Sharks in the Moat
Page 15
DNS (Domain Name System)
HTTP (Hypertext Transfer Protocol)
IRC (Internet Relay Chat)
SMTP (Simple Mail Transfer Protocol)
Figure 38: The OSI Model and Common Protocols
This handoff normally occurs through an API of some kind.
Presentation Layer
Layer 6 – the Presentation layer – wraps more specific content into a generic wrapper that any computer implementing Layer 6 will understand. The Presentation layer also adds compression and encryption.
Protocols working at this layer typically are:
MIME (multipurpose internet mail extension )
TIFF
GIF
JPG
MPEG
Session Layer
Layer 5 – the session layer – is concerned with establishing a session between the same application running on two different computers. The session layer can provide this communication in three modes:
Simplex – can communicate in one direction only
Half-duplex – can communicate in both directions, but one at a time
Full-duplex – can communicate in both directions simultaneously
Don’t get this confused with the next layer down, the transport layer. Session sets up communication between applications, while the transport layer sets up communication between computers. Interprocess communication, sometimes called a remote procedure call or RPC, takes place at this layer. RPC is unsecure as it does not provide for authentication, but Secure RPC, or SRPC, does. Note that session layer protocols are no longer used very often, and it is considered a best practice to disable them.
Transport Layer
Layer 4 – the Transport layer – is all about ensuring data gets to the destination intact. In this layer two computers will agree on:
How much information to send in a single burst
How to verify the integrity of the data
How to determine if some data was lost along the way
This is essentially agreeing on how two computers are going to communicate with each other. Connection-oriented protocols working at this layer, such as the Transmission Control Protocol (TCP), provide reliable data transmission with retries. Contrast this to the User Datagram Protocol (UDP) which is more of a ‘fire and forget’ mechanism – UDP sends the packet but doesn’t care if it made it. TCP, on the other hand, will send packets and then wait around to see if it made it; if it detects a packet got lost somewhere, it will send it again.
The transport layer is where TCP and UDP ports are specified, such as port 80 for HTTP, or port 21 for FTP.
Protocols working at this layer are:
TCP
UDP
Network Layer
Layer 3 – the Network layer – is all about making sure the packet gets to the correct location. For TCP, UDP and ICMP, this is where the IP address is added. Protocols working at this layer are:
Internet Protocol (IP)
Internet Control Message protocol (ICMP)
Routing Information Protocol (RIP)
Open Shortest Path First (OSPF)
Border Gateway Protocol (BGP)
Internet Group Management Protocol (IGMP)
People often assume an IP address is the only way to address packets across a network, but the truth is that IP is the most common but not the only method. The completed network layer 3 envelope is called a packet.
Data Link Layer
Layer 2 – the Data Link layer – is probably the most complex of all the layers, because it is split into two sublayers – the Logical Link Control sublayer, or LLC, and the Media Access Control sublayer, or MAC.
By the time we get to the Data Link Layer, we are almost to the point of putting data onto the physical ‘wire’. The LLC sublayer communicates directly to the network layer above it, and:
Provides multiplexing – allows multiple protocols such as IP and IPX to exist on the network at the same time
Provides flow control
Manages errors
Once the LLC sublayer has performed its duties, it will hand the data down to the MAC sublayer, which knows what physical protocol the network is using – Ethernet, Token Ring, ATM, wireless, etc. The MAC sublayer adds a few additional header values right before it is physically sent.
Note that the IEEE standards, such as 802.11
(wireless), 802.3 (Ethernet), 802.5 (Token Ring), etc. all happen at the MAC sublayer.
Protocols working at the Data Link Layer are:
Point-To-Point (PPP)
ATM
Layer 2 Tunneling Protocol (L2PP)
FDDI
Ethernet
Token Ring
Each of the above protocols define the physical medium used to transmit signals. The MAC sublayer takes bits and decides how to turn them into physical signals. For example, if a bit value of ‘1’ needs to be sent over an Ethernet network, the MAC sublayer will tell the physical layer to create a voltage of 0.5 volts. If the same bit needs to be sent over an ATM network the voltage might be 0.9 volts. Just remember that the intelligence of how to create signals using the different protocols happens in the MAC sublayer, and therefore in the Data Link Layer. Actually, producing the electrical voltages does not happen yet – just the decision on what the voltage is going to be. The finished Data Link Layer 2 envelope is called a frame.
Physical Layer
Layer 1 – the Physical layer – converts the bits into voltages for transmission. The MAC sublayer of the Data Link Layer has already decided what voltage needs to be used, so the physical layer is responsible for creating the voltage. Depending on the physical medium being used, this layer will control synchronization, line noise, data rates and various transmission techniques. Physical optical, electrical and mechanical connectors used for transmission are a part of this layer.
Section 3: Secure Software Development
This section is focused on how to develop, deploy and maintain secure software that is completely contained within a company. In other words, it assumes that all development is being performed by employees or is partially outsourced to contractors directly managed by the company. In either case, we are assuming that the system is hosted by the company either internally or in the cloud. Purchased software and software that is 100% outsourced is addressed in the final section, Supply Chain Management. But for now, let’s focus on internally-developed software only.
This book is designed to address all roles involved in delivering secure software - from the very first point in time when the spark of an idea forms in someone’s mind, all the way through to the final retirement of the software and its data. While all organizations are different, there are twelve distinct roles that will always exist. Many times, a single individual or team will fill more than one role, but the role remains distinct regardless. A mature organization will recognize the various roles and be very intentional in ensuring the related security duties are carried out properly. These roles are the following:
An auditor, who ensures that all other roles play their part and that gaps in security do not exist.
The security team, who performs the day-to-day monitoring and auditing activities required to keep the system secure.
The product team, who owns the system, dictates requirements and takes responsibility for highlighting secure requirements.
The project manager, who owns the project processes, and ensures smooth communication and progress.
The change advisory board, or CAB, who approve new releases and versions into the production environment.
The architect - one of the most crucial roles in a secure system - who considers security requirements, the overall design, development efforts and deployment capabilities.
The engineering manager, who acts as the secondary glue after the project manager.
The testing team, who is responsible for writing and automating test cases, and has the go/no-go power over releases, second only to the CAB.
The development team, who i
mplements the architect’s designs in a secure manner, and performs peer code reviews.
The DBA, who ensures a safe and reliable database.
The infrastructure team who owns the various environments such as development, staging, testing and production, and ensures that monitoring and auditing capabilities are continuously running.
The DevOps team, who takes care of deployments to the various environments, and ideally implements an automated build pipeline that executes tests for the testing team.
The content within this section is divided into each of the twelve roles. For example, if you are a developer and just want to know about secure coding patterns, you can jump right to the section on Development. If you are an architect, you can jump to the section labeled Architect. However, as an architect, you will also need to understand the information covered under Development. I have ordered the various roles in such a way to show such dependencies, as shown in Figure 39.
Keep in mind that this chart illustrates knowledge areas, not any type of an organizational reporting hierarchy. Starting from the top and moving down, a security auditor must understand pretty much everything, regardless of the role. Specifically, an Auditor will need to cover all topics underneath Project, Security, and Change Management. The Project role will need to also read all content under the Product role. Change Management pretty much stands alone. The Security role must include everything an Architect understands, which covers everything that an Engineering Manager deals with. Under the Engineering Manager role, we find two more – Development and Infrastructure. Development includes the DBA role, while Infrastructure includes the DevOps role. Essentially, choose the role you wish to know about, and you will need to read all sections underneath that role as well. As this is a book about creating secure software, it should be no surprise to find out that the Developer role has by far the most content. Having said all of this, if you want to truly understand secure software development, you should read the entire book such as an Auditor might need to do.
Underlying every role is a block called Core Concepts, which is the material we covered in Section 1. If you do net yet feel that you have a good grasp of the core concepts, you might want to reread that content now. Everything from this point forward builds on those subjects, and you might find yourself getting lost if you do not have a good grasp of the foundational elements.
Figure 39: Role Dependencies
Chapter 34: The DevOps Role
The DevOps role is responsible for deploying builds to the various environments. The term ‘DevOps’ is a combination of ‘development’ and ‘operations’, and ideally is comprised of members from both teams. In this book, we use the term ‘infrastructure’ to refer to the operations team, but ‘DevInfra’ doesn’t sound nearly as cool. The idea behind this role is that deployment and infrastructure accountability is assigned to both teams, who work in a close collaboration to achieve success. This prevents the blame game when something goes wrong, which will always happen – since both roles have skin in the game, they tend to work together instead of pointing fingers.
Environments
While every organization is different, the following environments are normally found in a well-oiled machine:
Development, which is controlled by the development team and is never guaranteed to be stable. This is where new things are tried out and is where code integrated from multiple developers is first tested by the development team. Builds are deployed multiple times per day.
Systems Integration Test, or SIT, is where development from multiple systems are deployed to test end-to-end integration with changes that are not yet ready to be fully tested by the testing team. Builds are deployed several times a week.
Test, which is controlled by the testing team. Here is where test cases are written and executed as automated tests. Some manual tests are performed as well, but these should be kept to a bare minimum. Product often uses this environment to sign off on user stories. The test environment is rarely a mirror of the production environment, but it is crucial that we can extrapolate a reasonable estimate of production performance and load based on how code performs in the test environment.
Staging, which should be a mirror of Production. This is where we test and rehearse deployments prior to Production.
Production, or the real run-time environment that is public-facing. This environment must be protected at all costs.
Secure Build Environments
The final step that takes source code and generates the run-time files to deploy into an environment is called the build process. There are numerous tools out there to help with this, some that are cross-platform and language-agnostic, while others are geared specifically for a target language and run-time environment. What they all have in common, however, is that they typically will use some type of scripting language to allow customization and require a considerable deal of configuration. Automated builds using these tools are absolutely a requirement in order to achieve quality and repeatability, but they can be a double-edged sword. If an attacker – or malicious employee – were able to gain control of the build pipeline, then all of our careful security code reviews will be absolutely useless as he will be able to inject whatever malicious code he desires during the build process itself. The results will be virtually undetectable until it is too late.
Protecting the build pipeline is just as important as the source code repository itself. The build environment should allow a limited number of people with modification rights, and all activity must be logged in the same manner as the code repository. Many times, the build process will use service accounts to carry out actions, and all activity from these accounts must be closely monitored.
I once inherited a SaaS software product (that seems to happen to me a lot) that was always breaking after each deployment. It did not take long to figure out that the problem centered on a completely, 100% manual deployment. This was a huge problem as every dependency had to be manually remembered and carried out, and of course many mistakes were made during each deployment. Inevitably, the team had to wait for the customer to call in and complain about some feature that no longer worked. It is no wonder that the product was hemorrhaging customers left and right. After implementing proper version control, automating the build process, locking down Production to only the deployment team, and mandating that no one was allowed to manually touch production outside of kicking off a deployment script, the problems virtually disappeared. Of course, it took the better part of a year to get to that point, and required many, many counseling sessions with developers who claimed the way they had always done it was best - but we got there!
The process to take raw source code to a production state is just as crucial to quality software as is the source code itself. It doesn’t matter how well your software is written if you are unable to roll it into production in a usable state. The integrity of the build environment depends on three things:
Physically securing access to the systems that build code.
Using access control lists, or ACLs, that prevent unauthorized access by users.
Using a version control system to ensure the code is from the correct version.
But it is not enough to simply have a build process – as we have already mentioned, in this day and age it must be automated. Build automation occurs when we script all tasks involved in going from source code to the final deployment. It is important, though, that the automation does not bypass the appropriate security checks, and just because we have automated a process does not eliminate the need for proper security. When a machine is the ‘user’ it can still go rogue and cause all sorts of damage. Legacy source code is often an issue as the build process is seldom automated and few people understand the nuances of how to carry this out. Since it is not under active development, many months or even years may pass in between releases, making the need for automation even more important.
Building
There are two independent sub-processes that make up th
e overall build process – packers and packagers.
A packer compresses an executable, with the primary intent to reduce storage requirements in the final package. During the installation process, some files will need to be ‘unpacked’. Packing also has the side effect of obfuscating some code, increasing the work factor when reverse engineering our code. However, packers can be used by an attacker to evade malware detection tools, as it changes their signature without affecting their ability to execute.
Once a packer has done its job, we can use a packager to build a package to seamlessly install the software in the target environment. A packager ensures that all dependent components and resources are present. The Red Hat Package Manager, or RPM, and the Microsoft Installer, or MSI, are great examples of packagers.
Installation and Deployment
A development team can implement tons of security controls, the testing team can verify that they are effective, the product team can accept the software for release – only for all of that hard work to be undone the minute the product is installed due to a bad deployment. Unless installation and deployment are not carefully planned and monitored after-the-fact, owning a hack-resistant product will never happen. Specifically, there are four areas that need attention – hardening, configuration, releases, and startup.
Hardening
Hardening an environment starts well before we actually deploy. In a well-run infrastructure, all servers are expected to meet a minimum security baseline, or MSB, which is a base set of configuration parameters and controls that cannot be removed. The MSB is normally setup to comply with an organization’s security policy and can be very effective in preventing compromise due to an incorrectly configured system.