Sharks in the Moat
Page 16
Beyond the server itself, the software that runs on top of the operating system must be hardened. This not only applies to the product we are getting ready to deploy but also to peer software that will run side-by-side with our new product. For example, many servers come with an FTP capability built-in, and if this service is to be left running it will need to be hardened right along with any other software riding on top of the OS. Let’s run through some of the most common security misconfigurations found when deploying. Remember this applies to our new product as well as other software already installed on a server.
Hardcoded credentials or keys.
Credentials and keys stored as cleartext in configuration files.
Leaving the directory and file listing capabilities enabled on web servers.
Software being installed with the default accounts and settings.
The administrative console being installed with default configuration settings.
Leaving unneeded services, ports, and protocols installed.
Leaving unused or unprotected pages, files and directories.
Not keeping patches up-to-date, for both the OS and other software.
The lack of perimeter and host defensive controls such as a firewall, IDS or IPS.
Test and default accounts left enabled.
In addition to those just listed, certain tracing and debugging capabilities that are left enabled can cause information disclosure if they capture sensitive or confidential information. Debugging can often cause errors on the server that result in stack traces being sent to the client.
Whereas hardening the host operating system is primarily covered by following the MSB and ensuring proper patching, hardening application software requires a more code-centric approach. We will need to search for and remove such things as maintenance hooks, debugging code, and comments containing sensitive information.
Again, hardening both the environment and our code base is crucial to a proper security posture and cannot be left up to chance.
Configuration
Pre-installation checklists are a great tool to ensure we don’t forget known ‘things to do’, but they will not be able to predict if the final environment configuration is secure, and certainly will not address dynamic configuration issues that result when we start combining components in real-time. However, there are a few things to look out for that repeatedly popup.
A common need when deploying software is that administrative privileges are required. This is not a bad thing – in fact it a good indicator that security has been locked down on a production server. The thing to watch out for is when the development team is given administrative access so that they can install the software. This is a major no-no, as it is a clear violation of the separation of duties principle. We cannot have the people developing software also responsible for deploying that software. If this happens, no one should be surprised when the environment is completely compromised by an attacker. Additionally, if services, ports or protocols must be enabled just for the deployment process, another red flag should rise up the pole. The deployment process must be handled completely by either a deployment team or the infrastructure team, and it should not require any actions that decrease security. A temporary decrease in security will inevitably become a permanent one when the deployment team forgets to re-enable certain security controls.
It is a rare software product that functions the first time it is deployed to production. In fact, in my experience it has never happened. The source of this problem is that the configuration differs between the development, test and production environments. In order to avoid violating the principle of separation of duties and to not have to disable security controls in the production environment, an organization will choose to allow the software to have temporary administrative access and to programmatically reconfigure the environment. Unfortunately, we will eventually disable or remove the code that ‘re-enables’ the security controls, and we again find out that the environment has been hacked. A much better use of our time is to ensure the various environmental configurations match, including ensuring that access rights are denied by default and explicitly granted in exactly the same manner in all environments. This is not easy to do but will result in a far superior security stance.
A final note on this topic references the ability for software to run on different platforms, such as .Net code that can run on both the x86 and x64 architectures. X64 allows software to run in a more efficient manner but must be explicitly taken advantage of. This means that software may not run the same depending on the platform it is deployed to. Again, all environments should match, and if the production environment might have multiple platforms, such as when creating desktop applications, then an environment for each must be created for proper testing.
Bootstrapping and Secure Startup
When a system first starts up, it is more vulnerable in many ways than when it has fully booted and running. This process is called bootstrapping, and when we properly protect this fragile state, we are said to ensure a secure startup. There are several steps that comprise bootstrapping, which is sometimes called the initial program load, or IPL.
First, the computer will go through a power-on self-test, or POST, in which the basic input/output system, or BIOS, can overwrite portions of memory in a destructive fashion to ensure that no information left over from a previous session remains intact. However, this confidentiality control can be disabled if an attacker can gain access to BIOS, which has only a single password option to control access. The BIOS protection does not perform any checks related to integrity checking, so it will not be aware if unauthorized changes have been made.
Once POST has been completed, the trusted platform module chip, or TPM chip, takes over. TPM is a cryptographically strong capability physically contained on the motherboard and is used to perform integrity checks and to secure entire storage mediums such as the hard drive. It also has the capability to store encryption keys in a secure manner and provide authentication and access management for mobile devices. TPM goes a long way in mitigating information disclosure in the event that a mobile hardware device is stolen or lost.
Once TPM has completed its job, the operating system is then loaded from disk. During this process various OS-level services will be enabled and started. After the OS has completed initialization, other software riding on top of the OS is loaded according to however the OS has been configured. When starting up, software also goes through a vulnerable state. For example, web server software will normally perform a one-
time retrieval at startup of various settings from configuration files. During this time an attacker has the opportunity to inject his own settings, resulting in the web server software being compromised. Malware are often known to use this opportunity to inject themselves as a program loads.
Any interruption in the overall bootstrapping process can result in unavailability or information disclosure. Side channel attacks such as the cold boot attack are proof that the shutdown and reboot process can be abused, leading to information disclosure. We will discuss these types of attacks in greater detail later.
Chapter 35: The Infrastructure Role
Infrastructure is concerned with spinning up, maintaining, and winding down all resources required to host a software application. While the architect and development teams are responsible for generating the source code and choosing the software configuration once deployed, it is up to the infrastructure team to make sure the various hosts can support that configuration and the environments remain stable, secure and performant.
Operational Requirements
It is very true that developers create bugs that eventually wind up in the production environment, but the majority of major production issues are actually caused by some breakdown in operational procedures, not code. For example, an unexpected increase in traffic can saturate the available database connections, causing everyone to queue up and wait. Or storage space is exhausted due to a misconfiguration in log
ging. Or the database runs out of disk space because it grew faster than expected. Each one of these examples should have been caught during the requirements phase, but they often are not. These missed requirements are examples of operational requirements. When developing software to be deployed to the cloud, or when using a DevOps capability, the importance of nailing operational requirements increases.
CONOPS
To reliably identify these types of requirements, the team must take on the mindset of a Concept of Operations, or CONOPS. This approach covers interoperability with other systems and how the software will interface and react. How the software will be managed is also part of CONOPS. Some good examples of operational requirements include the following:
“Cryptographic keys that are shared between applications should be protected using strict access controls.”
“Data backups and replications must be protected in secure logs with least privilege implemented.”
“Software patching must align with the enterprise process and changes to production environments must be done only after all necessary approvals have been granted.”
“Vulnerabilities in the software that can impact the business and the brand, must be addressed and fixed as soon as possible, after being thoroughly tested in a simulated environment.”
“The incident management process should be followed to handle security incidents and root causes of the incidents must be identified.”
“The software must be continuously monitored to ensure that it is not susceptible to emerging threats.”
Deployment Environment
I once took over a project that had experienced severe outages for years with no improvement. After reviewing the code base, I was mystified as to the source of the unreliability, for it appeared to be well-written and followed decent design patterns. Then I witnessed the first deployment of a new version. What should have taken 30 minutes to deploy turned into a 3-day exercise which was ultimately rolled back due to numerous outages. While taking the team through a post-mortem, I discovered that the vast majority of struggles had to do with how each environment was configured – nothing was the same and there was almost no documentation on how the development, staging and production environments differed.
The configuration and layout of the various environments is seldom considered by a project team unless some very senior leaders are present. What works well in the development environment often does not work at all in the production environment due to increased security configurations. Pre-production environments are seldom configured exactly the same as the production environment, and as a result, load handling capabilities in production are based on best-guesses. Privacy and regulatory concerns compound the problem, as the same requirements may not apply to all environments equally.
Following are some great questions to ask during the requirements phase that will help flush out those hidden details.
Will the software be deployed in an Internet, Extranet or intranet environment?
Will the software be hosted in a Demilitarized Zone (DMZ)?
What ports and protocols are available for use?
What privileges will be allowed in the production environment?
Will the software be transmitting sensitive or confidential information?
Will the software be load balanced and how is clustering architected?
Will the software be deployed in a web farm environment?
Will the software need to support single sign-on (SSO) authentication?
Can we leverage existing operating system event logging for auditing purposes?
Archiving
Archiving is the act of moving data from a primary storage capability to a secondary location that is normally not accessed. For example, data older than 2 years might be moved from a database and stored on optical discs such as DVDs. The data is still available if absolutely needed, but the work required to retrieve it would be considerable. Data retention is closely aligned with archiving as long-term data retention needs are almost always going to be met through the use of some type of archival mechanism.
Data retention requirements can come from internal policies or external regulatory requirements. It is important that internal policies compliment, not contradict, external regulatory needs. For example, if an internal policy requires that a data set be retained for 3 years, and the regulatory requirement is only 2 years, then the internal policy is complimentary – it does not contradict the minimum required by the regulatory retention period. However, if the regulatory policy stated 5 years as the minimum, there would be an inherent conflict, and in this case the safer course to follow would be to require a 5-year retention period.
When stating retention requirements, three dimensions must be specified – the location, duration and format of the archived information. Some questions that will help determine the various dimensions are the following:
Where will the data or information be stored?
Will it be in a transactional system that is remote and online or will it be in offline storage media?
How much space do we need in the archival system?
How do we ensure that the media is not re-writable?
How fast will we need to be able to retrieve from archives when needed?
How long will we need to store the archives for?
Is there a regulatory requirement to store the data for a set period of time?
Is our archival retention policy contradictory to any compliance or regulatory requirements?
Will the data be stored in an encrypted format?
If the data or information is encrypted, how is this accomplished and are there management processes in place that will ensure proper retrieval?
How will these archives be protected?
Anti-Piracy
Anti-piracy requirements are important under two conditions – if the company is purchasing software from a third-party, or if the company is creating software to sell to a third-party. In either case, the following are good examples of requirements:
“The software must be digitally signed to protect against tampering and reverse engineering.”
“The code must be obfuscated, if feasible, to deter the duplication of code.”
“License keys must not be statically hard-coded in the software binaries as they can be disclosed by debugging and disassembly.”
“License verification checks must be dynamic with phone-home mechanisms and not be dependent on factors that the end-user can change.”
Pervasive and Ubiquitous Computing
Pervasive computing is a term describing the everyday existence of a set of technologies that we take for granted and to some extent has become invisible. Such technologies are wireless communication, the Internet, and mobile devices. Essentially, pervasive computing recognizes that any device can connect to a network of other devices.
This concept can be broken down into two components – pervasive communication and pervasive computation. Pervasive computation claims that any device having an embedded computer or sensor can be connected to a network. Pervasive communication implies that devices on a network can communicate across that network. Put more simply, pervasive computers by definition are connected to a network, and pervasive communications claims those computers can talk to each other using that network.
A key component in pervasive computing is that it happens transparently, which certainly implies the need for wireless networks. Additionally, pervasive computers can join and leave wireless networks at-will, and even create their own networks with nearby devices, forming ad-hoc networks. As an example, when your phone connects to your car via Bluetooth, an instant, ad-hoc network is formed. Of course, the term ‘instant’ is being used fairly loosely here, as we all know how frustrating it is to have to wait for your phone to connect to the car’s entertainment system.
Another example of pervasive computing is the bring your own device, or BYOD, attitude that many companies are adopting. With this approach, emplo
yees can use their personal mobile devices on the company’s intranet. This naturally brings with it additional security concerns, as the company is limited in their ability to secure devices they do not own. In fact, not only do these devices represent another vector from which an attacker could enter the intranet, they themselves represent a threat as they could be used to attack the network directly. Complete mediation implemented using a node-to-node authentication mechanism is the best way to combat such a threat. This is carried out when a mobile app authenticates to the device, which in turn authenticates to the internal application running on the intranet. Additionally, using the trusted platform module, or TPM, for identification and authentication is safer than relying on the MAC address of the device, which can be spoofed. Mobile device management systems, or MDM systems, provide additional security by allowing policies to control the use of third-party apps on the mobile device.
Because a mobile device is easily stolen, system designers need to go to extra lengths to protect data on these devices. For example, when stolen, an application needs to be able to delete sensitive data, either by receiving a remote command or by monitoring a local trigger. One trigger might be exceeding a set number of attempts to enter a PIN to unlock the device. Of course, biometric authentication is preferred over a PIN, as it is harder to spoof a biometric attribute, thereby increasing the work factor.
An increasing level of maturity with specific technologies in the last decade have been key enablers for pervasive computing. Some of these technologies are the following: