by Phil Martin
Building in the ability to swap out algorithms is not an easy task and must be purposefully designed from the beginning. Ideally replacement of an algorithm should not require code changes, rebuilds or deployments. Minimal testing should be required, primarily to ensure the new algorithms are compatible with ciphertext that has already been generated and stored.
All encryption and decryption services should be abstracted inside of their own class or service. Ideally, from a code consumer point of view, the only inputs to the encryption services should be the data context and plain text, with the only output being the ciphertext and any updates to the data context. Based on the provided data context, the cryptographic service should select the appropriate algorithm, block size, salt, key length and the actual key to be used.
As an example, let’s consider two scenarios – hashing a password, and encrypting PII.
When hashing a password, we should create a context that looks something like the following object:
{
DataContext
{
Purpose: Purposes.PasswordStorage,
ConfigurationRecord: passwordRecord,
},
plainTextPassword
}
The cryptography service could look at the Purpose, decide that SHA-2 hashing should be used to encrypt the password as a one-way hash, use the salt contained within the ConfigurationRecord object, hash plainTextPassword, and return the resulting digest.
When encrypting PII, the request might look like this:
{
DataContext
{
Purpose: Purposes.PII,
ConfigurationRecord: null,
},
ssnText
}
Here, the same cryptography service might decide that PII needs to use AES-256 and return the ciphertext. Know, where did the encryption key come from? The service itself would know where the keys are stored and would fetch the 256-bit key associated with AES-256 PII encryption. The calling code would have no knowledge of the algorithm used, where the key was stored, how long the key was, or any other details. Everything is hidden behind the cryptography service.
The power of such an approach is that if the encryption class is compiled into the application with hard-coded logic, then we can simply update the code within that class, perform minimal testing and redeploy to production. But we can do better than that. What if, instead of hardcoding which algorithm to use based on the ‘Purpose’ and hardcoding how to get to the encryption key, we used a protected configuration file instead? Then we could update the configuration file without redeploying the application. The only thing we would have to worry about then is if the class was already capable of using the algorithms referenced in the configuration file. If we needed to roll out a new algorithm that was not supported, we would still need to redeploy the application. But, we can even overcome that limitation by implementing the cryptography capabilities into a stand-alone service that our application calls. If a brand-new algorithm is called for, we simply redeploy the encryption service worse-case.
Using Cryptography API Next Generation, or CNG, can also help us to remain agile. CNG is very extensible and agnostic when it comes to algorithms. It provides the following:
A configuration system that supports agility.
Abstraction for key storage and separation of key storage from the algorithm operations.
Isolation of processes for operations using long-term keys.
Replaceable random number generators.
Better signing support for exports.
Thread-safety mechanisms throughout the entire stack, resulting in more stability and scalability.
A kernel-mode cryptographic API.
Now, there is a rather significant fly in our cryptographically agile ointment that we have ignored so far. It’s great to swap out algorithms and keys, but what happens to all of that data-at-rest that is already encrypted? For example, if we stored the MD5 hash of all passwords, and then decide to switch to SHA-2, the SHA-2 digest will never match the MD5 digest, meaning no one will be able to log in until they reset their password. In this case, we should store the new hashing function as metadata alongside the new hash. Authentication can continue to work until the time at which everyone has naturally updated their password at which time we can retire the old algorithm completely. Using our abstracted class example, we could easily hide this new field in the ‘ConfigurationRecord’ object such that the main application logic is not even aware of the increased business logic.
But there is an even worse case - we will not be able to decrypt persisted data that was encrypted with a different algorithm. In this case we will need to decrypt using the old algorithm and re-encrypt using the new. This will most likely require a custom one-time process to be written and executed to migrate all existing data to the new encryption scheme.
Beyond being able to read encrypted or hashed data, we must also consider any size changes that will result from moving to new algorithms. As an example, if we move from MD5 which produces a digest of 128 bits, to SHA-2 which yields a 256-bit digest, we will need to ensure our database field can handle the increased size of the hash.
Secure key management includes the following:
Properly generate keys using truly random methods of the appropriate length.
When exchanging keys between processes, ensure the key is never exposed in an unsecure manner, such as over unencrypted communication channels. Key exchange should also be implemented using an out-of-band mechanism or an approved key infrastructure process.
Keys should never be stored with the encrypted data.
Ensure keys are changed and rotated. When doing this, ensure that a strict process is used in which the data is first decrypted with the old key, and then encrypted with the new key. This might sound like simple common sense, but you would be surprised how often a self-inflicted DoS results from a failure to follow this process.
Protect the location of key archival and escrow. When keys are escrowed, be sure to properly maintain different versions of the key.
Ensure proper and timely destruction of keys when they are no longer needed. Be sure to decrypt data with the key before it is destroyed!
Safeguard active key storage and ensure secrecy of this location.
Adequate access control and auditing is achieved when we control access to the keys for both internal and external users. This means we should:
Grant access explicitly, never implicitly that results from some other permissions.
Access to keys is controlled and monitored with automated logging, and periodic reviews are performed on the logs.
Insecure permission configurations should not let users bypass the control mechanisms.
The access control process for keys should understand the difference between one-way encryption and two-way encryption, and how that impacts the required security around the key. One-way encryption, where the key used to encrypt the data is not required to decrypt it, understands that only the recipient should have access to the decryption key. This is the case when using PKI in which the public key can encrypt, and the private key decrypts. With two-way encryption, in which the same key is used to encrypt and decrypt, the key will need to be available for both sender and recipient.
Spoofing Attacks
If code manages sessions or authentication mechanisms, it should be checked to see if it is susceptible to spoofing attacks. Session tokens should not be predictable, passwords are never to be hard-coded, and credentials should never be cached. If impersonation is used such as using a common database user account to represent all named application users, then there should be no code that changes that impersonated account – it should rely on configuration alone.
Anti-Tampering
Anti-tampering prevents unauthorized modification of code or data. In essence, it assures the ‘I’ in CIA by using three different approaches – obfuscation, anti-reversing, and code signing.
O
bfuscation is that act of introducing intentional noise in the hopes of concealing a secret. Obfuscation is an example of security through obscurity, which as we have mentioned is no security at all. However, it can serve to increase the work factor in some cases. Obfuscation is most often used to scramble source code so that an attacker will have a tough time figuring out the algorithm and logic. Since compiled languages do not preserve the original source code during the compilation process, there is little need for obfuscation, although it is used for object code at times. When using scripting languages such as JavaScript, it is extremely simple for an attacker to download the source code – in fact it is required for the browser to be able to run it. This is probably the best use case for obfuscation in which random variable names are used, convoluted loops and conditional constructs are injected, and text blocks and symbols are renamed with meaningless character sequences. This is sometimes called shrouded code.
Reverse engineering, or reversing, is the process of figuring out how a piece of software works by looking at the object code. This can be a legitimate exercise if we own the software and there is no documentation, but if we do not own it, there might be legal fallout if the licensing agreement forbids such actions. Even if the licensing agreement does not explicitly restrict reverse engineering, the owner should be notified about the activity. From a security point of view reverse engineering can be very helpful for security research and to discover vulnerabilities in published software. But the exact same actions can be used by an attacker to circumvent security protections. This puts an attacker in the position to tamper with the code and repackage it with less-than honorable intentions. While obfuscation can make this more difficult, we can use anti-reversing tactics by removing symbolic information from the executable such as class names, member names, names of instantiated objects, and other textual information. This can be done by stripping them from the source code before compilation or by using obfuscation to rename symbols to something meaningless. We can also choose to embed anti-debugger code which detects the presence of a debugger at run-time and terminates the process if found. Examples of anti-debugger APIs used to inject this type of code are the IsDebuggerPresent and SystemKernelDebuggerInformation APIs.
To protect the integrity of deployed code, we can use a digital signature to implement code signing. With this process, the code object is hashed and the resulting digest is included with the deployed product so that consumers can verify their copy has not been tampered with. This is normally carried out by encrypting the hash with a public and private key pair, thereby producing a digital signature – this not only provides integrity, authenticity and non-repudiation, but anti-tampering as well. Code signing can occur each time code is built, or delayed signing may be carried out by generating the hash and digital signature immediately before deployment.
Code signing becomes very important when dealing with mobile code. Mobile code is a stand-alone code block, usually compiled, that is downloaded and executed usually by a browser. In years past the major types of mobile code included Java applets, ActiveX components, Adobe Flash files and other web controls. With the advent of HTML5 these types of mobile code have fallen out of favor, but today we still need to use code signing for JavaScript files and browser extensions. When we sign mobile code, it gives the code’s container permission to access system resources. For example, when a browser extension is installed, the browser will check the digital signature and if valid will tell the sandbox running the extension to allow it access to certain features.
Reversible Code
Reversible code is code that can be used by an attacker to determine the internal architecture, design and implementation of any software. Reversible code should be avoided as well as textual and symbolic information that can aid an attacker. If debugger detection is to be used, the reviewer should look for its presence if warranted.
Privileged Code
Code should be examined to ensure it follows the principle of least privilege. Code that violates this principle is called privileged code, and while not something to be avoided, it should require administrative rights to execute.
Maintenance Hooks
A maintenance hook is any code that is intentionally introduced, usually to provide easy access for maintenance purposes. It will often look innocuous and can be used to troubleshoot a specific issue. However, it can also be used as a backdoor for malicious reasons and should never be allowed to enter the production environment. If maintenance hooks must absolutely be put in as a way to debug an issue, it should never be checked into any version that could eventually be deployed to production. Instead, it should be injected directly into a test environment and evaluated there without checking into the main version control system. If the capability must exist in production, then it should be implemented in a way that is not specific to any customer or hard-coded to data and should be controlled by a configuration flag. If this rule is followed, it almost always will result in the developer finding a different way to debug the issue as it requires a significant amount of work to carry this out in production for a single issue.
Logic Bombs
Logic bombs by definition are always malicious and usually implemented by a disgruntled insider having access to the source code. A logic bomb is a code block that waits until some type of pre-defined condition is met and then activates by carrying out an unwanted and unintended operation. These source code operations have often been carried out by employees who feel they have been wronged by their employer and want to take out revenge. Logic bombs can cause destruction of data, bring a business to a halt or even be used as a means of extortion. Review of logic bombs becomes even more important when an external party develops the code. Although logic bombs are often triggered based on a date and time, the deactivation of trial software at a specific date due to a license that was agreed upon is not considered to be a logic bomb.
Cyclomatic Complexity
Cyclomatic complexity measures the number of linearly independent paths in a program. We use the term linear as opposed to cyclical, as in code that calls other code, that in turn calls the original code. While this is not always a bad thing, it is a significant source of infinite loop constructs, and quickly complicates the code path making it difficult to detect impending issues and debug. When code is highly cohesive and loosely coupled, cyclomatic complexity will naturally go down. After calculating some value for this type of complexity, it can be used to judge how well code follows the principles of economy of mechanisms and least common mechanisms.
Code Reviews
Unless you are a lone developer working on a project with no other technical person involved, you should never, ever allow code to be checked into the source code repository until it has been reviewed by a peer. A code review, also called a peer review when peers from the development team are used, detects syntax issues and weaknesses in code that can impact the performance and security of an application. It can be carried out manually or by using automated tools. Note that while automated tools can help detect some issues, they are NEVER a substitute for manual code reviews carried out by a human.
While peer code reviews are a non-negotiable if we hope to build quality software, a security code review is just as important but seldom found. The general purpose of a security code review is to ensure software actively looks for and defeats exploitable patterns. Examples of such controls are input validation, output encoding, parameterized queries, not allowing direct object manipulation, using secure APIs and avoiding the use of maintenance hooks. A secondary activity is to look for malicious code that has been injected somewhere along the supply chain. Examples of malicious code are embedded backdoors, logic bombs, and Trojan horses. To accomplish both objectives, a security code review must encompass the three C’s - code, components and configuration.
Some recommendations for good security code reviews are the following:
Review all code before it is checked back into the source code repository.
&
nbsp; Perform a code review along the entire code path.
Partner with external development teams when conducting reviews.
Document all discovered weaknesses in a database to prevent each from being lost.
When working in a collaborative manner with external development teams during security code reviews, you will have a much better chance at finding non-obvious logic problems and weaknesses.
It is important to remember that reviews should focus on the code and not the coder. For code reviews to be healthy, a proper respect for all members of the team must be maintained. Constructive criticism of code should not be taken as a personal insult, and explicit roles and responsibilities should be assigned to all participants. Moderators should be designated to handle any conflicts or differences of opinions. When meetings are held for code reviews, a scribe should be identified to write down the comments and action items so that each is not forgotten. It is important that reviewers have access to the code prior to the meeting so they come prepared. It should go without saying, but the author of code should not be the moderator, reviewer or scribe. It is usually of great help to continuously refer to established documents such as coding standards, internal policies and external compliance requirements to avoid personality push-backs, as well as to facilitate prioritization of any findings.
Larger issues such as business logic and design flaws are normally not going to be detected using a code review, but it can be used to validate the threat model generated during the design phase. From a security point of view, at a minimum we are looking for both insecure and inefficient code. We have already spent a good deal of time looking at insecure coding patterns, so let’s take a look at reviewing code for inefficiencies. Inefficient code can have a direct impact on software security as it can easily cause a self-inflicted DoS scenario. For example, making an invalid API call or entering an infinite loop can lead to memory leaks, hung threads, or resource starvation. The two biggest contributors to code inefficiencies are timing and complexity problems.