Sharks in the Moat
Page 31
An attacker can analyze a cryptographic system in the same way – even though he does not have access to the original plain text or the resulting cipher text, the attacker can still figure out quite a bit based on external observation of the system in action. Let’s go over some of the most common side channel attacks – keep in mind that these attacks for the most part are carried out against a small hardware device but can be used for much larger systems as well.
When the attacker measures how long each computational cycle takes, he is carrying out a timing attack that leaks information about the internal makeup of the system. By varying the size of input and measuring the time it takes for the system to complete encryption, we can deduce certain things about the internal algorithm and logic.
A power analysis attack is carried out by measuring the amount of power consumed by a system. As an example, an RSA key can be decoded by analyzing the frequency and amplitude of power peaks, which represent times when the algorithms used multiplication operations.
An acoustic cryptanalysis attack is similar to a power analysis attack but listens to the sounds produced by hardware when operations are being carried out.
A TEMPEST attack, also called a van Eck attack or a radiation monitoring attack, watches for leaked radiation from a system to know when it is performing certain operations. This attack can be used to discover plain text.
A differential fault analysis attack intentionally injects faults into the system under study to see how it behaves. As an example, if we are trying to deduce how a cryptosystem responds to key lengths, we can inject keys of different lengths – regardless if they are real keys or not – to see how the system responds.
With a cold boot attack, the attacker finds a way to freeze the contents of memory chips and then boots the system up to recover the data. While conventional wisdom says that all contents of volatile memory such as RAM is lost when a system shuts down, there are ways of retrieving this information even after power to that memory has been removed. This underscores the need for a system to restore itself to a secure state on startup.
The next group of side channel attacks are not specific to cryptosystems, but rather to systems in general.
A subset of a timing attack is to look for the amount of time elapsed before an error message is produced. For example, in a blind SQL injection attack, if the system returns an error message within 2 seconds each time an injection attack is carried out, and then suddenly takes 22 seconds to return, the attacker can deduce that he has hit upon a table that is either very large or is missing an index.
A blind SQL injection attack can also fall under the differential fault analysis attack category, as can fuzz testing. This approach can also be used to indicate the strength of input validation controls by randomly varying the values and format of input data. Fuzz testing can also be seen as a form of injection fault analysis.
A distant observation attack, often called a shoulder surfing attack, occurs when the attacker observes and collects secret information from a distance. This could be using direct eyesight, using a long-distance device such as binoculars or a telescope, or even from the reflection of the victim’s eyeglasses.
The following are a few mitigation controls we can apply to minimize side channel attacks:
Use standard and vetted cryptographic algorithms that are known to be less prone to side channel attacks.
Choose a system where the time to compute is independent of the input data or key size.
Balancing power consumption among all possible operation types, along with reducing the radiation signal size can help foil a power analysis attack. Likewise, adding noise can help fight an acoustic analysis.
Physical shielding is one of the best defenses against emanation or radiation attacks such as TEMPEST.
Although difficult to implement, see if you can avoid the use of complex branching conditional clauses in critical sections of code. In simpler terms, avoid the use of if-then-else flow control, and instead opt for much quicker AND, OR and XOR operations. This will limit the variance of timing based on conditions that can be indirectly controlled by an attacker. As an example, if we decide to change the arithmetic logic used based on the incoming block size, then an attacker can use the power difference of the CPU to determine that we have just executed an ‘if’ branch.
If you are going hard-core security, it is by far better that every operation takes the same time to complete. As an example, if a specific condition is twice as fast as another condition, we can purposefully add a delay so that both take the same amount of time. This does have a negative impact on performance, obviously. Alternatively, we can introduce a random delay to throw the attacker off, but again at the expense of performance.
To combat differential fault analysis in which the attacker purposefully causes a fault to occur, we can use double encryption in which we run the encryption algorithm twice and only provide an output if both operations match exactly. This is based on the assumption that the likelihood of a fault happening twice in a row is extremely unlikely.
To combat a cold boot attack, we can:
Physically protect the memory chips.
Prevent memory dumping software from executing.
Not store sensitive information in memory.
Scrub and overwrite contents that are no longer needed.
Perform a destructive power-on self-test, or POST.
Use the Trusted Platform Module, or TPM, chip.
Obviously, we are not going to take extreme measures to secure web applications. But if our software will be running on some type of embedded device and it will handle extremely sensitive information, or the main reason for our software to exist is to provide encryption security, then we will want to consider some of these measures. Regardless, it is a good idea to be aware of side channel attacks and how to combat them if the need ever arises.
Code Vulnerabilities
Code vulnerabilities are found primarily on the server. While many could equally apply to native mobile apps on smartphones, or desktop applications running on Windows or the Mac OS, I will leave it to you to make that extrapolation. Because we are covering good coding habits, it should not be too difficult.
Error Handling
An error is a condition that does not follow the ‘happy’ path. For example, during a login procedure, the happy path – and yes, that is the real term for it – assumes the following:
Show login page.
User enters user name.
User enters password.
User presses ‘Submit’ button.
Server receives form and extracts user name and password.
Server creates SQL statement and executes to validate credentials.
On successful login forward user to home page.
The ‘sad’ path is not addressed – what happens if the credentials do not match? To accommodate the sad path, we can change step 7 and add a new step:
If credentials match, forward user to home page.
If credentials do not match, return same page with message ‘Credentials do not match’.
The sad path represents an ‘error’ and is handled properly. Now, what happens if Step 6 results in the database throwing an error? Neither the happy or sad paths expect this, and so we refer to this as an ‘exceptional’ condition, and we say an ‘exception has been thrown’. If exceptions are not properly managed it can cause instability, crash our application, cause business logic to be bypassed or leak sensitive information to an attacker.
All modern languages that are worth anything have a native capability to handle these unexpected exceptions, usually in the form of a try/catch clause. For example, if we suspect that a certain block of code has a good chance at causing an exception to be thrown, then we wrap it in a try/catch:
try
{
//execute risky code here
}
catch
{
//do something intelligent here that doesn’t crash our a
pp
}
Some naïve developers are aware that many runtime environments implement a global ‘last chance’ exception catcher that will generically handle all exceptions that we do not explicitly catch. This is a bad idea for several reasons.
We lose the ability to safely and securely recover.
We lose the ability to log the condition for later analysis.
It’s a bad user experience.
We will more than likely leak information to an attacker.
As an example of leaking information due to improper exception handling, consider what happens when an improperly-configured ASP.Net application encounters an unhandled exception,
as shown in Figure 90.
Figure 90: The Result of an Unhandled Exception (credit troyhunt.com)
In this example, the resulting web page reveals the error message, the full location of the file being executed, stack trace details and even our source code. Don’t think this is a .Net problem – pretty much all platforms and languages have the same problem.
In addition to explicitly handling exception conditions, we can also leverage flags available to use during the compilation and linking steps, such as the safe security exception handler flag, commonly referenced as ‘/SAFESEH’. When specified, this flag will create a table listing all safe exception handlers in code and place this table into the executable. At run-time, when an exception is encountered, the OS will check the exception against this list, and if a matching entry is not found, the OS will terminate the process instead of risking an unsecure action. In short, this approach favors a secure stance over allowing the application to continue running.
When an exception occurs three actions must be taken. First, the application must handle the error condition in way that does not require the environment to handle it. This is called an unhandled exception. For example, if we do not catch an exception in a server web page, it will default to the web server software which will return a 500 error to the browser, more than likely containing sensitive information in how our web application is constructed. This is a form of information leakage, and hackers absolutely love this scenario. We have already covered just such a scenario earlier.
Second, the application must log the error condition with sufficient detail to allow a root cause analysis later on. The root cause is the first condition that was the source of an error. It is important to be able to differentiate between root cause and secondary causes. For example, if the network connection to a database goes down, and a web application attempts to execute a query, the data access layer might throw an exception that says, “Unable to execute query”, which does not provide us enough information to later decide the network is unstable instead of the database.
Third, the application must prevent information leakage by returning just enough error information for the user to know what to do next without revealing too many details about what actually happened. For example, instead of sending back an error message that says, “Unable to connect to the DBOAPP01 SQL Server database”, we send back a message stating “An unexpected error was encountered. Please try again later.”
It is essential to use try/catch blocks, or whatever mechanism the language of choice provides. I once had a developer with a nasty habit of writing unstable code, and in a moment of frustration I told him “You are NEVER to write another block of code that is not wrapped in a try/catch!” With much grumbling he complied, and to his surprise the number of bugs due to his code went down drastically. While using a try/catch block will not magically make developers write better code, it will make it much easier to identify root cause. The use of finally blocks is also encouraged if operations within the try/catch block allocate resources that must be explicitly released. This is true even for languages that support garbage collection like .Net and Java, as some operations still require an explicit release at times.
Some good examples of requirements around this area are the following:
“All exceptions are to be explicitly handled using try, catch and finally blocks.”
“Error messages that are displayed to the end user will reveal only the needed information without disclosing any internal system error details.”
“Security exception details are to be audited and monitored periodically.”
Input validation not only serves to keep data safe and prevent injection attacks, but it also prevents errors from being thrown later when the data is being processed. When unacceptable input is received that should be rejected, it is important to properly word the message sent back to the user to avoid information disclosure. Not only should we not display the exact error message as the exception condition generates, we must also take care not to reveal inner workings. Let’s use the example of a login to demonstrate this.
Suppose that an attacker attempts to carry out a SQL injection attack by entering “’ 1=1” into the password field. We do not carry out input validation, and so we simply concatenate this with SQL which generates an exception ‘Unable to locate the ‘’ field in the passwords table’.
The worst case is that we send this error message back to the user. Never, ever send raw error information to the browser, even in development environments. Instead, always log the detailed information and explicitly craft the message back to the user. A better result would be to say, ‘An invalid password was provided’.
Unfortunately, this message is also a bad idea, as now the user knows the username he randomly selected is actually valid – we are still leaking information. A better result would be to simply say, ‘The login information provided is incorrect.’
Of course, the best scenario is one in which we perform input validation and strip the password input down to “ 1 1” because we recognize the danger from the single quote and equal symbols. In this case, we have completely defeated the attacker’s attempt and simply told him that the login attempt failed.
If we have properly applied security, a clipping level will kick in after a specific number of failed attempts and lock the account out. This is an example of software ‘failing secure’.
There are three recommendations to follow to ensure errors are handled properly. The first we have already covered, which states to never send back details and instead show a generic user-friendly message that does not leak internal information.
The second recommendation helps in determining what went wrong. While we should always log exception conditions, it is sometimes helpful for the end-user who contacts the help desk for support to provide details of the situation without disclosing too much information to the user. We can accomplish this by generating a unique GUID that represents the error conditions and provide that GUID only to the end user. We can then map the GUID back to internal logs to find detailed information about the error condition.
The third recommendation is to redirect users who encounter an unrecoverable error or exception condition to a default handling location such as an error web page. Based on the privilege level of the user, and whether the user is remote or local, we can customize the information shown.
In summary, error and exception handling is an extremely important aspect of code reviews, as both security and stability depend on a proper implementation. All critical sections of code must be wrapped by a try/catch with proper attention being paid to prevent information disclosure to the end-user. The software must fail securely, meaning that after an error is caught, the code should revert to a known state that is secure. ‘finally’ clauses should be implemented when memory or handles have been allocated and must be released.
Sequence and Timing
Any senior developer will tell you that some of the toughest bugs to fix deal with either timing or threads due to the difficulty in reproducing the root cause. In these cases, a good logging capability is crucial as the order in which seemingly non-dependent events happen will hold the key to the root cause. A primary attack vector is to take advantage of sequencing and timing design flaws that produce a race condition. A race
condition occurs when two processes attempt to access the same shared object at the same time, but each is waiting on the other to let go. As a result, both processes ‘lock up’ and refuse to move until the other process gives in. You can envision this as a couple of two-year old children fighting over the same toy – neither one is willing to give in – ever! To create a race condition, three factors must be in place:
1) At least two threads or control flows are executing concurrently.
2) The concurrent threads all attempt to access the same shared object.
3) At least one of the concurrent threads alters the shared object.
Attackers love race conditions as they are generally missed during testing and can result in anything from a DoS to a complete compromise of the entire system. These scenarios are extremely difficult to reproduce and debug, and effective logging in the production environment is usually the only way to figure out where the error condition lies.
There are multiple ways in which we can mitigate the existence and impact of race conditions.
First, identify and eliminate race windows, which are the scenarios under which a race condition can occur. Only a careful code review of the source code will reveal potential problems. Otherwise, you have to wait until it occurs in production, and you better hope you have put in plenty of logging.
Second, perform atomic operations on the shared resources. This happens when we ensure that code cannot be interrupted by another process until it has completed the entire manipulation of the shared resource. Here are a few suggestions on how to keep the functionality atomic:
Use synchronization primitives such as a mutex, conditional variables and semaphores. This approach in its simplest form checks a global variable to see if another thread is already in a sensitive area. If it is, the current thread will block until the active thread has completed running the protected code block. For example, consider the following pseudo-code: