Sharks in the Moat
Page 9
When writing confidentiality requirements, the following statements provide good examples of well-written and applicable requirements. Keep in mind that these are just examples, and are not necessarily requirements that you should use in your own projects.
“Personally identifiable information must be protected against disclosure by using approved encryption mechanisms.”
“Passwords and other sensitive input fields must be masked.”
“Passwords must be hashed before persistent storage using the SHA-256 hash function or better.”
“Transport layer security (TLS) must be used for all traffic between the DMZ and the Internet.”
“Log files must not contain any PII in a form that is humanly readable or easily decipherable.”
We just spent a great deal of time introducing encryption, because it is core to confidentiality. It is just as core to integrity as well, so before we go onto that topic, let’s take a few minutes and perform a high-level flyover of encryption.
Chapter 11: Encryption
Encryption is the process of converting plain text into a secure form called ciphertext, which is unreadable until converted back into the original plain text. The conversion of ciphertext back into the original plain text form is called decryption. Both processes use mathematical functions along with a password called a key to carry out the conversions.
There are four general uses for encryption:
To ensure data is not intercepted or manipulated as it is transmitted across networks.
To ensure information stored on computers is not viewed or manipulated in an unauthorized manner.
To prevent and detect both accidental and intentional alteration of data.
To verify the authenticity of a transaction or document.
It should be noted that in many countries governmental law prohibit the use of certain encryption techniques, as the government would not be able to decrypt the data. This is primarily found in countries with an oppressive government that does not value individual privacy or freedom. It should also be noted that while encryption can detect modification of data, it cannot prevent the loss or modification of that data. In other words, while encryption can prevent someone from reading confidential information, and it can detect when someone changes it through hashing, it does nothing to prevent someone from actually changing or deleting it.
Key Elements of Encryption Systems
Any encryption system includes three primary variables – the algorithm, the key, and the length of the key.
The algorithm is a function based on some type of mathematical formula that can both encrypt and decrypt data.
The key is a piece of information required to both encrypt and decrypt the plain text message. Think of it as a password. While the decryption algorithm will produce data with an incorrect key, it will be a garbled mess.
The key length is a predetermined length that the key must match. The longer the key, the longer the encryption and decryption processes will take, but the more difficult it will be to break.
Just like a password, an attacker can try random key values in a brute force attack until he finds one that works. The longer the key, the more values the attacker will have to try. Keys should be randomly generated, as using words or phrases can drastically reduce the amount of time required to break encryption. Another method to reduce the number of brute force attempts is to analyze the underlying algorithm in a process called cryptanalysis. For example, if a portion of the plain text and resulting cipher text is known, an attacker might be able to deduce part of the algorithm, allowing him to narrow down the range of keys to try.
One of the few absolutes in the security world is that there is no such thing as an un-hackable system – it is simply a matter of how much work is required. The same is true of cryptography - any scheme can be broken given enough time and computing power. The only thing we can do is to make breaking our encryption so difficult that it is not worth anyone’s time. The amount of effort that must be applied to break a given encryption scheme is called the work factor. As the work factor for encryption increases, so does the amount of effort required to defeat it.
Hashing
There is a slightly different take on encryption that is just as important to security. Usually we focus on being able to encrypt and decrypt messages, but the ability to create a hash is equally valuable. A hash is a one-way encryption that results in the same length of ciphertext regardless of the amount of plain text and cannot be reversed. We have already discussed the value of such a capability when storing passwords. It is also crucial in the ability to determine if data’s integrity has been violated, or if it has been changed in an unauthorized manner. Think of it this way – if I need to send you a message and I don’t care who sees it as long as the message arrives intact, then a hash function is the way to go.
For example, let’s say I need to get a shopping list to you and I don’t care who reads it as long as they don’t change the word ‘cupcakes’ to ‘carrots’. In this example, I would run the entire shopping list through a hashing algorithm, which gives me a 64-character string of letters and numbers. No matter how many times I hash the shopping list, it will always result in the same hash value, or digest. But if I change even one single letter – say I misspell ‘cup’ as ‘cap’ then the entire hash value changes. So, I write my hash value at the bottom of my shopping list and give it to a courier to deliver to you. Unfortunately, my courier also happens to be my wife who is trying to get me to lose weight, so she surreptitiously changes ‘cupcakes’ to ‘carrots’ (because I foolishly wrote my shopping list in pencil). Now, you receive my shopping list, use the same hashing algorithm I used to generate a hash value, and notice that the two values don’t match. You instantly realize that my list has been altered. And that is why a hashing function is so valuable – it detects whether the integrity of shopping lists have been compromised or not. I suppose it could
be used for things other than shopping lists as well.
When using hashing for passwords, as soon as a user changes their password, we hash it and store the hash value. One of the side-effects of hashing is that they are completely un-reversible, so no one will be able to un-hash the value and see the original password. Therefore, when someone tries to log in, we take the password the user types in, hash it and compare the resulting value to that stored in our database. If it matches, they are now authenticated, all without the risk of someone stealing the password from our database.
Quantum Cryptography
Quantum computing represents the next evolution of computers, but we haven’t quite reached it yet. Theoretically, quantum computing will advance computers overnight to unimaginable speeds. So fast, in fact, that all of our current encryption algorithms will become useless due to brute force attacks from these mega-computers. After all, all current encryption schemes are built with the understanding that all are ‘breakable’, but it would take such a long time to break a specific message that no one would even try. That’s what the inventors of DES thought back in the 1970s until computer speeds of 1999 slapped them in the face. Quantum computing would render current schemes such as RSA useless.
Quantum encryption schemes are already being designed that allow each party to know when a key has become compromised. The parties simply have to generate a new key and continue. Post-quantum algorithms are also being designed as we speak, but until we have these super-beasts in-hand, the quantum encryption schemes will run too slowly to be applicable to our modern computers.
Symmetric vs. Asymmetric
Now, there are actually two types of encryption schemes – symmetric and asymmetric – and they differ primarily in how the secret keys are used.
A symmetric scheme uses a single ‘shared key’ that both the sender and recipient know. This is basically how a password is used – both parties have to know the password before access is granted. The primary weakness with symmetric schemes is in how to get the secret key from the sender to th
e recipient without it being intercepted and revealed. This must be done ‘out of band’, or in some other manner than how the encrypted message is sent. It doesn’t do a whole lot of good to encrypt a message and then send the key in clear text with it. We could encrypt the key as well, but then we would have to send the second key in clear text. And then we could encrypt that key and…well, you get the idea.
Symmetric systems got their names from the fact that the same key is used to both encrypt and decrypt the message, meaning the keys are symmetric. Originally, the most common symmetric system was the data encryption standard or DES. As computers increased in power, it was just a matter of time before DES was broken, and sure enough that happened in the late 1990s. To provide enough time to find a proper successor, 3DES, pronounced triple-DES, was implemented by simply executing DES three times in a row. Eventually the advanced encryption standard, or AES, was completed, which is still in-use today. Symmetric key systems have simpler keys and perform faster than asymmetric systems. The whole problem with symmetric schemes is the need to somehow get the secret key to the receiver.
An asymmetric scheme addresses the secret key conundrum by using one key for encryption, and a different key for decryption. Now, that may not make much sense, but mathematically speaking the two keys share some common attributes that are calculated. Instead of trying to figure out how that works, just accept that it’s not magic and is instead based on some extremely advanced mathematics. With an asymmetric scheme, neither the sender or receiver are expected to know, trust or have any previous contact with each other. In fact, one of the key pairs can be publicly disclosed, while the other key remains hidden and a secret. The problem with asymmetric systems is that they are slower relative to symmetric systems. The first workable solution to public keys was an asymmetric algorithm called RSA, named after the inventors.
So, if symmetric systems are fast but have a shared key problem, and asymmetric systems solved the shared key problems but are slow, surely, we could somehow combine them to come up with the perfect system, right?
The answer is YES, and that is where we’re heading next.
Public Key Systems
Before we get too deep in this discussion, we need to define four different aspects of public key systems.
When a person or entity - let’s say his name is Bobby - creates a pair of public/private keys, he gives the public key to a certificate authority, or CA. The CA confirms that the entity is the real owner of the public key, and creates a public certificate stating so. The CA encrypts the certificate with its own private key, so everyone knows it came from the CA. Now, when I want someone else’s public key, I can just go to the CA and ask for the certificate, decrypt it with the CA’s public key, and be completely confident that the public key inside of that package is the public key for the person I want to talk to. Why do I trust the CA? Because it is a public entity that has been set up specifically to be trustworthy and has a lot of processes in-place so that I feel comfortable trusting it. Don’t worry too much yet if the public and private key stuff makes sense or not – we’re going to go over it in great detail in just a moment.
Now, how does the CA know Bobby is who he claims to be? Well, that is where a registration authority, or RA, comes into play. The RA does some real-world magic on behalf of a CA to see if you are a real person or company before telling the CA “Yep, this guy is for real.”
What happens if someone steals your private key, or an entity with a public certificate goes out of business? In that case, the CA will put your certificate on a certification revocation list, or CRL. This list is always available and should be checked before you trust a certificate. When you visit a web site and use HTTPS, the browser will check to see if the certificate the web site hands out is on the CRL or not, and if it is, will strongly suggest you go someplace else.
The last thing to note before we continue is something called a certification practice statement, or CPS. This is a detailed set of rules defining what a CA does. The CPS tells us how a CA validates authenticity and how to use its certificates, among other things.
Now, let’s continue with public key systems.
Let’s say that Bobby creates an asymmetric key pair – a public key and a private key which are mathematically linked. The public key is designed for everyone to know about, but only Bobby should know his private key. If anyone discovers that private key, then the public key has been compromised, and they both must be tossed. Now let’s say that Sam does the same and creates his own asymmetric key pair, and Sam decides to talk to Bobby. Now Bobby doesn’t trust Sam, and Sam doesn’t trust Bobby, but somehow, they are going to carry on a conversation across the Internet and be completely comfortable.
The first step is for Sam to get Bobby’s public key. That is where the certificate authority, or CA, comes into the picture. Even though they don’t trust each other, both Bobby and Sam trust the CA. So, Sam goes to the CA and says ‘Hey, give me Bobby’s public key’. Sam is very comfortable that he now has Bobby’s real public key because the CA has done their homework to make sure this is true.
Sam then creates a symmetric secret key that both he and Bobby will share, encrypts it with Bobby’s asymmetric public key and sends it off to Bobby. Bobby receives it, decrypts it using his matching private asymmetric key, and both parties now have a shared symmetric secret key that no one else could have intercepted. From this point forward, symmetric algorithms can be used for all communication. Notice that we have achieved something important – we have combined the key sharing capabilities of an asymmetric scheme with the performance of symmetric schemes.
Now comes another difficult question – how does Bobby know the message really came from Sam? After all, anyone can encrypt something using Bobby’s public key. While we have solved the shared key problem, we still aren’t sure where the message came from. Enter digital signatures!
Digital Signatures
One of the interesting things about asymmetric key pairs is that they operate in both directions. For example, you can encrypt a message with the private key that can then be decrypted with the corresponding public key. The reverse is true as well - you can encrypt a message with the public key that can then be decrypted with the corresponding private key. This is a very important capability because it gives us an easy way to prove we are who we claim to be.
Remember the CA? Not only do they provide public keys, but they also provide a certificate that is guaranteed to be unique and will not change as long as the public key pair does not change. In other words, we have this third party that everyone trusts who is willing to give us some kind of electronic file – the certificate - associated with a public key. Here is why we care. If I retrieve the certificate associated with my public key from the CA, encrypt it with my corresponding private key, and then send the resulting ciphertext to you, guess what you can do? You can decrypt my ciphertext with my own public key, and if it works, then you know for certain that it came from me, because only I could have encrypted it with my private key. That is the very definition of non-repudiation – there is no way I can claim I did not send the message, because only I have the private key. Either that or someone has stolen it and is masquerading as me, which very rarely happens.
Let’s revisit Sam and Bobby to show how this solves Bobby’s problem. When we last saw these two, Bobby was wondering if Sam really sent that secret key. Well, what if Sam encrypted his own public certificate from the CA with his private key, and sent that along with the shared secret key? Now all Bobby has to do is to try and decrypt Sam’s encrypted certificate with Sam’s public key. If it works, then Bobby knows for certain that the shared key came from Sam. And with that, we have not only solved the shared secret key problem, but also how to make sure the secret key originated from the right person.
We have therefore used a public/private key pair to create a digital signature. The whole point of a digital signature is to prove that something was sent from a specific entity. The way this works
in real life is this:
1) Create content you want to send
2) Generate a hash on the entire content
3) Encrypt the hash with your private key, resulting in a digital signature
4) Send the content and digital signature
5) The recipient decrypts the digital signature with your public key
6) The recipient creates a hash of the content
7) If both hashes – the hash sent in the digital signature and your computed hash – agree, then you know two things – the content has not been altered and it was sent by the person who owns the public key.
Note that we have achieved integrity, because if the content was altered, then our hashes would not have matched. We have also achieved authentication because we know only the owner of the public key sent it, because only they would have had the corresponding private key. And finally, we have achieved non-repudiation, as the owner of the public key could not deny having sent the message as only the owner of the private part of the key pair could have encrypted it.
In general, digital signatures provide the following benefits:
The signature cannot be forged.
The signature is authentic and encrypted.
The signature cannot be reused, meaning that a signature on one document cannot be transferred to another document.
The signed document cannot be altered, and any alteration to the document - whether or not it has been encrypted - renders the signature invalid.
And that is how a digital signature works.