THE CODEBREAKERS
Page 54
The Index of Coincidence intermingles the two techniques, but they are easier to understand separately. Furthermore, the rudimentary formula used in that publication for the statistical technique has been superseded by one growing out of Friedman’s 1925 solution of a cipher machine using cryptographic rotors, or wired codewheels. During this analysis Friedman refined his theory, evolving two parameters of great importance in modern cryptology. Hence, despite the violation of chronology, it seems wiser to begin with the improved theory.
Imagine an urn containing one each of the 26 letters of the alphabet. The chance of drawing any specified letter, say r, is one in 26, or 1/26. Now imagine another, identical urn. The chance of drawing an r is equally one in 26, or 1/26. What are the odds on drawing a pair of r’s, one after another, in a two-draw situation? The likelihood of drawing the second r is 1/26 of the chance of drawing the first, which is 1/26. So the chance of drawing two r’s in a single event, or “simultaneously,” one from each urn, is 1/26 × 1/26. Similarly, the probability of drawing two a’s is 1/26 × 1/26, of two b’s, 1/26 × 1/26, and so on. Consequently, the chance of drawing a pair of letters—any pair of letters, no matter which pair may come up—is the sum of all these probabilities. It is (1/26 × 1/26) + (1/26 × 1/26) + … +(1/26 × 1/26), repeated 26 times, or 26 × (1/26 × 1/26), or 1/26. This quantity may be written as the decimal 0.0385.
Assume now an ideal cryptosystem whose ciphertexts yield a perfectly flat frequency count—one with as many a’s as b’s as c’s … as z’s. Polyalphabetics approach this in varying degrees and may, for practical purposes, be regarded as generating such ciphertexts. These texts are called “random” because they are what would be obtained if letters were drawn at random from the urn (each letter being replaced after being noted and the urn shaken to mix the lot, chance alone dictating their identities). If two such random texts are superimposed, the chance that the letter above will be the same as the letter below is the same as the chance of drawing a pair of identical letters from the two urns. This is 0.0385, or, to put it another way, there will be 3.85 such coincidences in every 100 vertical pairs. Experiment will confirm this.
Now imagine an urn filled with 100 letters of English in the proportion in which they are used in normal text—8 a’s, 1 b, 3 c’s, 13 e’s, and so on. The chance of drawing a specified letter is now proportional to its frequency. The probability that an a will emerge is 8/100ths, that an e will is 13/100ths. With two such urns, the chance of drawing two a’s is, as before, the product of the individual probabilities, or 8/100 × 8/100; the chance of drawing two e’s is consequently 13/100 × 13/100. And the probability of drawing a pair—any pair—of identical letters is the sum of all these pair-probabilities: (8/100 × 8/100) + (1/100 × 1/100) + (3/100 × 3/100)…, and so on through all 26 letters. This calculation has been made (with a slightly different frequency table). The result is 0.0667.
These two plaintext urns may likewise be replaced by two strings of plaintext. If they are superimposed, there will be as much likelihood that two letters will coincide vertically as there was that two identical letters will be drawn from the two urns. This probability is 0.0667, or 6.67 coincidences per 100 pairs. For example:
There are just seven coincidences in the 100 pairs—precisely what theory predicts.
The quantities 0.0385 and 0.0667 are important enough to be given names. The first is called κr, read as “kappa sub r” (for random), the second is κp, read “kappa sub p” (for plaintext).* They will naturally differ for other alphabets and other languages. In Russian’s 30-letter Cyrillic alphabet, for example, κr will be 30 × 1/30 × 1/30, or 0.0333. Changing frequency characteristics alters κp. Thus, it is 0.0778 for French, 0.0762 for German, 0.0738 for Italian, 0.0775 for Spanish, and 0.0529 for Russian.
The establishment of the kappa values permits the finding of a quick and easy answer to one of the most important and recurring problems in cryptanalysis : how to superimpose two or more polyalphabetic ciphertexts so that the letters in each column will have the same keyletter. The problem arises in cases in which different messages use the same portion of a very long key, such as that generated by a machine. Discovery of these overlaps opens the door to a Kerckhoffs solution. A test based on the kappa values and called “the kappa test” tells quantitatively whether a given superimposition has brought together identically enciphered texts.
To understand it, one must recognize first that the superimposition of two monalphabetically enciphered texts will result in the κp figure of about 6.67 coincidences per 100 vertical pairs, or 6.67 per cent of coincidences. This is because the coincidences will occur whether the letters are clothed in ciphertext disguises or not. The calculation does not ask the letters for their identities. It merely notes their coincidence. By the same token—and this is important—two polyalphabetic cryptograms enciphered in the same key and superimposed so that the two occurrences of that key are in synchronization with one another will also show 6.67 per cent of coincidences. The reason is this: In a correct (in-phase) superimposition, the two letters of each vertical pair have the same keyletter. Thus whenever a coincidence occurs in the plaintext, the letters of the pair will be identically enciphered. This results in an identical pair—a coincidence—in the ciphertext. It does not matter that a pair of e’s may be enciphered into V’s at one point and into Q’s at another, or that a coincidence of a’s becomes a coincidence of L’s here and a coincidence of F’s there. The total number of coincidences will remain the same as the number in the plaintext.
On the other hand, if the two cryptograms are improperly superimposed, so that the keys are not in step, any coincidences will result from different keyletters operating on different plaintext letters to accidentally produce the same ciphertext letter. The coincidences will be caused, in other words, by chance. Chance alone will produce 3.85 coincidences per 100 vertical pairs in random text, and polyalphabetic ciphertext is equivalent to random text. Hence an incorrect superimposition should yield about 3.85 per cent of coincidences. But 3.85 per cent is substantially less than 6.67 per cent, and so a comparison of the percentages of coincidences at various test superimpositions should show which superimposition is correct.
An example should make things clear. A cryptosystem with the Vigenère running key THE BARD OF AVON IS THE AUTHOR OF THESE LINES … starts the key for the first message with the first keyletter, but starts the key for successive messages with the third, fifth, and so on, keyletters. If plaintext 1 is If music be the food of love, play on, and plaintext 2 is Now is the winter of our discontent, the encipherments will be these:
A cryptanalyst, receiving these two cryptograms, will superimpose them so that they start at the same point:
Since there are 28 vertical pairs, the cryptanalyst would expect 28×0.0667 coincidences or 1.8676, or about 2, for a proper superimposition. But in fact he finds none, so he shifts the second cryptogram one space to the right and tries again. There will now be 27 vertical pairs. The cryptanalyst again calculates the theoretical expected number of coincidences for random and for correctly superimposed texts of this length so that he may compare the values with what he actually observes. Thus, a wrongly superimposed text would yield 27 × 0.0385 =0.9695, or about 1 coincidence that would be produced by chance alone, while a correct superimposition would yield 27×0.0667 = 1.2369. (These fractional differences become more pronounced with longer texts.) One coincidence appears:
Since the differences between the chance and the caused values are so slight, with so few letters, the cryptanalyst might wonder whether this is not in fact a random result (which in fact it is: the upper w resulting from the encipherment of plaintext o with key I, the lower w resulting from the encipherment of plaintext e with key S) and try the next superimposition. Here the number of coincidences immediately jumps. This superimposition is obviously correct.
If the cryptanalyst wishes to continue, he will find that at the next superimposition the number of coincidences falls again, to 2, and will
return to begin his attack with the third superimposition.
It is like shifting, an inch at a time, two identical picket fences with very wide pales and very narrow slits at irregular locations. From time to time, light will shine through when two slits coincide by chance. But there will be a burst of radiance when the fences are correctly juxtaposed and light can stream through all the slits at once. Similarly with the cryptograms: the right superimposition allows the coincidences that lie in the original plaintext to stand forth, even though the polyalphabetic key produces different ciphertext letters for the same plaintext letter.
The importance of the kappa test in modern cryptology can hardly be overestimated. Computers can automatically make the vertical comparisons necessary to determine coincidences at rates of thousands per second, can check the total against the two theoretical figures, and then can ring a bell to signalize the correct superimposition or can automatically shift the texts one place and try again. Cipher machines employ keys millions of letters long in attempts to preclude superimposition, but in heavy traffic several cryptograms may be enciphered with overlapping portions of these keys. Only the computerized kappa test makes practicable the search for these overlaps through the scores or hundreds of messages that are needed to make finding them likely. If enough are found to permit their alignment in depth, a Kerckhoffs attack—frequency analysis of the columns, plus anagramming of the plaintext along the horizontal, aided by symmetry of position to reconstruct the cipher alphabets—can solve the cryptograms. The kappa test thus opens the door to the solution of the most complex of modern ciphers.
The parameters κp and κr animate two other Greek-letter tests, the phi and the chi tests. Both derive from the basic principle of coincidence. And just as a frequency count concentrates the spread-out occurrences of individual letters for easier assimilation, so the phi and chi tests coalesce the separate tabulations of a frequency count to make it easier to compare counts. These two tests were devised in 1935 by one of Friedman’s assistants, Dr. Solomon Kullback. Since Friedman’s original test in The Index of Coincidence has been supplanted by the chi test, it seems preferable to give the latter.
The phi test, which is its basis, can determine whether a given frequency count reflects a monalphabetic or a polyalphabetic encipherment. It might be used to see whether a Kasiski determination of a period is correct by testing the letters in the column for monalphabeticity. If the period is correct, the frequency counts of the columns will show as monalphabetic; if not, they will be only random.
To use it, the cryptanalyst first multiplies the total number of letters in the message (N) by that total less one (N-1). He then multiplies this product by κr to find what is known as the polyalphabetic expected phi (Φr). Then he performs the same operation with κp to find what is known as the monalphabetic expected phi (Φp). He sets these two aside and goes through his frequency count of the cryptogram, multiplying each letter’s frequency (ƒ) by that frequency less one (ƒ-1). He adds up these products. If the sum—the observed phi—is closer to the monalphabetic expected phi than to the polyalphabetic, the frequency count is monalphabetic, and vice versa. For example, with a 26-letter cryptogram the expected phis are:
26×25×0.0385 = 25 for Φr
26×25×0.0667 = 43 for Φp
The cryptogram’s frequency count determines its observed phi:
The observed phi of 28 is noticeably closer to the polyalphabetic expected phi; the assortment of letters on which the count is based is thus probably polyalphabetic. The test can determine this fairly accurately for small distributions, where the eye cannot discriminate between the two types of count.
The chi test uses this procedure to compare two frequency distributions. It can tell whether the letters they represent have been enciphered with the same key, either mon- or polyalphabetic. For example, it can tell whether two Vigenère cryptograms have the same keyword, or, more importantly, it can pick out the columns in a Kerckhoffs superimposition that have been enciphered by the same keyletter, thus permitting their letter counts—which are usually scanty—to be amalgamated.
Its mechanics remain the same whether a polyalphabetic or a monalphabetic distribution is being tested, the only difference being that κr is used in the polyalphabetic calculations and κp in the monalphabetic. The chi test compares only two distributions at a time. The procedure is this: Multiply the number of letters in one distribution by the number in the other and by κp or κr. This is the expected chi. Then multiply the number of a’s in one by the number of a’s in the other, the number of b’s by the number of b’s, and so on. Total these products. The sum constitutes the observed chi. If the observed chi is reasonably close to the expected, the distributions represent identically enciphered assortments of letters.
For example, the three following counts have all been found to be monalphabetic. Are they identically enciphered?
Since they are monalphabetic, κp is used for the calculations:
The individual letter-multiplications produce the following:
The only expected and observed chis that agree to any extent are those for counts 2 and 3; their messages may then be regarded as having identical encipherments and may be combined in all respects, making identification of plaintext letters that much easier. With Kerckhoffs superimpositions in which the columns run only 10 or 15 letters deep, the chi test in effect makes solution practicable.
The same procedure may be used to correctly line up frequency distributions that have been shifted relative to one another—a task almost impossible to do by eye when the counts are small. For example, a cryptanalyst knows that two frequency counts represent the same cipher alphabet but standing at different positions relative to the normal alphabet. He can run the chi test at each of the 26 possible juxtapositions of the two to see at which point they represent identical encipherments. If the cryptanalyst can determine this, he will know the distance that one has to be slid to match the other and so their relative displacement. This knowledge plays an essential role in the other technique that Friedman described in The Index of Coincidence.
One of the two ciphers that he was analyzing in that publication was a progressive-alphabet system. For simplicity’s sake, this may be imagined as a St.-Cyr slide with a mixed cipher alphabet that shifts forward one space after each plaintext letter is enciphered. The period is 26, and a cryptanalyst would have no trouble in distributing the letters of a cryptogram into 26 columns, each enciphered at a setting of the slide. If the cryptanalyst focuses on one letter of the ciphertext alphabet as it creeps forward with the slide, he will see that this letter adopts at any given position the frequency of the plaintext letter above it. It exists with that frequency in the column representing that setting of the slide, and “deposits” this frequency in a frequency count for that column. At the next setting of the slide, it attires itself in the frequency of the plaintext letter above it at this point, and again sheds the frequency of that plaintext letter in a frequency count for the column representing that setting. The cryptanalyst now looks at his 26 frequency counts, which represent the successive positions of the slide as the key progressed. He singles out this one letter in the successive counts. Its differing frequencies mark the differing plaintext letters it has represented as it has moved along. The point to see is that these successive frequencies reflect the plaintext letters in their order in the plaintext alphabet. If this order happens to be the normal alphabet, things will be simplified, but the order itself is immaterial to what follows.
While this cipher letter is creating this pattern of frequencies, another cipher letter is also creating it. As this other cipher letter moves past the letters of the plaintext alphabet, it too is piling up little mounds of frequencies in the successive column counts. These mounds likewise mirror the order of the letters in the plaintext alphabet. The two patterns will be virtually identical, differing only by the usual and slight variations in plaintext. Now if one letter precedes another on the ciphertext slide by, say, thre
e places, its pattern, as seen cutting through the 26 frequency counts, will obviously be shifted three places forward of the pattern of the other letter. So if the cryptanalyst can determine the displacements of the patterns with respect to one another, he can find the relative positions of those two ciphertext letters in the ciphertext alphabet. By determining the relative displacement of all the ciphertext letters in this fashion, the cryptanalyst can reconstruct the entire ciphertext alphabet! And he can do it without guessing at a single plaintext letter!
Friedman developed the ancestor of the chi test to compare the frequency patterns to determine the displacements. This comparison is a crafty and ingenious idea, with many applications in the cryptanalyses of complex systems, especially machines using cryptographic rotors, which are progressive. But it has had nowhere near the impact of the statistical concept. Friedman presented both ideas in The Index of Coincidence, and cryptology has never been the same since.
Before Friedman, cryptology eked out an existence as a study unto itself, as an isolated phenomenon, neither borrowing from nor contributing to other bodies of knowledge. Frequency counts, linguistic characteristics, Kasiski examinations—all were peculiar and particular to cryptology. It dwelt a recluse in the world of science. Friedman led cryptology out of this lonely wilderness and into the broad rich domain of statistics. He connected cryptology to mathematics. The sense of expanding horizons must have resembled that felt by chemists when Friedrich Wöhler synthesized urea, demonstrating that life processes operate under well-known chemical laws and are therefore subject to experimentation and control, and leading to today’s vast strides in biochemistry. When Friedman subsumed cryptanalysis under statistics, he likewise flung wide the door to an armamentarium to which cryptology had never before had access. Its weapons—measures of central tendency and dispersion, of fit and skewness, of probability and sampling and significance—were ideally fashioned to deal with the statistical behavior of letters and words. Cryptanalysts, seizing them with alacrity, have wielded them with notable success ever since.