THE CODEBREAKERS
Page 58
But the reversal in his personal fortunes seemed to depress him. Each night he sank deeper and deeper into the newspaper. Finally, on February 7, 1960, after a long bout with Parkinson’s disease, the man who had automated cryptography died in obscurity in his home in Hackensack, New Jersey.
It is self-evident that, as the number of plaintext elements increase, so does the security of a cryptosystem. Thus a large code is harder to solve than a small one. Similarly, among cipher systems in which only a single set of plain-to-cipher equivalents is used (that is, nonpolyalphabetic systems), those enciphering two letters at a time offer more resistance than those enciphering just one at a time, all other things being equal. In other words, a digraphic substitution such as the Playfair is stronger than a monographic substitution. The reason is that digraphs are harder to identify than single letters—partly because there are more to choose from, partly because their characteristics are less sharply defined. These problems would be aggravated for trigraphic substitutions, and would rapidly approach insuperable proportions for tetragraphic, pentagraphic, hexagraphic, and even larger polygraphic substitutions.
Such substitutions have always been possible in principle simply by listing plaintext polygraphs opposite their ciphertext polygraphs. The first such list was constructed for digraphs when Porta set one up as a tableau, using distinctive symbols for each plaintext digraph. Many digraph lists using letters have been compiled, usually in big 26 × 26 tables. But such lists have almost never been produced for trigrams, and never for tetragrams or larger polygram substitutions. Their bulk (263 or 17,576 entries for the trigraphic, 264 or 456,976 for the tetragraphic, and so on) is prohibitive, and much of the labor would be wasted on useless polygrams, as jgt or wqh.
Ever since Wheatstone’s Playfair showed how a digraphic substitution could be achieved compactly and without a lengthy list, other cryptographers have tried to extend his geometrical technique to trigraphic substitution. Nearly all have failed. Perhaps the best known effort was that of Count Luigi Gioppi di Türkheim, who in 1897 produced a pseudo-trigraphic system in which two letters were monalphabetically enciphered and the third depended only on the second. Finally, about 1929, a young American mathematician, Jack Levine, used six 5 × 5 squares to encipher trigraphs in an ingenious extension of the Playfair. But he did not disclose his method.
This was the situation when a 38-year-old assistant professor of mathematics at Hunter College in New York published a seven-page paper entitled “Cryptography in an Algebraic Alphabet” in The American Mathematical Monthly for June-July 1929. He was Lester S. Hill, a five-foot-six, blue-eyed, black-haired native of New York and a Phi Beta Kappa graduate of Columbia College who had taken his Ph.D. in mathematics at Yale in 1926. Hill had taught mathematics at the University of Montana, at Princeton, at the University of Maine, and at Yale before coming to Hunter in 1927. While at Yale, he had written three articles for Telegraph and Telephone Age dealing with mathematical means of checking the accuracy of telegraphed code numbers. He hoped to make some money from his checking scheme, which he was seeking to have patented. This did not go anywhere, but it sparked in Hill an interest in secret communications. Later in the summer in which his paper on algebraic cryptography appeared, he expanded the topic before the American Mathematical Society at Boulder, Colorado. This lecture was later published in The American Mathematical Monthly as “Concerning Certain Linear Transformation Apparatus of Cryptography.”
Hill successfully used algebra as a process for cryptography. Probably many mathematicians had toyed with this idea; two proposals had even reached print—one by a German, F. J. Buck, as far back as 1772, the other by the young mathematician Jack Levine in a 1926 issue of a detective magazine. But Hill alone devised a method of power and generality. In addition, his procedure made polygraphic cryptography practical for the first time.
It employed equations in which the keys and plaintext letters had numerical values. Encipherment consisted of solving the equations. There were as many equations as letters in the polygraph. Since there are 26 letters in the alphabet, and since he had to make decipherment possible, Hill performed his computations modulo 26. This means that the mathematician uses only the integers from 0 to 25. Any number higher than 25 must be reduced by dropping out as many multiples of 26 as possible; the remainder equals that number modulo 26. Thus 28 is 2 modulo 26, because 28 minus 26 leaves 2. Likewise, 68 is 16 modulo 26, for 68 minus 2 times 26, or 52, leaves 16.
To demonstrate a tetragraphic substitution, Hill framed the following set of simultaneous equations. The x’s represent the plaintext letters, x1 being the first, x2 the second, and so on; the y’s represent the ciphertext letters:
In the first step of actual encipherment, Hill converted the letters of his plaintext—Delay operations—into numbers according to the following arbitrary alphabet:
Then he inserted the numerical values for the first four-letter group—dela, or 20, 10, 16, 5—into the equations as x1,x2, x3, and x4, resulting in the following:
Hill then carried out the multiplications and additions in each equation modulo 26. For example, solving for y 1 gave: 8 × 20 = 4, 6 × 10 = 8, 9 × 16= 14, and 5 × 5 = 25. They summed to 25. Reverting to literal values, 25 became J, the first cipher letter. The others were found in the same way, and the full ciphertext for dela became JCOW. The completed cryptogram is JCOW ZLVB DVLE QMXC.
Suppose, now, that the plaintext message had begun Demand …, which would alter only the third letter of the first tetragram from an l to an m. The replacement in the four equations of the 16 of l by the 13 of m would change the products of the third elements in each equation, thereby modifying the sums. Consequently the ciphertext for dema, which is CMZQ, appears entirely different from JCOW of dela. Such a system is genuinely polygraphic, and its cryptographic security is substantial.
The fixed values in the equations—the numbers that multiply the plaintext numbers—cannot be selected at random if the system is to work in reverse. Hill specified the requirements and derived the deciphering equations. For the above key, they are:
x1 = 23y1+20y2+ 5y3+ 1y4
x2 = 2y1+11y2+18y3+ 1y4
x3 = 2y1+20y2+ 6y3+25y4
x4 = 25y1+ 2y2+22y3+25y4
Hill eliminated separate deciphering equations by constructing “involutory transformations.” A single set of these equations serves both to encipher and decipher. Involutory transformations are constructed according to a special formula, which limits their number compared to noninvolutory ones. In theory this also reduces the cryptanalytical resistance. But the security loss is negligible, especially when measured against the increased facility of operation.
Hill further simplified the cipher’s operation by introducing matrices. A matrix is simply a square of numbers. Matrices can be added and multiplied together under their own rules. The numbers in a matrix may represent plaintext letters. And since each matrix may be handled arithmetically as if it were a single number, two equations serve for the encipherment of two matrices, no matter how many numbers each contains. Thus, by disposing plaintext in matrices, more letters can be handled with fewer equations. Thus two 3 × 3 matrices will encipher 18 letters at a time with only two equations instead of the 18 that would be needed for the so-called linear encipherment. Hill gave an example of this massive polygraphic encipherment in his second article. His plaintext was Hold out. Supporting air squadrons en route, and, using a different numerical alphabet than in his first example, he prepared his first two x, or plaintext, matrices as follows:
These he inserted into the equations, which are involutory and have an extra arbitrary matrix to be added in to further complicate the encipherment:
When he performed the appropriate matrix multiplications and additions modulo 26, y1 and y2 were found to be:
Such an encipherment virtually obliterates ciphertext repetitions. Even if an exact 18-letter plaintext group recurs, it must begin at exactly the same point in the encipherment equation to produce a ciph
ertext repetition—and there is only one chance in 18 of this happening. More importantly, a poly-graphic encipherment of this magnitude is possible only with a Hill transformation. The more than 40 quintillion 18-letter groups, printed 100 to each side of a page with their ciphertext equivalents, would fill a codebook thicker than the distance from the sun to Pluto.
Mathematically, there is no limit either to matrix size or to the number of equations. A cryptographer may use matrices ten letters square in five simultaneous equations to catapult 500 letters into cipher at once. Or he may set up 500 simultaneous linear equations each with 500 terms to encipher that regiment of letters together. From a practical standpoint, the matrix method is superior because it enciphers more letters for a given amount of work than the linear method, and larger polygraphs resist cryptanalysis more strongly than smaller ones. But from a purely theoretical standpoint, the matrix encipherment is less secure than a linear encipherment of the same number of letters. This is because the linear encipherment employs a greater number of arbitrary key constants in its equations. Many of the matrix constants reduce to zero when the matrix equations are written out in their linear equivalent. These play no role in the arithmetic. As a result, while the change of a single plaintext letter in a linear encipherment will affect every ciphertext letter, such a change will affect only every second ciphertext letter in a 2 × 2 matrix, every third in a 3 × 3 matrix, and so on. Conversely, an error in a linear ciphertext will garble the entire plaintext group, whereas an error in a matrix ciphertext will garble only every second letter, or every third, or fourth, and so on, depending on the size of the matrix.
In general, the Hill system defends itself well against the direct onslaughts of cryptanalysis. Without a knowledge of the basic letter-to-number conversion alphabet, the cryptanalyst may not even be able to start. Even with it, a straightforward frequency-analysis attack is out of the question: octogram frequencies, for example, are hard to collect and even harder to differentiate. Probable words require tedious testing for possible locations and then much mathematical juggling to determine the correct equations; even so, only the relatively trivial trigraphic encipherments have been solved. The cipher has, however, at least one curious chink in its armor. If a cryptanalyst obtains two ciphertexts resulting from a single plaintext enciphered with different involutory equations (of the same type and polygram size), and if he knows the conversion alphabet, he can, in general, recover the equations fairly easily.
The real obstacle to practical use of the Hill system is, of course, its ponderousness. Hill sought to minimize this by patenting a device that will encipher small polygrams (up to hexagrams). It consists of a series of geared wheels connected by a sprocketed chain so that the rotation of one wheel will turn all the others, but the range of its keys appears to be limited. Mechanisms could also be built to compute the encipherments of large polygrams, which give the best security, but they would be so complicated that they could not compete on a practical basis with simpler, though possibly less secure, cipher machines. For such reasons, the Hill system has served as a U.S. governmental cryptosystem in only one minor capacity—to encipher the three-letter groups of radio call-signs.
Hill never published any further papers on cryptology, but he kept writing them, turning over most of his studies to the Navy (probably because he was a lieutenant in that service in World War I). They mostly concerned further variations on the polygraphic scheme or elaborate complications on Vigènere-type systems. But though none ever approached his first publications in significance, the Navy welcomed his suggestions. In 1955, Rear Admiral H. C. Bruton, director of naval communications, wrote him: “I am pleased to acknowledge that you furnished material to naval communications during World War II, and that the ideas presented were ingenious, detailed, and complete. The cryptographic system which you proposed at the time demonstrated competence and inventiveness of a high order in the application of advanced mathematical concepts to the field of cryptography. May I again express to you the appreciation of naval communications….” Hill retired from Hunter in 1960, and on January 9, 1961, died in Lawrence Hospital in Bronxville, New York, after a long illness.
Although Hill’s cipher system itself saw almost no practical use, it had a great impact upon cryptology. When he published his articles in 1929 and 1931, cryptology, like other applied sciences, was beginning its drift toward a widespread application of mathematics to its problems. Friedman had just linked cryptanalysis to statistics. Two of the junior cryptanalysts he hired were mathematicians. Kunze, in the German Foreign Office, a Ph.D. in mathematics, was applying his mathematical knowledge to his work. Hill accelerated this trend.
The mechanism of the cipher machine, U.S. Patent 1,845,947, that was invented by Lester Hill and Louis Weisner, for polygraphic substitution
The elegance and generality of his work engaged the interest of mathematicians and cryptologists. Dr. A. Adrian Albert was perhaps the first to observe that, as he put it, “all of these methods [of cryptography] are very special cases of the so-called algebraic cipher systems.” On November 22, 1941, Albert, then 36, professor of mathematics at the University of Chicago and winner two years before of the Cole prize for outstanding research in algebra, expounded this view before an American Mathematical Society meeting at Manhattan, Kansas. “We shall see that cryptography is more than a subject permitting mathematical formulation, for indeed it would not be an exaggeration to state that abstract cryptography is identical with abstract mathematics,” he said. He adapted Hill’s basic algebraic idea to simple cipher systems, such as transposition, periodic Vigenère, and autokey, and derived their mathematical equations. Complicated systems, he explained, are often merely “the product” of two of these simple systems.
This reformulation of cipher systems in mathematical terms bares their essential structure. It shows up weaknesses and helps the cryptographer to correct them. It may suggest analyses. More importantly, however, it may enable the cryptanalyst to bring to bear mathematical techniques that were not previously applicable and that make entirely new solutions possible. Take, for example, the case of two Playfair cryptograms, enciphered in different keysquares but known to have the same plaintext. In the ordinary geometical solution of a Playfair, the extra knowledge of the identical plaintext in the second cryptogram does not assist in reconstructing the first key. But if the two ciphertexts are translated into the appropriate mathematical equations for Playfair, the fact that the plaintext elements in these equations are identical and so may be cancelled out may greatly simplify finding the unknowns in these equations and so facilitate solving the cryptograms. Thus the application of mathematical techniques that make explicit fundamental relationships often obscured permits the resolution of otherwise intractable problems. In much the same way, the invention of the calculus made previously unsolvable problems solvable. The complex ciphers generated by modern electromechanical means would lie virtually beyond cryptanalysis without the help of the new, high-powered mathematical weapons.
The cryptology of today is saturated with mathematical operations, mathematical methods, mathematical thinking. In practice, it has become virtually a branch of applied mathematics. Its sophistication, its range, and its power have grown far beyond the imaginings of the most imaginative cryptologist in Yardley’s Black Chamber. And in this evolution, Lester Hill was a prime mover.
The history of science is replete with coincidence. Adams and Leverrier deduced the existence of Neptune almost simultaneously. While Darwin was elaborating his theory of evolution, Wallace sent him a short paper that succinctly set it forth. Five years after Morse invented his telegraph, Wheatstone independently invented another. So it is not surprising that coincidence brushed cryptography in the crucible years of the First World War and just after. Its fabled long arm reached out and tapped four men in four countries. Spurred by the vast wartime use of secret communications, and beckoned by the new age of mechanization, they independently created the machine whose principle
is perhaps the most widely used in cryptography today. This principle is that of the wired codewheel, the rotor.
The body of a rotor consists of a thick disk of insulating material, such as Bakelite or hard rubber, commonly two to four inches in diameter and half an inch thick. Embedded around the circumference of each face are 26 evenly spaced electrical contacts, often of brass. Each contact is connected at random by a wire to a contact on the opposite face. Thus a path for an electric current is set up that starts at one point on the circumference of one side and ends at another point on the other.
The contacts on the starting, or input, face represent plaintext letters and those on the output face ciphertext letters. The wire connections between the two then provide a way of converting plaintext letters to ciphertext. To encipher, one need only fire a burst of current into the rotor at the input contact of the desired plaintext letter, say, a; this current then courses along the wire to emerge at an output contact representing the ciphertext letter, say, R. If a list be drawn up of all the rotor’s wire connections from the plaintext to the ciphertext face, it will constitute a monalphabetic substitution alphabet. The rotor thus embodies a cipher alphabet in a form suitable for electromechanical manipulation.
To carry out this manipulation, the rotor is placed between two fixed plates, each also of insulating material and with 26 contacts studded in a circle to match those on each face of the rotor. Each contact on the input plate is connected to a typewriter key that represents a plaintext letter. Each contact on the output plate is connected with some kind of device to indicate the ciphertext letter, such as a lamp or a typebar. When the encipherer strikes the key representing the plaintext letter a, he allows electricity to flow from the power source, into the input plate contact for a, across the junction into the rotor at the input contact for a, through the wire heart of the rotor to the output contact for ciphertext R, across to the output plate contact for R, and to the bulb that lights up the letter R as the ciphertext letter.