The Many Worlds of Hugh Everett III: Multiple Universes, Mutual Assured Destruction, and the Meltdown of a Nuclear Family
Page 8
The following definition of game theory was written in 1967 by former RAND5 economist (and the 2005 Noble Laureate in economics), Thomas Schelling, not long after he played an instrumental role in designing the rationale for carpet bombing North Vietnamese cities during America’s war on Vietnam:
Two or more individuals have choices to make, preferences regarding the outcomes, and some knowledge of the choices available to each other and of each other’s preferences. The outcome depends on the choices that both of them make, or all of them if there are more than two. There is no independently ‘best’ choice that one can make; it depends on what the others do…. Game theory is the formal study of the rational, consistent expectations that participants can have about each other’s choices.6
The only group that benefited from bombing Vietnamese civilians was American munitions contractors, as the brutal campaign was militarily ineffective. On the other hand, from a humanitarian point of view, carpet bombing was preferable to using nuclear weapons! So, within that decision matrix, using conventional explosives was more rational than using hydrogen bombs. The weakness in game theory, of course, is that the decision matrix may be constructed from fundamentally irrational choices (such as the utility assigned to invading and trying to occupy Vietnam in the first place.)
The meaning of zero sum
In a zero-sum contest the winning player gains the same value that his opponent loses, i.e. wins and losses add up to zero. For two-player zero-sum games, von Neumann defined the solution as a “saddle-point,” or “mini-max” equilibrium. In this equilibrium, each player tries to minimize his worst possible outcome or, conversely, tries to maximize his best possible outcome. Tic-tac-toe and chess are zero-sum games in which each player has exactly the same information concerning the playing field and defeat is total. As children quickly learn, tic-tac-toe is easily solved: the saddle-point is a draw. In chess, each player has the same information about the state of play, but the game is much more complex than tic-tac-toe and its saddle-point strategy (if it has one) is unknown. Nonetheless, making random moves is not a rational strategy in tic-tac-toe or chess.
But in zero-sum games without a discernible saddle-point, such as poker, where vital information is hidden, it is rational to adopt a “mixed-strategy,” e.g. allowing bluffing or chance to enter the fray. Bluffing has perils: double-guessing a double-guessing opponent can lead to an infinite regression. Over the course of a poker game, players consciously and unconsciously absorb information about each other, while attempting to shield information about their own situation. Bluffing on a random basis is more rational than always bluffing.
In the much simpler zero-sum game, matching pennies, each player secretly turns a penny heads or tails up. The rule is that if the pennies match, one player wins, if they do not match, the other wins. The most rational strategy is for both players to randomly select either heads or tails, without trying to outguess the other, because guessing leads to a feedback loop of, “he thinks that I think that he thinks that I think,” ad infinitum. Random play in matching pennies (equivalent to flipping coins) ensures that each player will win 50 percent of the time over the long run. So, why would anyone want to play?7
It turns out that non-zero-sum games with multiple players get closer to reflecting how real life plays out than do two-person zero-sum games. But game theory, in general, diverges from real life by assuming that all players know and agree to the rules of the game. It presumes that players understand the stakes, and can calculate the probabilities, or “expected values,” associated with making certain choices. Furthermore, it assumes that the rational players equally desire a particular type of outcome, a quantifiable utility. But as preferences obviously vary from person to person, and group to group, agreements about what is rational smear into subjectivity.
Despite its weakness as a replica of reality, game theory has been usefully applied to retroactively modeling how decision paths emerge in political science, to tracing stages of development in evolutionary biology, and to analyzing the history of competitive forces in capitalist economies. It is not very successful at predicting the future. Nonetheless, von Neumann-type game theory was influential for about two decades as a guide to war-planning for the governments of the United States and the Soviet Union. Some credit the use of game theory with successfully deterring nuclear war during that period, others say it brought the world unconscionably close to destruction.
Everett was its adept.
A critique of military game theory
In 1960, one of RAND’s top game theorists, Hermann Kahn, wrote a controversial book, On Thermonuclear War. Kahn rationalized the fighting of “limited” and “preventative” nuclear wars on the basis of a high expected utility for liquidating communism. Kahn calculated that “winning” a nuclear war was worth the sacrifice of scores of millions of American lives and the genetic mutation and damage to industrial and agrarian infrastructure that would set capitalist society back thousands of years—until it rebounded as a consumer heaven! Some of Kahn’s colleagues, including Everett, admired his coldblooded, cost-benefit approach to “thinking the unthinkable.” Others were righteously horrified and attacked both Kahn and game theory, particularly its assumption of rationality.
One particularly vocal Kahn critic was University of Michigan mathematician, Anatol Rapoport, a Russian émigré with sterling credentials as a statistician and a game theorist. In 1964, Rapoport penned Strategy and Conscience. In it, he explained how and why (in his opinion) game theory fails as a model for policy-making:
It is generally assumed in a theory of decision under risk that a rational decision maximizes expected utility. This is a tautology if utilities are so defined that the action with the maximum expected utility is always preferred.8
It is not always rational to maximize utility. For example, if the preferred utility for players in a war game is to win all battles in the shortest time possible, then the rational choice for winning a battle might be to launch a surprise attack; but using up all of one’s resources on that attack might cause another battle (and the war) to be lost. It is more important, therefore, to define the situational context of a game in relation to the particular rules of the game. In order to reflect reality, games should not be treated in isolation from their greater context. Rapoport summed up,
Therefore, if a normative [ethically prescriptive] theory of risky decisions is not to be vacuous, utilities must be defined independently of the choices made in the context examined. If this is done, the principle of maximizing expected utility becomes an additional criterion of rationality.9
For instance, if an important but non-contextual utility of a policy choice is a preference for peaceful co-existence, then it would not necessarily be rational for a government to chose a political strategy focused on military victory, as a greater game will be thereby lost. Contrary to the basic assumptions of von-Neumann-style game theory, said Rapoport, true rationality cannot be bound by the limits of a subjectively constructed decision matrix.
He commented,
In the last analysis, then, arguments in support of probabilities assigned to events are pleas to pay attention to some facts more than to others. A change in our attitude brings up a different set of facts as the relevant set and changes our perceived probability of the event in question.10
In other words, the assignment of probabilities to possible events is a subjective act; political and ideological agendas bias the choice of utilities in a policy game, thereby defeating the purpose of using game theory as an objective model.
Rapoport struck at the heart of the matter:
For the most part, decisions depend on the ethical orientations of the decision-makers themselves. The rationales of choices so determined may be obvious to those with similar ethical orientations but may appear to be only rationalizations to others. Therefore, in most contexts, decisions cannot be defended on purely rational grounds. A normative theory of decision which claims to be ‘realistic,’ i.e.,
purports to derive its prescripts exclusively from ‘objective reality,’ is likely to lead to delusion.11
Everett’s game
Princeton’s Harold Kuhn has made many significant contributions to game theory as a scientist and historian. As a young professor of mathematics in the 1950s, Kuhn co-edited two collections of seminal papers on game theory for “Annals of Mathematics Studies.” And in 1953 he published one of the field’s first textbooks, Lectures on the Theory of Games. Kuhn was so lastingly impressed by Everett that he included a paper he wrote during his first year at Princeton in a book of “pioneering” work from the “heroic age” of game theory titled Classics in Game Theory (1997). Everett’s relatively unknown “Recursive Games” paper appears alongside a famous paper by John Forbes Nash, the co-winner of the 1994 Nobel Prize in Economic Science.
Kuhn recalls,
I had no social interaction with Everett outside the weekly game theory seminars and teas, but I got to know him very well. In retrospect, I thought of him as a physics grad student slumming over in math. ‘Recursive Games’ was a beautiful piece of work; it would have been sufficient to get a Ph.D. in math.
The weekly seminars were small and informal, but extremely important in the history of the Cold War. Practically everybody working on game theory in the 1950s regularly gathered in Fine Hall, including von Neumann, Nash, Shapley, and Everett. We also held four formal conferences that were mile markers in the history of game theory.12
On January 31, 1955, von Neumann chaired a formal game theory conference at Fine Hall. Everett, who was mid-way through his second year as a grad student, and already writing his thesis on quantum mechanics, presented his “Recursive Games” paper on the first day of the conference; among a dozen other presenters were Nash and Shapley.13 Kuhn describes how Everett’s attempt to model the complexity of real life, broadened the scope of game theory:
There were two competing, but related theories at that time. One, formulated by Shapley was called stochastic games, and the other by Hugh Everett was called recursive games. And the idea of both of them was that in real life we keep making choices and formulating strategies, and we do not necessarily get a pay-off, but, rather, we are led into another game. So instead of having a matrix of pay-off numbers, we have a matrix in which some entries are other games. And these were attempts to make game theory more realistic.14
Shapley’s paper, “Stochastic Games,” defined a type of zero-sum operation with randomized moves where the probabilities relevant to deciding the next move do not depend upon the history of the game. Shapley found a way to design optimal strategies for each player, so that pay-offs could be determined at some point, and the game would end after a finite number of steps. He theorized that his method could be extended to cover games with “infinite sets of alternatives.”
Everett picked up where Shapley left off. He defined “recursive” games where the outcome of a single randomized play can either be a pay-off, or another game, allowing for games with infinite moves. The problem, then, was how to identify an acceptable pay-off in the face of the possibility of infinite plays, (a superior pay-off might emerge in the far, far future). Everett found a strategy for making such decisions, and this was an important discovery. His fascination with cybernetics paid off in game theory. He remarked:
The situation is fully analogous to servomechanism analysis, where the complex behavior of a closed loop servomechanism is analyzed in terms of the (open loop) behavior of its parts [via feedback]. The theory of servomechanisms is concerned solely with the problem of predicting this closed loop behavior from known behavior of the components. An appropriate name for recursive games would be ‘games with feedback.’15
Everett illustrated the military application of his theory:
Colonel Blotto commands a desert outpost staffed by three military units, and is charged with the task of capturing the encampment of two units of enemy tribesmen which is located ten miles away.16
Under the rules of Blotto’s recursive game, daylight raids are forbidden, and night raids require that the attacking force have one more unit than the defending force. Each player must divide his units into attacking and defending forces. Given the constraints and rules of the game, one can see the possibility for an endless loop of excursions and retreats by larger and smaller forces passing each other by in the night, each hoping to overwhelm the enemy, but beset by the necessity of protecting his home base.
In accord with common sense, Everett’s innovative probabilistic approach determined that if Blotto is patient, he will eventually win, because he commands a larger force—but winning might take a very long time.
7 Origin of MAD
The Kodak slogan, ‘You push the button, we do the rest,’ which since 1889 has helped so much to popularize photography all over the world, is symbolic. It is one of the earliest appeals to the push-button power-feeling; you do nothing, you do not have to know anything, everything is done for you; all you have to do is push the button. Indeed, the taking of snapshots has become one of the most significant expressions of alienated visual perception, of sheer consumption. The ‘tourist’ with his camera is an outstanding symbol of an alienated relationship to the world. Being constantly occupied with taking pictures, actually he does not see anything at all, except through the intermediary of the camera. The camera sees for him and the outcome of his ‘pleasure’ trip is a collection of snapshots, which are the substitute for an experience which he could have had, but did not have.
Erich Fromm, 19551
Prisoner’s dilemma
As the Cold War intensified, a fear-based equilibrium evolved. It was called mutual assured destruction, widely known as MAD. As a game, it defined the Cold War, and guaranteed Everett’s career in operations research.
The winning strategy for a rational player in a zero-sum game is to wipe out the opponent while minimizing the risk of meeting that fate. Short of winning, the optimal strategy is to avoid defeat by forcing a tie. Risk avoidance mediates decision-making; and game theory attempts to quantify risk. But in real life, not all players are equally rational, and their utility values may be diametrically opposed: one might value revenge, another mercy. And not all stakes are zero-sum. For instance, in non-zero-sum games cooperation is possible between two or more competitive players. The impulse to make bargains is more reflective of the complexity of real life than winner-take-all, or no-risk options. Consider a game in which players can make enforceable agreements, in which they can maximize their own perceived self-interest, and minimize risk, by choosing the best action available to them collectively given that the other players are also choosing their best actions: this is the well-known “Nash equilibrium.”
Building on the von Neumann-Morgenstern analysis of cooperative (nonzero-sum) games, Nash’s “bargaining” theorem shocked Kuhn and Tucker with its mathematical and sociological beauty. Achieving the Nash equilibrium point is synonymous with the result that no rational player would change his initial decision after the results of the game are known. It rewards cooperative bargaining and penalizes unilateralism. It has many applications in market economics and other forms of group dynamics, including geopolitics. It does not, however, solve the problem that game theory relies upon an ideal of rationality and an assumption of shared utility values.
Nash’s breakthrough in 1950 was immediately countered by a game that questioned the rationality of searching for his equilibrium points. Developed at RAND, it was dubbed “Prisoner’s Dilemma.” It questioned the logic of game theory itself.
Prisoner’s Dilemma considers the situation of two prisoners accused of jointly committing a crime (which they did commit). Each prisoner is separately questioned by detectives and informed of the stakes. Each is told that if he testifies against his partner, he will be set free, whereas his silent partner will be sentenced to three years in prison. If they testify against each other they will both be incarcerated for two years. These are the “defect” options. But if both prisoners refus
e to turn state’s evidence, then each will spend only one year in prison. This is the “cooperation” option. The shared utility is that each prisoner wants to minimize the risk of doing hard time; both are presumed to be rational (defined as self-interested).
Obviously, if the prisoners could communicate, the best choice would be for both to cut a deal and clam up (cooperate), and each settle for one year in prison. But that would not be the best retroactive choice (Nash equilibrium) in this scenario, because if your partner defects, while you cooperate, you get slammed with three years. Therefore, the least risky and most rational choice is to defect, as betraying your partner will at worst give you two years in prison, and you have a chance of being set free. But if both prisoners act “rationally” and defect, then each prisoner ends up worse off than if both had acted “irrationally” and cooperated.
Prisoner’s Dilemma exposes the contradiction of using a rationality construct based on perceived self-interest as a guide to solving conflicts. It demonstrates that even in a situation with a built-in cooperative option that would equally benefit all players, it is not necessarily rational to cooperate. But it is not necessarily rational to defect, either. The game highlights the paradox inherent in the Cold War strategy of assured destruction, which is based on the assumption that no winner will emerge from the ashes of a nuclear war if one or both defect by striking first—and, yet, it is not necessarily rational not to strike first.