Games Indians Play

Page 6

by V Raghunathan

If I had had the good sense to contain my impatience instead of rushing to defect, both of us could have enjoyed the benefits of a prolonged mutual cooperation, thus enhancing my collection of points. But now, my defection and your retaliation with Never Again strategy will cost both of us all future satisfaction points.

Now, suppose my defection had not been deliberate but due to an unavoidable circumstance, and on the first of the fifth month I leave a cheque as usual but find no goat in return as you have decided ‘Never again’. It soon becomes clear to me that there will be no more goats from you to me and all our transactions are doomed to be D–D type.

However, suppose you did not take my one-time defection so much to heart and went looking for my cheque the fifth month and found it, though you leave no goat in return as I had defected the previous month. You realize I am not an all-out scoundrel. Not seeing your goat at the delivery site, I too would understand that you wish to get even with me for my last month’s defection. After this hiccup, from the sixth month, life could well be back on a smooth sail, with each of us back to earning our two points a month.

So is Massive Retaliation the best strategy? No, as it impairs all further accumulation of points. So what other strategy could you have adopted, given my fourth-month defection? Perhaps you could have allowed me one chance, that is, one more defection, before your massive retaliation. This would have earned both of us the future string of two points from the fifth month onwards, had the reason for my defection in the fourth month not been deliberate. If I had defected deliberately, I might get your goat yet again, leaving no cheque, thus exploiting your goodness and earning four points once more; but neither of us would earn anything from each other any more.

Yet other possible strategies for you could be to allow two more defections to me before retaliating, or springing some random defections of your own, such as, say, one, two or three random defections every ten interactions, and so on. Of course my own defections or when or how to retaliate against your defections would be part of my strategy.

Clearly, our objective of maximizing satisfaction points appears to be best achieved through a long string of rewards of twos that we can garner from mutual cooperation, rather than through temptation points of four gained by defecting.

So asking ‘What’s the best strategy?’ is the same as asking ‘Which of the strategies will ensure maximum points for me from the sum total of my transactions with others over a prolonged period of time?’ There does not appear to be a single answer. For example, my simple strategy could be to take you to the cleaners if I know that your strategy is to give at least two chances before retaliating. If there are enough suckers giving two chances apiece, I could make a living amassing four points twice off each godly man. After all we have a billion-strong population and I could get seriously rich exploiting the naive. On the other hand, if I knew the population is made of massive retaliators galore, I would go easy on my propensity to defect.

AXELROD’S EXPERIMENT AND THE GENTLEMAN STRATEGY

In the late 1970s, Robert Axelrod, the mathematician and political scientist best known for his work on the evolution of cooperation, investigated if there was any one strategy that outperformed the others.1

The experiment, structured as a competition, invited experts in game theory to send in algorithms which could maximize the total points in the iterative PD environment in the Brownian motion scenario, as delineated earlier. The experiment required each program to outline how it would respond to a Cooperate or Defect (C or D) move from another competitor, with the payoff defined as in Figure 3.2. The objective of the algorithm was to maximize the total number of points accumulated for itself over a large number of transactions. Here is a summary of the experiment.2

There were fourteen entries that qualified for the competition, with program lengths ranging from seventy-seven lines to a mere three words. Each program was effectively trying to compete with all the other programs by first ‘understanding’ the other’s behaviour before unleashing its specific strategy on the other. So here was each program trying to find chinks in the other’s armour, thrusting into those chinks or parrying; now enticing the other to continue cooperating while itself defecting to wrest those four temptation points; and now trying to guess what the next move of the other was going to be, so as to decide whether or not it was worth defecting; here trying to appear like a perfect gentleman who almost always cooperates except to defect once in a while to gain those four seductive points after a long series of twos; and there trying to exploit the ‘simpletons’ who were found to be naive enough to give two or more chances to the defector before retaliating, and so forth.

Axelrod made each strategy interact with every other strategy 200 times, and the tournament itself was run five times to smoothen out random statistical fluctuations in the outcomes, so that nobody could accuse his results of suffering from statistical vagaries.

At the end of this marathon competition, the program that won the tournament was the shortest of the many entries, comprising essentially three words—Tit for Tat—entered by the famous game theorist Alfred Rappaport. The Tit for Tat strategy was truly simple—‘Never be the first to defect; thereafter do what the other one did the last time.’

The Tit for Tat strategy always ‘plays’ a C to begin with. The response of the other party, say also a C, resulting in a C–C transaction, is stored in memory. The next time the same party is encountered, Tit for Tat plays a C again (remember, it never defects first) and so on, until it encounters a D against its C, resulting in a C–D transaction and stores that fact in memory. The next time it encounters the same party, it plays a D. But the strategy holds no permanent grudge. If the other party responds with a C this time, resulting in a D–C interaction, it plays a C the next time. On the other hand, if the other plays a D once again, Tit for Tat returns a D. At any future time, should the other party revert to C, in the move after that Tit for Tat returns a C, and so on. In short, Tit for Tat bears no long-time grudge. But it does seem to subscribe to that old adage ‘You cheat me once, shame on you; you cheat me twice, shame on me.’

TIT FOR TAT NEVER WINS

The Tit for Tat strategy never amasses more points than any one strategy at any point in time. In other words, it never ever wins against any other strategy. At best, it equals the score vis-à-vis another individual. This is because, by never defecting first, it is always the first to suffer a deficit of five points, by losing a point and giving away four points to a defector consequent to a C–D transaction. And when it defects in turn in the next move, it remains at a disadvantage if the other strategy continues to defect and all future transactions between the two degenerate to D–D type. On the other hand, if in the next move the other strategy reverts to cooperation while Tit for Tat defects in retaliation, it merely manages to equal the score by recouping the deficit of five.

Then how did Tit for Tat come out with top honours in Axelrod’s experiment? How is Tit for Tat a winning strategy if it never wins against any one strategy?

This is where we need to understand the overall objective of the tournament more deeply. The objective was to accumulate as many points as possible for oneself, taking all the interactions into account, and not to win against any one strategy or individual, or even score maximum number of wins against any, leave alone all, individuals.

THE GENTLEMAN STRATEGY

It does not take long to realize that Tit for Tat is a gentleman strategy. It never defects first. But it does get provoked to retaliate. It does not believe in ‘turning the other cheek’. Gentleman, yes; Gautam Buddha, no! A defecting individual is rebuked with a defection next time. So a defector soon learns that, unless he undoes the damage caused by his earlier defection, he cannot expect to profit any further from his interactions with Tit for Tat. Finally, it is also forgetting and forgiving. It holds no long-time grudge like the Never Again strategy. It has a memory only for the other’s last action and all earlier memory is erased, since the
earlier defections, if any, have already been ‘paid for’ by the retaliatory defections in the following moves. Lastly, it is not envious of others doing well, so long as it does well enough for itself. With all such attributes, one cannot but see that Tit for Tat is simple and truly a gentleman strategy.

To be nice and simple, to get provoked in the face of injustice, to forget and forgive and not be envious, all are gentlemanly traits. People with such traits do make more friends and go further ahead of others. Perhaps that answers why the Tit for Tat strategy came out on top.

A WINNING STRATEGY

Once you begin interacting with the Tit for Tat strategy, you soon see through its simplicity. You realize that you can always bank upon it not to defect first. You realize too that if you defect, for whatever reason, it will defect the next time and not cooperate until you correct the imbalance created by your defection. You learn that, if you wish to profit from Tit for Tat, it pays not to defect with it very early in the interactions. Suppose you are an occasional defaulter, where you cooperate most of the time but defect occasionally, say 10 per cent of the times. If you defect against a Tit for Tat, the consequence is that you are either up or, at worst, equal in score to Tit for Tat in due course. If you compensate for your defection by cooperating in the move after Tit for Tat punishes you, and go back to cooperation, you can hope to earn a lot of reward points of twos in the future together with Tit for Tat. But what if you run into a Never Again strategy? Clearly, you will never ever be able to earn any profit points from it after your first defection. This will not be the case if Tit for Tat and Never Again interact, since neither of them ever defects first and hence will earn a perennial stream of two points from each other, amassing a large number of points. But Never Again loses out to Tit for Tat because, when it comes up against the occasional defaulter, it stops all future interactions in a fit of massive retaliation. On the other hand when Tit for Tat comes up against a default from an occasional defaulter, it retaliates in the next move, but reverts to cooperation once the defaulter does the same, thus earning a profitable string of twos in the future.

So even though Tit for Tat may not win against any one single strategy, it manages to amass more points than most other strategies in the long run.

AGAIN AND AGAIN!

Axelrod did not stop there. He carried his experiment a good deal further using computer simulation. He ran a similar but larger tournament, where the competitors were provided with detailed results of the first tournament along with an explanation of the strategic qualities of niceness, simplicity, provocability and forgiveness that had made Tit for Tat the best strategy in the first tournament, so that experts could try and work on more sophisticated strategies that could outperform Tit for Tat. The new format of the tournament put the competing strategies through a variety of situations to ensure that a strategy was truly and genuinely robust and not a flash in the pan.

What happened in the subsequent tournament is remarkable again. Rappaport returned his original strategy, Tit for Tat, in its original form. Others turned in strategies like Tit for Two Tats. There were also some ‘villainous strategies’ that tried to amass points by preying upon the tolerance of the ‘good strategies’ like Tit for Two Tats. Then there were strategies that would almost never defect first, but use a random number generator to defect once in a while, say 1, 2, 5 or 10 per cent of the times, hoping their defection would go unpunished. In all, this tournament attracted sixty-six entries.

Taking a cue from biological evolution—survival of the fittest, which results in a greater number of progeny in the next generation for the successful species—Axelrod picked the best strategies from each simulation run, and increased the weightage of such strategies in the subsequent run of simulations. What does this imply? If a certain strategy happens to be robust, it may be reasonably assumed that the strategy will multiply itself faster in due course, as more and more people come to adopt it. Axelrod simulated this realistic scene by ensuring that the best strategies in one run of the tournament were made more numerous in the next generation of the tournament. By varying the proportions of the best strategies in the next generation, Axelrod was able to arbitrarily create a large number of environments in which various strategies competed with each other.

As the experiment took off, in the initial runs, it was found that many villainous strategies were ranked very high—in fact among the ‘best’, alongside the likes of Tit for Tat—in terms of amassing maximum number of points as they prospered by preying upon strategies such as Tit for Two Tats. As the experiments progressed to subsequent generations of simulations, with the proportion of the best strategies (which included the villainous ones) increasing in each successive generation of runs, it was found that in due course the villainous strategies started dropping out of the race.

The reasons, though a trifle perplexing, are not too difficult to fathom. To begin with, as the villainous strategies gained ground in the environment, the good strategies started thinning out. As the ‘exploitable’ good strategies began disappearing, the villains had less and less simpletons to exploit and consequently they too began to thin out. As successive runs of the tournament progressed, it turned out that strategies like Tit for Tat that were not exploitable and other similar strategies began to gain ground.

Not only did Tit for Tat emerge with top honours, its rate of growth in the successive generations was also among the highest. Yet, predictably, Tit for Tat never beat a single one of its competing programs!

TIT FOR TAT IN EVERYDAY LIFE

So what if the Tit for Tat strategy emerged numero uno in the iterative prisoner’s dilemma tournament? How does it relate to our daily lives? Let us take a simple situation. I spot a colleague walking towards me at some distance in the office corridor. While there is no law that requires us to greet each other, it is the polite thing to do. So I greet him but he, for some reason, looks through me. A ‘royal ignore’ is how I read it. In my mind, I ‘cooperated’ while he ‘defected’, so we have here a C–D like transaction.

The next time I encounter the same colleague, I have to decide whether or not to greet him. If I were a massive retaliator, I would ignore him forever. Alternatively, I could adopt the Tit for Tat strategy. Or else, I could follow the random strategy or ‘two chances’ strategy or any other strategy.

As colleagues we are socially called upon to decide whether or not to greet each other every time; hence we are in a sort of iterative PD situation. What if I had adopted the Never Again strategy? I would have ignored the colleague completely, no matter whether or not he greeted me again. Chances are after one or two such encounters he would stop greeting me. As a result, neither of us would ever be of much collegial comfort to each other.

On the other hand, if I were to adopt the Tit for Tat strategy, I would try to find a reason for my colleague’s action. As the light was behind me, he could not see my silhouette very clearly in the dark corridor. Or maybe he was preoccupied with some problem and wasn’t really looking at me carefully. Or he thought I was waving to somebody behind him and moments later he probably reflected on the possibility that I had after all waved to him, but before he could reciprocate my greeting, I had already turned the corner. So the next time we meet, I remain uneffusive while he, knowing his earlier omission, makes amends with a hearty greeting. I am now satisfied that he had not really ignored me the last time. The next time we pass each other again, we greet heartily and this continues.

If my colleague had deliberately ignored me the first time, he would have continued being stand-offish when we met next and I could have reciprocated with my own cold treatment (Tit for Tat), leading to a D–D behaviour forever.

You can see how Tit for Tat scores over Never Again, even though ‘morally’ both are similar in that they never defect first.

Never Again makes a strong value judgement on a fellow’s actions based on a single experience. On the other hand, Tit for Tat is more tolerant of human frailties and allows room for one-off
defections and always allows a defector to recover. Never Again is judgemental, while Tit for Tat is practical.

TIT FOR TAT IN POLITICAL LIFE

I have always been intrigued by the rather regrettable success of Tit for Tat in politics. It is common to see two politicians belonging to different parties go for each other’s throats. Yet, come elections, the two get together, like long-lost siblings in the final reel of a Bollywood flick, to form a coalition. Unfortunately, and mostly for all the wrong reasons, politicians understand the power of Tit for Tat far too well and never ever hold long grudges.

Let us explore in the next chapter whether competition can lead to cooperation. Before that, you may want to take a look at Appendix 2, which presents a digression on pseudo dilemmas.

CHAPTER 5

Can Competition Lead to Cooperation?

Sometimes cooperation emerges where it is least expected. During the First World War, the Western Front witnessed horrible battles. But between these battles, and even during them at other places along the 500-mile line in France and Belgium, the enemy soldiers often exercised considerable restraint. A British staff officer on a tour of the trenches remarked that he was:

. . . astonished to observe German soldiers walking about within rifle range behind their own line. Our men appeared to take no notice. I privately made up my mind to do away with that sort of thing when we took over; such things should not be allowed. These people evidently did not know there was a war on. Both sides apparently believed in the policy of ‘live and let live.’1

‹ Prev Next ›