The prisoner’s dilemma at 70 – at what we get wrong about it
Once upon a time, a pianist was arrested by the secret police and accused of spying. He was carrying sheets of paper covered with a mysterious code. Despite protesting that it was merely the sheet music for Beethoven’s Moonlight sonata, the poor man was marched to the cells. A couple of hours later, a sinister interrogator walked in. “You’d better tell us everything, comrade,” he announced with a thin smile. “We have caught your friend Beethoven. He is already talking.”
This sets up the most famous problem in game theory: the prisoner’s dilemma. The interrogator explains that if one man confesses and the other does not, the talkative prisoner will go free and the other will do 25 years in a gulag. If they both remain silent, they will each spend five years in prison. If they both confess, 20 years each. The dilemma is clear enough: each would do better to confess, regardless of what the other does; yet collectively they could profit by sticking together.
The dilemma is now 70 years old — it was developed in a simple mathematical form in 1950 by mathematicians Merrill Flood and Melvin Dresher and wrapped in a story by Albert Tucker. (My own retelling owes a debt to economists Avinash Dixit and Barry Nalebuff.)
Dresher, Flood and Tucker worked at the Rand think-tank. The prisoner’s dilemma distilled the tension between selfishness and co-operation into a potent form, making it emblematic of the risk of nuclear destruction and much more besides. The dilemma received a second burst of attention in 1981, after the publication of “The Evolution of Cooperation” by political scientist Robert Axelrod and evolutionary biologist William Hamilton. Their article is not only the most cited in political science, but as popular as the next three works put together.
I hope readers will forgive my dredging up such a venerable idea, because it remains relevant, instructive, and widely misunderstood. One common misunderstanding is that the problem is one of communication: if only the pianist and Beethoven could get together and agree a strategy, they’d figure out that they should stick together. Not so. Communication doesn’t solve anything. The attraction of teaming up is obvious; so is the temptation to betray. Those who believe talking helps much should watch Golden Balls, a game show based on a modified prisoner’s dilemma. What makes the show fun to watch is the emptiness of the promises contestants make to each other.
More problematic is the mistaken belief that the prisoner’s dilemma means we are doomed to selfish self-destruction. Moral philosophers have tied themselves in knots trying to refute it, to show that it is somehow rational to collaborate in a one-shot prisoner’s dilemma. It isn’t. Fortunately, most human interaction is not a one-shot prisoner’s dilemma. The 1981 paper — and subsequent book — may have pushed the pendulum too far in an optimistic direction. Prof Axelrod ran tournaments in which computer programs competed against each other, playing the prisoner’s dilemma hundreds of times. Repeating the game allows co-operation to be enforced through the threat of punishment — something game theorists had known since the 1950s. When Prof Axelrod enshrined that idea in a simple program called “Tit for Tat”, it routinely triumphed.
Tit for Tat responds to co-operation with co-operation, and betrayal with betrayal. Whatever you do to it, it does right back. Prof Axelrod highlighted the fact that although the program was tough, it was “nice” — it tried co-operation first. And he drew broader parallels, arguing that the success of the strategy explains why soldiers in the trenches of the first world war were able to agree informal ceasefires. His inspiring message was that in the worst possible circumstances, nice guys finish first — provided they have an inner steel.
But that goes too far. A simpler explanation of “live and let live” in the trenches is that popping up to shoot at the enemy is nothing like ratting out Beethoven. It is dangerous. One needs no game theory to explain why soldiers might prefer to lie low.
Prof Axelrod also set far too much store by Tit for Tat’s “niceness”. Other strategies prosper in prisoner’s dilemma tournaments, depending on details of the rules. Among them is “Pavlov”, a strategy that tries to exploit suckers and changes tactics when it encounters a punishing response. It can be co-operative, sure — but it is hardly “nice”.
Prisoner’s dilemmas do exist. The most pressing example today is climate change. Every nation and every individual benefits if others restrain their pollution, but we all prefer not to have to restrain our own. It would be foolish to hope that Tit for Tat will save the day here — and we don’t have to. We have tools available to us: domestically, taxes and regulations; internationally, treaties and alliances. Such tools change the incentives. We could and should be using them more. The pianist and his suspected accomplice were trapped. We are not. Unlike them, we can change the game.
Written for and first published in the Financial Times on 24 January 2020.