Recent news stories, picked up all over blogland, reported that Tit-for-Tat has been dethroned as the best strategy in iterated prisoners’ dilemma games. In a computer tournament, a team from Southampton University won with a new strategy, beating the Tit-for-Tat strategy for the first time.
Here’s the background. Prisoners’ Dilemma is a game with two players. Each player chooses a move, which is either Cooperate or Defect. Then the players reveal their moves to each other. If both sides Cooperate, they each get three points. If both Defect, they each get one point. If one player Cooperates and the other Defects, then the defector gets five points and the cooperator gets none. The game is interesting because no matter what one’s opponent does, one is better off chosing to Defect; but the most mutually beneficial result occurs when both players Cooperate.
Things get more interesting when you iterate the game, so that the same pair of players plays many times in a row. A player can then base its strategy on what the opponent has done recently, which changes the opponent’s incentives in an subtle ways. This game is an interesting abstract model of adversarial social relationships, so people are interested in understanding its strategy tradeoffs.
For at least twenty years, the best-looking strategy has been Tit-for-Tat, in which one starts out by Cooperating and then copies whatever action the opponent used last. This strategy offers an appealing combination of initial friendliness with measured retaliation for an opponent’s Defections. In tournaments among computer players, Tit-for-Tat won consistently.
But this year, the Southampton team unveiled a new strategy that won the latest tournament. Many commentators responded by declaring that Tit-for-Tat had been dethroned. But I think that conclusion is wrong, for reasons I’ll explain.
But first, let me explain the new Southampton strategy. (This is based on press accounts, but I’m confident that it’s at least pretty close to correct.) They entered many players in the tournament. Their players divide into two groups, which I’ll call Stars and Stooges. The Stars try to win the tournament, and the Stooges sacrifice themselves so the Stars can win. When facing a new opponent, one of these players starts out by making a distinctive sequence of moves. Southampton’s players watch for this distinctive sequence, which allows them to tell whether their opponents are other Southampton players. When two Southampton players are playing each other, they collude to maximize their scores (or at least the score of the Star(s), if any, among them). When a Star plays an outsider, it tries to score points normally; but when a Stooge plays an outsider, it always Defects, to minimize the opponent’s score. Thus the Stooges sacrifice themselves so that the Stars can win. And indeed, the final results show a few Stars at the top of the standings (above Tit-for-Tat players) and a group of Stooges near the bottom.
If we look more closely, the Southampton strategy doesn’t look so good. Apparently, Tit-for-Tat still scores higher than the average Southampton player – the sacrifice (in points) made by the Stooges is not fully recouped by the Stars. So Tit-for-Tat will still be the best strategy, both for a lone player, and for a team of players, assuming the goal is to maximize the sum of the team members’ scores. (Note that a team of Tit-for-Tat players doesn’t need to use the Southampton trick for recognizing fellow team members, since Tit-for-Tat players who play each other will always cooperate, which is the team-optimal thing to do.)
So it seems that all the Southampton folks discovered is a clever way to exploit the rules of this particular tournament, with its winner-take-all structure. That’s clever, but I don’t think it has much theoretical significance.
UPDATE (Friday 22 October): The comments on this post are particularly good.