Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx

Lecture Overview
Solving the prisoner’s dilemma
Instrumental rationality
Morality & norms
Repeated games
Three ways to solve the prisoner’s dilemma
Sequential games
Backward induction
Subgame perfect Nash equilibrium
Common knowledge of rationality
Mixed strategies
Game theory: underlying assumptions
Remember:
Homo economicus: instrumental rationality and preferences
Common knowledge of rationality and consistent alignment of
believes: given the same information individuals arrive at the
same decisions
Individuals know the rules of the game which are exogenously
given and independent of individuals’ choices
We will look at these one by one, analysing alternative
assumptions.

We will use the prisoner’s dilemma as example.
Why?
Coordination game with conflict
Arguably it describes many social situations, e.g. the free rider
problem:
Voting
Trade union affiliation
Wage cuts to increase profit
Domestic work
Prisoner’s dilemma
The homo economicus maximises his/her utility.
In a prisoner’s dilemma the dominant strategy is to confess
(defect).
Fallacy of compositions: what is individually rational is neither
Pareto optimal not socially rational.
But do people really defect?
Kant’s categorical imperative: not the outcome but the act is
crucial (morality)
Altruism: blood donation
Social norms: forest people hunting in the Congo (Turnbull
1963)

Gauthier: it is instrumentally rational to cooperate rather than to
defect
Assume there are two sorts of maximisers in the economy:
straight maximisers (SM) and constrained maximisers (CM);
SMs defect, CMs cooperate with other CMs:
E(return from CM) = p*(-1)+(1-p)*(-3)
E(return from SM) = -3
For any p>0 the CM
strategy is better than
the SM one.
Tit-for-tat
Unsurprisingly (maybe), in a repeated Prisoner’s dilemma the
best strategy is not to defect but to adopt a tit-for-tat strategy.
In the 1980s, Robert Axelrod invited professional game
theorists to enter strategies into a tournament of a repeated

game (200 times).
The winning strategy was tit-for-tat entered by Anatol
Rapaport:
Start off with cooperation
If opponent defects punish him/her by defecting
If opponent comes back to cooperation ‘forgive’ them and go
back to cooperation
Overall, forgiving and cooperative strategies did better.
Repeated games & reputation
A tit-for-tat strategy can only be played in repeated games.
The folk theorem states that in an infinitely repeated game (or
given uncertainty to the end of the game) any strategy with a
feasible payoff can be an equilibrium.
This is important for social interaction: the prisoner’s dilemma
can be overcome without (!) external authority.
Players enforce compliance (cooperate rather than defect)
through punishment.
The loss of future returns deters players from defecting.
The surprising thing about Axelrod’s tournament was that the
tit-for-tat strategy won in a finite (and defined) repeated game.

Solution
s to the prisoner’s dilemma
Authority (state, mafia etc.) imposes cooperation by changing
the payoff matrix (e.g. binding contracting).
Individuals are motivated by morality/norms (but: if some
individuals in society are motivated by morality it becomes
rational – according to Gauthier – to cooperate)
In repeated games with uncertainty about the end of the game
cooperation can be policed by the players through punishment
(e.g. defecting in future round). This insight is backed by the
Folk Theorem.
Remember this game?
This is the matrix form of a simultaneous game.
Are there any equilibria?
Yes, player A has a dominant strategy: ‘Top’.

The equilibrium in dominant strategy is (Top, Right).
This is simultaneously a Nash equilibrium.
Dynamic games & backward induction
LeftRightTop10, 41, 5Bottom9, 90, 3
Now imagine the game is played sequentially: A moves first.
To solve this game, we can use backward induction.
A: What will B do once I’ve played Top/Bottom?

10, 4
0, 3
9, 9
1, 5
Let’s play a game to demonstrate backward induction:
The race to 20
Your aims is to get to 20 first!
You can either pick up one pen or two pens.
Whoever gets the last pen wins!
Who will win?
Whoever gets to 2 first (the first player if she/he is aware of
backward induction).

From 2 you can get to 5, 8, 11, 14, 17, 20.
Backward induction helps us to identify ‘subgame perfect Nash
equilibria’.
A subgame is a segment of a dynamic game, if:
The subgame starts from a node
It branches out to the successor node
It must end at the pay-offs of the last node
The initial node must be a singleton (i.e. it is clear what the
previous move was).
While (Top, Right) was an equilibrium in dominant strategies, it
is not a subgame perfect Nash equilibrium (SPNE).
For backward induction we ask ‘what will player B do it A
plays strategy x, y, z.
But if it is not rational for player A to play bottom, why should
B’s reaction to a non-rational strategy be rational?

This is a game of imperfect information.
Common knowledge of rationality is not assumed. Therefore,
you can build a reputation for being rational/irrational, play
cooperative/defect etc.
R
R
R
C
C
C
110, 101
101
102

4
1
3
100
102
99
1
0
0
2
A reputation as ‘irrational’ individual might increase your pay-
off.
But your opponent might know that. He/she might suspect that
you’re just playing irrational.
If a player does not have perfect information about her
opponents strategy choices she will play a game of mixed
strategies (rather than pure strategy).
Assume C plays left with probability c and right with
probability (1-c). R plays top with probability r and
bottom with probability (1-r).

For R pay-offs are:
For C:
Row will be happy with his assigned probabilities if a small
change in r does not change his pay-off:
Same is true for Column.
Mixed strategies

Mixed strategies
These are best response curves.
The intersection shows a Nash equilibrium.
There are three ways how the prisoner’s dilemma can be
resolved:
enforcement by authority,
non-instrumental rationality and
the folk theorem.
In dynamic games we use backward induction to find subgame
perfect Nash equilibria.

If we drop common knowledge of rationality pay-off
change/might increase (reputation).
Given incomplete information we can play mixed strategies.
Summary
Varian (2006): chapters 24, 28 and 29.
Hargreaves Heap & Varoufakis (1995): Game theory: A critical
introduction, chapters 3, 5 and 6.
Required readings

Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx

Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx

More Related Content

Similar to Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx (20)

More from SHIVA101531 (20)

Recently uploaded (20)

Lecture OverviewSolving the prisoner’s dilemmaInstrumental r.docx