Resolution of the stochastic strategy spatial prisoner’s dilemma by means of particle swarm optimization

Resolution of the Stochastic Strategy
Spatial Prisoner’s Dilemma by Means of
Particle Swarm Optimization
Zhang J, Zhang C, Chu T, Perc M
PLoS ONE 6(7): e21787. doi:10.1371/journal.pone.0021787 (2011)
Presented by: Stephen Daedalus E. Separa

Outline
● abstract
● prisoner’s dilemma
● particle swarm optimization
● methods
● results
● summary

Abstract
We study the evolution of cooperation among selfish individuals in the stochastic strategy spatial
prisoner’s dilemma game. We equip players with the particle swarm optimization technique, and find that it
may lead to highly cooperative states even if the temptations to defect are strong. The concept of
particle swarm optimization was originally introduced within a simple model of social dynamics that can
describe the formation of a swarm, i.e., analogous to a swarm of bees searching for a food source.
Essentially, particle swarm optimization foresees changes in the velocity profile of each player, such that the
best locations are targeted and eventually occupied. In our case, each player keeps track of the highest payoff
attained within a local topological neighborhood and its individual highest payoff. Thus, players make use of
their own memory that keeps score of the most profitable strategy in previous actions, as well as use
of the knowledge gained by the swarm as a whole, to find the best available strategy for themselves
and the society. Following extensive simulations of this setup, we find a significant increase in the level of
cooperation for a wide range of parameters, and also a full resolution of the prisoner’s dilemma. We
also demonstrate extreme efficiency of the optimization algorithm when dealing with environments that
strongly favor the proliferation of defection, which in turn suggests that swarming could be an important
phenomenon by means of which cooperation can be sustained even under highly unfavorable
conditions. We thus present an alternative way of understanding the evolution of cooperative behavior
and its ubiquitous presence in nature, and we hope that this study will be inspirational for future efforts
aimed in this direction.

Prisoner’s dilemma
Cooperate Defect
Cooperate R, R S, T
Defect T, S P, P
● Cooperate (stay silent), Defect (betray other prisoner)
● Payoff: R - reward, T - temptation, S - “sucker’s”, P - punishment
● T > R > P > S
● R > P mutual cooperation is superior to mutual defection
● T > R and P > S defection is the dominant strategy for both
● mutual defection is the only strong Nash equilibrium
● Developed by Merrill Flood and Melvin Dresher at RAND Corporation, later
formalized by Albert W. Tucker (mathematician)
Prisoner's dilemma, https://en.wikipedia.org/wiki/Prisoner’s_dilemma (last visited Aug. 23, 2015)

Particle swarm optimization algorithm
velocity update rule
global best current best
● Nature-inspired optimization algorithm based on swarming behavior
● schools of fishes, herding in quadrupeds, flocking in birds, ant and bee colonies
● forming swarms for: migration, foraging, evading preys
● algorithm has deterministic and stochastic components
● each particle fully aware of the other particle’s positions and swarm’s global best
Kennedy J, Eberhart R (1995) Particle swarm optimization. IEEE International Conference on Neural Networks, Piscataway, volume 4. pp 1942–1948
Xin-She Yang, Introduction to Mathematical Optimization: From Linear Programming to Metaheuristics, Cambridge International Science Press: Cambridge, UK (2008)
Particle swarm optimization, https://en.wikipedia.org/wiki/Particle_swarm_optimization (last visited Aug. 23, 2015)

Methods
payoff matrix for
pairwise interaction
(cooperation C or
defection D), between
it’s 4-nearest
neighbors
velocity
update
strategy
update
average level
of cooperation

Average level of cooperation in dependence on b for
different values of ω
● imitating best performing player
in the swarm (ω → 0) beneficial
in low b environments
● imitating personal success (ω →
1) better for evolution of
cooperation in high b
environments
● each data point is an average of
the final outcome of 100
simulations

Distribution of strategies in the whole population, as
obtained for different combinations of b and ω
● ω = 0.01, game is fully
dominated by two strategies, W
= 0 (full defection) and W = 1
(full cooperation)
● ω = 0.99, full spectrum of
strategies are available
● vertical axis: probability that
strategy W is present in the
population

Characteristic spatial distributions of strategies, as
● for low values of ω, clustering of
cooperators are observed, game
is dominated by two strategies W
= 0 and W = 1.
● for high values of ω, clustering of
cooperators less pronounced,
more strategies W are observed
● survival of cooperators at high b

Characteristic spatial distributions of velocities, as
● at ω = 0.99, population becomes
a “swarm”, with velocities ~ 0,
regardless of b. stationary state
has been reached by means of
adaptive locally inspired slow
strategy changes
● at ω = 0.01, only a few clusters
can be considered as “swarms”.
many pursue highest payoff but
are unable to attain it. swarming
is an important factor promoting
cooperation at high b

Summary
● impact of particle swarm optimization on the evolution of cooperation in stochastic strategy spatial prisoner’s
dilemma game
● strategy update guided by individual memory and swarm knowledge
● cooperative behavior can prevail
● resolution of prisoner’s dilemma game to prosocial behavior
● most profitable swarm strategy at moderate b lead to full dominance of cooperation
● best individual strategy at high b lead to survival of cooperative behavior
● spatial distributions of strategies and velocities

Resolution of the stochastic strategy spatial prisoner’s dilemma by means of particle swarm optimization

More Related Content

Similar to Resolution of the stochastic strategy spatial prisoner’s dilemma by means of particle swarm optimization (20)

Recently uploaded (20)

Resolution of the stochastic strategy spatial prisoner’s dilemma by means of particle swarm optimization