SlideShare a Scribd company logo
CPS 270: Artificial Intelligence
http://www.cs.duke.edu/courses/fall08/cps270/
Two-player, zero-sum, perfect-information
Games
Instructor: Vincent Conitzer
Game playing
• Rich tradition of creating game-playing programs in AI
• Many similarities to search
• Most of the games studied
– have two players,
– are zero-sum: what one player wins, the other loses
– have perfect information: the entire state of the game is known to
both players at all times
• E.g., tic-tac-toe, checkers, chess, Go, backgammon, …
• Will focus on these for now
• Recently more interest in other games
– Esp. games without perfect information; e.g., poker
• Need probability theory, game theory for such games
“Sum to 2” game
• Player 1 moves, then player 2, finally player 1 again
• Move = 0 or 1
• Player 1 wins if and only if all moves together sum to 2
Player 1
Player 2 Player 2
Player 1
-1
Player 1 Player 1 Player 1
0
0
0
0
0
0 0
1
1
1
1 1 1
1
-1 -1 1 -1 1 1 -1
Player 1’s utility is in the leaves; player 2’s utility is the negative of this
Backward induction (aka. minimax)
• From leaves upward, analyze best decision for player at
node, give node a value
– Once we know values, easy to find optimal action (choose best value)
Player 1
Player 2 Player 2
Player 1
-1
Player 1 Player 1 Player 1
0
0
0
0
0
0 0
1
1
1
1 1 1
1
-1 -1 1 -1 1 1 -1
-1 1 1 1
1
-1
1
Modified game
• From leaves upward, analyze best decision for
player at node, give node a value
Player 1
Player 2 Player 2
Player 1
-1
Player 1 Player 1 Player 1
0
0
0
0
0
0 0
1
1
1
1 1 1
1
-2 -3 4 -5 6 7 -8
-1 4 6 7
6
-1
6
A recursive implementation
• Value(state)
• If state is terminal, return its value
• If (player(state) = player 1)
– v := -infinity
– For each action
• v := max(v, Value(successor(state, action)))
– Return v
• Else
– v := infinity
– For each action
• v := min(v, Value(successor(state, action)))
– Return v
Space? Time?
Do we need to see all the leaves?
• Do we need to see the value of the question
mark here?
Player 1
Player 2 Player 2
Player 1
-1
Player 1 Player 1 Player 1
0
0
0
0
0
0 0
1
1
1
1 1 1
1
-2 ? 4
Do we need to see all the leaves?
• Do we need to see the values of the question
marks here?
Player 1
Player 2 Player 2
Player 1
-1
Player 1 Player 1 Player 1
0
0
0
0
0
0 0
1
1
1
1 1 1
1
-2 6 7
? ? -5 -8
Alpha-beta pruning
• Pruning = cutting off parts of the search tree
(because you realize you don’t need to look at
them)
– When we considered A* we also pruned large parts
of the search tree
• Maintain alpha = value of the best option for
player 1 along the path so far
• Beta = value of the best option for player 2
along the path so far
Pruning on beta
• Beta at node v is -1
• We know the value of node v is going to be at least
4, so the -1 route will be preferred
• No need to explore this node further
Player 1
Player 2 Player 2
Player 1
-1
Player 1 Player 1 Player 1
0
0
0
0
0
0 0
1
1
1
1 1 1
1
-2 ? 4
node v
Pruning on alpha
• Alpha at node w is 6
• We know the value of node w is going to be at most
-1, so the 6 route will be preferred
• No need to explore this node further
Player 1
Player 2 Player 2
Player 1
-1
Player 1 Player 1 Player 1
0
0
0
0
0
0 0
1
1
1
1 1 1
1
-2 6 7
? ? -5 -8
node w
Modifying recursive implementation
to do alpha-beta pruning
• Value(state, alpha, beta)
• If state is terminal, return its value
• If (player(state) = player 1)
– v := -infinity
– For each action
• v := max(v, Value(successor(state, action), alpha, beta))
• If v >= beta, return v
• alpha := max(alpha, v)
– Return v
• Else
– v := infinity
– For each action
• v := min(v, Value(successor(state, action), alpha, beta))
• If v <= alpha, return v
• beta := min(beta, v)
– Return v
Benefits of alpha-beta pruning
• Without pruning, need to examine O(bm) nodes
• With pruning, depends on which nodes we
consider first
• If we choose a random successor, need to
examine O(b3m/4) nodes
• If we manage to choose the best successor first,
need to examine O(bm/2) nodes
– Practical heuristics for choosing next successor to
consider get quite close to this
• Can effectively look twice as deep!
– Difference between reasonable and expert play
Repeated states
• As in search, multiple sequences of moves
may lead to the same state
• Again, can keep track of previously seen
states (usually called a transposition table
in this context)
– May not want to keep track of all previously seen
states…
Using evaluation functions
• Most games are too big to solve even with alpha-
beta pruning
• Solution: Only look ahead to limited depth
(nonterminal nodes)
• Evaluate nodes at depth cutoff by a heuristic
(aka. evaluation function)
• E.g., chess:
– Material value: queen worth 9 points, rook 5, bishop 3,
knight 3, pawn 1
– Heuristic: difference between players’ material values
Chess example
• White to move
Ki B
p R
R
p
p p
K
• Depth cutoff: 3 ply
– Ply = move by one player
Black
White
2
Kb7
Rd8+
Rxf8 Re8
-1
White
…
…
Chess (bad) example
• White to move
Ki B R
p
R
p
p p
K
• Depth cutoff: 3 ply
– Ply = move by one player
Black
White
2
Kb7
Rd8+
Rxf8 Re8
-1
White
…
…
Depth cutoff obscures fact that white R will be captured
Addressing this problem
• Try to evaluate whether nodes are
quiescent
– Quiescent = evaluation function will not
change rapidly in near future
– Only apply evaluation function to quiescent
nodes
• If there is an “obvious” move at a state,
apply it before applying evaluation function
Playing against suboptimal players
• Minimax is optimal against other minimax
players
• What about against players that play in
some other way?
Many-player, general-sum games
of perfect information
• Basic backward induction still works
– No longer called minimax
Player 1
Player 2 Player 2
Player 3 Player 3 Player 3 Player 3
0
0
0
0
0
0 0
1
1
1
1 1 1
1
(1,2,3) (3,4,2)
(1,2,3)
vector of numbers gives each player’s utility
What if other
players do not
play this way?
Games with random moves by “Nature”
• E.g., games with dice (Nature chooses dice roll)
• Backward induction still works…
– Evaluation functions now need to be cardinally right (not just ordinally)
– For two-player zero-sum games with random moves, can we generalize
alpha-beta? How? Player 1
Nature Nature
Player 2 Player 2 Player 2 Player 2
50%
0
0
40%
0
0 0
1
60%
50%
1 1 1
1
(1,3) (3,2)
(1,3)
(3,4) (1,2)
(3,4)
(2,3.5)
Games with imperfect information
• Players cannot necessarily see the whole current
state of the game
– Card games
• Ridiculously simple poker game:
– Player 1 receives King (winning) or Jack (losing),
– Player 1 can bet or stay,
– Player 2 can call or fold
• Dashed lines indicate
indistinguishable states
• Backward induction does not work, need random
strategies for optimality! (more later in course)
1 gets King 1 gets Jack
bet bet
stay stay
call fold call fold call fold call fold
“nature”
player 1
player 1
player 2 player 2
2 1 1 1 -2 -1
1 1
Intuition for need of random strategies
• Suppose my strategy is
“bet on King, stay on Jack”
– What will you do?
– What is your expected utility?
• What if my strategy is
“always bet”?
• What if my strategy is
“always bet when given
King, 10% of the time bet
when given Jack”?
1 gets King 1 gets Jack
bet bet
stay stay
call fold call fold call fold call fold
“nature”
player 1
player 1
player 2 player 2
2 1 1 1 -2 -1
1 1
The state of the art for some games
• Chess:
– 1997: IBM Deep Blue defeats Kasparov
– … there is still debate about whether computers are really better
• Checkers:
– Computer world champion since 1994
– … there was still debate about whether computers are really better…
– until 2007: checkers solved optimally by computer
• Go:
– Computers still not very good
– Branching factor really high
– Some recent progress
• Poker:
– Competitive with top humans in some 2-player games
– 3+ player case much less well-understood
Is this of any value to society?
• Some of the techniques developed for games
have found applications in other domains
– Especially “adversarial” settings
• Real-world strategic situations are usually not
two-player, perfect-information, zero-sum, …
• But game theory does not need any of those
• Example application: security scheduling at
airports

More Related Content

PPT
cps270_game_playing technology intelligence.ppt
PPT
Cards and combinatorics-security scheduling at airports
PPTX
PPTX
CptS 440/ 540 AI.pptx
PPT
It is an artificial document, please. regarding Ai topics
PPT
Game playing.ppt
PPT
GamePlaying numbert 23256666666666666662
PPT
ch_5 Game playing Min max and Alpha Beta pruning.ppt
cps270_game_playing technology intelligence.ppt
Cards and combinatorics-security scheduling at airports
CptS 440/ 540 AI.pptx
It is an artificial document, please. regarding Ai topics
Game playing.ppt
GamePlaying numbert 23256666666666666662
ch_5 Game playing Min max and Alpha Beta pruning.ppt

Similar to cps270_game_playing artificial intelligence.ppt (20)

PPT
PPT
GamePlaying.ppt
PDF
Games.4
PPTX
AI- to eat boiled egg and cheese Unit-II.pptx
PPT
AI subject - cps 570 _ game _ theory.ppt
PPT
artificial intelligence and Game development
PPT
AI.ppt
PPT
Artificial intelligence games
PPT
PPT
cs-171-07-Games and Adversarila Search.ppt
PPTX
Capgemini 1
PPTX
9SearchAdversarial (1).pptx
PPTX
Game Analysis, lecture 1
PPTX
Artificial intelligence dic_SLIDE_3.pptx
PPTX
Adversarial search
PPT
games, infosec, privacy, adversaries .ppt
PPT
Topic - 6 (Game Playing).ppt
PPT
484-7.ppt
PPT
Adversarial Search and Game-Playing .ppt
GamePlaying.ppt
Games.4
AI- to eat boiled egg and cheese Unit-II.pptx
AI subject - cps 570 _ game _ theory.ppt
artificial intelligence and Game development
AI.ppt
Artificial intelligence games
cs-171-07-Games and Adversarila Search.ppt
Capgemini 1
9SearchAdversarial (1).pptx
Game Analysis, lecture 1
Artificial intelligence dic_SLIDE_3.pptx
Adversarial search
games, infosec, privacy, adversaries .ppt
Topic - 6 (Game Playing).ppt
484-7.ppt
Adversarial Search and Game-Playing .ppt
Ad

More from Iftikhar70 (20)

PPT
m1-intro artificial intelligence for .ppt
PPTX
introductioartificial intelligencen.pptx
PPT
dcscw07aetificial intelligence for 1.ppt
PPT
Ch1-2 (artificial intelligence for 3).ppt
PPTX
توظيف ادوات الذكاء الاصطناعي في البحث العلمي الاعلامي - saad kadhim.pptx
PPTX
introduction technology technology tec.pptx
PPT
Chapter_9_Morphological_Image_Processing.ppt
PPTX
10_2020_12_10!09_25_23_AM technology.pptx
PPT
14580hffggcdfghcfgvcsdvbcdgbvcdgg968.ppt
PPT
1424403vfdfdfghljhhggvgfggffgddgd6trf.ppt
PPT
12_2017_09_17!02_48_41_Aengerrings M.ppt
PPTX
امن سيبر اني للحوسبة في الجزء الثاني.pptx
PPT
Alhadeff cloud computing cyber technology.ppt
PPT
Introduction to artificial intelligence.ppt
PPTX
777137036 image processing bacherde.pptx
PPT
m1-intro artificial intelligence tec.ppt
PPT
cps270_intro artificial intelligence.ppt
PPTX
في الذكاء الاصطناعي وتقنيه المعلوماتالعملي.pptx
PPT
chapter5-2 restoration and depredations.ppt
PPT
database-stucture-and-space-managment.ppt
m1-intro artificial intelligence for .ppt
introductioartificial intelligencen.pptx
dcscw07aetificial intelligence for 1.ppt
Ch1-2 (artificial intelligence for 3).ppt
توظيف ادوات الذكاء الاصطناعي في البحث العلمي الاعلامي - saad kadhim.pptx
introduction technology technology tec.pptx
Chapter_9_Morphological_Image_Processing.ppt
10_2020_12_10!09_25_23_AM technology.pptx
14580hffggcdfghcfgvcsdvbcdgbvcdgg968.ppt
1424403vfdfdfghljhhggvgfggffgddgd6trf.ppt
12_2017_09_17!02_48_41_Aengerrings M.ppt
امن سيبر اني للحوسبة في الجزء الثاني.pptx
Alhadeff cloud computing cyber technology.ppt
Introduction to artificial intelligence.ppt
777137036 image processing bacherde.pptx
m1-intro artificial intelligence tec.ppt
cps270_intro artificial intelligence.ppt
في الذكاء الاصطناعي وتقنيه المعلوماتالعملي.pptx
chapter5-2 restoration and depredations.ppt
database-stucture-and-space-managment.ppt
Ad

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
TLE Review Electricity (Electricity).pptx
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPT
Teaching material agriculture food technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Mushroom cultivation and it's methods.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
cloud_computing_Infrastucture_as_cloud_p
Spectral efficient network and resource selection model in 5G networks
Encapsulation_ Review paper, used for researhc scholars
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Weekly Chronicles - August'25-Week II
Univ-Connecticut-ChatGPT-Presentaion.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
TLE Review Electricity (Electricity).pptx
A comparative study of natural language inference in Swahili using monolingua...
Programs and apps: productivity, graphics, security and other tools
Teaching material agriculture food technology
Network Security Unit 5.pdf for BCA BBA.
A comparative analysis of optical character recognition models for extracting...
Mushroom cultivation and it's methods.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Tartificialntelligence_presentation.pptx
Machine learning based COVID-19 study performance prediction
Group 1 Presentation -Planning and Decision Making .pptx
cloud_computing_Infrastucture_as_cloud_p

cps270_game_playing artificial intelligence.ppt

  • 1. CPS 270: Artificial Intelligence http://www.cs.duke.edu/courses/fall08/cps270/ Two-player, zero-sum, perfect-information Games Instructor: Vincent Conitzer
  • 2. Game playing • Rich tradition of creating game-playing programs in AI • Many similarities to search • Most of the games studied – have two players, – are zero-sum: what one player wins, the other loses – have perfect information: the entire state of the game is known to both players at all times • E.g., tic-tac-toe, checkers, chess, Go, backgammon, … • Will focus on these for now • Recently more interest in other games – Esp. games without perfect information; e.g., poker • Need probability theory, game theory for such games
  • 3. “Sum to 2” game • Player 1 moves, then player 2, finally player 1 again • Move = 0 or 1 • Player 1 wins if and only if all moves together sum to 2 Player 1 Player 2 Player 2 Player 1 -1 Player 1 Player 1 Player 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 -1 -1 1 -1 1 1 -1 Player 1’s utility is in the leaves; player 2’s utility is the negative of this
  • 4. Backward induction (aka. minimax) • From leaves upward, analyze best decision for player at node, give node a value – Once we know values, easy to find optimal action (choose best value) Player 1 Player 2 Player 2 Player 1 -1 Player 1 Player 1 Player 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 -1 -1 1 -1 1 1 -1 -1 1 1 1 1 -1 1
  • 5. Modified game • From leaves upward, analyze best decision for player at node, give node a value Player 1 Player 2 Player 2 Player 1 -1 Player 1 Player 1 Player 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 -2 -3 4 -5 6 7 -8 -1 4 6 7 6 -1 6
  • 6. A recursive implementation • Value(state) • If state is terminal, return its value • If (player(state) = player 1) – v := -infinity – For each action • v := max(v, Value(successor(state, action))) – Return v • Else – v := infinity – For each action • v := min(v, Value(successor(state, action))) – Return v Space? Time?
  • 7. Do we need to see all the leaves? • Do we need to see the value of the question mark here? Player 1 Player 2 Player 2 Player 1 -1 Player 1 Player 1 Player 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 -2 ? 4
  • 8. Do we need to see all the leaves? • Do we need to see the values of the question marks here? Player 1 Player 2 Player 2 Player 1 -1 Player 1 Player 1 Player 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 -2 6 7 ? ? -5 -8
  • 9. Alpha-beta pruning • Pruning = cutting off parts of the search tree (because you realize you don’t need to look at them) – When we considered A* we also pruned large parts of the search tree • Maintain alpha = value of the best option for player 1 along the path so far • Beta = value of the best option for player 2 along the path so far
  • 10. Pruning on beta • Beta at node v is -1 • We know the value of node v is going to be at least 4, so the -1 route will be preferred • No need to explore this node further Player 1 Player 2 Player 2 Player 1 -1 Player 1 Player 1 Player 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 -2 ? 4 node v
  • 11. Pruning on alpha • Alpha at node w is 6 • We know the value of node w is going to be at most -1, so the 6 route will be preferred • No need to explore this node further Player 1 Player 2 Player 2 Player 1 -1 Player 1 Player 1 Player 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 -2 6 7 ? ? -5 -8 node w
  • 12. Modifying recursive implementation to do alpha-beta pruning • Value(state, alpha, beta) • If state is terminal, return its value • If (player(state) = player 1) – v := -infinity – For each action • v := max(v, Value(successor(state, action), alpha, beta)) • If v >= beta, return v • alpha := max(alpha, v) – Return v • Else – v := infinity – For each action • v := min(v, Value(successor(state, action), alpha, beta)) • If v <= alpha, return v • beta := min(beta, v) – Return v
  • 13. Benefits of alpha-beta pruning • Without pruning, need to examine O(bm) nodes • With pruning, depends on which nodes we consider first • If we choose a random successor, need to examine O(b3m/4) nodes • If we manage to choose the best successor first, need to examine O(bm/2) nodes – Practical heuristics for choosing next successor to consider get quite close to this • Can effectively look twice as deep! – Difference between reasonable and expert play
  • 14. Repeated states • As in search, multiple sequences of moves may lead to the same state • Again, can keep track of previously seen states (usually called a transposition table in this context) – May not want to keep track of all previously seen states…
  • 15. Using evaluation functions • Most games are too big to solve even with alpha- beta pruning • Solution: Only look ahead to limited depth (nonterminal nodes) • Evaluate nodes at depth cutoff by a heuristic (aka. evaluation function) • E.g., chess: – Material value: queen worth 9 points, rook 5, bishop 3, knight 3, pawn 1 – Heuristic: difference between players’ material values
  • 16. Chess example • White to move Ki B p R R p p p K • Depth cutoff: 3 ply – Ply = move by one player Black White 2 Kb7 Rd8+ Rxf8 Re8 -1 White … …
  • 17. Chess (bad) example • White to move Ki B R p R p p p K • Depth cutoff: 3 ply – Ply = move by one player Black White 2 Kb7 Rd8+ Rxf8 Re8 -1 White … … Depth cutoff obscures fact that white R will be captured
  • 18. Addressing this problem • Try to evaluate whether nodes are quiescent – Quiescent = evaluation function will not change rapidly in near future – Only apply evaluation function to quiescent nodes • If there is an “obvious” move at a state, apply it before applying evaluation function
  • 19. Playing against suboptimal players • Minimax is optimal against other minimax players • What about against players that play in some other way?
  • 20. Many-player, general-sum games of perfect information • Basic backward induction still works – No longer called minimax Player 1 Player 2 Player 2 Player 3 Player 3 Player 3 Player 3 0 0 0 0 0 0 0 1 1 1 1 1 1 1 (1,2,3) (3,4,2) (1,2,3) vector of numbers gives each player’s utility What if other players do not play this way?
  • 21. Games with random moves by “Nature” • E.g., games with dice (Nature chooses dice roll) • Backward induction still works… – Evaluation functions now need to be cardinally right (not just ordinally) – For two-player zero-sum games with random moves, can we generalize alpha-beta? How? Player 1 Nature Nature Player 2 Player 2 Player 2 Player 2 50% 0 0 40% 0 0 0 1 60% 50% 1 1 1 1 (1,3) (3,2) (1,3) (3,4) (1,2) (3,4) (2,3.5)
  • 22. Games with imperfect information • Players cannot necessarily see the whole current state of the game – Card games • Ridiculously simple poker game: – Player 1 receives King (winning) or Jack (losing), – Player 1 can bet or stay, – Player 2 can call or fold • Dashed lines indicate indistinguishable states • Backward induction does not work, need random strategies for optimality! (more later in course) 1 gets King 1 gets Jack bet bet stay stay call fold call fold call fold call fold “nature” player 1 player 1 player 2 player 2 2 1 1 1 -2 -1 1 1
  • 23. Intuition for need of random strategies • Suppose my strategy is “bet on King, stay on Jack” – What will you do? – What is your expected utility? • What if my strategy is “always bet”? • What if my strategy is “always bet when given King, 10% of the time bet when given Jack”? 1 gets King 1 gets Jack bet bet stay stay call fold call fold call fold call fold “nature” player 1 player 1 player 2 player 2 2 1 1 1 -2 -1 1 1
  • 24. The state of the art for some games • Chess: – 1997: IBM Deep Blue defeats Kasparov – … there is still debate about whether computers are really better • Checkers: – Computer world champion since 1994 – … there was still debate about whether computers are really better… – until 2007: checkers solved optimally by computer • Go: – Computers still not very good – Branching factor really high – Some recent progress • Poker: – Competitive with top humans in some 2-player games – 3+ player case much less well-understood
  • 25. Is this of any value to society? • Some of the techniques developed for games have found applications in other domains – Especially “adversarial” settings • Real-world strategic situations are usually not two-player, perfect-information, zero-sum, … • But game theory does not need any of those • Example application: security scheduling at airports