Game playing
Outline
• Optimal decisions
• α-β pruning
• Imperfect, real-time decisions
Games vs. search problems
• "Unpredictable" opponent  specifying a
move for every possible opponent reply
• Time limits  unlikely to find goal, must
approximate
Game tree (2-player,
deterministic, turns)
Optimal strategy
• Perfect play for deterministic games
• Minimax Value for a node n
• This definition is used recursively
• Idea: minimax value is the best achievable payoff
against best play
Minimax example
• Perfect play for deterministic games
• Minimax Decision at root: choose the action a that
lead to a maximal minimax value
• MAX is guaranteed for a utility which is at least the
minimax value – if he plays rationally.
Minimax algorithm
Properties of minimax
• Complete? Yes (if tree is finite)
• Optimal? Yes (against an optimal opponent)
• Time complexity? O(bm)
• Space complexity? O(bm) (depth-first exploration)
• For chess, b ≈ 35, m ≈100 for "reasonable" games
 exact solution completely infeasible
Multiplayer games
• Each node must hold a vector of values
• For example, for three players A, B, C (vA, vB, vC)
• The backed up vector at node n will always be the one
that maximizes the payoff of the player choosing at n
α-β pruning example
α-β pruning example
α-β pruning example
α-β pruning example
α-β pruning example
Properties of α-β
• Pruning does not affect final result
• Good move ordering improves effectiveness of pruning
• With "perfect ordering," time complexity = O(bm/2)
 doubles depth of search
• A simple example of the value of reasoning about which
computations are relevant (a form of metareasoning)
The α-β algorithm
The α-β algorithm
Why is it called α-β?
•  is the value of the best
(i.e., highest-value) choice
found so far for MAX at
any choice point along the
path to the root.
• If v is worse than , MAX
will avoid it
 prune that branch
•  is the value of the best
(i.e., lowest-value) choice
found so far for MIN at any
choice point along the path
for to the root.
Another example
5 7 10 3 1 2 9 9 8 2 9 3
How much do we gain?
 Assume a game tree of uniform branching factor b
 Minimax examines O(bh) nodes, so does alpha-beta in
the worst-case
 The gain for alpha-beta is maximum when:
• The MIN children of a MAX node are ordered in decreasing
backed up values
• The MAX children of a MIN node are ordered in increasing
backed up values
 Then alpha-beta examines O(bh/2) nodes [Knuth and Moore, 1975]
 But this requires an oracle (if we knew how to order nodes
perfectly, we would not need to search the game tree)
 If nodes are ordered at random, then the average
number of nodes examined by alpha-beta is ~O(b3h/4)
Heuristic Ordering of Nodes
 Order the nodes below the root according to
the values backed-up at the previous iteration
 Order MAX (resp. MIN) nodes in decreasing
(increasing) values of the evaluation function
computed at these nodes
Games of imperfect information
• Minimax and alpha-beta pruning require
too much leaf-node evaluations.
• May be impractical within a reasonable
amount of time.
• SHANNON (1950):
– Cut off search earlier (replace TERMINAL-
TEST by CUTOFF-TEST)
– Apply heuristic evaluation function EVAL
(replacing utility function of alpha-beta)
Cutting off search
• Change:
– if TERMINAL-TEST(state) then return
UTILITY(state)
into
– if CUTOFF-TEST(state,depth) then return EVAL(state)
• Introduces a fixed-depth limit depth
– Is selected so that the amount of time will not exceed what
the rules of the game allow.
• When cutoff occurs, the evaluation is
performed.
Heuristic EVAL
• Idea: produce an estimate of the expected
utility of the game from a given position.
• Performance depends on quality of EVAL.
• Requirements:
– EVAL should order terminal-nodes in the same way
as UTILITY.
– Computation may not take too long.
– For non-terminal states the EVAL should be
strongly correlated with the actual chance of
winning.
• Only useful for quiescent (no wild swings in
value in near future) states
Heuristic EVAL example
Eval(s) = w1 f1(s) + w2 f2(s) + … + wnfn(s)
Heuristic EVAL example
Eval(s) = w1 f1(s) + w2 f2(s) + … + wnfn(s)
Addition assumes
independence
Heuristic difficulties
Heuristic counts pieces won
Horizon effect
Fixed depth search
Makes black think
it can avoid the
queening move of
White pawn
Games that include chance
• Possible moves (5-10,5-11), (5-11,19-24),(5-
10,10-16) and (5-11,11-16)
Games that include chance
• Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16)
and (5-11,11-16)
• [1,1], [6,6] chance 1/36, all other chance 1/18
chance nodes
Games that include chance
• [1,1], [6,6] chance 1/36, all other chance 1/18
• Can not calculate definite minimax value, only
expected value
Expected minimax value
EXPECTED-MINIMAX-VALUE(n)=
UTILITY(n) if n is a terminal
maxs  successors(n) MINIMAX-VALUE(s) if n is a max node
mins  successors(n) MINIMAX-VALUE(s) if n is a max node
s  successors(n) P(s) . EXPECTEDMINIMAX(s) if n is a chance
node
These equations can be backed-up
recursively all the way to the root of the
game tree.
Position evaluation with chance
nodes
• Left, A1 wins
• Right A2 wins
• Outcome of evaluation function may not change when
values are scaled differently.
• Behavior is preserved only by a positive linear
transformation of EVAL.
State-of-the-Art
Checkers: Tinsley vs. Chinook
Name: Marion Tinsley
Profession: Teach mathematics
Hobby: Checkers
Record: Over 42 years
loses only 3 games
of checkers
World champion for over 40
years
Mr. Tinsley suffered his 4th and 5th losses against Chinook
Chinook
First computer to become official world champion of Checkers!
Has all endgame table for 10 pieces or less: over 39 trillion
entries.
Chess: Kasparov vs. Deep Blue
Kasparov
5’10”
176 lbs
34 years
50 billion neurons
2 pos/sec
Extensive
Electrical/chemical
Enormous
Height
Weight
Age
Computers
Speed
Knowledge
Power Source
Ego
Deep Blue
6’ 5”
2,400 lbs
4 years
32 RISC processors
+ 256 VLSI chess engines
200,000,000 pos/sec
Primitive
Electrical
None
1997: Deep Blue wins by 3 wins, 1 loss, and 2 draws
Chess: Kasparov vs. Deep Junior
August 2, 2003: Match ends in a 3/3 tie!
Deep Junior
8 CPU, 8 GB RAM, Win 2000
2,000,000 pos/sec
Available at $100
Othello: Murakami vs. Logistello
Takeshi Murakami
World Othello Champion
1997: The Logistello software crushed Murakami
by 6 games to 0
Backgammon
• 1995 TD-Gammon by Gerald Thesauro
won world championship on 1995
• BGBlitz won 2008 computer backgammon
olympiad
Go: Goemate vs. ??
Name: Chen Zhixing
Profession: Retired
Computer skills:
self-taught programmer
Author of Goemate (arguably the
best Go program available today)
Gave Goemate a 9 stone
handicap and still easily
beat the program,
thereby winning $15,000
Go: Goemate vs. ??
Name: Chen Zhixing
Profession: Retired
Computer skills:
self-taught programmer
Author of Goemate (arguably the
strongest Go programs)
Gave Goemate a 9 stone
handicap and still easily
beat the program,
thereby winning $15,000
Jonathan Schaeffer
Go has too high a branching factor
for existing search techniques
Current and future software must
rely on huge databases and pattern-
recognition techniques
– March 2016
• Developed by Google DeepMind in London to
play the board game Go.
• Plays full 19x19 games
• October 2015: the distributed version of
AlphaGo defeated the European Go champion
Fan Hui - five to zero
• March 2016 AlphaGo played South Korean
professional Go player Lee Sedol, ranked 9-dan,
one of the best Go players – four to one.
• A significant breakthrough in AI research!!!
Secrets
 Many game programs are based on alpha-beta +
iterative deepening + extended/singular search +
transposition tables + huge databases + ...
 For instance, Chinook searched all checkers
configurations with 8 pieces or less and created an
endgame database of 444 billion board
configurations
 The methods are general, but their implementation
is dramatically improved by many specifically
tuned-up enhancements (e.g., the evaluation
functions)
Perspective on Games: Con and Pro
Chess is the Drosophila of
artificial intelligence. However,
computer chess has developed
much as genetics might have if
the geneticists had concentrated
their efforts starting in 1910 on
breeding racing Drosophila. We
would have some science, but
mainly we would have very fast
fruit flies.
John McCarthy
Saying Deep Blue doesn’t
really think about chess
is like saying an airplane
doesn't really fly because
it doesn't flap its wings.
Drew McDermott
Other Types of Games
 Multi-player games, with alliances or not
 Games with randomness in successor
function (e.g., rolling a dice)
 Expectminimax algorithm
 Games with partially observable states
(e.g., card games)
 Search of belief state spaces
See R&N p. 175-180

More Related Content

PPT
Cards and combinatorics-security scheduling at airports
PDF
Games.4
PPT
GamePlaying numbert 23256666666666666662
PPT
cps270_game_playing technology intelligence.ppt
PPT
It is an artificial document, please. regarding Ai topics
PPT
Game playing.ppt
PPT
ch_5 Game playing Min max and Alpha Beta pruning.ppt
PPT
M6 game
Cards and combinatorics-security scheduling at airports
Games.4
GamePlaying numbert 23256666666666666662
cps270_game_playing technology intelligence.ppt
It is an artificial document, please. regarding Ai topics
Game playing.ppt
ch_5 Game playing Min max and Alpha Beta pruning.ppt
M6 game

Similar to GamePlaying.ppt (20)

PPT
cps270_game_playing artificial intelligence.ppt
PPT
Adversarial Search and Game-Playing .ppt
PPT
cs-171-07-Games and Adversarila Search.ppt
PPTX
AI subject - Game Theory and cps ppt pptx
PPTX
Adversarial search
PPT
PPTX
PPT
6 games
PPT
Badiya haihn
PPT
PPTX
CptS 440/ 540 AI.pptx
PPTX
Game playing
PDF
Adversarial search
PPT
games, infosec, privacy, adversaries .ppt
PDF
Lecture 6 - Game playinffffffffffffffffffffffffg.pdf
PPTX
Artificial intelligence dic_SLIDE_3.pptx
PPTX
Adversarial search with Game Playing
PDF
12 adversal search
PDF
GAME THEORY AND MONTE CARLO SEARCH SPACE TREE
PDF
Implementation and analysis of search algorithms in single player connect fou...
cps270_game_playing artificial intelligence.ppt
Adversarial Search and Game-Playing .ppt
cs-171-07-Games and Adversarila Search.ppt
AI subject - Game Theory and cps ppt pptx
Adversarial search
6 games
Badiya haihn
CptS 440/ 540 AI.pptx
Game playing
Adversarial search
games, infosec, privacy, adversaries .ppt
Lecture 6 - Game playinffffffffffffffffffffffffg.pdf
Artificial intelligence dic_SLIDE_3.pptx
Adversarial search with Game Playing
12 adversal search
GAME THEORY AND MONTE CARLO SEARCH SPACE TREE
Implementation and analysis of search algorithms in single player connect fou...
Ad

Recently uploaded (20)

PDF
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PDF
Abrasive, erosive and cavitation wear.pdf
PDF
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PPTX
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
PPTX
Software Engineering and software moduleing
PPTX
Information Storage and Retrieval Techniques Unit III
PPTX
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
737-MAX_SRG.pdf student reference guides
PPTX
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
PPT
Total quality management ppt for engineering students
PDF
Design Guidelines and solutions for Plastics parts
PPTX
"Array and Linked List in Data Structures with Types, Operations, Implementat...
PPTX
Feature types and data preprocessing steps
PPTX
Current and future trends in Computer Vision.pptx
PPTX
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
PPTX
Management Information system : MIS-e-Business Systems.pptx
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
III.4.1.2_The_Space_Environment.p pdffdf
Abrasive, erosive and cavitation wear.pdf
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
Software Engineering and software moduleing
Information Storage and Retrieval Techniques Unit III
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
Fundamentals of safety and accident prevention -final (1).pptx
737-MAX_SRG.pdf student reference guides
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
Total quality management ppt for engineering students
Design Guidelines and solutions for Plastics parts
"Array and Linked List in Data Structures with Types, Operations, Implementat...
Feature types and data preprocessing steps
Current and future trends in Computer Vision.pptx
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
Management Information system : MIS-e-Business Systems.pptx
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
Ad

GamePlaying.ppt

  • 2. Outline • Optimal decisions • α-β pruning • Imperfect, real-time decisions
  • 3. Games vs. search problems • "Unpredictable" opponent  specifying a move for every possible opponent reply • Time limits  unlikely to find goal, must approximate
  • 5. Optimal strategy • Perfect play for deterministic games • Minimax Value for a node n • This definition is used recursively • Idea: minimax value is the best achievable payoff against best play
  • 6. Minimax example • Perfect play for deterministic games • Minimax Decision at root: choose the action a that lead to a maximal minimax value • MAX is guaranteed for a utility which is at least the minimax value – if he plays rationally.
  • 8. Properties of minimax • Complete? Yes (if tree is finite) • Optimal? Yes (against an optimal opponent) • Time complexity? O(bm) • Space complexity? O(bm) (depth-first exploration) • For chess, b ≈ 35, m ≈100 for "reasonable" games  exact solution completely infeasible
  • 9. Multiplayer games • Each node must hold a vector of values • For example, for three players A, B, C (vA, vB, vC) • The backed up vector at node n will always be the one that maximizes the payoff of the player choosing at n
  • 15. Properties of α-β • Pruning does not affect final result • Good move ordering improves effectiveness of pruning • With "perfect ordering," time complexity = O(bm/2)  doubles depth of search • A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)
  • 18. Why is it called α-β? •  is the value of the best (i.e., highest-value) choice found so far for MAX at any choice point along the path to the root. • If v is worse than , MAX will avoid it  prune that branch •  is the value of the best (i.e., lowest-value) choice found so far for MIN at any choice point along the path for to the root.
  • 19. Another example 5 7 10 3 1 2 9 9 8 2 9 3
  • 20. How much do we gain?  Assume a game tree of uniform branching factor b  Minimax examines O(bh) nodes, so does alpha-beta in the worst-case  The gain for alpha-beta is maximum when: • The MIN children of a MAX node are ordered in decreasing backed up values • The MAX children of a MIN node are ordered in increasing backed up values  Then alpha-beta examines O(bh/2) nodes [Knuth and Moore, 1975]  But this requires an oracle (if we knew how to order nodes perfectly, we would not need to search the game tree)  If nodes are ordered at random, then the average number of nodes examined by alpha-beta is ~O(b3h/4)
  • 21. Heuristic Ordering of Nodes  Order the nodes below the root according to the values backed-up at the previous iteration  Order MAX (resp. MIN) nodes in decreasing (increasing) values of the evaluation function computed at these nodes
  • 22. Games of imperfect information • Minimax and alpha-beta pruning require too much leaf-node evaluations. • May be impractical within a reasonable amount of time. • SHANNON (1950): – Cut off search earlier (replace TERMINAL- TEST by CUTOFF-TEST) – Apply heuristic evaluation function EVAL (replacing utility function of alpha-beta)
  • 23. Cutting off search • Change: – if TERMINAL-TEST(state) then return UTILITY(state) into – if CUTOFF-TEST(state,depth) then return EVAL(state) • Introduces a fixed-depth limit depth – Is selected so that the amount of time will not exceed what the rules of the game allow. • When cutoff occurs, the evaluation is performed.
  • 24. Heuristic EVAL • Idea: produce an estimate of the expected utility of the game from a given position. • Performance depends on quality of EVAL. • Requirements: – EVAL should order terminal-nodes in the same way as UTILITY. – Computation may not take too long. – For non-terminal states the EVAL should be strongly correlated with the actual chance of winning. • Only useful for quiescent (no wild swings in value in near future) states
  • 25. Heuristic EVAL example Eval(s) = w1 f1(s) + w2 f2(s) + … + wnfn(s)
  • 26. Heuristic EVAL example Eval(s) = w1 f1(s) + w2 f2(s) + … + wnfn(s) Addition assumes independence
  • 28. Horizon effect Fixed depth search Makes black think it can avoid the queening move of White pawn
  • 29. Games that include chance • Possible moves (5-10,5-11), (5-11,19-24),(5- 10,10-16) and (5-11,11-16)
  • 30. Games that include chance • Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and (5-11,11-16) • [1,1], [6,6] chance 1/36, all other chance 1/18 chance nodes
  • 31. Games that include chance • [1,1], [6,6] chance 1/36, all other chance 1/18 • Can not calculate definite minimax value, only expected value
  • 32. Expected minimax value EXPECTED-MINIMAX-VALUE(n)= UTILITY(n) if n is a terminal maxs  successors(n) MINIMAX-VALUE(s) if n is a max node mins  successors(n) MINIMAX-VALUE(s) if n is a max node s  successors(n) P(s) . EXPECTEDMINIMAX(s) if n is a chance node These equations can be backed-up recursively all the way to the root of the game tree.
  • 33. Position evaluation with chance nodes • Left, A1 wins • Right A2 wins • Outcome of evaluation function may not change when values are scaled differently. • Behavior is preserved only by a positive linear transformation of EVAL.
  • 35. Checkers: Tinsley vs. Chinook Name: Marion Tinsley Profession: Teach mathematics Hobby: Checkers Record: Over 42 years loses only 3 games of checkers World champion for over 40 years Mr. Tinsley suffered his 4th and 5th losses against Chinook
  • 36. Chinook First computer to become official world champion of Checkers! Has all endgame table for 10 pieces or less: over 39 trillion entries.
  • 37. Chess: Kasparov vs. Deep Blue Kasparov 5’10” 176 lbs 34 years 50 billion neurons 2 pos/sec Extensive Electrical/chemical Enormous Height Weight Age Computers Speed Knowledge Power Source Ego Deep Blue 6’ 5” 2,400 lbs 4 years 32 RISC processors + 256 VLSI chess engines 200,000,000 pos/sec Primitive Electrical None 1997: Deep Blue wins by 3 wins, 1 loss, and 2 draws
  • 38. Chess: Kasparov vs. Deep Junior August 2, 2003: Match ends in a 3/3 tie! Deep Junior 8 CPU, 8 GB RAM, Win 2000 2,000,000 pos/sec Available at $100
  • 39. Othello: Murakami vs. Logistello Takeshi Murakami World Othello Champion 1997: The Logistello software crushed Murakami by 6 games to 0
  • 40. Backgammon • 1995 TD-Gammon by Gerald Thesauro won world championship on 1995 • BGBlitz won 2008 computer backgammon olympiad
  • 41. Go: Goemate vs. ?? Name: Chen Zhixing Profession: Retired Computer skills: self-taught programmer Author of Goemate (arguably the best Go program available today) Gave Goemate a 9 stone handicap and still easily beat the program, thereby winning $15,000
  • 42. Go: Goemate vs. ?? Name: Chen Zhixing Profession: Retired Computer skills: self-taught programmer Author of Goemate (arguably the strongest Go programs) Gave Goemate a 9 stone handicap and still easily beat the program, thereby winning $15,000 Jonathan Schaeffer Go has too high a branching factor for existing search techniques Current and future software must rely on huge databases and pattern- recognition techniques
  • 43. – March 2016 • Developed by Google DeepMind in London to play the board game Go. • Plays full 19x19 games • October 2015: the distributed version of AlphaGo defeated the European Go champion Fan Hui - five to zero • March 2016 AlphaGo played South Korean professional Go player Lee Sedol, ranked 9-dan, one of the best Go players – four to one. • A significant breakthrough in AI research!!!
  • 44. Secrets  Many game programs are based on alpha-beta + iterative deepening + extended/singular search + transposition tables + huge databases + ...  For instance, Chinook searched all checkers configurations with 8 pieces or less and created an endgame database of 444 billion board configurations  The methods are general, but their implementation is dramatically improved by many specifically tuned-up enhancements (e.g., the evaluation functions)
  • 45. Perspective on Games: Con and Pro Chess is the Drosophila of artificial intelligence. However, computer chess has developed much as genetics might have if the geneticists had concentrated their efforts starting in 1910 on breeding racing Drosophila. We would have some science, but mainly we would have very fast fruit flies. John McCarthy Saying Deep Blue doesn’t really think about chess is like saying an airplane doesn't really fly because it doesn't flap its wings. Drew McDermott
  • 46. Other Types of Games  Multi-player games, with alliances or not  Games with randomness in successor function (e.g., rolling a dice)  Expectminimax algorithm  Games with partially observable states (e.g., card games)  Search of belief state spaces See R&N p. 175-180