SlideShare a Scribd company logo
Improving the Performance of
MCTS-Based μRTS Agents
Through Move Pruning
Abdessamed Ouessai (University of Mascara, Algeria)
Mohammed Salem (University of Mascara, Algeria)
Antonio M. Mora (University of Granada, Spain)
INDEX
◦ RTS and μRTS
◦ Problem Statement
◦ Monte Carlo Tree Search
◦ Move Pruning
◦ Experiments and Results
◦ Conclusions
2
RTS and μRTS
Main features
1
3
REAL-TIME STRATEGY GAMES
4
REAL-TIME STRATEGY GAMES
5
Features:
◦ Real-Time (10~50 decision cycles per second)
◦ Simultaneous actions for different units
◦ Durative actions (more than one decision cycle)
◦ Deterministic / non deterministic
◦ Partially / fully observable map
REAL-TIME STRATEGY GAMES
6
◦ Commercial RTS games were not
conceived with AI research in mind.
◦ There are no adequate APIs (normally),
just independent solutions such as
BWAPI (for Starcraft) or Wargus (for
Warcraft).
◦ In 2016 it was launched an official AI API
for StarCraft II from DeepMind® and
Blizzard®
◦ There are also other solutions:
◦ OpenRTS
◦ SparCraft
◦ μRTS
μRTS
7
◦ Open-source simulator (in Java) created by Santiago Ontañón.
◦ Simple, fast and lightweight.
◦ Designed for research.
◦ Extensible and scalable.
Unit-Actions
• move
• attack
• harvest
• produce
Player-Action
Combination of
unit-actions
Problem Statement
Creating autonomous agents competitive on playing RTS games
2
8
COMPETITIVE AGENT
The aim is to create a human challenging agent for RTS games.
It should manage:
◦ Several units simultaneously
◦ Very short decision time
◦ Growing map (if it is not known)
◦ Growing amount of units
9
COMPETITIVE AGENT
It should also deal with:
Exponential increase of possible decisions/actions
Growing Branching factor in the decision tree of the agent
10
Chess Go StarCraft
Branching Factor 36 180 1050
State Space 1047 10171 101685
MCTS
Monte Carlo Tree Search
3
11
MCTS
Heuristic search algorithm.
Very effective method in games such as Go (AlphaGo) or chess.
12
Selection is
applied
recursively until a
leaf is reached
One or more
nodes are
created
One random
simulated game
is played
Repeat N times
The result of
the game is
backpropagated
MCTS
However MCTS has a scalability problem:
◦ In RTS the space of states (decision space) is huge
◦ MCTS struggles with this space
◦ When the branching factor grows past a certain threshold
MCTS becomes a worse option
◦ Just suitable for small maps or tactical planning
13
MCTS
Solutions:
◦ Abstracting the decision space  consider coarser actions
>> tactical performance is worse <<
◦ Reduce the decision space  remove some actions (decisions)
14
We propose reduce this space by
pruning unnecessary and
detrimental player-actions
Move Pruning
To reduce the branching factor in the search
4
15
MOVE PRUNING
To improve MCTS performance in growing decision spaces:
◦ The aim is to reduce the branching factor on the decision tree
◦ The algorithm won’t explore irrelevant or unfavourable actions
◦ Focus on the so-called Inactive Player Actions (IPAs)  keep
one or more units in inactive state
◦ Explore more promising player-actions
◦ Saved time will be used to improve the playing performance
16
MOVE PRUNING
To improve MCTS performance in growing decision spaces:
◦ Focus on the so-called Inactive Player Actions (IPAs)  keep
one or more units in inactive state
17
Unit-actions in μRTS
MOVE PRUNING
Wait is useful in situations such as trapped unit or tactical waiting.
So we cannot prune completely these actions.
Four Pruning approaches:
◦ Random Inactivity Pruning – Fixed (RIP-F)
(Allows a constant number k of IPAs)
◦ Random Inactivity Pruning – Relative (RIP-R)
(Allows a percentage p of IPAs from the total number)
◦ Dynamic Random Inactivity Pruning – Fixed (DRIP-F)
(Allows k1 IPAs if the agent has more units than the opponent and k2 IPAs otherwise)
◦ Dynamic Random Inactivity Pruning – Relative (DRIP-R)
(Allows a percentage p1 of IPAs if the agent has more units and p2 of IPAs otherwise)
18
Experiments and Results
Check the proposed approaches
5
19
EXPERIMENTAL SETUP
Two different MCTS approaches have been tested:
◦ UCT  which uses Upper Confidence Bounds and treats the
selection as a Multi-Armed Bandit (MAB) problem.
◦ NaïveMCTS  designed to better deal with the combinatorial
search space in RTS. Thus, selection is considered as a
combinatorial MAB problem.
63000 matches have been conducted for each approach…
20
EXPERIMENTAL SETUP
◦ Two different sets of values for k and p :
k = {0, 1, 5, 10, 50, 100, 500, 1000, 5000, 10000}
p = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1}
◦ All the values of these parameters have been considered
◦ A version of the pruning agent have confronted a non-pruning
version of the same agent in 500 games using a value of k or p
◦ 100 ms per decision
◦ 8x8, 12x12 and 16x16 maps
21
RESULTS (UCT)
22
Score in the vertical axis is a measure of the
number of wins of the pruned agent against
the standard one
RESULTS (NaïveMCTS)
23
Score in the vertical axis is a measure of the
number of wins of the pruned agent against
the standard one
RESULTS (Best agents)
24
Best agents per map size and configuration
(k or p value) have been confronted
against others from the State of the Art
WINS
Conclusions
and Future Work
6
25
CONCLUSIONS
◦ μRTS MCTS agents have been improved.
◦ Four move pruning approaches have been proposed and
studied focusing on Waiting actions.
◦ The results show a good improvement for some configurations
◦ The best agents are competitive against baseline agents from
μRTS competition
◦ They are worse against script-based ones (using expert
knowledge)
◦ However, our best agent got better results in one map than one
of the top agents in the 2019 competition, MixedBot.
26
FUTURE WORK
◦ Analyse other possible actions to prune.
◦ Consider larger and more complex scenarios.
◦ Apply IPA pruning with other search methods, such as Strategic
Learning + Tactical Search, or Multi-Armed Bandit Tree Search.
◦ Use the IPA pruning as a part of a scripted agent to gain
performance.
27
Thanks!
Contact us:
abdessamed.ouessai@univ-mascara.dz
salem@univ-mascara.dz
amorag@ugr.es
28
Template by
SlidesCarnival

More Related Content

PDF
Testing hybrid computational intelligence algorithms for general game playing...
PDF
Beating uncertainty in racing bot evolution through enhanced exploration and ...
PDF
The Evolutionary Race: Improving the Process of Evaluating Car Controllers in...
PDF
CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strateg...
PPTX
Speeding Up Sub-Optimal MAPF Algorithms
PDF
Production model lifecycle management 2016 09
PDF
Dexterous In-hand Manipulation by OpenAI
PDF
Ndss 2016 game_bot_final_no_video
Testing hybrid computational intelligence algorithms for general game playing...
Beating uncertainty in racing bot evolution through enhanced exploration and ...
The Evolutionary Race: Improving the Process of Evaluating Car Controllers in...
CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strateg...
Speeding Up Sub-Optimal MAPF Algorithms
Production model lifecycle management 2016 09
Dexterous In-hand Manipulation by OpenAI
Ndss 2016 game_bot_final_no_video

Similar to Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning (20)

PDF
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
PDF
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
PDF
AlphaGo and AlphaGo Zero
PDF
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
PDF
Optimized Multi-agent Box-pushing - 2017-10-24
PDF
Machine Learning for Dummies
PDF
Dynamic Batch Parallel Algorithms for Updating Pagerank : SLIDES
PDF
Cyber Security Forum: DARPA's Cyber Grand Challenge. What Happened and What'...
PDF
Mastering the game of go with deep neural networks and tree searching
PDF
Ad Click Prediction - Paper review
PPTX
Thesis Presentation
PPTX
2017 Fighting Game AI Competition
PDF
A taste of GlobalISel
PPTX
Statistical Arbitrage
PDF
A Methodology for Automatic GPU Kernel Optimization - NECSTTechTalk 4/06/2020
PDF
DC02. Interpretation of predictions
PPTX
Final Presentation - Edan&Itzik
ODP
Chance detection in football broadcasts
PDF
A Brief Survey of Reinforcement Learning
PPTX
Introduction to Genetic algorithm and its significance in VLSI design and aut...
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
AlphaGo and AlphaGo Zero
ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path
Optimized Multi-agent Box-pushing - 2017-10-24
Machine Learning for Dummies
Dynamic Batch Parallel Algorithms for Updating Pagerank : SLIDES
Cyber Security Forum: DARPA's Cyber Grand Challenge. What Happened and What'...
Mastering the game of go with deep neural networks and tree searching
Ad Click Prediction - Paper review
Thesis Presentation
2017 Fighting Game AI Competition
A taste of GlobalISel
Statistical Arbitrage
A Methodology for Automatic GPU Kernel Optimization - NECSTTechTalk 4/06/2020
DC02. Interpretation of predictions
Final Presentation - Edan&Itzik
Chance detection in football broadcasts
A Brief Survey of Reinforcement Learning
Introduction to Genetic algorithm and its significance in VLSI design and aut...
Ad

More from Antonio Mora (20)

PDF
Study on Genetic Algorithm Approaches to Improve an Autonomous Agent for a Fi...
PDF
Optimización Adaptativa basada en Colonias de Hormigas para la Composición de...
PDF
Adaptive Ant Colony Optimization for Service Function Chaining in a Dynamic 5...
PDF
Research in Videogames. (Much) further than just AI
PDF
GRETIVE: Un Bot Evolutivo para HearthStone basado en Perfiles
PDF
Checking the difficulty of evolutionary-generated maps in a N-Body inspired m...
PDF
Applying Ant Colony Optimization for Service Function Chaining in a 5G Networ...
PDF
Investigación en videojuegos. (mucho) Mas allá de la IA
PDF
Inteligencia Computacional en Videojuegos (Meetup GranadAI 2019)
PDF
Free Form Evolution for Angry Birds Level Generation
PDF
Ciencia y Videojuegos (ULP 2019)
PDF
Predicción de Quiebra Financiera de Empresas Mediante Equilibrado de Datos y ...
PDF
Driving in TORCS using modular fuzzy controllers - Poster - EvoGAMES 2017
PDF
Sólo puede quedar uno: Evolución de Bots para RTS basada en supervivencia
PDF
Living-UGR: Una aventura gráfica geolocalizada para difundir el patrimonio de...
PDF
Gamification in Teaching - How to motivate students through games
PDF
Ciencia y videojuegos (versión Extracción de Información) [UCA 05/2015]
PDF
Ciencia y Videojuegos (v2)
PDF
Evolving Evil: Optimizing Flocking Strategies through Genetic Algorithms for ...
PDF
MUSES: A Corporate User-Centric System which Applies Computational Intelligen...
Study on Genetic Algorithm Approaches to Improve an Autonomous Agent for a Fi...
Optimización Adaptativa basada en Colonias de Hormigas para la Composición de...
Adaptive Ant Colony Optimization for Service Function Chaining in a Dynamic 5...
Research in Videogames. (Much) further than just AI
GRETIVE: Un Bot Evolutivo para HearthStone basado en Perfiles
Checking the difficulty of evolutionary-generated maps in a N-Body inspired m...
Applying Ant Colony Optimization for Service Function Chaining in a 5G Networ...
Investigación en videojuegos. (mucho) Mas allá de la IA
Inteligencia Computacional en Videojuegos (Meetup GranadAI 2019)
Free Form Evolution for Angry Birds Level Generation
Ciencia y Videojuegos (ULP 2019)
Predicción de Quiebra Financiera de Empresas Mediante Equilibrado de Datos y ...
Driving in TORCS using modular fuzzy controllers - Poster - EvoGAMES 2017
Sólo puede quedar uno: Evolución de Bots para RTS basada en supervivencia
Living-UGR: Una aventura gráfica geolocalizada para difundir el patrimonio de...
Gamification in Teaching - How to motivate students through games
Ciencia y videojuegos (versión Extracción de Información) [UCA 05/2015]
Ciencia y Videojuegos (v2)
Evolving Evil: Optimizing Flocking Strategies through Genetic Algorithms for ...
MUSES: A Corporate User-Centric System which Applies Computational Intelligen...
Ad

Recently uploaded (20)

PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
The scientific heritage No 166 (166) (2025)
PDF
Sciences of Europe No 170 (2025)
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
Microbiology with diagram medical studies .pptx
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
BIOMOLECULES PPT........................
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
2. Earth - The Living Planet earth and life
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
HPLC-PPT.docx high performance liquid chromatography
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
ECG_Course_Presentation د.محمد صقران ppt
The scientific heritage No 166 (166) (2025)
Sciences of Europe No 170 (2025)
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
Microbiology with diagram medical studies .pptx
AlphaEarth Foundations and the Satellite Embedding dataset
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Placing the Near-Earth Object Impact Probability in Context
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
POSITIONING IN OPERATION THEATRE ROOM.ppt
BIOMOLECULES PPT........................
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Biophysics 2.pdffffffffffffffffffffffffff
2. Earth - The Living Planet earth and life
neck nodes and dissection types and lymph nodes levels
HPLC-PPT.docx high performance liquid chromatography

Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning

  • 1. Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning Abdessamed Ouessai (University of Mascara, Algeria) Mohammed Salem (University of Mascara, Algeria) Antonio M. Mora (University of Granada, Spain)
  • 2. INDEX ◦ RTS and μRTS ◦ Problem Statement ◦ Monte Carlo Tree Search ◦ Move Pruning ◦ Experiments and Results ◦ Conclusions 2
  • 3. RTS and μRTS Main features 1 3
  • 5. REAL-TIME STRATEGY GAMES 5 Features: ◦ Real-Time (10~50 decision cycles per second) ◦ Simultaneous actions for different units ◦ Durative actions (more than one decision cycle) ◦ Deterministic / non deterministic ◦ Partially / fully observable map
  • 6. REAL-TIME STRATEGY GAMES 6 ◦ Commercial RTS games were not conceived with AI research in mind. ◦ There are no adequate APIs (normally), just independent solutions such as BWAPI (for Starcraft) or Wargus (for Warcraft). ◦ In 2016 it was launched an official AI API for StarCraft II from DeepMind® and Blizzard® ◦ There are also other solutions: ◦ OpenRTS ◦ SparCraft ◦ μRTS
  • 7. μRTS 7 ◦ Open-source simulator (in Java) created by Santiago Ontañón. ◦ Simple, fast and lightweight. ◦ Designed for research. ◦ Extensible and scalable. Unit-Actions • move • attack • harvest • produce Player-Action Combination of unit-actions
  • 8. Problem Statement Creating autonomous agents competitive on playing RTS games 2 8
  • 9. COMPETITIVE AGENT The aim is to create a human challenging agent for RTS games. It should manage: ◦ Several units simultaneously ◦ Very short decision time ◦ Growing map (if it is not known) ◦ Growing amount of units 9
  • 10. COMPETITIVE AGENT It should also deal with: Exponential increase of possible decisions/actions Growing Branching factor in the decision tree of the agent 10 Chess Go StarCraft Branching Factor 36 180 1050 State Space 1047 10171 101685
  • 11. MCTS Monte Carlo Tree Search 3 11
  • 12. MCTS Heuristic search algorithm. Very effective method in games such as Go (AlphaGo) or chess. 12 Selection is applied recursively until a leaf is reached One or more nodes are created One random simulated game is played Repeat N times The result of the game is backpropagated
  • 13. MCTS However MCTS has a scalability problem: ◦ In RTS the space of states (decision space) is huge ◦ MCTS struggles with this space ◦ When the branching factor grows past a certain threshold MCTS becomes a worse option ◦ Just suitable for small maps or tactical planning 13
  • 14. MCTS Solutions: ◦ Abstracting the decision space  consider coarser actions >> tactical performance is worse << ◦ Reduce the decision space  remove some actions (decisions) 14 We propose reduce this space by pruning unnecessary and detrimental player-actions
  • 15. Move Pruning To reduce the branching factor in the search 4 15
  • 16. MOVE PRUNING To improve MCTS performance in growing decision spaces: ◦ The aim is to reduce the branching factor on the decision tree ◦ The algorithm won’t explore irrelevant or unfavourable actions ◦ Focus on the so-called Inactive Player Actions (IPAs)  keep one or more units in inactive state ◦ Explore more promising player-actions ◦ Saved time will be used to improve the playing performance 16
  • 17. MOVE PRUNING To improve MCTS performance in growing decision spaces: ◦ Focus on the so-called Inactive Player Actions (IPAs)  keep one or more units in inactive state 17 Unit-actions in μRTS
  • 18. MOVE PRUNING Wait is useful in situations such as trapped unit or tactical waiting. So we cannot prune completely these actions. Four Pruning approaches: ◦ Random Inactivity Pruning – Fixed (RIP-F) (Allows a constant number k of IPAs) ◦ Random Inactivity Pruning – Relative (RIP-R) (Allows a percentage p of IPAs from the total number) ◦ Dynamic Random Inactivity Pruning – Fixed (DRIP-F) (Allows k1 IPAs if the agent has more units than the opponent and k2 IPAs otherwise) ◦ Dynamic Random Inactivity Pruning – Relative (DRIP-R) (Allows a percentage p1 of IPAs if the agent has more units and p2 of IPAs otherwise) 18
  • 19. Experiments and Results Check the proposed approaches 5 19
  • 20. EXPERIMENTAL SETUP Two different MCTS approaches have been tested: ◦ UCT  which uses Upper Confidence Bounds and treats the selection as a Multi-Armed Bandit (MAB) problem. ◦ NaïveMCTS  designed to better deal with the combinatorial search space in RTS. Thus, selection is considered as a combinatorial MAB problem. 63000 matches have been conducted for each approach… 20
  • 21. EXPERIMENTAL SETUP ◦ Two different sets of values for k and p : k = {0, 1, 5, 10, 50, 100, 500, 1000, 5000, 10000} p = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1} ◦ All the values of these parameters have been considered ◦ A version of the pruning agent have confronted a non-pruning version of the same agent in 500 games using a value of k or p ◦ 100 ms per decision ◦ 8x8, 12x12 and 16x16 maps 21
  • 22. RESULTS (UCT) 22 Score in the vertical axis is a measure of the number of wins of the pruned agent against the standard one
  • 23. RESULTS (NaïveMCTS) 23 Score in the vertical axis is a measure of the number of wins of the pruned agent against the standard one
  • 24. RESULTS (Best agents) 24 Best agents per map size and configuration (k or p value) have been confronted against others from the State of the Art WINS
  • 26. CONCLUSIONS ◦ μRTS MCTS agents have been improved. ◦ Four move pruning approaches have been proposed and studied focusing on Waiting actions. ◦ The results show a good improvement for some configurations ◦ The best agents are competitive against baseline agents from μRTS competition ◦ They are worse against script-based ones (using expert knowledge) ◦ However, our best agent got better results in one map than one of the top agents in the 2019 competition, MixedBot. 26
  • 27. FUTURE WORK ◦ Analyse other possible actions to prune. ◦ Consider larger and more complex scenarios. ◦ Apply IPA pruning with other search methods, such as Strategic Learning + Tactical Search, or Multi-Armed Bandit Tree Search. ◦ Use the IPA pruning as a part of a scripted agent to gain performance. 27