Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning

Improving the Performance of
MCTS-Based μRTS Agents
Through Move Pruning
Abdessamed Ouessai (University of Mascara, Algeria)
Mohammed Salem (University of Mascara, Algeria)
Antonio M. Mora (University of Granada, Spain)

INDEX
◦ RTS and μRTS
◦ Problem Statement
◦ Monte Carlo Tree Search
◦ Move Pruning
◦ Experiments and Results
◦ Conclusions
2

RTS and μRTS
Main features
1
3

REAL-TIME STRATEGY GAMES
5
Features:
◦ Real-Time (10~50 decision cycles per second)
◦ Simultaneous actions for different units
◦ Durative actions (more than one decision cycle)
◦ Deterministic / non deterministic
◦ Partially / fully observable map

REAL-TIME STRATEGY GAMES
6
◦ Commercial RTS games were not
conceived with AI research in mind.
◦ There are no adequate APIs (normally),
just independent solutions such as
BWAPI (for Starcraft) or Wargus (for
Warcraft).
◦ In 2016 it was launched an official AI API
for StarCraft II from DeepMind® and
Blizzard®
◦ There are also other solutions:
◦ OpenRTS
◦ SparCraft
◦ μRTS

μRTS
7
◦ Open-source simulator (in Java) created by Santiago Ontañón.
◦ Simple, fast and lightweight.
◦ Designed for research.
◦ Extensible and scalable.
Unit-Actions
• move
• attack
• harvest
• produce
Player-Action
Combination of
unit-actions

Problem Statement
Creating autonomous agents competitive on playing RTS games
2
8

COMPETITIVE AGENT
The aim is to create a human challenging agent for RTS games.
It should manage:
◦ Several units simultaneously
◦ Very short decision time
◦ Growing map (if it is not known)
◦ Growing amount of units
9

COMPETITIVE AGENT
It should also deal with:
Exponential increase of possible decisions/actions
Growing Branching factor in the decision tree of the agent
10
Chess Go StarCraft
Branching Factor 36 180 1050
State Space 1047 10171 101685

MCTS
Monte Carlo Tree Search
3
11

MCTS
Heuristic search algorithm.
Very effective method in games such as Go (AlphaGo) or chess.
12
Selection is
applied
recursively until a
leaf is reached
One or more
nodes are
created
One random
simulated game
is played
Repeat N times
The result of
the game is
backpropagated

MCTS
However MCTS has a scalability problem:
◦ In RTS the space of states (decision space) is huge
◦ MCTS struggles with this space
◦ When the branching factor grows past a certain threshold
MCTS becomes a worse option
◦ Just suitable for small maps or tactical planning
13

MCTS
Solutions:
◦ Abstracting the decision space  consider coarser actions
>> tactical performance is worse <<
◦ Reduce the decision space  remove some actions (decisions)
14
We propose reduce this space by
pruning unnecessary and
detrimental player-actions

Move Pruning
To reduce the branching factor in the search
4
15

MOVE PRUNING
To improve MCTS performance in growing decision spaces:
◦ The aim is to reduce the branching factor on the decision tree
◦ The algorithm won’t explore irrelevant or unfavourable actions
◦ Focus on the so-called Inactive Player Actions (IPAs)  keep
one or more units in inactive state
◦ Explore more promising player-actions
◦ Saved time will be used to improve the playing performance
16

MOVE PRUNING
To improve MCTS performance in growing decision spaces:
◦ Focus on the so-called Inactive Player Actions (IPAs)  keep
one or more units in inactive state
17
Unit-actions in μRTS

MOVE PRUNING
Wait is useful in situations such as trapped unit or tactical waiting.
So we cannot prune completely these actions.
Four Pruning approaches:
◦ Random Inactivity Pruning – Fixed (RIP-F)
(Allows a constant number k of IPAs)
◦ Random Inactivity Pruning – Relative (RIP-R)
(Allows a percentage p of IPAs from the total number)
◦ Dynamic Random Inactivity Pruning – Fixed (DRIP-F)
(Allows k1 IPAs if the agent has more units than the opponent and k2 IPAs otherwise)
◦ Dynamic Random Inactivity Pruning – Relative (DRIP-R)
(Allows a percentage p1 of IPAs if the agent has more units and p2 of IPAs otherwise)
18

Experiments and Results
Check the proposed approaches
5
19

EXPERIMENTAL SETUP
Two different MCTS approaches have been tested:
◦ UCT  which uses Upper Confidence Bounds and treats the
selection as a Multi-Armed Bandit (MAB) problem.
◦ NaïveMCTS  designed to better deal with the combinatorial
search space in RTS. Thus, selection is considered as a
combinatorial MAB problem.
63000 matches have been conducted for each approach…
20

EXPERIMENTAL SETUP
◦ Two different sets of values for k and p :
k = {0, 1, 5, 10, 50, 100, 500, 1000, 5000, 10000}
p = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1}
◦ All the values of these parameters have been considered
◦ A version of the pruning agent have confronted a non-pruning
version of the same agent in 500 games using a value of k or p
◦ 100 ms per decision
◦ 8x8, 12x12 and 16x16 maps
21

RESULTS (UCT)
22
Score in the vertical axis is a measure of the
number of wins of the pruned agent against
the standard one

RESULTS (NaïveMCTS)
23
Score in the vertical axis is a measure of the
number of wins of the pruned agent against
the standard one

RESULTS (Best agents)
24
Best agents per map size and configuration
(k or p value) have been confronted
against others from the State of the Art
WINS

Conclusions
and Future Work
6
25

CONCLUSIONS
◦ μRTS MCTS agents have been improved.
◦ Four move pruning approaches have been proposed and
studied focusing on Waiting actions.
◦ The results show a good improvement for some configurations
◦ The best agents are competitive against baseline agents from
μRTS competition
◦ They are worse against script-based ones (using expert
knowledge)
◦ However, our best agent got better results in one map than one
of the top agents in the 2019 competition, MixedBot.
26

FUTURE WORK
◦ Analyse other possible actions to prune.
◦ Consider larger and more complex scenarios.
◦ Apply IPA pruning with other search methods, such as Strategic
Learning + Tactical Search, or Multi-Armed Bandit Tree Search.
◦ Use the IPA pruning as a part of a scripted agent to gain
performance.
27

Thanks!
Contact us:
abdessamed.ouessai@univ-mascara.dz
salem@univ-mascara.dz
amorag@ugr.es
28
Template by
SlidesCarnival

Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning

More Related Content

Similar to Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning (20)

More from Antonio Mora (20)

Recently uploaded (20)

Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning