Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Autonomous Robots

Pooyan Jamshidi, Javier Cámara, Bradley Schmerl, Chris3an Kästner,
David Garlan
Machine Learning Meets
Quan0ta0ve Planning:
Enabling Self-Adapta1on in Autonomous Robots
https://arxiv.org/abs/1903.03920

Outline
• Self-adapta*on of Highly-Configurable Systems
• Mobile robo)cs domain
• Challenges with quan)ta)ve planning and scale of search space
• Our approach: use machine learning to iden*fy interes*ng configura*ons
• Evalua-on: third party evalua*on of highly-configurable robot naviga*ng
internal space
• Results: machine learning to limit configura*on search space leads to tractable
high quality plans synthesized at run *me
• Future work
214th Symposium on So/ware Engineering for Adap9ve and Self-Managing Systems, Montreal, CA, 24-25 May 2019

Self-adapta.on of Highly-Configurable Systems
• Many cyberphyscial systems have many alterna2ve components with
hundreds of configura2on op2ons
• Many different kinds of sensors
• Alterna3ve so?ware for different robot
func3ons
• Abundant configura3on op3ons
• E.g., AMCL, a component for robot localiza8on,
has ~40 configura8on parameters
• Understanding effect of parameters on
behavior, power consump3on, memory,
etc. is hard
• Self-adapta2on required to handle dynamic situa2ons
314th Symposium on Software Engineering for Adaptive and Self-Managing Systems, Montreal, CA, 24-25 May 2019

Challenges
• How does self-adapta2on deal with this?
• Fixed set of plans developed at design 3me
• Restricted to a manageable set of condi8ons,
pre-known condi8ons
• Run 3me planning that needs to search large
planning space
• Need to simplify the problem to deal with large s
earch space
• Cyberphysical components à intractable to
completely deﬁne ground truth model
Desire a solu2on that can deal with large conﬁgura,on space and highly
dynamic environments

Scenario: Autonomous Service Robot Power
12
Go to a series of locations in a building to deliver packages and messages.
Objectives:
• Timeliness (time to completion)
• Success rate (number of targets reached)
5
Adapta/on space:
• Instruc8on graph (move, charge, etc.)
• Robot’s conﬁgura8on
3
Adapt!
Find new plan
Choose configuration
Sensitive to power model.

Our approach
1. Off-line machine learning finds
Pareto-op2mal configura2ons
2. Planning space restricted to
only these configura2ons
H1: Machine learning can find sufficiently op2mal configura2ons with
limited sampling budget.
H2: Restric2ng planning to pareto-op2mal solu2ons makes run2me
planning tractable while maintaining high quality plans.

Approach to machine learning
7
Oﬄine
Learning
Polynomial
regression model
Query
Value
Hidden Power
Model
Exhaus8ve
search
𝑓 ⋅ = 1.2 +
3𝑜! + 5𝑜" +
0.9𝑜# + 0.8𝑜" 𝑜#
+4𝑜! 𝑜" 𝑜#

Background: Conﬁgura/on Representa/on
ℂ = 𝑂!×𝑂"× ⋯×𝑂!#×𝑂"$
Kinect
Conﬁguration
Space
thermometer
𝑐! = 0×0× ⋯×0×1𝑐! ∈ ℂ
Localization
Lidar GPS
14th Symposium on So/ware Engineering for Adap9ve and Self-Managing Systems, Montreal, CA, 24-25 May 2019 8

Our learning approach
A typical approach for understanding the performance behavior is
sensi2vity analysis
𝑂!×𝑂"× ⋯×𝑂!#×𝑂"$
0×0× ⋯×0×1
0×0× ⋯×1×0
0×0× ⋯×1×1
1×1× ⋯×1×0
1×1× ⋯×1×1
⋯
𝑐!
𝑐%
𝑐&
𝑐'
𝑦! = 𝑓(𝑐!)
𝑦% = 𝑓(𝑐%)
𝑦& = 𝑓(𝑐&)
𝑦' = 𝑓(𝑐')
𝑓 ∼ 𝑓(⋅)
⋯
Learn
TrainingSet
^

Our learning approach: Step-wise linear regression
𝑂!×𝑂"× ⋯×𝑂!#×𝑂"$
0×0× ⋯×0×1
0×0× ⋯×1×0
0×0× ⋯×1×1
1×1× ⋯×1×0
1×1× ⋯×1×1
⋯
𝑐!
𝑐%
𝑐&
𝑐'
𝑦! = 𝑓(𝑐!)
𝑦% = 𝑓(𝑐%)
𝑦& = 𝑓(𝑐&)
𝑦' = 𝑓(𝑐')
⋯
TrainingSet
Learn
power
model
1. Fit an ini#al model
2. Forward selec#on: Add terms
itera0vely
3. Backward elimina#on:
Removes terms itera0vely
4. Terminate: When neither (2) or
(3) improve the model
Source
(Execution time of Program X)
𝐋𝐞𝐚𝐫𝐧𝐞𝐝 𝐦𝐨𝐝𝐞𝐥: 𝑓(⋅) = 1.2 + 3𝑜$ + 5𝑜% + 0.9𝑜& + 0.8𝑜% 𝑜& + 4𝑜$ 𝑜% 𝑜&

Planning: Approach overview
The set of Pareto op2mal configura2ons reduces the search space
• But not enough to do planning all in one model
Approach: Divide and conquer
1. Determine valid paths
2. Find best configura2on for each path
3. Pick path/config combina2on with best score
Approach that comes up with the best combina2on configura2on/path to
sa2sfy a preference func2on over quality aPributes

Planning: Mul.ple Models
Planner requires informa2on from mul2ple models
Each stage updates some of the models
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Problem Domain Models

Planning: Machine learned models
Machine learning produces models for:
• Configura2on space to search
• Power consump2on of robot opera2ons
in those configura2ons
Configura)on Machine
Learning Pipeline (Offline)
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
LearnerSystem Observa5ons
Pareto-optimal
configs
Offline

Planning: Find legal paths
Use Dijkstra's algorithm
Considers current knowledge of
location, target, and
environment.
Task Planning Pipeline (Online)
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Aggregator
Path
Preprocessor
Legal paths
Robot loca)on
Target loca)on
Space Topology
Online

Planning: Quan.ta.ve Planning
All models combined into Prism
models
Prism synthesizes plan that…
14th Symposium on Software Engineering for Adaptive and Self-Managing Systems, Montreal, CA, 24-25 May 2019 15
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Aggregator
Path
Preprocessor
Legal paths
Robot loca)on
Target loca)on
Space Topology
Aggregator
Task Planning
Model Generator
Task Planner
Task Plan
Prism
Spec
Task a4ribute quan)ﬁers
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
Pareto-op)mal conﬁgs
Model-ViewTranslationandAggregation

Evalua.on: H1
Want to know how accurate a learned model is:
• Sampling ground truth model through physical experimenta3on
• Power model, which is a set of func3ons, one for each configura3on
Approach: Learn from a set of synthe2c models
• 100 synthe3cally generated power models each with 1000000 configura3ons
• Pick 100 samples from every model and try to learn that model
16
H1: ML finds Pareto-op3mal configura3ons.

Results: H1
We are able to learn an accurate model that is highly likely to iden2fy
Pareto op2mal configura2ons
17
H1: ML finds Pareto-op3mal configura3ons.

Evalua.on: H2
A range of condi2ons:
- different missions
- sequences of waypoints
- different adapta2on-causing perturba2ons
- obstacle placement and baRery deple3on
- different learning budgets
- how much machine learning is done
Actual experiments chosen and executed by a third party
- Lincoln Laboratories( ) as part of a DARPA project
18
H2: Good adapta3ons with just Pareto configura3ons.

Docker Container
Evalua.on Implementa.on: H2
Start:
- mission
- power model
- learning budget
Perturb:
- obstacle
- baRery
REST Test Adapter
Robot Software
Planning
Path plan
Configuration
Test Driver
Gazebo
Simulator
Oﬄine
Learning
Find Pareto-
opt
Models Analysis

Docker Container
Evalua.on Implementa.on: H2
20
Start:
- power model
- learning budget
- mission
Perturb:
- obstacle
- baRery
REST Test Adapter
Models Analysis
Planning
Test Driver
Robot Software
Gazebo
Simulator
Oﬄine
Learning
Find
Pareto-opt
Choose modelLearn modelStart missionPerturb system
Path plan
Configuration

Evaluation Design: H2
Baseline A: No Perturba2ons, no learning, reac2ve planning
Baseline B: Perturba2ons, no learning, reac2ve planning
Challenge: Perturba2ons, learning, quan2ta2ve planning
280 Test triples (840 runs total)
120 Valid triples (Where successful mission in A and unsuccessful in B)
21

Results: H2
Verdicts:
Pass: C completes
mission
Degraded: C completes
more tasks in the
mission
Fail: B bePer than C
22
H2: Good adaptations with just Pareto configurations.
Path obstruc,on
Power deple,on

Results: Summary
H1: Machine learning can find op2mal configura2ons without exploring the
en2re state space
• Pareto configura-ons learned even when observing 10-4% of the
configura-on space
H2: Restric2ng planning to pareto-op2mal solu2ons makes run2me
planning tractable while maintaining high quality plans
• Planning was able to be done in real -me in a robot simula-on that beat
reac-ve adapta-on

Limita.ons
Miscommunication in test design led to poor test cases by independent
evaluators:
- Multiple battery perturbations drain battery completely
- Did not combine battery and obstacle perturbations
- Only one domain (service robots) and one learned model (power,
polynomial)
Future work
- On-line transfer learning to learn and adapt models at run 2me
- Incorpora2on of mul2ple learned models
- More principled approach to model integra2on

25

Approach to machine learning
26LL
Specify
Query
ValueMARS DAS
Learn
Polynomial regression model

Background: Configura/on Representa/on
ℂ = 𝑂!×𝑂"× ⋯×𝑂!#×𝑂"$
Kinect
Configuration
Space
thermometer
𝑐! = 0×0× ⋯×0×1𝑐! ∈ ℂ
Energy
Localization
Robot
Compiled
Code
Instrumented
Binary
Hardware
Compile Deploy
Configure
𝑓!"(𝑐#) = 100𝑚𝑤ℎ
Non-func/onal
measurable/quan/fiable
aspect
Lidar GPS

Our learning approach
Performance model could be in any appropriate form
of black-box models
𝑂!×𝑂"× ⋯×𝑂!#×𝑂"$
0×0× ⋯×0×1
0×0× ⋯×1×0
0×0× ⋯×1×1
1×1× ⋯×1×0
1×1× ⋯×1×1
⋯
𝑐!
𝑐%
𝑐&
𝑐'
𝑦! = 𝑓(𝑐!)
𝑦% = 𝑓(𝑐%)
𝑦& = 𝑓(𝑐&)
𝑦' = 𝑓(𝑐')
𝑓 ∼ 𝑓(⋅)
⋯
Learn
TrainingSet
^

Our learning approach: Measuring Accuracy
𝑂!×𝑂"× ⋯×𝑂!#×𝑂"$
0×0× ⋯×0×1
0×0× ⋯×1×0
0×0× ⋯×1×1
1×1× ⋯×1×0
1×1× ⋯×1×1
⋯
𝑐!
𝑐%
𝑐&
𝑐'
𝑦! = 𝑓(𝑐!)
𝑦% = 𝑓(𝑐%)
𝑦& = 𝑓(𝑐&)
𝑦' = 𝑓(𝑐')
⋯
TrainingSet
Source
(Execution time of Program X)
𝑓
̂
∼ 𝑓(⋅)
Learn
Evaluate
Accuracy
𝐴𝑃𝐸(𝑓
̂
, 𝑓) =
|𝑓
̂
(𝑐) − 𝑓(𝑐)|
𝑓(𝑐)
×100
14th Symposium on Software Engineering for Adaptive and Self-Managing Systems, Montreal, CA, 24-25 May 2019 29

Planning: Planning Architecture
30
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Aggregator
Task Planning
Model Generator
Task Planner
Task Plan
Prism
Spec
Robot loca)on
Target loca)on
Space Topology
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
Adapta)onPlanningModel-ViewTransla)onandAggrega)on

31
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Different models
capture facets of the
domain
Each model includes a
model-view translator
that enables retrieving
and inserting information

32
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
Pareto-optimal
configs
The set of Pareto-
optimal configurations
into the configuration
model
The energy consumption
of robot operations in
those configurations into the
power model
Configuration
Machine
Pareto-optimal
configs
The
Configuration
Machine Learner
incorporates...

Planning: Legal Paths
33
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Adapta)onPlanningModel-ViewTranslationandAggregation
Configuration
Machine
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Robot loca)on
Target location
Space Topology
The
Aggregator
gathers...
The robot’s and
target location from
the task model
The topological
information of the
physical space

Planning: Legal Paths
34
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
LearnerSystem Observations
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Robot loca)on
Target loca)on
Space Topology
The Path Preprocessor
generates the legal
paths between those
locations...
...and Legal
Paths are
inserted back
into the task
model
Aggregator
Path
Preprocessor
Legal paths
Robot loca)on
Target loca)on
Space Topology

Planning: Prism Gen
35
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Task Planner
Task Plan
Robot location
Target loca)on
Space Topology
Aggregator
Task Planning
Model Generator
Prism
Spec
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
Aggregator
Task Planning
Model
Generator
Prism
Spec
The
Aggregator
gathers...

Planning: Prism Gen
36
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Task Planner
Task Plan
Robot loca)on
Target loca)on
Space Topology
Aggregator
Task Planning
Model Generator
Prism
Spec
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
AdaptationPlanningModel-ViewTransla)onandAggrega)on
Aggregator
Task Planning
Model
Generator
Prism
Spec
The
Aggregator
gathers...
Legal paths, task preferences,
and task attribute quantifiers
from the task model

Planning: Prism Gen
37
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
LearnerSystem Observations
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Task Planner
Task Plan
Robot loca)on
Target location
Space Topology
Aggregator
Task Planning
Model Generator
Prism
Spec
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
Aggregator
Task Planning
Model
Generator
Prism
Spec
The
Aggregator
gathers...
from the task model
Distances from the
physical env. model

Planning: Prism Gen
38
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Task Planner
Task Plan
Robot loca)on
Target loca)on
Space Topology
Aggregator
Task Planning
Model Generator
Prism
Spec
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
Pareto-optimal configs
Aggregator
Task Planning
Model
Generator
Prism
Spec
The
Aggregator
gathers...
from the task model
Distances from the
physical env. model
Robot operations
from the operations
model

Planning: Prism Gen
39
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Task Planner
Task Plan
Robot loca)on
Target loca)on
Space Topology
Aggregator
Task Planning
Model Generator
Prism
Spec
Task attribute quantifiers
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
Aggregator
Task Planning
Model
Generator
Prism
Spec
The
Aggregator
gathers...
from the task model
Distances from the
physical env. model
Pareto-optimal configs
from the config model
Robot operations
from the operations
model

Planning: Prism Gen
40
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Robot loca)on
Target location
Space Topology
Aggregator
Task Planning
Model Generator
Prism
Spec
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
Aggregator
Task Planning
Model
Generator
Prism
Spec
The Task Planning Model
Generator creates a Prism
Specification using all the former
elements as building blocks

Planning: Planning
41
Task
Model
Physical
Env. Model
Power
Model
Operations
Model
Configuration
Model
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Model-View
Translator
Configuration
Machine
Pareto-optimal
configs
Aggregator
Path
Preprocessor
Legal paths
Robot location
Target loca)on
Space Topology
Aggregator
Task Planning
Model Generator
Task Planner
Task Plan
Prism
Spec
Task attribute quantifiers
Legal paths
Preferences
Distances
Robot opera)ons’
energy consump)on
Robot opera)ons
Task Planner
Task Plan
Prism
Spec
The Task Planner use
probabilistic model
checking (MDP policy
synthesis) in the backend
to generate a Task Plan

Our learning approach: Op0on analysis
A power model contains useful informa2on about inﬂuen2al op2ons and
interac2ons
𝑓(⋅) = 1.2 + 3𝑜! + 5𝑜& + 0.9𝑜( + 0.8𝑜& 𝑜( + 4𝑜! 𝑜& 𝑜(
𝑓: ℂ → ℝ

Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Autonomous Robots

More Related Content

Similar to Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Autonomous Robots (20)

More from Pooyan Jamshidi (20)

Recently uploaded (20)

Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Autonomous Robots