XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hypervisor

XeMPUPiL
A performance-aware power capping
orchestrator for the Xen hypervisor
Marco Arnaboldi, author
marco1.arnaboldi@mail.polimi.it
05 June 2017

2
Problem Definition
VIRTUALIZATION
DATACENTERTENANTS
AVAILABLE  
POWER
XEN

3
Problem Definition
VIRTUALIZATION
DATACENTERTENANTS
AVAILABLE  
POWER
XEN

4
Problem Definition
VIRTUALIZATION
DATACENTERTENANTS
AVAILABLE  
POWER
XEN

5
Problem Definition
VIRTUALIZATION
DATACENTERTENANTS
AVAILABLE  
POWER
XEN

6
Problem Definition
One problem, two points of view:
➡ minimize power consumption given a minimum performance
requirement
➡ maximize performance given a maximum power consumption
capping
AVAILABLE  
POWER

7
Problem Definition
AVAILABLE  
POWER
One problem, two points of view:
➡ minimize power consumption given a minimum performance
requirement
➡ maximize performance given a maximum power consumption
capping

8
Challenges

9
Instrumentation-free
workload monitoring
Challenges

10
workload monitoring
Power management techniques
HW vs. SW
Challenges

11
workload monitoring
Open Source virtualization layer adopted
by many fortune companies
Power management techniques
HW vs. SW
Challenges

12
State of the Art
SOFTWARE APPROACH
✓ efficiency
✖ timeliness
MODEL BASED 
MONITORING [3]
THREAD 
MIGRATION [2]
RESOURCE

MANAGMENT DVFS [4]
RAPL [1]
CPU

QUOTA
HARDWARE APPROACH
✖ efficiency
✓ timeliness
[1] H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. Rapl: Memory power estimation and capping. In International Symposium on Low Power Electronics and Design (ISPLED), 2010.
[2] R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. Pack & cap: adaptive dvfs and thread packing under power caps. In International Symposium on Microarchitecture (MICRO), 2011.
[3]M. Ferroni, A. Cazzola, D. Matteo, A. A. Nacci, D. Sciuto, and M. D. Santambrogio. Mpower: gain back your android battery life! In Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication, pages 171–
174. ACM, 2013.
[4] T. Horvath, T. Abdelzaher, K. Skadron, and X. Liu. Dynamic voltage scaling in multitier web servers with end-to-end delay control. In Computers, IEEE Transactions. IEEE, 2007.

13
State of the Art
RESOURCE

MANAGMENT
CPU

QUOTA
HYBRID APPROACH [5]
✓ efficiency
✓ timeliness
SOFTWARE APPROACH
✓ efficiency
✖ timeliness
HARDWARE APPROACH
✖ efficiency
✓ timeliness
[1] H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. Rapl: Memory power estimation and capping. In International Symposium on Low Power Electronics and Design (ISPLED), 2010.
[2] R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. Pack & cap: adaptive dvfs and thread packing under power caps. In International Symposium on Microarchitecture (MICRO), 2011.
[3]M. Ferroni, A. Cazzola, D. Matteo, A. A. Nacci, D. Sciuto, and M. D. Santambrogio. Mpower: gain back your android battery life! In Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication, pages 171–
174. ACM, 2013.
[4] T. Horvath, T. Abdelzaher, K. Skadron, and X. Liu. Dynamic voltage scaling in multitier web servers with end-to-end delay control. In Computers, IEEE Transactions. IEEE, 2007.
[5] H. Zhang and H. Hoffmann. Maximizing performance under a power cap: A comparison of hardware, software, and hybrid techniques. In International Conference on Architectural Support for Programming Languages and Operating Systems
(ASPLOS), 2016.  
 
 
MODEL BASED 
MONITORING [3]
THREAD 
MIGRATION [2]
DVFS [4]
RAPL [1]

14
Proposed Solution
Xen
Hypervisor
Hardware
DomainU
Workload
DomainU
Workload
Dom0
Decide
Observe
XeMPower
Hardware
Events
Counters
PUPiL
CLI
buffers
Hypercall manager
RAPL
interface
XL
Act

u Server setup (aka Sandy)
u 2.8-GHz quad-core Intel Xeon E5-1410 processor, no HT enabled (4 physical
core)
u 32GB of RAM
u Xen hypervisor version 4.4
u paravirtualized instance of Ubuntu 14.04 as Dom0, pinned on the first pCPU and
with 4GB of RAM
15
Experimental Setup
u Benchmarking
u Embarrassingly Parallel (EP) [1]
u IOzone [3]
u cachebench [2]
u Bi-Triagonal solver (BT) [1]
EP IOzone cachebench BT
CPU-bound YES NO NO YES
IO-bound NO YES NO YES
memory-bound NO NO YES YES
[1] Nas parallel benchmarks. http://www.nas.nasa.gov/publications/npb. html#url. Accessed: 2017-04-01.
[2] Openbenchmarking.org. https://openbenchmarking.org/test/pts/ cachebench. Accessed: 2017-04-01.
[3] Iozone ﬁlesystem benchmark. http://www.iozone.org. Accessed: 2017- 04-01.

16
Experimental Results
0
0.2
0.4
0.6
0.8
1.0
NO RAPL
RAPL 40
RAPL 30
RAPL 20NormalizedPerformance
0
0.2
0.4
0.6
0.8
1.0
EP cachebench IOzone BT
Baseline Deﬁnition via RAPL

17
0
0.2
0.4
0.6
0.8
1.0
NO RAPL
RAPL 40
RAPL 30
0
0.2
0.4
0.6
0.8
1.0
CPU-intensive
benchmarks
suffer
processor
frequency
reduction

18
0
0.2
0.4
0.6
0.8
1.0
NO RAPL
RAPL 40
RAPL 30
0
0.2
0.4
0.6
0.8
1.0
Other
benchmarks suffer
processor voltage
reduction

19
0
0.5
1.0
PUPiL 40
RAPL 40
Normalizedperformance
0
0.5
1.0
0
0.5
1.0
PUPiL 30
RAPL 30
0
0.5
1.0
0
0.5
1.0
PUPiL 20
RAPL 20
0
0.5
1.0
XeMPUPiL
results
compared to the
baseline

20
XeMPUPiL
results
compared to the
baseline
XeMPUPiL
outperforms pure
RAPL
for IO-, MEM-, and
mix-bound
benchmarks
0
0.5
1.0
PUPiL 40
RAPL 40
0
0.5
1.0
0
0.5
1.0
PUPiL 30
RAPL 30
0
0.5
1.0
0
0.5
1.0
PUPiL 20
RAPL 20
0
0.5
1.0

21
XeMPUPiL
results
compared to the
baseline XeMPUPiL suffers
pure CPU-bound
benchmarks, due
to Xen developer-
transparent
optimization
0
0.5
1.0
PUPiL 40
RAPL 40
0
0.5
1.0
0
0.5
1.0
PUPiL 30
RAPL 30
0
0.5
1.0
0
0.5
1.0
PUPiL 20
RAPL 20
0
0.5
1.0

22
Conclusions
u Conclusions
u Performance tuning trough ODA controller under a power cap
improves performances
u Hybrid approaches like XeMPUPiL
u Better efficiency than HW approaches
u Better timeliness than SW approaches 
“Towards a performance-aware power capping
orchestrator for the Xen hypervisor” @ EWiLi’16, October
6th, 2016, Pittsburgh, USA
Paper

23
Future Works
u (Integrating || Moving) orchestrator logic into
scheduler
u Exploit new RAPL version on Haswell family
u Explore new policies regarding:
u Decision
u Resource assignment

24
Thank you!!!
XeMPUPiL
“Towards a performance-aware
power capping orchestrator for the
Xen hypervisor” @ EWiLi’16,
October 6th, 2016, Pittsburgh, USA

25
ODA Details
ACTDECIDEOBSERVE
u Exploration in the space of
all possible resource
configuration, based on
binary search tree
u Policy in order to distribute
the virtual resources on the
physical ones.
u Enforce power cap via
RAPL
u Define a cpu pool for the
workload
u Launch the workload on the
pool
u Change the number of the
resource on the pool
accordingly with the
decision phase
u Pin workload’s vCPU over
pCPU accordingly with the
map decided
The decision phase is similar to the one implemented in
PUPiL. The major changes are in how we evaluate the metrics
gathered in the previous phase and in how we assign the
physical resources to each virtual domain.
The evaluation criterion is based on the average IR rate,
given a certain time window: this allows the workload to adapt
to the actual configuration before a new decision is taken.
For what concerns the allocation of resources to each
domains, we chose to work at a core-level granularity: on the
one hand, each domain owns a set virtual CPUs (vCPUs),
while, on the other hand, we have a set of physical CPUs
(pCPU) present on the machine. Each vCPU is mapped on a
pCPU for a certain amount of time, while it may happen that
even multiple vCPUs can be mapped on the same pCPU.
We wanted our allocation policy to be as fair as possible,
covering the whole set of pCPUs if possible; given a workload
with M virtual resources and an assignment of N physical
resources, to each pCPUi we assign:
vCPUs(i) =
2
6
6
6
6
6
M
X
0<j<i
vCPUs(j)
N i
3
7
7
7
7
7
(1)
where i is a number between 0 and N 1, i.e., it spans over
the set of pCPUs.
C. Act
The act phase essentially consists in: 1) setting the chosen
power cap and 2) actuating the selected resource configuration.
2Source code available at: https://bitbucket.org/necst/xempower
written to set a limit on the po
CPU socket.
In a virtualized environment,
accessible by the virtual doma
tenant Dom0. However, this li
invoking custom hypercalls th
derlying hardware. To the bes
hypervisor does not natively
interact with the RAPL inter
implemented our custom hype
der to be enough generic, we
"xempower_rdmsr" and "x
one allows to reads, while the
specified MSR from Dom0.
Each hypercall needs to be
the hypervisor, that runs bare
kernel keeps track of the list o
input parameters they accept.
function has to be declared and
by the kernel at runtime: our im
Xen build-in functions to safely
i.e., wrms_safe and rdmsr_
if something goes wrong in ac
critical problems to happen at
We then implemented
Interface (CLI) tools to ac
Dom0: xempower_RaplS
xempower_RaplPowerMoni
consumption of the socket.
value of power cap and the p
are passed through the whole
u Instruction retired per
domain metric
u Data gathered from
xempowermon
u Use of HPC and Xen
scheduler in order to map
the IR to the respective
domain

26
RAPL Details
MSR
INTEL RAPL INTERFACE
HYPERCALL MANAGER
BUFFER
XEMPOWER
CLI TOOL
u Two tools based on xc native tool: XEMPOWER_RAPLSETPOWER and
XEMPOWER_RAPLPOWERMONITOR
u Tools divided into two parts
u FRONTEND: manage users command and gather information ad
privileges about the session. Pass the user parameters to the backend
u BACKEND: bake the hyperbola, declaring an hypercall structure and
filling it with the user parameters. The invoke the just defined hypercall
u Used to map user space memory to kernel memory, in order to perform
“pass by reference like” mechanism inside hyperbola
u Declaration of two custom hypercall: XEMPOWER_RDMSR and
XEMPOWER_WRMSR
u Implementation of the routines that will manage the two custom hyperbolas
u Accessed by the routines, that write to and read from RAPL specific MSR
register, in order to set the power cap and to retrive metrics on the socket
power consumption
u Three registers are accessed:
u RAPL_PWR_INFO
u RAPL_PK_POWER_LIMIT
u RAPL_PK_POWER_INFO

XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hypervisor

More Related Content

What's hot (20)

Similar to XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hypervisor (20)

More from NECST Lab @ Politecnico di Milano (20)

Recently uploaded (20)

XeMPUPiL: Towards Performance-aware Power Capping Orchestrator for the Xen Hypervisor