SlideShare a Scribd company logo
connect.linaro.org
SFO17-403: Optimizing the Design and Implementation of KVM/ARM
Christoffer Dall
ENGINEERS AND DEVICES

WORKING TOGETHER
–Popek and Golberg
[Formal requirements for virtualizable third generation architectures ’74]
““Efficient, isolated duplicate

of the real machine””
ENGINEERS AND DEVICES

WORKING TOGETHER
Hardware
OS Kernel
App AppApp
Hardware
Hypervisor
VM
Kernel
App App
VM
Kernel
App App
Native Virtual Machines
Virtualization
ENGINEERS AND DEVICES

WORKING TOGETHER
Hypervisor Design
Hardware
Hypervisor
VM
Kernel
App App
VM
Kernel
App App
Type 1 (Standalone)
ENGINEERS AND DEVICES

WORKING TOGETHER
Hypervisor Design
Hardware
Hypervisor
VM
Kernel
App App
VM
Kernel
App App
Type 1 (Standalone)
Hardware
OS Kernel
VM
Kernel
App App
VM
Kernel
App App
Type 2 (Hosted)
Hypervisor
App
ENGINEERS AND DEVICES

WORKING TOGETHER
Hypervisor Design
Hardware
Xen
Dom0
Linux
App App
DomU
Linux
App App
Hardware
Linux
VM
Linux
App App
VM
Linux
App App
KVM
App
ENGINEERS AND DEVICES

WORKING TOGETHER
ARM Virtualization Extensions
Kernel
UserEL0
EL1
HypervisorEL2
ENGINEERS AND DEVICES

WORKING TOGETHER
ARM VE and Hypervisors
Xen
Dom0
Linux
App App
DomU
Linux
App AppEL0
EL1
EL2
?
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM
Host
Linux
AppApp
VM
Kernel
AppApp
KVM
KVM lowvisor
EL0
EL1
EL2
1. Hypercall
2. Return3. Hypercall
4. Return
switch
state
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM
Host
Linux
AppApp
VM
Kernel
AppApp
KVM
EL0
EL1
EL2
1. Hypercall 2. Return
ENGINEERS AND DEVICES

WORKING TOGETHER
ARMv8.1 VHE
• Virtualization Host Extensions

• Supports running unmodified
OSes in EL2 without using EL1
Linux
EL0
EL1
EL2
AppApp
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE: Backwards Compatible
• HCR_EL2.E2H complete enables and disables VHE

• When disabled, completely backwards compatible with ARMv8.0

• Example: Xen disables VHE
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE: Expands Functionality of EL2
• Expanded EL2 functionality

• New registers: TTBR1_EL2, CONTEXTIDR_EL2 

• New virtual EL2 timer
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE: Support Userspace in EL0
• TGE: Trap General Exceptions

• Routes all exceptions to EL2

• VHE no longer disables EL0 stage 1 MMU
Linux
EL0
EL1
EL2
AppApp
Exceptions
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE: EL2&0 Translation Regime
• Same page table format as EL1

• Used in EL0 with TGE bit set
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE: System Register Redirection
TCR_EL1
mrs x0, TCR_EL1
HCR_EL2.E2H == 0
TCR_EL2
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE: System Register Redirection
TCR_EL1
mrs x0, TCR_EL1
TCR_EL2
HCR_EL2.E2H == 1
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE Register Redirection
TCR_EL1mrs x0, TCR_EL12
ENGINEERS AND DEVICES

WORKING TOGETHER
More VHE Register Redirection
• Some registers change bit position to be similar between EL1 and EL2

• Example: CNTHTCL_EL2 changes layout to match CNTKCTL_EL1 with extra
bits
ENGINEERS AND DEVICES

WORKING TOGETHER
Legacy KVM/ARM without VHE
HypervisorLinux
EL2
EL1
KVM
Lowvisor
Trap
Run VM
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM with VHE
HypervisorLinux
EL2
KVM
Lowvisor
Function

Call
Run VM
ENGINEERS AND DEVICES

WORKING TOGETHER
Experimental Setup
• AMD Seattle B0
• 64-bit ARMv8-A
• 2.0 GHz AMD A1100 CPU
• 8-way SMP
• 16 GB RAM
• 10 GB Ethernet (passthrough)
*Measurements obtained using Linux in EL2. See BKK16 talk.
ENGINEERS AND DEVICES

WORKING TOGETHER
VHE Performance at First Glance
CPU Clock Cycles non-VHE VHE*
Hypercall 3.181 3.045
*Measurements obtained using Linux in EL2. See BKK16 talk.
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #1
VM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
• Avoid saving/restoring
EL1 register state
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #2
VM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
• Legacy KVM/ARM design
enabled/disabled virtualization
features on every transition

• Virtual/Physical interrupts

• Stage 2 memory translation
KVM Lowvisor
Disable traps
Enable traps
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #2
VM
Kernel
AppAppEL0
EL1
EL2
Host
AppApp
Linux KVM
• Leave virtualization
features enabled

• Host EL2 never uses
stage 2 translations
and always has full
hardware access.
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #3
• Don’t context switch
the timer on every exit
from the VM

• Completely reworks the
timer code

• 20 patches on list
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #4
• Reduce run loop work

• Do work in vcpu_load and vcpu_put instead

• Called when entering/exiting run-loop

• Called when preempted/scheduled

• Requires VHE
vcpu_load
vcpu_put
vcpu run
loop
ENGINEERS AND DEVICES

WORKING TOGETHER
KVM/ARM Optimization #5
• Rewrite the world
switch code
kvm_arch_vcpu_ioctl_run
{
...
while (1) {
...
if (has_vhe() /* static key */
ret = kvm_vcpu_vhe_run(vcpu);
else
ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
...
}
...
}
ENGINEERS AND DEVICES

WORKING TOGETHER
Microbenchmark Results
CPU Clock Cycles non-VHE VHE OPT * x86
Hypercall 3.181 752 1.437
I/O Kernel 3.992 1.604 2.565
I/O User 6.665 7.630 6.732
Virtual IPI 14.155 2.526 3.102
*Measurements obtained using Linux in EL2. See BKK16 talk.
Application Workloads
Application Description
Kernbench Kernel compile
Hackbench Scheduler stress
Netperf Network performance
Apache Web server stress
Memcached Key-Value store
ENGINEERS AND DEVICES

WORKING TOGETHER
Application Workloads
0.00
0.50
1.00
1.50
2.00
Kernbench
Hackbench
TCP_STREAM
TCP_MAERTS
TCP_RR
Apache
Memcached
non-VHE VHE OPT*
*Measurements obtained using Linux in EL2. See BKK16 talk.
Normalized overhead
(lower is better)
ENGINEERS AND DEVICES

WORKING TOGETHER
Conclusions
• Optimize and redesign KVM/ARM for VHE

• Reduce hypercall overhead by more than 75%

• Better cycle counts than x86 for key hypervisor operations

• Network benchmark overhead reduced by 50%

• Key-value store workload overhead reduced by more than 80%
ENGINEERS AND DEVICES

WORKING TOGETHER
Upstream Status
• Timer patches on list

• Core optimization patches coming soon

More Related Content

PDF
Secure Boot on ARM systems – Building a complete Chain of Trust upon existing...
PDF
LMG Lightning Talks - SFO17-205
PPTX
U-boot and Android Verified Boot 2.0
PPTX
Demystifying Security Root of Trust Approaches for IoT/Embedded - SFO17-304
PDF
Mirko Damiani - An Embedded soft real time distributed system in Go
PDF
HKG15-400: Next steps in KVM enablement on ARM
PDF
SFO15-407: Performance Overhead of ARM Virtualization
PDF
Claudio Scordino - Handling mixed criticality on embedded multi-core systems
Secure Boot on ARM systems – Building a complete Chain of Trust upon existing...
LMG Lightning Talks - SFO17-205
U-boot and Android Verified Boot 2.0
Demystifying Security Root of Trust Approaches for IoT/Embedded - SFO17-304
Mirko Damiani - An Embedded soft real time distributed system in Go
HKG15-400: Next steps in KVM enablement on ARM
SFO15-407: Performance Overhead of ARM Virtualization
Claudio Scordino - Handling mixed criticality on embedded multi-core systems

What's hot (20)

PDF
Linux Kernel Platform Development: Challenges and Insights
PDF
LCU14 500 ARM Trusted Firmware
PDF
XenSummit NA 2012: Xen on ARM Cortex A15
PDF
ARM Architecture and Meltdown/Spectre
PDF
Sfo17 109 containerized vn fs with data plane acceleration on arm platform
PDF
SFO15-200: Linux kernel generic TEE driver
PDF
Kernel Recipes 2015: Greybus
PDF
MOVED: RDK/WPE Port on DB410C - SFO17-206
PDF
LCU14 302- How to port OP-TEE to another platform
PPTX
QEMU and Raspberry Pi. Instant Embedded Development
PDF
BKK16-504 Running Linux in EL2 Virtualization
PDF
Andrea Righi - Spying on the Linux kernel for fun and profit
PDF
BKK16-502 Suspend to Idle
PDF
BUD17-416: Benchmark and profiling in OP-TEE
PDF
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
PDF
Lcu14 107- op-tee on ar mv8
PDF
BKK16-309A Open Platform support in UEFI
PDF
LXC, Docker, and the future of software delivery | LinuxCon 2013
PDF
Linux firmware for iRMC controller on Fujitsu Primergy servers
PDF
Project ACRN USB mediator introduction
Linux Kernel Platform Development: Challenges and Insights
LCU14 500 ARM Trusted Firmware
XenSummit NA 2012: Xen on ARM Cortex A15
ARM Architecture and Meltdown/Spectre
Sfo17 109 containerized vn fs with data plane acceleration on arm platform
SFO15-200: Linux kernel generic TEE driver
Kernel Recipes 2015: Greybus
MOVED: RDK/WPE Port on DB410C - SFO17-206
LCU14 302- How to port OP-TEE to another platform
QEMU and Raspberry Pi. Instant Embedded Development
BKK16-504 Running Linux in EL2 Virtualization
Andrea Righi - Spying on the Linux kernel for fun and profit
BKK16-502 Suspend to Idle
BUD17-416: Benchmark and profiling in OP-TEE
LAS16-111: Easing Access to ARM TrustZone – OP-TEE and Raspberry Pi 3
Lcu14 107- op-tee on ar mv8
BKK16-309A Open Platform support in UEFI
LXC, Docker, and the future of software delivery | LinuxCon 2013
Linux firmware for iRMC controller on Fujitsu Primergy servers
Project ACRN USB mediator introduction
Ad

Similar to Optimizing the Design and Implementation of KVM/ARM - SFO17-403 (20)

PDF
MOVED: Optimizing the Design and Implementation of KVM/ARM - SFO17-403
PDF
KVM/ARM Nested Virtualization Support and Performance - SFO17-410
PDF
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
PPTX
Xen Project Update LinuxCon Brazil
PDF
64-bit ARM Unikernels on uKVM
PDF
OpenVZ Linux Containers
PDF
Rmll Virtualization As Is Tool 20090707 V1.0
PDF
RMLL / LSM 2009
PDF
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
PDF
LCA13: Xen on ARM
PPTX
Hyper V And Scvmm Best Practis
PDF
ARM Architecture-based System Virtualization: Xen ARM open source software pr...
PDF
Exploiting Linux Control Groups for Effective Run-time Resource Management
PDF
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
PDF
Linaro connect : Introduction to Xen on ARM
PDF
Experiences porting KVM to SmartOS
PDF
Joyent's Bryan Cantrill: Experiences Porting KVM to SmartOS at KVM Forum, Aug...
PDF
i-just-want-to-use-one-giant-vm.pdf
PDF
Technical update KVM and Red Hat Enterprise Virtualization (RHEV) by syedmshaaf
MOVED: Optimizing the Design and Implementation of KVM/ARM - SFO17-403
KVM/ARM Nested Virtualization Support and Performance - SFO17-410
Rootlinux17: Hypervisors on ARM - Overview and Design Choices by Julien Grall...
Xen Project Update LinuxCon Brazil
64-bit ARM Unikernels on uKVM
OpenVZ Linux Containers
Rmll Virtualization As Is Tool 20090707 V1.0
RMLL / LSM 2009
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
LCA13: Xen on ARM
Hyper V And Scvmm Best Practis
ARM Architecture-based System Virtualization: Xen ARM open source software pr...
Exploiting Linux Control Groups for Effective Run-time Resource Management
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
Linaro connect : Introduction to Xen on ARM
Experiences porting KVM to SmartOS
Joyent's Bryan Cantrill: Experiences Porting KVM to SmartOS at KVM Forum, Aug...
i-just-want-to-use-one-giant-vm.pdf
Technical update KVM and Red Hat Enterprise Virtualization (RHEV) by syedmshaaf
Ad

More from Linaro (20)

PDF
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
PDF
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
PDF
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
PDF
Bud17 113: distribution ci using qemu and open qa
PDF
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
PDF
HPC network stack on ARM - Linaro HPC Workshop 2018
PDF
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
PDF
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
PDF
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
PDF
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
PDF
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
PDF
HKG18-100K1 - George Grey: Opening Keynote
PDF
HKG18-318 - OpenAMP Workshop
PDF
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
PDF
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
PDF
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
PDF
HKG18-TR08 - Upstreaming SVE in QEMU
PDF
HKG18-113- Secure Data Path work with i.MX8M
PPTX
HKG18-120 - Devicetree Schema Documentation and Validation
PPTX
HKG18-223 - Trusted FirmwareM: Trusted boot
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Bud17 113: distribution ci using qemu and open qa
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-100K1 - George Grey: Opening Keynote
HKG18-318 - OpenAMP Workshop
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-113- Secure Data Path work with i.MX8M
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-223 - Trusted FirmwareM: Trusted boot

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPT
Teaching material agriculture food technology
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
Per capita expenditure prediction using model stacking based on satellite ima...
Unlocking AI with Model Context Protocol (MCP)
Understanding_Digital_Forensics_Presentation.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Teaching material agriculture food technology
NewMind AI Monthly Chronicles - July 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Big Data Technologies - Introduction.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
Machine learning based COVID-19 study performance prediction
“AI and Expert System Decision Support & Business Intelligence Systems”
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Review of recent advances in non-invasive hemoglobin estimation
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Advanced methodologies resolving dimensionality complications for autism neur...

Optimizing the Design and Implementation of KVM/ARM - SFO17-403

  • 1. connect.linaro.org SFO17-403: Optimizing the Design and Implementation of KVM/ARM Christoffer Dall
  • 2. ENGINEERS AND DEVICES WORKING TOGETHER –Popek and Golberg [Formal requirements for virtualizable third generation architectures ’74] ““Efficient, isolated duplicate
 of the real machine””
  • 3. ENGINEERS AND DEVICES WORKING TOGETHER Hardware OS Kernel App AppApp Hardware Hypervisor VM Kernel App App VM Kernel App App Native Virtual Machines Virtualization
  • 4. ENGINEERS AND DEVICES WORKING TOGETHER Hypervisor Design Hardware Hypervisor VM Kernel App App VM Kernel App App Type 1 (Standalone)
  • 5. ENGINEERS AND DEVICES WORKING TOGETHER Hypervisor Design Hardware Hypervisor VM Kernel App App VM Kernel App App Type 1 (Standalone) Hardware OS Kernel VM Kernel App App VM Kernel App App Type 2 (Hosted) Hypervisor App
  • 6. ENGINEERS AND DEVICES WORKING TOGETHER Hypervisor Design Hardware Xen Dom0 Linux App App DomU Linux App App Hardware Linux VM Linux App App VM Linux App App KVM App
  • 7. ENGINEERS AND DEVICES WORKING TOGETHER ARM Virtualization Extensions Kernel UserEL0 EL1 HypervisorEL2
  • 8. ENGINEERS AND DEVICES WORKING TOGETHER ARM VE and Hypervisors Xen Dom0 Linux App App DomU Linux App AppEL0 EL1 EL2 ?
  • 9. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Host Linux AppApp VM Kernel AppApp KVM KVM lowvisor EL0 EL1 EL2 1. Hypercall 2. Return3. Hypercall 4. Return switch state
  • 10. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Host Linux AppApp VM Kernel AppApp KVM EL0 EL1 EL2 1. Hypercall 2. Return
  • 11. ENGINEERS AND DEVICES WORKING TOGETHER ARMv8.1 VHE • Virtualization Host Extensions • Supports running unmodified OSes in EL2 without using EL1 Linux EL0 EL1 EL2 AppApp
  • 12. ENGINEERS AND DEVICES WORKING TOGETHER VHE: Backwards Compatible • HCR_EL2.E2H complete enables and disables VHE • When disabled, completely backwards compatible with ARMv8.0 • Example: Xen disables VHE
  • 13. ENGINEERS AND DEVICES WORKING TOGETHER VHE: Expands Functionality of EL2 • Expanded EL2 functionality • New registers: TTBR1_EL2, CONTEXTIDR_EL2 • New virtual EL2 timer
  • 14. ENGINEERS AND DEVICES WORKING TOGETHER VHE: Support Userspace in EL0 • TGE: Trap General Exceptions • Routes all exceptions to EL2 • VHE no longer disables EL0 stage 1 MMU Linux EL0 EL1 EL2 AppApp Exceptions
  • 15. ENGINEERS AND DEVICES WORKING TOGETHER VHE: EL2&0 Translation Regime • Same page table format as EL1 • Used in EL0 with TGE bit set
  • 16. ENGINEERS AND DEVICES WORKING TOGETHER VHE: System Register Redirection TCR_EL1 mrs x0, TCR_EL1 HCR_EL2.E2H == 0 TCR_EL2
  • 17. ENGINEERS AND DEVICES WORKING TOGETHER VHE: System Register Redirection TCR_EL1 mrs x0, TCR_EL1 TCR_EL2 HCR_EL2.E2H == 1
  • 18. ENGINEERS AND DEVICES WORKING TOGETHER VHE Register Redirection TCR_EL1mrs x0, TCR_EL12
  • 19. ENGINEERS AND DEVICES WORKING TOGETHER More VHE Register Redirection • Some registers change bit position to be similar between EL1 and EL2 • Example: CNTHTCL_EL2 changes layout to match CNTKCTL_EL1 with extra bits
  • 20. ENGINEERS AND DEVICES WORKING TOGETHER Legacy KVM/ARM without VHE HypervisorLinux EL2 EL1 KVM Lowvisor Trap Run VM
  • 21. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM with VHE HypervisorLinux EL2 KVM Lowvisor Function
 Call Run VM
  • 22. ENGINEERS AND DEVICES WORKING TOGETHER Experimental Setup • AMD Seattle B0 • 64-bit ARMv8-A • 2.0 GHz AMD A1100 CPU • 8-way SMP • 16 GB RAM • 10 GB Ethernet (passthrough) *Measurements obtained using Linux in EL2. See BKK16 talk.
  • 23. ENGINEERS AND DEVICES WORKING TOGETHER VHE Performance at First Glance CPU Clock Cycles non-VHE VHE* Hypercall 3.181 3.045 *Measurements obtained using Linux in EL2. See BKK16 talk.
  • 24. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #1 VM Kernel AppAppEL0 EL1 EL2 Host AppApp Linux KVM • Avoid saving/restoring EL1 register state
  • 25. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #2 VM Kernel AppAppEL0 EL1 EL2 Host AppApp Linux KVM • Legacy KVM/ARM design enabled/disabled virtualization features on every transition • Virtual/Physical interrupts • Stage 2 memory translation KVM Lowvisor Disable traps Enable traps
  • 26. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #2 VM Kernel AppAppEL0 EL1 EL2 Host AppApp Linux KVM • Leave virtualization features enabled • Host EL2 never uses stage 2 translations and always has full hardware access.
  • 27. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #3 • Don’t context switch the timer on every exit from the VM • Completely reworks the timer code • 20 patches on list
  • 28. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #4 • Reduce run loop work • Do work in vcpu_load and vcpu_put instead • Called when entering/exiting run-loop • Called when preempted/scheduled • Requires VHE vcpu_load vcpu_put vcpu run loop
  • 29. ENGINEERS AND DEVICES WORKING TOGETHER KVM/ARM Optimization #5 • Rewrite the world switch code kvm_arch_vcpu_ioctl_run { ... while (1) { ... if (has_vhe() /* static key */ ret = kvm_vcpu_vhe_run(vcpu); else ret = kvm_call_hyp(__kvm_vcpu_run, vcpu); ... } ... }
  • 30. ENGINEERS AND DEVICES WORKING TOGETHER Microbenchmark Results CPU Clock Cycles non-VHE VHE OPT * x86 Hypercall 3.181 752 1.437 I/O Kernel 3.992 1.604 2.565 I/O User 6.665 7.630 6.732 Virtual IPI 14.155 2.526 3.102 *Measurements obtained using Linux in EL2. See BKK16 talk.
  • 31. Application Workloads Application Description Kernbench Kernel compile Hackbench Scheduler stress Netperf Network performance Apache Web server stress Memcached Key-Value store
  • 32. ENGINEERS AND DEVICES WORKING TOGETHER Application Workloads 0.00 0.50 1.00 1.50 2.00 Kernbench Hackbench TCP_STREAM TCP_MAERTS TCP_RR Apache Memcached non-VHE VHE OPT* *Measurements obtained using Linux in EL2. See BKK16 talk. Normalized overhead (lower is better)
  • 33. ENGINEERS AND DEVICES WORKING TOGETHER Conclusions • Optimize and redesign KVM/ARM for VHE • Reduce hypercall overhead by more than 75% • Better cycle counts than x86 for key hypervisor operations • Network benchmark overhead reduced by 50% • Key-value store workload overhead reduced by more than 80%
  • 34. ENGINEERS AND DEVICES WORKING TOGETHER Upstream Status • Timer patches on list • Core optimization patches coming soon