SlideShare a Scribd company logo
Xen Summit 2010
  Extending Xen into Embedded
and Communications Workloads
Agenda


       •       Embedded Usage Models
       •       Virtual Machine Monitor Requirements
       •       Benchmarking
       •       Cisco Product Range
       •       Embedded Development Requirements
       •       High Availability




2   09.14.05
Embedded Usage Models

                           Robotics
                           Using Core Micro
                           Architecture for GUI
IP Media Phones            interface with real time
Atom based platforms       industrial control.
delivering Internet
connectivity and media
content to continuous
connected devices.

                                                         Routing
                                                      Xeon Micro
                                                      Architecture based
                                                      platforms implement
                                                      control and data-
                                                      plane services on
                                                      high end routers.



                 Unique VMM requirements across all segments

 3
Virtual Machine Monitor Implementation

                                                              Scalability, Flexibility, RAS
                                Industrial Control requires   and Fail Over are a few of
                                determinism. Performance      the vmm requirements in
     Critical partition                                       Comm’s appliance
     required to host Cell      is measured in interrupt
                                latency (10 usec or lower)    environment
     phone application,
     hypervisor requires
     Quality of Service                                                            RTOS
                                                              (Service)   Linux
                                     Microsoft                  Linux
       Critical     App              GUI
      Partition     partition                      RTOS
                                          Shared
                                          Memory                          vmm
                                            vmm
             Thin vmm

                                       Industrial               Comm’s Appliance
        Media Phone




 4
Embedded Virtualization - Advantages


Consolidation and Preservation    Dataplane                               Control
Legacy - Proprietary Single
                                 Legacy         Legacy
Threaded Operating Systems       RTOS           RTOS
                                                                         Linux




    Rapid Deployment of new                                             vmm
    services                               VT-d / SRIOV


                                      Core 0           Core 1         Multi-Core
    Integrate Development                                             Architecture
                                 rx               rx
    Environment separate from         tx               tx

    Critical Services                      PF
                                                            10 Gb/s




5
Embedded Deployment Requirements
                                                          Single Core scheduling
    Scheduling control for Guest Quality of
    Service
                                                                        Phone          App
                                                          Dom0          Application    Development
    Traffic prioritization to avoid packet loss
    requires (soft) Real Time scheduling
                                                                                      Xen
    Credit based scheduler research in progress           Atom              I/O             I/O




                                                  Consolidated              Grant Tables
    Consolidate Fast Path with Security
                                                  fast path
    Intrusion Detection application
                                                            Linux             io rings            Fast Path
    Requires efficient mechanism to share                   Intrusion
                                                                                          ip
    packet data with Linux application            Dom0      Detection
                                                                                          packet
                                                                                                   Forwarding


    Grant tables (io rings) maybe an efficient
    mechanism to meet performance                                                     Xen
    requirements (needs to be Lock Free)
                                                   Xeon                                                I/O



6
Embedded Xen Deployment
                                                                        120
                                                                        100
Power Profile of some edge based appliances is
                                                                        80
cyclical, potential power savings can be substantial
                                                                        60
(Example Base Station Controller)                                       40
                                                                                    Data

                                                                        20
ACPI support generally not supported in Real Time /                      0
Proprietary Operating Systems                                                 6am           6pm


                                                                        120
Hypervisor Power Management could be very useful
                                                                        100
to control overall power budget
                                                                         80
                                                                                    Voice
                                                                         60
“Shelf Manager” Power management research in
                                                                         40
progress
                                                                         20
                                     Fast
                                  Fast                                    0
                                Fast Path
               Dom0               Path
                                Path                                          6am           6pm
                      Shf mgr
                                   Xen            Fast
                                               Fast
                                             Fast Path
                          Dom0                 Path
                                             Path
                 Multi Core
                                   Shf mgr
                                                                        Intelligent Power
                                                                 Fast
                                               Xen            Fast
                                                            Fast Path   Management, balances I/O
                                            Dom0              Path
                                                            Path
                            Multi Core                                  latency & throughout
                                                  Shf mgr
                                                               Xen

                                              Multi Core

 7
Embedded Xen – Direct Cache Access




                                                                                                      memory
 DCA - Direct Cache Access delivers data in cache to                     CPU




                                                                                         ctrl
 reduce average memory latency and attempts to
                                                                        Cache
 reduce memory bandwidth

 DCA Driver uses get_cpu() to gather APIC_ID, uses
 this to configure the DCA enabled NIC device                              IOH
                                                                                    DCA


 static void igb_update_dca(struct igb_q_vector *q_vector)
 {                                                                                  I/O

      struct igb_adapter *adapter = q_vector->adapter;
      struct e1000_hw *hw = &adapter->hw;
      int cpu = get_cpu();                /* Get the current CPU Id*/

      if (q_vector->cpu == cpu)                                         Dom0   Guest             Guest

           goto out_no_update;
                                                                                                Xen

 get_cpu() requires to return the valid APIC ID of the                            CPU              CPU
                                                                                 Cache            Cache
 core where the guest is executing.




  8
Benchmarking, 10 GbE perspective
A 64B packet can arrive every 67.2ns
In terms of processor cycles : @ 2.53 GHz, a 64B packet arrives every ~201 cycles
Can generate up to 14.88 million Rx and 14.88 million Tx transactions every second
(packets)
Each packet has a 16B descriptor associated with it, that must be written for every
packet that needs to be processed
Mpp/s
16,000,000                                         The Linux forwarding code
14,000,000                                         takes ~3000 cycles to process
12,000,000                                         a packet.
10,000,000

 8,000,000
                                                   With enhancement we can
 6,000,000
                                                   reduce the number of cycles
                                                   per (64 Byte) packet to ~1350
 4,000,000
                                                   cycles.
 2,000,000

        0
               64
              118
              172
              226
              280
              334
              388
              442
              496
              550
              604
              658
              712
              766
              820
              874
              928
              982
             1036
             1090
             1144
             1198
             1252
             1306
             1360
             1414
             1468


                     Packet Size


9
Guest Forwarding Performance
                                 Native              Layer 3 Forwarding
                                 Virtualized         2-Port (1 Core, 1 Thread)
Packets per Second (PPS)




                                                                                         Linux             Linux
                                                                                       forwarding     forwarding




                                                                                                    VT-d            vmm


                                                                                          Core 0           Core 1   Multi-Core
                                                                                                                    Architecture
                                                                                           I/O              I/O
                                 64


                                       128


                                               256


                                                      512


                                                            768


                                                                  1024


                                                                         1280


                                                                                1518
                                               Packet Size (bytes)


                 Single threaded virtualized environments show promising performance:
                                - Near native performance for small packet sizes
                                - Native performance for large packet sizes ( >256B ).

                 Limited performance penalty for consolidation, additional scaling tests
                 in progress

                           10
Cisco Embedded Product Space                                                  Service Provider


   Wide range of products in a number
   of market segments:
                                                                                ASR 9000        CRS
                                                                 Data Center

                   Voice & Video


                                                               UCS        Nexus 7000

            TelePresence      Unified              Enterprise
                           Communications


                                                                                   Security
                                             MDS 9222i (SAN)   ASR 1000
                             Branch
        Home                                                                   Ironport    ASA 5500


                      3900 ISR    2800 ISR

Flip Video Valet


   11
Embedded Product Environment
Hardware Environment
      General Purpose CPUs, SoCs, ASICs, FPGAs, custom processors, ixp, DSPs, …
      From large multi-core, multi-blade, multi-chassis systems to small single/dual core devices
      Terabit to Gigabit I/O



Software Environment
      Multi-OS: IOS, IOS-XE, IOS-XR, NX-OS
           Proprietary (legacy), Linux, other …
      Single threaded, multi-threaded, pipelined, flow-based, …
      Multiple vm models
           integrated services platform, distributed/load balancing, HA, control & data
           separation, …
      Control plane, data plane, management plane, appliance and service engines, …
           e.g., routing, data, voice, video, deep packet inspection, firewall, security, etc.



Memory, processor, and I/O bandwidth requirements vary by application
and network device location

 12
Embedded Development Requirements
We believe that xen is the right choice for an embedded hypervisor
     Early support for prototype hardware required: In hypervisor and dom0
     Open source xen and linux critical to this effort
     It’s the right architecture and feature set for embedded development



RAS
     High Availability (HA) for guests
          non-disruptive stateful failover, non-disruptive in service software upgrade (ISSU)
     Devices
          hot pluggable/removable (non-disruptive): shared & dedicated (including sr-iov)
     dom0
          Separate device driver domains good, but not enough
          All domains need to be restartable


Deterministic Performance
     QoS control through configuration and scheduling
     I/O linearly scalable across cores and vms
     Low latency interrupts



13
Embedded Development Requirements
Core allocation/Scheduling: vcpu              pcpu mapping
     (pinned, non-shared):                    deterministic performance
     (pinned, shared), (non-pinned, shared): scheduled
For pv IOS, I/O workload, 64-byte packets, 2 ports, bidirectional, 64-bit xen, NUMA on

(pinned, non-shared), HT off                 100%line rate (1Gb) per core
                                             <0.1% time spent in hypervisor
(non-pinned, shared), HT off                 ~10% decreased throughput
(pinned, non-shared), NUMA- remote, HT off   ~8% decreased throughput

(pinned, non-shared), HT on, one on each     1.5x/1.7x (I/O/cpu) increase in
thread on the core                           throughput (aggregate)
                                             .75x/.85x (I/O/cpu) throughput per
                                             transaction single thread
(pinned, non-shared), HT on, only one        Same as (pinned, non-shared), HT off
thread on the core in use


 Guest Support
      Both pv and hvm (hybrid!)
      32-bit & 64-bit
      Virtual memory paged and non-paged (single, flat address space)


14
Embedded Development Requirements
Debug and Performance Monitoring
     multi-guest, simultaneous
     32-bit & 64-bit guests (minimum is gdbsx for both pv & hvm)
     Performance monitoring tools (access to PMU data - xenoprofile & others)
     Required in the field as well as during development

Trusted Systems: Secure Products
     Trusted boot, TPM, Intel TXT/AMD-V
     Trusted guests, sandboxed 3rd party guests, anti-counterfeiting, …
     Manageable

Power Management
     Especially at the edge, branch, and consumer devices
     Policy based, managed by hypervisor
          Cases where guest should not be automatically power managed

“carrier class” xen Development Environment
     Support for rapid prototyping
     Support for production product environment




15
HA Requirements
Rationale
      HA & ISSU features available on many platforms across our product space today
           Cannot go to market without support in certain product spaces
      Software fails much more often than hardware
           Software-only HA/ISSU at much lower cost very attractive
           Natural fit on multi-core devices

High Availability (HA)
      Active-Standby: stateful, “hot” Standby
      Failure of Active causes non-disruptive failover to Standby
      Reconciliation required on switchover
           Standby progresses through state machine to Active state
      I/O devices always belong to Active and switch to [new] Active without loss of state
           Packet loss ok on switchover – higher level protocols recover
      Downstream end of device connection must not see a “failure”
      Switchover must take place in < 1 sec.

In Service Software Upgrade (ISSU)
      Built on HA infrastructure
      Automated software upgrade (or downgrade)
      Non disruptive: Fallback if required or requested



 16
HA Requirements
What is needed:
      Reliable fast failure detection mechanism
           Current: hardware uses interrupt pin; backup is heart-beat mechanism (slow)
           Need to emulate/implement fast, reliable failure detection mechanism in xen
      Failover device transparently from Active to Standby
           no loss of [device] state
           Packet traffic dropped until Standby transitions to Active
      Interrupts
           redirected to new Active (old Standby) on failover
           interrupts dropped until Standby transitions to Active
           [new] Active must be able to address outstanding interrupts without complete reset
      Need to be able to run in redundant hardware configuration or on multi-core device
          drivers responsible for appropriate reconciliation protocols
      Minimize the changes to xen kernel and dom0 code
           recovery decisions need to be in the domain of the guest driver
      Support for direct assign devices (including sr-iov) and shared devices
      Non shared memory solution for DMA target memory preferred
          requires ability to either pre-program and switch or reprogram and switch on failover


 17
“carrier class” xen Development Environment
Needs to support 2 different Environments:

      Rapid prototyping and development of new services
         Work often requires unstable branch, pre-release/prototype hardware
         Straight forward, and accessible to the non xen expert
                    Interest is in getting the prototype/product up and running quickly rather
         than
                    xen infrastructure
                    Developer threads, blogs, etc. not a substitute for up-to-date
         documentation
         Product decisions (go/no go) based on prototype results
                    Failure/missed deadlines will eliminate a prototype as a possible solution
         Corporate networks/labs behind firewalls, use proxies
                    Doesn’t work well with current git-based source control
                    Requires exceptions to corporate IT policy

      Production product
         Uses stable release
         Controlled access to performance & debug tools in customer environment
         Documentation required in field as well
         Auditing requires ability to reproduce image bit-for-bit from local build

 18
Summary


     •   Embedded market provides for a great growth
         opportunity
     •   Deployment requires some unique features
     •   Xen is well positioned but requires support for RAS
         features, debug and “Carrier Class” Release




19

More Related Content

PPTX
E Vm Virtualization
PDF
SmartOS
PDF
Cisco ios versions
PPTX
ActionPacked! Networks Hosts Cisco Application Visibility & Control Webinar
PDF
Track A-Shmuel Panijel, Windriver
PPTX
Keynote Speech: Xen ARM Virtualization
PDF
V-TAS Pro cctv command & control
PDF
Gaweł mikołajczyk. holistic identity based networking approach – an irreducib...
E Vm Virtualization
SmartOS
Cisco ios versions
ActionPacked! Networks Hosts Cisco Application Visibility & Control Webinar
Track A-Shmuel Panijel, Windriver
Keynote Speech: Xen ARM Virtualization
V-TAS Pro cctv command & control
Gaweł mikołajczyk. holistic identity based networking approach – an irreducib...

What's hot (19)

PDF
Monitoring Principles & z/VSE Monitoring Options
PPTX
Introducing OneCommand Vision 3.0, I/O management that gives your application...
PDF
TenduitRIMCenter
PPTX
Solace Systems The Evolution of Messaging The Rise of the Appliance
PDF
Hisham Dalle - Zero client computing - taking the desktop into the cloud
PDF
Apposite - Netropy WAN emualation
PDF
Security in a Cloudy Architecture
PDF
PDF
Smarter Computing: Expert Integrated System
PDF
Building Network Elements Using Intel Network Processors and ATCA
PDF
Rationalization and Defense in Depth - Two Steps Closer to the Cloud
PDF
Virtualizing More While Improving Risk Posture – From Bare Metal to End Point
PDF
Road to superior investment protection for mission critical
PDF
V-TAS Pro alarm receiving software
PDF
Cinemeccanica
PDF
Self Care Solution for Microsoft Mediaroom
DOC
Wipro - FM Best Practices Showcase
PDF
Next Generation Data Centers
PDF
Rationalization and Defense in Depth - Two Steps Closer to the Clouds
Monitoring Principles & z/VSE Monitoring Options
Introducing OneCommand Vision 3.0, I/O management that gives your application...
TenduitRIMCenter
Solace Systems The Evolution of Messaging The Rise of the Appliance
Hisham Dalle - Zero client computing - taking the desktop into the cloud
Apposite - Netropy WAN emualation
Security in a Cloudy Architecture
Smarter Computing: Expert Integrated System
Building Network Elements Using Intel Network Processors and ATCA
Rationalization and Defense in Depth - Two Steps Closer to the Cloud
Virtualizing More While Improving Risk Posture – From Bare Metal to End Point
Road to superior investment protection for mission critical
V-TAS Pro alarm receiving software
Cinemeccanica
Self Care Solution for Microsoft Mediaroom
Wipro - FM Best Practices Showcase
Next Generation Data Centers
Rationalization and Defense in Depth - Two Steps Closer to the Clouds
Ad

Similar to Xen summit 2010 extending xen into embedded (20)

PDF
A series presentation
PDF
Big Data Smarter Networks
PDF
Hardware assisted Virtualization in Embedded
PDF
Nevmug Green Pages Cisco Nexus January 2009
PDF
The Network\'s IN the (virtualised) Server: Virtualized Io In Heterogeneous M...
PDF
VMware Performance for Gurus - A Tutorial
PDF
Exploiting Linux Control Groups for Effective Run-time Resource Management
PDF
Embedded Virtualization applied in Mobile Devices
PPTX
Hyper V And Scvmm Best Practis
PDF
Xensummit2009 Io Virtualization Performance
PDF
Simulation Directed Co-Design from Smartphones to Supercomputers
PPTX
Windows server 2008 r2
PDF
Multicore I/O Processors In Virtual Data Centers
PDF
SAP Virtualization Week 2012 - The Lego Cloud
PDF
ARM Architecture-based System Virtualization: Xen ARM open source software pr...
PDF
NFV SDN for carriers
PPTX
Track 3 - next generation computing
PDF
Workload consolidation on ATCA with the advantech mic 5333 universal platform
PDF
Cots moves to multicore: AMD
PDF
Fremtidens platform til koncernsystemer (IBM System z)
A series presentation
Big Data Smarter Networks
Hardware assisted Virtualization in Embedded
Nevmug Green Pages Cisco Nexus January 2009
The Network\'s IN the (virtualised) Server: Virtualized Io In Heterogeneous M...
VMware Performance for Gurus - A Tutorial
Exploiting Linux Control Groups for Effective Run-time Resource Management
Embedded Virtualization applied in Mobile Devices
Hyper V And Scvmm Best Practis
Xensummit2009 Io Virtualization Performance
Simulation Directed Co-Design from Smartphones to Supercomputers
Windows server 2008 r2
Multicore I/O Processors In Virtual Data Centers
SAP Virtualization Week 2012 - The Lego Cloud
ARM Architecture-based System Virtualization: Xen ARM open source software pr...
NFV SDN for carriers
Track 3 - next generation computing
Workload consolidation on ATCA with the advantech mic 5333 universal platform
Cots moves to multicore: AMD
Fremtidens platform til koncernsystemer (IBM System z)
Ad

More from The Linux Foundation (20)

PDF
ELC2019: Static Partitioning Made Simple
PDF
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
PDF
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
PDF
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
PDF
XPDDS19 Keynote: Unikraft Weather Report
PDF
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
PDF
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
PDF
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
PDF
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
PPTX
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
PPTX
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
PDF
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
PDF
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
PDF
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
PDF
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
PDF
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
PDF
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
PDF
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
PDF
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
PDF
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
ELC2019: Static Partitioning Made Simple
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
PPT
Teaching material agriculture food technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Approach and Philosophy of On baking technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
A Presentation on Artificial Intelligence
PPTX
sap open course for s4hana steps from ECC to s4
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
Teaching material agriculture food technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Agricultural_Statistics_at_a_Glance_2022_0.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Approach and Philosophy of On baking technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
A Presentation on Artificial Intelligence
sap open course for s4hana steps from ECC to s4
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
Building Integrated photovoltaic BIPV_UPV.pdf
Chapter 3 Spatial Domain Image Processing.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf

Xen summit 2010 extending xen into embedded

  • 1. Xen Summit 2010 Extending Xen into Embedded and Communications Workloads
  • 2. Agenda • Embedded Usage Models • Virtual Machine Monitor Requirements • Benchmarking • Cisco Product Range • Embedded Development Requirements • High Availability 2 09.14.05
  • 3. Embedded Usage Models Robotics Using Core Micro Architecture for GUI IP Media Phones interface with real time Atom based platforms industrial control. delivering Internet connectivity and media content to continuous connected devices. Routing Xeon Micro Architecture based platforms implement control and data- plane services on high end routers. Unique VMM requirements across all segments 3
  • 4. Virtual Machine Monitor Implementation Scalability, Flexibility, RAS Industrial Control requires and Fail Over are a few of determinism. Performance the vmm requirements in Critical partition Comm’s appliance required to host Cell is measured in interrupt latency (10 usec or lower) environment phone application, hypervisor requires Quality of Service RTOS (Service) Linux Microsoft Linux Critical App GUI Partition partition RTOS Shared Memory vmm vmm Thin vmm Industrial Comm’s Appliance Media Phone 4
  • 5. Embedded Virtualization - Advantages Consolidation and Preservation Dataplane Control Legacy - Proprietary Single Legacy Legacy Threaded Operating Systems RTOS RTOS Linux Rapid Deployment of new vmm services VT-d / SRIOV Core 0 Core 1 Multi-Core Integrate Development Architecture rx rx Environment separate from tx tx Critical Services PF 10 Gb/s 5
  • 6. Embedded Deployment Requirements Single Core scheduling Scheduling control for Guest Quality of Service Phone App Dom0 Application Development Traffic prioritization to avoid packet loss requires (soft) Real Time scheduling Xen Credit based scheduler research in progress Atom I/O I/O Consolidated Grant Tables Consolidate Fast Path with Security fast path Intrusion Detection application Linux io rings Fast Path Requires efficient mechanism to share Intrusion ip packet data with Linux application Dom0 Detection packet Forwarding Grant tables (io rings) maybe an efficient mechanism to meet performance Xen requirements (needs to be Lock Free) Xeon I/O 6
  • 7. Embedded Xen Deployment 120 100 Power Profile of some edge based appliances is 80 cyclical, potential power savings can be substantial 60 (Example Base Station Controller) 40 Data 20 ACPI support generally not supported in Real Time / 0 Proprietary Operating Systems 6am 6pm 120 Hypervisor Power Management could be very useful 100 to control overall power budget 80 Voice 60 “Shelf Manager” Power management research in 40 progress 20 Fast Fast 0 Fast Path Dom0 Path Path 6am 6pm Shf mgr Xen Fast Fast Fast Path Dom0 Path Path Multi Core Shf mgr Intelligent Power Fast Xen Fast Fast Path Management, balances I/O Dom0 Path Path Multi Core latency & throughout Shf mgr Xen Multi Core 7
  • 8. Embedded Xen – Direct Cache Access memory DCA - Direct Cache Access delivers data in cache to CPU ctrl reduce average memory latency and attempts to Cache reduce memory bandwidth DCA Driver uses get_cpu() to gather APIC_ID, uses this to configure the DCA enabled NIC device IOH DCA static void igb_update_dca(struct igb_q_vector *q_vector) { I/O struct igb_adapter *adapter = q_vector->adapter; struct e1000_hw *hw = &adapter->hw; int cpu = get_cpu(); /* Get the current CPU Id*/ if (q_vector->cpu == cpu) Dom0 Guest Guest goto out_no_update; Xen get_cpu() requires to return the valid APIC ID of the CPU CPU Cache Cache core where the guest is executing. 8
  • 9. Benchmarking, 10 GbE perspective A 64B packet can arrive every 67.2ns In terms of processor cycles : @ 2.53 GHz, a 64B packet arrives every ~201 cycles Can generate up to 14.88 million Rx and 14.88 million Tx transactions every second (packets) Each packet has a 16B descriptor associated with it, that must be written for every packet that needs to be processed Mpp/s 16,000,000 The Linux forwarding code 14,000,000 takes ~3000 cycles to process 12,000,000 a packet. 10,000,000 8,000,000 With enhancement we can 6,000,000 reduce the number of cycles per (64 Byte) packet to ~1350 4,000,000 cycles. 2,000,000 0 64 118 172 226 280 334 388 442 496 550 604 658 712 766 820 874 928 982 1036 1090 1144 1198 1252 1306 1360 1414 1468 Packet Size 9
  • 10. Guest Forwarding Performance Native Layer 3 Forwarding Virtualized 2-Port (1 Core, 1 Thread) Packets per Second (PPS) Linux Linux forwarding forwarding VT-d vmm Core 0 Core 1 Multi-Core Architecture I/O I/O 64 128 256 512 768 1024 1280 1518 Packet Size (bytes) Single threaded virtualized environments show promising performance: - Near native performance for small packet sizes - Native performance for large packet sizes ( >256B ). Limited performance penalty for consolidation, additional scaling tests in progress 10
  • 11. Cisco Embedded Product Space Service Provider Wide range of products in a number of market segments: ASR 9000 CRS Data Center Voice & Video UCS Nexus 7000 TelePresence Unified Enterprise Communications Security MDS 9222i (SAN) ASR 1000 Branch Home Ironport ASA 5500 3900 ISR 2800 ISR Flip Video Valet 11
  • 12. Embedded Product Environment Hardware Environment General Purpose CPUs, SoCs, ASICs, FPGAs, custom processors, ixp, DSPs, … From large multi-core, multi-blade, multi-chassis systems to small single/dual core devices Terabit to Gigabit I/O Software Environment Multi-OS: IOS, IOS-XE, IOS-XR, NX-OS Proprietary (legacy), Linux, other … Single threaded, multi-threaded, pipelined, flow-based, … Multiple vm models integrated services platform, distributed/load balancing, HA, control & data separation, … Control plane, data plane, management plane, appliance and service engines, … e.g., routing, data, voice, video, deep packet inspection, firewall, security, etc. Memory, processor, and I/O bandwidth requirements vary by application and network device location 12
  • 13. Embedded Development Requirements We believe that xen is the right choice for an embedded hypervisor Early support for prototype hardware required: In hypervisor and dom0 Open source xen and linux critical to this effort It’s the right architecture and feature set for embedded development RAS High Availability (HA) for guests non-disruptive stateful failover, non-disruptive in service software upgrade (ISSU) Devices hot pluggable/removable (non-disruptive): shared & dedicated (including sr-iov) dom0 Separate device driver domains good, but not enough All domains need to be restartable Deterministic Performance QoS control through configuration and scheduling I/O linearly scalable across cores and vms Low latency interrupts 13
  • 14. Embedded Development Requirements Core allocation/Scheduling: vcpu pcpu mapping (pinned, non-shared): deterministic performance (pinned, shared), (non-pinned, shared): scheduled For pv IOS, I/O workload, 64-byte packets, 2 ports, bidirectional, 64-bit xen, NUMA on (pinned, non-shared), HT off 100%line rate (1Gb) per core <0.1% time spent in hypervisor (non-pinned, shared), HT off ~10% decreased throughput (pinned, non-shared), NUMA- remote, HT off ~8% decreased throughput (pinned, non-shared), HT on, one on each 1.5x/1.7x (I/O/cpu) increase in thread on the core throughput (aggregate) .75x/.85x (I/O/cpu) throughput per transaction single thread (pinned, non-shared), HT on, only one Same as (pinned, non-shared), HT off thread on the core in use Guest Support Both pv and hvm (hybrid!) 32-bit & 64-bit Virtual memory paged and non-paged (single, flat address space) 14
  • 15. Embedded Development Requirements Debug and Performance Monitoring multi-guest, simultaneous 32-bit & 64-bit guests (minimum is gdbsx for both pv & hvm) Performance monitoring tools (access to PMU data - xenoprofile & others) Required in the field as well as during development Trusted Systems: Secure Products Trusted boot, TPM, Intel TXT/AMD-V Trusted guests, sandboxed 3rd party guests, anti-counterfeiting, … Manageable Power Management Especially at the edge, branch, and consumer devices Policy based, managed by hypervisor Cases where guest should not be automatically power managed “carrier class” xen Development Environment Support for rapid prototyping Support for production product environment 15
  • 16. HA Requirements Rationale HA & ISSU features available on many platforms across our product space today Cannot go to market without support in certain product spaces Software fails much more often than hardware Software-only HA/ISSU at much lower cost very attractive Natural fit on multi-core devices High Availability (HA) Active-Standby: stateful, “hot” Standby Failure of Active causes non-disruptive failover to Standby Reconciliation required on switchover Standby progresses through state machine to Active state I/O devices always belong to Active and switch to [new] Active without loss of state Packet loss ok on switchover – higher level protocols recover Downstream end of device connection must not see a “failure” Switchover must take place in < 1 sec. In Service Software Upgrade (ISSU) Built on HA infrastructure Automated software upgrade (or downgrade) Non disruptive: Fallback if required or requested 16
  • 17. HA Requirements What is needed: Reliable fast failure detection mechanism Current: hardware uses interrupt pin; backup is heart-beat mechanism (slow) Need to emulate/implement fast, reliable failure detection mechanism in xen Failover device transparently from Active to Standby no loss of [device] state Packet traffic dropped until Standby transitions to Active Interrupts redirected to new Active (old Standby) on failover interrupts dropped until Standby transitions to Active [new] Active must be able to address outstanding interrupts without complete reset Need to be able to run in redundant hardware configuration or on multi-core device drivers responsible for appropriate reconciliation protocols Minimize the changes to xen kernel and dom0 code recovery decisions need to be in the domain of the guest driver Support for direct assign devices (including sr-iov) and shared devices Non shared memory solution for DMA target memory preferred requires ability to either pre-program and switch or reprogram and switch on failover 17
  • 18. “carrier class” xen Development Environment Needs to support 2 different Environments: Rapid prototyping and development of new services Work often requires unstable branch, pre-release/prototype hardware Straight forward, and accessible to the non xen expert Interest is in getting the prototype/product up and running quickly rather than xen infrastructure Developer threads, blogs, etc. not a substitute for up-to-date documentation Product decisions (go/no go) based on prototype results Failure/missed deadlines will eliminate a prototype as a possible solution Corporate networks/labs behind firewalls, use proxies Doesn’t work well with current git-based source control Requires exceptions to corporate IT policy Production product Uses stable release Controlled access to performance & debug tools in customer environment Documentation required in field as well Auditing requires ability to reproduce image bit-for-bit from local build 18
  • 19. Summary • Embedded market provides for a great growth opportunity • Deployment requires some unique features • Xen is well positioned but requires support for RAS features, debug and “Carrier Class” Release 19