SlideShare a Scribd company logo
Service Assurance for
Virtual Network Functions in
Cloud Native Environments
Nikos Anastopoulos, Software Engineer, Intracom Telecom
github.com/anastop
2nd Athens Kubernetes Meetup
Agenda
Motivation
NFV Service Assurance Platform
Internals
Conclusion
Agenda
Motivation
NFV Service Assurance Platform
Internals
Conclusion
Network Function Virtualization (NFV)
What:
› Move Network Functions (NFs) from proprietary & fixed-
function HW («middleboxes») to standard high-volume
servers
› Run on the cloud
Why:
› Lower costs due to reused/lower-cost infrastructure
› Easier deployment of new services
› No vendor lock-in
Quick overview
NF implementations:
› VMs
› Containers
› Native processes
› Unikernels
› HW-offloaded functions
› ...
NFV services (or “appliances”) can be:
› Simple (e.g. NAT, firewalls) or more
complex (e.g. EPC, IMS)
› Standalone or chained
Virtual switches:
› Kernel implementations (Linux bridge, OpenVswitch)
› User-level apps (DPDK-OVS, FD.io, ...)
NFV’s big challenge: performance
Key for widespread adoption. CommSPs want it:
› High
carrier-grade KPIs (high throughput, low latency, low packet loss)
mitigate overheads during NFV transformation
Legacy Softwarization Virtualization Cloudification
NFV’s big challenge: performance
Key for widespread adoption. CommSPs want it:
› Predictable
provide strong guarantees on the delivered service performance
traditionally, we could well understand the performance properties of NFs on fixed-function HW
this is HARD with SW, particularly when multiple NFs need to colocate on the same server
NFV’s big challenge: performance
Key for widespread adoption. CommSPs want it:
› Predictable
provide strong guarantees on the delivered service performance
traditionally, we could well understand the performance properties of NFs on fixed-function HW
this is HARD with SW, particularly when multiple NFs need to colocate on the same server
Example: deploying a mixture of NFs on a commodity server
What your favorite orchestrator can do
› Policy-driven & static placement
› Efficient thread-thread/thread-memory comm due to packing within socket
› Good resource isolation due to CPU pinning
› ...but not enough!
What your favorite orchestrator can do
› Policy-driven & static placement
› Efficient thread-thread/thread-memory comm due to packing within socket
› Good resource isolation due to CPU pinning
› ...but not enough!
Impact of Last Level Cache (LLC) contention on NFV performance*
› NFs executing simultaneously on isolated cores (same socket), throughput of each VM measured
2,20
6,20
12,80
33,30
46,10
51,00
37,60
2,20
4,90
0,00
10,00
20,00
30,00
40,00
50,00
60,00
EndRe IPSec Suricata Snort Stats MazuNAT LPM Firewall Efficuts
Max Throughput Degradation (%), normalized
*Tootoonchian et al, USENIX NSDI ’18, http://span.cs.berkeley.edu
What a careful admin will do
› Near-optimal isolation for the Critical NF
› Inefficient resource utilization
...or, in the worst case
Problem
How to best allocate server resources both within and across NFV services to achieve:
› High performance
› Predictable performance
› Dense consolidations for high infrastructure utilization & energy efficiency
› Harder as #NFs per server increases
Agenda
Motivation
NFV Service Assurance Platform
Internals
Conclusion
Our premise: fine-grain resource slicing
› Leverage LLC & DRAM B/W as additional resources to partition
› Assign dedicated slices to NFs
› Assign just-the-right amount of resources to NFs
End result: high performance, high predictability, dense consolidations on the same platform
Cache Allocation Technology (CAT)
› Enables OS, hypervisor, or similar system to specify how much LLC space an app can use
› Allows exclusive use of that space by the application
› Allows dynamic reassignment of the cache space during runtime
› Since Xeon E5 v3
Memory Bandwidth Allocation (MBA) Technology
› Provide a method to control applications which may be monopolizing memory B/W relative to their priority in
a colocated environment
› High-prio apps may be allowed to use all available bandwidth
› Low-prio apps can be throttled down to e.g. 30%, 20% or 10% (model specific) of the theoretic max B/W of
their cores
› Since Xeon Scalable family
Service Assurance for Virtual Network Functions in Cloud-Native Environments
How it works
› NFVSAP acts on services after they have been deployed on the customer’s DC, no matter how. It is non-
intrusive.
› Automatically evaluates the performance of VNFs under thousands of possible HW resource allocation
combinations for every VNF
CPU placement (+NUMA, +Hyper-threading), LLC capacity, mem B/W throttling, CPU frequency
› User interactively shortlists stress tests results, based on his performance & energy constraints
› Creates Optimization Profiles out of shortlisted solutions for permanent use, which can enforce in real-time
Premise: use application-specific KPIs to measure how well a NF is running
Network Service KPIsPower
Transcoder perf:
2 – 2.5 Mbps
Transcoder perf:
2 – 2.5 MbpsPower: 150-170 Watts
NF Cache profiling
› Reduce the “LLC” dimensions in the automated testing phase  reduce exploration space from 1000s to 10s
› Discover the least amount of LLC each NF needs for near-optimal performance
0
10
20
30
40
50
60
70
80
90
100
3 6 9 12 15 18 21 24 27
Throughput(percent)
LLC allocation (MB)
streamer
transcoder
Broad virtualization support
› Supports NFs implemented as
KVM VMs
Docker containers
Kubernetes Pods
native Linux apps
› Components of each type are treated as a whole
w.r.t. resource management
› Allows resource management for hybrid network
services
› Handles special classes of NFs, like user-level soft
switches
Architecture
› Data-store & knowledge base
› Automated performance tests orchestration
› NF profiling
› Storage & enforcement of Optimization Profiles
› HW topology & capabilities discovery
› NF placement & resource allocation
› NF lifecycle monitoring
› Utilization metrics collection at every level (HW, OS, NFs)
› Performance metrics collection (NF, virtual switch)
IP-TV service prototype (MWC ‘18)*
* http://intracom-telecom.com/downloads/pdf/products/sdn_nfv/Resource_Intelligence_Platform.pdf
Throughput (Mbps): 0.8 – 2.8
Average power (Watts): 194
Mbps per Watt: 0.004 – 0.014
Jitter: Yes
Throughput (Mbps): 4.8
Average power (Watts): 220
Mbps per Watt: 0.021
Jitter: No
Throughput (Mbps): 1.5
Average power (Watts): 172
Mbps per Watt: 0.008
Jitter: No
Throughput (Mbps): 4.2
Average power (Watts): 158
Mbps per Watt: 0.027
Jitter: No
TranscoderIsolation
OperatingSystemColocation,NoCAT
Colocation,WithCAT
Improving NF throughput using CAT
› VM under test isolated using CAT, 4.5MB of LLC exclusively allocated to the corresponding NF
› LLC isolation with CAT restricts throughput variation to less than 3%
2,20
6,20
12,80
33,30
46,10
51,00
37,60
2,20
4,90
1,30 0,2 1,2 1,9 2,3
0
1,9
0 0,3
0,00
10,00
20,00
30,00
40,00
50,00
60,00
EndRe IPSec Suricata Snort Stats MazuNAT LPM Firewall Efficuts
Max Throughput Degradation (%), normalized
No CAT CAT
Improving packet processing latency using CAT*
› Traffic scenario: NIC  vswitch  VM (L2 fwd)  vswitch  NIC
› Processing latency in colocated environment is unpredictable with long tail
› CAT improves significantly both processing latency and latency predictability (5-100 μsecs vs 5-1000+ μsecs)
* https://builders.intel.com/docs/networkbuilders/deterministic_network_functions_virtualization_with_Intel_Resource_Director_Technology.pdf
1
10
100
1000
10000
100000
1000000
10000000
100000000
1E+09
5μs–10μs
10μs–25μs
25μs–50μs
50μs–75μs
75μs–100μs
100μs–150μs
150μs–200μs
200μs–250μs
250μs–500μs
500μs–750μs
750μs–1000μs
1000μs-max
#SAMPLES
Noisy Neighbor & No CAT
1
10
100
1000
10000
100000
1000000
10000000
100000000
1E+09
5μs–10μs
10μs–25μs
25μs–50μs
50μs–75μs
75μs–100μs
100μs–150μs
150μs–200μs
200μs–250μs
250μs–500μs
500μs–750μs
750μs–1000μs
1000μs-max
#SAMPLES
Noisy Neighbor & CAT
Agenda
Motivation
NFV Service Assurance Platform
Internals
Conclusion
Allocating resources to NFs (Agent)
Two-level mapping:
1. NF to CPUs (pinning)
2. CPUs to HW Resources
› How many #CPUs per NF?
NFs to CPUs mapping: Linux Kernel Control Groups (cgroups)
› A mechanism for applying resource limits and accounting to a process or a collection of processes
› Available in the Linux kernel since 2008
› All manipulation through a virtual fs under /sys/fs/cgroup
› Important subsystems
cpu : CPU time quota
cpuset : CPU placement
memory : memory allocation and limits
blkio : disk I/O speeds, etc.
net_* : packet filtering and QoS
devices : access control to device nodes
› Each subsystem has a hierarchy; hierarchies are independent
How we use cgroups
› Every NF gets its own cgroup (i.e. node) in the cpuset hierarchy
sub-trees in the hierarchy already exist for: Libvirt, Docker, K8s
cgroups manually created for native Linux NFs
› Allocate CPU subsets to every different NF cgroup recursively and in an isolated manner
› CPU subsets reverted to their prior values in existing cgroups, after agent’s termination
› Powered by , featuring team’s contributions for dynamic cgroup cpuset management
Sub-tree for Docker containers
/sys/fs/cgroup/cpuset/
├ docker/
│ ├ 6be7d2e79/ # container cgroup
│ │ ├ tasks: 23771,23814,32613
│ │ └ cpus: 0-3
...
├ kubepods/
│ └ besteffort/
│ ├ pod14882238-8fd7/
│ │ ├ 1d1363f93/
│ │ │ ├ tasks: 40280
│ │ │ └ cpus: 4-7
│ │ ├ 48ea3fb34/
│ │ │ ├ tasks: 40274
│ │ │ └ cpus: 4-7
...
...
├ machine/
│ ├ streamer1.libvirt-qemu/
│ │ ├ emulator/
│ │ │ ├ tasks: 23771,23814,32613
│ │ │ └ cpus: 8-10
│ │ ├ vcpu0/
│ │ │ ├ tasks: 5576
│ │ │ └ cpus: 8-10
...
├ nla-12807-myapp1/
│ ├ tasks: 12807
│ └ cpus: 11
...
├ nla-13892-myapp2/
│ ├ tasks: 13892
│ └ cpus: 12
...
Sub-tree for Kubernetes pods
/sys/fs/cgroup/cpuset/
├ docker/
│ ├ 6be7d2e79/
│ │ ├ tasks: 23771,23814,32613
│ │ └ cpus: 0-3
...
├ kubepods/
│ └ besteffort/
│ ├ pod14882238-8fd7/ # pod cgroup
│ │ ├ 1d1363f93/ # pod container1 ID
│ │ │ ├ tasks: 40280
│ │ │ └ cpus: 4-7
│ │ ├ 48ea3fb34/ # pod container2 ID
│ │ │ ├ tasks: 40274
│ │ │ └ cpus: 4-7
...
...
├ machine/
│ ├ streamer1.libvirt-qemu/
│ │ ├ emulator/
│ │ │ ├ tasks: 23771,23814,32613
│ │ │ └ cpus: 8-10
│ │ ├ vcpu0/
│ │ │ ├ tasks: 5576
│ │ │ └ cpus: 8-10
...
├ nla-12807-myapp1/
│ ├ tasks: 12807
│ └ cpus: 11
...
├ nla-13892-myapp2/
│ ├ tasks: 13892
│ └ cpus: 12
...
Sub-tree for Libvirt domains
/sys/fs/cgroup/cpuset/
├ docker/
│ ├ 6be7d2e79/
│ │ ├ tasks: 23771,23814,32613
│ │ └ cpus: 0-3
...
├ kubepods/
│ └ besteffort/
│ ├ pod14882238-8fd7/
│ │ ├ 1d1363f93/
│ │ │ ├ tasks: 40280
│ │ │ └ cpus: 4-7
│ │ ├ 48ea3fb34/
│ │ │ ├ tasks: 40274
│ │ │ └ cpus: 4-7
...
...
├ machine/
│ ├ streamer1.libvirt-qemu/ # Libvirt dom cgroup
│ │ ├ emulator/ # emulator threads
│ │ │ ├ tasks: 23771,23814,32613
│ │ │ └ cpus: 8-10
│ │ ├ vcpu0/ # vcpu0 thread
│ │ │ ├ tasks: 5576
│ │ │ └ cpus: 8-10
...
├ nla-12807-myapp1/
│ ├ tasks: 12807
│ └ cpus: 11
...
├ nla-13892-myapp2/
│ ├ tasks: 13892
│ └ cpus: 12
...
Cgroups for Native Linux applications
/sys/fs/cgroup/cpuset/
├ docker/
│ ├ 6be7d2e79/
│ │ ├ tasks: 23771,23814,32613
│ │ └ cpus: 0-3
...
├ kubepods/
│ └ besteffort/
│ ├ pod14882238-8fd7/
│ │ ├ 1d1363f93/
│ │ │ ├ tasks: 40280
│ │ │ └ cpus: 4-7
│ │ ├ 48ea3fb34/
│ │ │ ├ tasks: 40274
│ │ │ └ cpus: 4-7
...
...
├ machine/
│ ├ streamer1.libvirt-qemu/
│ │ ├ emulator/
│ │ │ ├ tasks: 23771,23814,32613
│ │ │ └ cpus: 8-10
│ │ ├ vcpu0/
│ │ │ ├ tasks: 5576
│ │ │ └ cpus: 8-10
...
├ nla-12807-myapp1/ # custom app cgroup
│ ├ tasks: 12807
│ └ cpus: 11
...
├ nla-13892-myapp2/ # custom app cgroup
│ ├ tasks: 13892
│ └ cpus: 12
...
CPUs to CAT & MBA resources mapping: Classes Of Service (CLOS)
The processor exposes a set of Classes of Service to represent priorities of apps, VMs, or CPUs in using the
controlled resources
› Defined per CPU socket
› A CLOS may have multiple controlled resources attached (e.g. LLC capacity & memory bandwidth)
› Number of CLOSes is limited (e.g. 16 in most recent Xeons)
CLOS ID CPUs Resource 1: LLC Resource 2: Mem B/W
CLOS1 0,1,2 regions 0-7 100%
CLOS2 3 regions 8,9 70%
CLOS3 4,5 region 10 10%
CLOS4 unassigned unassigned unassigned
...
CLOS0 (default) rest CPUs in the socket rest regions 100%
Specifying LLC regions via Capacity Bit Masks (CBMs)
Go library to translate high-level capacity & isolation requests to
low-level CBMs
› “give me an available CLOS with 9MB of isolated LLC”
Complexities handled:
› availability checking
› allocate contiguous CBMs & defragment as needed
› consider reserved CBMs
› recycle CLOS IDs
Software stack
Agenda
Motivation
NFV Service Assurance Platform
Internals
Conclusion
Conclusion
› With fine-grain isolation at the CPU core, cache & memory B/W level, we can deliver high & predictable NF
performance, while saving resources
› The ideal combination of resources is non-obvious beforehand & requires evaluation through exhaustive
testing & profiling
› Use well-known & well-documented kernel interfaces through high-level bindings
Future challenges
› Use cases beyond NFV
› Varying runtime behaviors (input traffic, #colocated NFs, etc.)
› Multi-objective optimization for multiple colocated Network Services
› SLO-driven optimization
Thank you!

More Related Content

PDF
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
PDF
VMworld 2014: Extreme Performance Series
PDF
WAN - trends and use cases
PPTX
Link Virtualization based on Xen
PPSX
From virtual to high end HW routing for the adult
PDF
Решения NFV в контексте операторов связи
PPTX
Realtime scheduling for virtual machines in SKT
PDF
Contrail Enabler for agile cloud services
VMworld 2013: Silent Killer: How Latency Destroys Performance...And What to D...
VMworld 2014: Extreme Performance Series
WAN - trends and use cases
Link Virtualization based on Xen
From virtual to high end HW routing for the adult
Решения NFV в контексте операторов связи
Realtime scheduling for virtual machines in SKT
Contrail Enabler for agile cloud services

What's hot (20)

PDF
Next Generation Security Solution
PDF
Alibaba cloud benchmarking report ecs rds limton xavier
PDF
I/O Scalability in Xen
PPT
Considerations when implementing_ha_in_dmf
PPTX
Improving Xen idle power efficiency
PDF
In-Network Acceleration with FPGA (MEMO)
PDF
Linux PV on HVM
PDF
Развитие MX маршрутизаторов
PDF
Stephan pfister flexcast remote pc new
PDF
Minimizing I/O Latency in Xen-ARM
PDF
XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins...
PPSX
Cvc2009 Moscow Repeater+Ica Fabian Kienle Final
PPTX
VMware vSphere 4.1 deep dive - part 1
PDF
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.
PDF
Quieting noisy neighbor with Intel® Resource Director Technology
PDF
Новый функционал JunOS для маршрутизаторов
PDF
Tudor Damian - Hyper-V 3.0 overview
PDF
Ina Pratt Fosdem Feb2008
PDF
Fast Convergence Techniques
Next Generation Security Solution
Alibaba cloud benchmarking report ecs rds limton xavier
I/O Scalability in Xen
Considerations when implementing_ha_in_dmf
Improving Xen idle power efficiency
In-Network Acceleration with FPGA (MEMO)
Linux PV on HVM
Развитие MX маршрутизаторов
Stephan pfister flexcast remote pc new
Minimizing I/O Latency in Xen-ARM
XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins...
Cvc2009 Moscow Repeater+Ica Fabian Kienle Final
VMware vSphere 4.1 deep dive - part 1
XPDDS18: Real Time in XEN on ARM - Andrii Anisov, EPAM Systems Inc.
Quieting noisy neighbor with Intel® Resource Director Technology
Новый функционал JunOS для маршрутизаторов
Tudor Damian - Hyper-V 3.0 overview
Ina Pratt Fosdem Feb2008
Fast Convergence Techniques
Ad

Similar to Service Assurance for Virtual Network Functions in Cloud-Native Environments (20)

PPTX
Netsft2017 day in_life_of_nfv
PDF
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
PPTX
Demystifying Network Function Virtualization (NFV) Service Assurance
PDF
Control of Communication and Energy Networks Final Project - Service Function...
DOCX
Moving CCAP To The Cloud
PDF
Network Function Virtualization - Security Best Practices AtlSecCon 2015
PPTX
Modern Networking Unit 3 Network Function virtualization
PDF
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
PDF
Openstack v4 0
PDF
My network functions are virtualized, but are they cloud-ready
PDF
OpenStack Paris Meetup on Nfv 2014/10/07
PDF
Network Functions Virtualization and CloudStack
PDF
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
PDF
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
PDF
1-11-FONEX-What-are-the-3-Fundamental-Approaches-to-NFV-Deployment.pdf
PDF
Optimized placement in Openstack for NFV
PDF
Why Network Functions Virtualization sdn?
PDF
TechWiseTV Workshop: Enterprise NFV
PDF
Known basic of NFV Features
Netsft2017 day in_life_of_nfv
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
Demystifying Network Function Virtualization (NFV) Service Assurance
Control of Communication and Energy Networks Final Project - Service Function...
Moving CCAP To The Cloud
Network Function Virtualization - Security Best Practices AtlSecCon 2015
Modern Networking Unit 3 Network Function virtualization
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
Openstack v4 0
My network functions are virtualized, but are they cloud-ready
OpenStack Paris Meetup on Nfv 2014/10/07
Network Functions Virtualization and CloudStack
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments...
1-11-FONEX-What-are-the-3-Fundamental-Approaches-to-NFV-Deployment.pdf
Optimized placement in Openstack for NFV
Why Network Functions Virtualization sdn?
TechWiseTV Workshop: Enterprise NFV
Known basic of NFV Features
Ad

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Electronic commerce courselecture one. Pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Machine Learning_overview_presentation.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Cloud computing and distributed systems.
PPTX
Spectroscopy.pptx food analysis technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Machine learning based COVID-19 study performance prediction
Assigned Numbers - 2025 - Bluetooth® Document
Spectral efficient network and resource selection model in 5G networks
Electronic commerce courselecture one. Pdf
Per capita expenditure prediction using model stacking based on satellite ima...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm
The AUB Centre for AI in Media Proposal.docx
MYSQL Presentation for SQL database connectivity
Machine Learning_overview_presentation.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Cloud computing and distributed systems.
Spectroscopy.pptx food analysis technology
Review of recent advances in non-invasive hemoglobin estimation
Mobile App Security Testing_ A Comprehensive Guide.pdf

Service Assurance for Virtual Network Functions in Cloud-Native Environments

  • 1. Service Assurance for Virtual Network Functions in Cloud Native Environments Nikos Anastopoulos, Software Engineer, Intracom Telecom github.com/anastop 2nd Athens Kubernetes Meetup
  • 2. Agenda Motivation NFV Service Assurance Platform Internals Conclusion
  • 3. Agenda Motivation NFV Service Assurance Platform Internals Conclusion
  • 4. Network Function Virtualization (NFV) What: › Move Network Functions (NFs) from proprietary & fixed- function HW («middleboxes») to standard high-volume servers › Run on the cloud Why: › Lower costs due to reused/lower-cost infrastructure › Easier deployment of new services › No vendor lock-in
  • 5. Quick overview NF implementations: › VMs › Containers › Native processes › Unikernels › HW-offloaded functions › ... NFV services (or “appliances”) can be: › Simple (e.g. NAT, firewalls) or more complex (e.g. EPC, IMS) › Standalone or chained Virtual switches: › Kernel implementations (Linux bridge, OpenVswitch) › User-level apps (DPDK-OVS, FD.io, ...)
  • 6. NFV’s big challenge: performance Key for widespread adoption. CommSPs want it: › High carrier-grade KPIs (high throughput, low latency, low packet loss) mitigate overheads during NFV transformation Legacy Softwarization Virtualization Cloudification
  • 7. NFV’s big challenge: performance Key for widespread adoption. CommSPs want it: › Predictable provide strong guarantees on the delivered service performance traditionally, we could well understand the performance properties of NFs on fixed-function HW this is HARD with SW, particularly when multiple NFs need to colocate on the same server
  • 8. NFV’s big challenge: performance Key for widespread adoption. CommSPs want it: › Predictable provide strong guarantees on the delivered service performance traditionally, we could well understand the performance properties of NFs on fixed-function HW this is HARD with SW, particularly when multiple NFs need to colocate on the same server
  • 9. Example: deploying a mixture of NFs on a commodity server
  • 10. What your favorite orchestrator can do › Policy-driven & static placement › Efficient thread-thread/thread-memory comm due to packing within socket › Good resource isolation due to CPU pinning › ...but not enough!
  • 11. What your favorite orchestrator can do › Policy-driven & static placement › Efficient thread-thread/thread-memory comm due to packing within socket › Good resource isolation due to CPU pinning › ...but not enough!
  • 12. Impact of Last Level Cache (LLC) contention on NFV performance* › NFs executing simultaneously on isolated cores (same socket), throughput of each VM measured 2,20 6,20 12,80 33,30 46,10 51,00 37,60 2,20 4,90 0,00 10,00 20,00 30,00 40,00 50,00 60,00 EndRe IPSec Suricata Snort Stats MazuNAT LPM Firewall Efficuts Max Throughput Degradation (%), normalized *Tootoonchian et al, USENIX NSDI ’18, http://span.cs.berkeley.edu
  • 13. What a careful admin will do › Near-optimal isolation for the Critical NF › Inefficient resource utilization
  • 14. ...or, in the worst case
  • 15. Problem How to best allocate server resources both within and across NFV services to achieve: › High performance › Predictable performance › Dense consolidations for high infrastructure utilization & energy efficiency › Harder as #NFs per server increases
  • 16. Agenda Motivation NFV Service Assurance Platform Internals Conclusion
  • 17. Our premise: fine-grain resource slicing › Leverage LLC & DRAM B/W as additional resources to partition › Assign dedicated slices to NFs › Assign just-the-right amount of resources to NFs End result: high performance, high predictability, dense consolidations on the same platform
  • 18. Cache Allocation Technology (CAT) › Enables OS, hypervisor, or similar system to specify how much LLC space an app can use › Allows exclusive use of that space by the application › Allows dynamic reassignment of the cache space during runtime › Since Xeon E5 v3
  • 19. Memory Bandwidth Allocation (MBA) Technology › Provide a method to control applications which may be monopolizing memory B/W relative to their priority in a colocated environment › High-prio apps may be allowed to use all available bandwidth › Low-prio apps can be throttled down to e.g. 30%, 20% or 10% (model specific) of the theoretic max B/W of their cores › Since Xeon Scalable family
  • 21. How it works › NFVSAP acts on services after they have been deployed on the customer’s DC, no matter how. It is non- intrusive. › Automatically evaluates the performance of VNFs under thousands of possible HW resource allocation combinations for every VNF CPU placement (+NUMA, +Hyper-threading), LLC capacity, mem B/W throttling, CPU frequency › User interactively shortlists stress tests results, based on his performance & energy constraints › Creates Optimization Profiles out of shortlisted solutions for permanent use, which can enforce in real-time Premise: use application-specific KPIs to measure how well a NF is running
  • 24. Transcoder perf: 2 – 2.5 MbpsPower: 150-170 Watts
  • 25. NF Cache profiling › Reduce the “LLC” dimensions in the automated testing phase  reduce exploration space from 1000s to 10s › Discover the least amount of LLC each NF needs for near-optimal performance 0 10 20 30 40 50 60 70 80 90 100 3 6 9 12 15 18 21 24 27 Throughput(percent) LLC allocation (MB) streamer transcoder
  • 26. Broad virtualization support › Supports NFs implemented as KVM VMs Docker containers Kubernetes Pods native Linux apps › Components of each type are treated as a whole w.r.t. resource management › Allows resource management for hybrid network services › Handles special classes of NFs, like user-level soft switches
  • 27. Architecture › Data-store & knowledge base › Automated performance tests orchestration › NF profiling › Storage & enforcement of Optimization Profiles › HW topology & capabilities discovery › NF placement & resource allocation › NF lifecycle monitoring › Utilization metrics collection at every level (HW, OS, NFs) › Performance metrics collection (NF, virtual switch)
  • 28. IP-TV service prototype (MWC ‘18)* * http://intracom-telecom.com/downloads/pdf/products/sdn_nfv/Resource_Intelligence_Platform.pdf Throughput (Mbps): 0.8 – 2.8 Average power (Watts): 194 Mbps per Watt: 0.004 – 0.014 Jitter: Yes Throughput (Mbps): 4.8 Average power (Watts): 220 Mbps per Watt: 0.021 Jitter: No Throughput (Mbps): 1.5 Average power (Watts): 172 Mbps per Watt: 0.008 Jitter: No Throughput (Mbps): 4.2 Average power (Watts): 158 Mbps per Watt: 0.027 Jitter: No TranscoderIsolation OperatingSystemColocation,NoCAT Colocation,WithCAT
  • 29. Improving NF throughput using CAT › VM under test isolated using CAT, 4.5MB of LLC exclusively allocated to the corresponding NF › LLC isolation with CAT restricts throughput variation to less than 3% 2,20 6,20 12,80 33,30 46,10 51,00 37,60 2,20 4,90 1,30 0,2 1,2 1,9 2,3 0 1,9 0 0,3 0,00 10,00 20,00 30,00 40,00 50,00 60,00 EndRe IPSec Suricata Snort Stats MazuNAT LPM Firewall Efficuts Max Throughput Degradation (%), normalized No CAT CAT
  • 30. Improving packet processing latency using CAT* › Traffic scenario: NIC  vswitch  VM (L2 fwd)  vswitch  NIC › Processing latency in colocated environment is unpredictable with long tail › CAT improves significantly both processing latency and latency predictability (5-100 μsecs vs 5-1000+ μsecs) * https://builders.intel.com/docs/networkbuilders/deterministic_network_functions_virtualization_with_Intel_Resource_Director_Technology.pdf 1 10 100 1000 10000 100000 1000000 10000000 100000000 1E+09 5μs–10μs 10μs–25μs 25μs–50μs 50μs–75μs 75μs–100μs 100μs–150μs 150μs–200μs 200μs–250μs 250μs–500μs 500μs–750μs 750μs–1000μs 1000μs-max #SAMPLES Noisy Neighbor & No CAT 1 10 100 1000 10000 100000 1000000 10000000 100000000 1E+09 5μs–10μs 10μs–25μs 25μs–50μs 50μs–75μs 75μs–100μs 100μs–150μs 150μs–200μs 200μs–250μs 250μs–500μs 500μs–750μs 750μs–1000μs 1000μs-max #SAMPLES Noisy Neighbor & CAT
  • 31. Agenda Motivation NFV Service Assurance Platform Internals Conclusion
  • 32. Allocating resources to NFs (Agent) Two-level mapping: 1. NF to CPUs (pinning) 2. CPUs to HW Resources › How many #CPUs per NF?
  • 33. NFs to CPUs mapping: Linux Kernel Control Groups (cgroups) › A mechanism for applying resource limits and accounting to a process or a collection of processes › Available in the Linux kernel since 2008 › All manipulation through a virtual fs under /sys/fs/cgroup › Important subsystems cpu : CPU time quota cpuset : CPU placement memory : memory allocation and limits blkio : disk I/O speeds, etc. net_* : packet filtering and QoS devices : access control to device nodes › Each subsystem has a hierarchy; hierarchies are independent
  • 34. How we use cgroups › Every NF gets its own cgroup (i.e. node) in the cpuset hierarchy sub-trees in the hierarchy already exist for: Libvirt, Docker, K8s cgroups manually created for native Linux NFs › Allocate CPU subsets to every different NF cgroup recursively and in an isolated manner › CPU subsets reverted to their prior values in existing cgroups, after agent’s termination › Powered by , featuring team’s contributions for dynamic cgroup cpuset management
  • 35. Sub-tree for Docker containers /sys/fs/cgroup/cpuset/ ├ docker/ │ ├ 6be7d2e79/ # container cgroup │ │ ├ tasks: 23771,23814,32613 │ │ └ cpus: 0-3 ... ├ kubepods/ │ └ besteffort/ │ ├ pod14882238-8fd7/ │ │ ├ 1d1363f93/ │ │ │ ├ tasks: 40280 │ │ │ └ cpus: 4-7 │ │ ├ 48ea3fb34/ │ │ │ ├ tasks: 40274 │ │ │ └ cpus: 4-7 ... ... ├ machine/ │ ├ streamer1.libvirt-qemu/ │ │ ├ emulator/ │ │ │ ├ tasks: 23771,23814,32613 │ │ │ └ cpus: 8-10 │ │ ├ vcpu0/ │ │ │ ├ tasks: 5576 │ │ │ └ cpus: 8-10 ... ├ nla-12807-myapp1/ │ ├ tasks: 12807 │ └ cpus: 11 ... ├ nla-13892-myapp2/ │ ├ tasks: 13892 │ └ cpus: 12 ...
  • 36. Sub-tree for Kubernetes pods /sys/fs/cgroup/cpuset/ ├ docker/ │ ├ 6be7d2e79/ │ │ ├ tasks: 23771,23814,32613 │ │ └ cpus: 0-3 ... ├ kubepods/ │ └ besteffort/ │ ├ pod14882238-8fd7/ # pod cgroup │ │ ├ 1d1363f93/ # pod container1 ID │ │ │ ├ tasks: 40280 │ │ │ └ cpus: 4-7 │ │ ├ 48ea3fb34/ # pod container2 ID │ │ │ ├ tasks: 40274 │ │ │ └ cpus: 4-7 ... ... ├ machine/ │ ├ streamer1.libvirt-qemu/ │ │ ├ emulator/ │ │ │ ├ tasks: 23771,23814,32613 │ │ │ └ cpus: 8-10 │ │ ├ vcpu0/ │ │ │ ├ tasks: 5576 │ │ │ └ cpus: 8-10 ... ├ nla-12807-myapp1/ │ ├ tasks: 12807 │ └ cpus: 11 ... ├ nla-13892-myapp2/ │ ├ tasks: 13892 │ └ cpus: 12 ...
  • 37. Sub-tree for Libvirt domains /sys/fs/cgroup/cpuset/ ├ docker/ │ ├ 6be7d2e79/ │ │ ├ tasks: 23771,23814,32613 │ │ └ cpus: 0-3 ... ├ kubepods/ │ └ besteffort/ │ ├ pod14882238-8fd7/ │ │ ├ 1d1363f93/ │ │ │ ├ tasks: 40280 │ │ │ └ cpus: 4-7 │ │ ├ 48ea3fb34/ │ │ │ ├ tasks: 40274 │ │ │ └ cpus: 4-7 ... ... ├ machine/ │ ├ streamer1.libvirt-qemu/ # Libvirt dom cgroup │ │ ├ emulator/ # emulator threads │ │ │ ├ tasks: 23771,23814,32613 │ │ │ └ cpus: 8-10 │ │ ├ vcpu0/ # vcpu0 thread │ │ │ ├ tasks: 5576 │ │ │ └ cpus: 8-10 ... ├ nla-12807-myapp1/ │ ├ tasks: 12807 │ └ cpus: 11 ... ├ nla-13892-myapp2/ │ ├ tasks: 13892 │ └ cpus: 12 ...
  • 38. Cgroups for Native Linux applications /sys/fs/cgroup/cpuset/ ├ docker/ │ ├ 6be7d2e79/ │ │ ├ tasks: 23771,23814,32613 │ │ └ cpus: 0-3 ... ├ kubepods/ │ └ besteffort/ │ ├ pod14882238-8fd7/ │ │ ├ 1d1363f93/ │ │ │ ├ tasks: 40280 │ │ │ └ cpus: 4-7 │ │ ├ 48ea3fb34/ │ │ │ ├ tasks: 40274 │ │ │ └ cpus: 4-7 ... ... ├ machine/ │ ├ streamer1.libvirt-qemu/ │ │ ├ emulator/ │ │ │ ├ tasks: 23771,23814,32613 │ │ │ └ cpus: 8-10 │ │ ├ vcpu0/ │ │ │ ├ tasks: 5576 │ │ │ └ cpus: 8-10 ... ├ nla-12807-myapp1/ # custom app cgroup │ ├ tasks: 12807 │ └ cpus: 11 ... ├ nla-13892-myapp2/ # custom app cgroup │ ├ tasks: 13892 │ └ cpus: 12 ...
  • 39. CPUs to CAT & MBA resources mapping: Classes Of Service (CLOS) The processor exposes a set of Classes of Service to represent priorities of apps, VMs, or CPUs in using the controlled resources › Defined per CPU socket › A CLOS may have multiple controlled resources attached (e.g. LLC capacity & memory bandwidth) › Number of CLOSes is limited (e.g. 16 in most recent Xeons) CLOS ID CPUs Resource 1: LLC Resource 2: Mem B/W CLOS1 0,1,2 regions 0-7 100% CLOS2 3 regions 8,9 70% CLOS3 4,5 region 10 10% CLOS4 unassigned unassigned unassigned ... CLOS0 (default) rest CPUs in the socket rest regions 100%
  • 40. Specifying LLC regions via Capacity Bit Masks (CBMs) Go library to translate high-level capacity & isolation requests to low-level CBMs › “give me an available CLOS with 9MB of isolated LLC” Complexities handled: › availability checking › allocate contiguous CBMs & defragment as needed › consider reserved CBMs › recycle CLOS IDs
  • 42. Agenda Motivation NFV Service Assurance Platform Internals Conclusion
  • 43. Conclusion › With fine-grain isolation at the CPU core, cache & memory B/W level, we can deliver high & predictable NF performance, while saving resources › The ideal combination of resources is non-obvious beforehand & requires evaluation through exhaustive testing & profiling › Use well-known & well-documented kernel interfaces through high-level bindings Future challenges › Use cases beyond NFV › Varying runtime behaviors (input traffic, #colocated NFs, etc.) › Multi-objective optimization for multiple colocated Network Services › SLO-driven optimization