SlideShare a Scribd company logo
Scaling OpenStack Networking
Beyond 4000 Nodes with Dragonflow
Omer Anson (#oanson), Dragonflow PTL @ Huawei
Eshed Gal-Or (#oshidoshi), Chief Architect Open Source @ Huawei
Is neutron Production-Ready?
Highlights from Mirantis Perf&Scale Test (Dec’16)
• MOS 9.0 with Mitaka-based Neutron
• 3 hardware labs were used for testing
• The largest lab included 378 nodes
• Line-rate throughput was achieved
• Over 24500 VMs were launched on a 200-node lab
• …and yes, Neutron works at scale!
https://www.mirantis.com/blog/openstack-neutron-performance-and-scalability-testing-summary/
Highlights from Mirantis Perf&Scale Test (Dec’16)
Configuration
• ML2 OVS
• VxLAN/L2 POP
• DVR
Behavior
• ARP tables exploded at 16K VMs (had to be increased)
• RabbitMQ & Ceph broke at 20K VMs
• Services and agents broke at 24.5K VMs
• Integrity test: Successful
Compute
1
Compute
n
(n<=378)
…
VM
s
19
6
DV
R
Ro
ute
r
Su
bn
et
…
Heat
Stack
1
VM
s
19
6
DV
R
Ro
ute
r
Su
bn
et
Heat
Stack
125
https://www.mirantis.com/blog/openstack-neutron-performance-and-scalability-testing-summary/
Is it enough?
Full OpenStack per ~400 servers
Max 24,500 VMs per OpenStack
What if we need Scale?
1,000+ Servers
The Problem:
Network Control & Services Break @ Scale
The Solution (for Networking):
•Add a scalable “Read Replica” of Neutron DB
•Use a well-distributed, well-scaling DB (e.g. Redis)
Separate “Reads”
from “Updates”
•Manage small (1) virtual switches in each controller
•Controller should be small (e.g. Not Opendaylight)
Lean Distributed
Control Plane
•Small footprint
•Grows with workload (not with infrastructure)
•Transformed to southbound at the edge
Distribute Policy (vs.
Flows)
•“Run at edge”
•Suppress control messages from going out
•Leverage “predefined” nature of cloud env
Distribute Network
Services
DF
Controller
DF
Controller
DF
Controller
…
Neutron
Server
Dragonflow ML2
Driver
Neutr
on
DB
Neutron
API
Update
Pub/Sub
& Queries
Create/Update/Delete/(also some Read)
OVS
Openflow
User
Example: Dragonflow + Redis
Neutron
Server
Neutr
on
API
Dragonflow Server
Distributed Network Services in Dragonflow
Compute NodeCompute NodeCompute Node
Dragonflow
Network DB
OVS
OVSDB
OVSDB-Server
ETCD Redis
Kernel Datapath Module
NIC
User Space
Kernel Space
DB Drivers
OVSD
B
ETCD Redis
Future (Pike+)
vswitchd
Dragonflow Controller
Applications
L2 App L3 App
DHCP
App
VLAN
App
SG App
LBaaS
Metadata
App
Flat Net
App
IGMP
ICMP
App
Remote
Port App
Pluggable DB
Layer
NBDBDrivers
SB DB Drivers
smartNIC OVSDB
OVSDB
ETCD
Redis
ØMQ
ØMQ
Neutron
DB
Dist.
SNAT
App
ML2Driver
L2 SG
Trunk
Port
Pub/Sub Drivers
ØMQRedis ETCD
Trunk
Port
Active
Port
Detection
TAP
FW
OpenFlow
Contai
ner
VM
Service Plugins
Route
r
BGP TAP
LBaa
S
FW
New (Ocata)
SNAT
CN
CN
CN
Brief Overview (SNAT vs. DNAT)
VM
VM
VM
SNA
T
10.1.11.
5
10.1.13.8
10.1.7.7
21.3.5.5
VM
VM
VM
DNA
T
DNA
T
DNA
T
21.3.5.5
21.3.5.7
21.3.8.7
WA
N
GW
WA
N
GW
SNA
T
DNA
T
SNAT
Implemented in Neutron DVR
https://www.mirantis.com/blog/openstack-neutron-performance-and-scalability-testing-summary/
Distributed SNAT
Implemented in Dragonflow
…
Compute Node
VM
Compute Node
VM
Some vRouters
Some WAN Gateways
Internet
NAT
#1
NAT
#2
Distributed SNAT
Implemented in Dragonflow
Compute Node
Dragonflow
VM
OVS
VM
1 2
br-int
qvoXX
X
qvoXX
X
OpenFlo
w
1
42
Dragonflow Controller
Abstraction Layer
L2
App
L3
App
Dist.
SNA
T
App
…
3
1 VM Send Packet
2 Classify Flow as Internet (i.e. not on any of the
internal routed networks)
3 Apply NAT function in OVS with the
4 Forward packet towards Internet
5 Possibly, Internet Gateway does 2nd NAT on Packet
To the
Internet
5
Pluggable DB
Layer
Distrib
uted
DB
DHCP
Network Node
DHCP
Implementation in Neutron
DHCP
Agent
Neutron Server
Message Queue
Example
• 100 Tenants
• 3 vNet / tenant
= 300 DHCP Servers
DHCP
namespace
dnsmasq
1 VM Send DHCP_DISCOVER
2 Classify Flow as DHCP, Forward to Controller
3 DHCP App sends DHCP_OFFER back to VM
4 VM Send DHCP_REQUEST
5 Classify Flow as DHCP, Forward to Controller
6 DHCP App populates DHCP_OPTIONS from DB/CFG
and send DHCP_ACK
Distributed DHCP
Implemented in Dragonflow
VM
DHCP
SERVER
1
3
4
6
7
Compute Node
Dragonflow
VM
OVS
VM
1 2
br-int
qvoXX
X
qvoXX
X
OpenFlo
w
1
4
2
5
7
Dragonflow Controller
Abstraction Layer
L2
App
L3
App
DHC
P
App
…
36
Pluggable DB
Layer
Distrib
uted
DB
Dragonflow Benchmark
(Control Plane)
Test Plan
1. Baseline Neutron
– Measure Neutron API-to-DB latency
2. Baseline Dragonflow
– Measure Dragonflow in small environment (1 controller per compute
node) – Total 33
3. 4K scale
– Measure Dragonflow in large environment (130 controllers per compute
node) – Total 4031
4. Baseline Redis
– Measure Redis in large environment (130 agents per compute node) –
Total 4031
OVS32
DF Server
OVS31OVS1
Baseline Test
Server 1 Server 31 Server 32…
Controller
1
Server 33 Server 38…
Redis 1
Master DB
Redis 2
Master DB
Redis 3
Master DB
Redis 4
Replica DB
Redis 5
Replica DB
Redis 6
Replica DB
br-int
Controller
31
br-int
Controller
32
br-int
OVS32
DF Server
OVS31OVS1
4K scale
Server 1 Server 31 Server 32…
Controller
1
Server 33 Server 38…
Redis 1
Master DB
Redis 2
Master DB
Redis 3
Master DB
Redis 4
Replica DB
Redis 5
Replica DB
Redis 6
Replica DB
Total:
4030 DF
Local
Controllers
br-int-1
Controller
130
br-int-130
…
…
Controller
1
br-int-1
Controller
130
br-int-130
…
…
Controller
1
br-int-1
Benchmark Conclusions
(single script)
Benchmark Conclusions
• Dragonflow performance consistent with scale
• Neutron performance needs to improve (need to profile)
– Multiple scripts with single Neutron improve 250% (from 1.06 subnet/sec
to 2.63 subnet/sec)
• Current performance is production ready
– Faster than VM spin-up
– Comparable to Container spin-up
– Scale-agnostic
• Redis performance far exceeds the requirements
– ~177 top-level network events per second, fully synchronized to 4161
nodes
Ride the Dragon
https://wiki.openstack.org/wiki/Dragonflow
https://github.com/openstack/dragonflow
https://launchpad.net/dragonflow
IRC: #openstack-dragonflow
Weekly IRC (Mondays 0900 UTC): #openstack-meeting-4
Thank you!
DATA CONSISTENCY
SDN Controller
North-bound Interface (REST?)
South-bound Interface
(Openflow)
SDN Apps
SDN
DB
Neutr
on
DB
Neutron-server
ML2-Core-Plugin
ML2.Drivers.Mechanism.XXX
Services-Plugin
Service
Network
Neutron API Nova API
CLI / Dashboard (Horizon) / Orchestration Tool
(Heat)
Switch
Nova
Nova Compute
VM VM
Nova Compute
VM VM
Virtual Switch Virtual Switch
Neu
tron
Plug
in
Age
nt
Neu
tron
Plug
in
Age
nt
Message Queue (AMQP)
Neutron-L3-Agent
Neutron-DHCP-
Agent
Loa
d
Bala
ncerFire
wall
VPN
L3
Serv
ices
Top
olog
y
Mgr.
Over
lay
Mgr.Sec
urity
Vendor-specific API
DB Consistency: Common Problem to all SDN Solutions
DB Consistency: Common Problem to all SDN Solution
• Neutron DB transaction is committed, but the related operations on SDN Controller
DB have failed
Problem 1
• Concurrent APIs cause multiple transactions on a given Neutron object. Neutron DB
can deal with it very well due to its ACID nature. How about the SDN Controller DB?
Problem 2
• Nested transactions can be done in Neutron DB. How about the SDN Controller DB?
Problem 3
Problem N…
Consistency Paradigms
• Basically Available
• Soft-state
• Eventual consistent
• Atomic
• Consistent
• Isolated
• Durable
Dragonflow Data System vs. Neutron
Neutron DB
Relational Database
ACID system
Stores the whole virtualized
network topology for OpenStack
Dragonflow DB
Key-value Store
BASE system
Stores a ‘partial’ virtualized
network topology used in
Dragonflow
DB Consistency in Dragonflow
• Introduce a distributed lock for coordination
– Guarantee the atomicity of a given API
– Implemented in the Neutron core plugin layer
– Project-based lock allows concurrency
Neutr
on
DB
Neutron-server
ML2
Dragonflow Driver
Neutron API
CLI / Dashboard (Horizon) / Orchestration Tool (Heat)
Dragonflow
North-bound Interface
South-bound Interface
(Openflow)
SDN Apps
Top
olog
y
Mgr.
Over
lay
Mgr.Sec
urity
Obtain
distributed
lock
Dragonflow
NB APIDB
DB Consistency in Dragonflow
• Introduce an object synchronization mechanism
– All the objects stored in both databases are versioned
– Take advantage of CAS operations of the Dragonflow DB
– Sync the object when something unexpected happens
SDN DB
Neutron
DB
Network_ID Name Status MTU VLAN Availability Zone Subnets
Object_ID = Network_ID Version = 5
Read
Notify
compare & swap <- Version
Compute NodeCompute NodeCompute Node
Dragonflow
Local
Controller
Subscriber
vSwitch Flush Flow

More Related Content

PPTX
OpenDaylight Netvirt and Neutron - Mike Kolesnik, Josh Hershberg - OpenStack ...
PPTX
Cloud Networking - Leaving the Physical Behind - Omer Anson - OpenStack Day I...
PPTX
Orchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
PPTX
OpenStack Discovery and Networking Assurance - Koren Lev - Meetup
PDF
Tech Talk by Gal Sagie: Kuryr - Connecting containers networking to OpenStack...
PPTX
Can the Open vSwitch (OVS) bottleneck be resolved? - Erez Cohen - OpenStack D...
PPTX
OpenStack and OpenDaylight Workshop: ONUG Spring 2014
PPTX
How Cloud Native VNFs Deployed on OpenStack Will Change the Telecom Industry ...
OpenDaylight Netvirt and Neutron - Mike Kolesnik, Josh Hershberg - OpenStack ...
Cloud Networking - Leaving the Physical Behind - Omer Anson - OpenStack Day I...
Orchestration Tool Roundup - Arthur Berezin & Trammell Scruggs
OpenStack Discovery and Networking Assurance - Koren Lev - Meetup
Tech Talk by Gal Sagie: Kuryr - Connecting containers networking to OpenStack...
Can the Open vSwitch (OVS) bottleneck be resolved? - Erez Cohen - OpenStack D...
OpenStack and OpenDaylight Workshop: ONUG Spring 2014
How Cloud Native VNFs Deployed on OpenStack Will Change the Telecom Industry ...

What's hot (20)

PDF
Neutron high availability open stack architecture openstack israel event 2015
PDF
OpenStack and OpenDaylight: An Integrated IaaS for SDN/NFV
PPTX
OpenStack High Availability
PPTX
OpenStack Neutron behind the Scenes
PPTX
OpenStack and the Transformation of the Data Center - Lew Tucker
PDF
Openstack Neutron and SDN
PDF
Inside Architecture of Neutron
PDF
Introduction to MidoNet
PPTX
Openstack Basic with Neutron
PPTX
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
PDF
MidoNet deep dive
PDF
OpenStack Astara
PPTX
OpenStack HA
PDF
[OpenStack Days 2016] Track4 - OpenNSL으로 브로드콜 기반 네트,워크 스위치 제어하기
PPTX
Open stack ha design & deployment kilo
PPTX
OpenStack Israel Meetup - Project Kuryr: Bringing Container Networking to Neu...
PDF
Container Networking Deep Dive
PPTX
OpenStack Networking and Automation
PPTX
OpenContrail deployment experience
PDF
Open daylight and Openstack
Neutron high availability open stack architecture openstack israel event 2015
OpenStack and OpenDaylight: An Integrated IaaS for SDN/NFV
OpenStack High Availability
OpenStack Neutron behind the Scenes
OpenStack and the Transformation of the Data Center - Lew Tucker
Openstack Neutron and SDN
Inside Architecture of Neutron
Introduction to MidoNet
Openstack Basic with Neutron
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
MidoNet deep dive
OpenStack Astara
OpenStack HA
[OpenStack Days 2016] Track4 - OpenNSL으로 브로드콜 기반 네트,워크 스위치 제어하기
Open stack ha design & deployment kilo
OpenStack Israel Meetup - Project Kuryr: Bringing Container Networking to Neu...
Container Networking Deep Dive
OpenStack Networking and Automation
OpenContrail deployment experience
Open daylight and Openstack
Ad

Similar to Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or, Omer Anson - OpenStack Day Israel 2017 (20)

PPTX
Dragonflow 01 2016 TLV meetup
PPTX
Dragonflow Austin Summit Talk
PPTX
OpenStack Neutron Dragonflow l3 SDNmeetup
PPTX
2014 OpenStack Summit - Neutron OVS to LinuxBridge Migration
PDF
Nvp deep dive_session_cee-day
PDF
LinuxCon 2015 Stateful NAT with OVS
PPTX
Room 1 - 7 - Lê Quốc Đạt - Upgrading network of Openstack to SDN with Tungste...
PPTX
DragonFlow sdn based distributed virtual router for openstack neutron
PDF
Quantum - Virtual networks for Openstack
PPTX
DCUS17 : Docker networking deep dive
PPTX
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
PDF
SDN/OpenFlow #lspe
PDF
Ovn vancouver
PPT
OpenFlow tutorial
PDF
Osnug meetup-tungsten fabric - overview.pptx
PPTX
BRKDCT-2445 Agile OpenStack Networking with Cisco Solutions - Cisco Live! US ...
PPTX
CloudComp 2015 - SDN-Cloud Testbed with Hyper-convergent SmartX Boxes
PPT
OpenFlow Tutorial
PDF
An Introduce of OPNFV (Open Platform for NFV)
PDF
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
Dragonflow 01 2016 TLV meetup
Dragonflow Austin Summit Talk
OpenStack Neutron Dragonflow l3 SDNmeetup
2014 OpenStack Summit - Neutron OVS to LinuxBridge Migration
Nvp deep dive_session_cee-day
LinuxCon 2015 Stateful NAT with OVS
Room 1 - 7 - Lê Quốc Đạt - Upgrading network of Openstack to SDN with Tungste...
DragonFlow sdn based distributed virtual router for openstack neutron
Quantum - Virtual networks for Openstack
DCUS17 : Docker networking deep dive
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
SDN/OpenFlow #lspe
Ovn vancouver
OpenFlow tutorial
Osnug meetup-tungsten fabric - overview.pptx
BRKDCT-2445 Agile OpenStack Networking with Cisco Solutions - Cisco Live! US ...
CloudComp 2015 - SDN-Cloud Testbed with Hyper-convergent SmartX Boxes
OpenFlow Tutorial
An Introduce of OPNFV (Open Platform for NFV)
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
Ad

More from Cloud Native Day Tel Aviv (20)

PDF
Cloud Native is a Cultural Decision. By Reshef Mann
PDF
Container Runtime Security with Falco, by Néstor Salceda
PDF
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
PDF
Running I/O intensive workloads on Kubernetes, by Nati Shalom
PDF
WTF Do We Need a Service Mesh? By Anton Weiss.
PDF
Update Strategies for the Edge, by Kat Cosgrove
PDF
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
PDF
The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato
PDF
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
PDF
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
PDF
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
PDF
Cloud native transformation patterns, by Pini Reznik
PPTX
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
PDF
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
PDF
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
PDF
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
PDF
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
PPTX
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
PPTX
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
PPTX
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018
Cloud Native is a Cultural Decision. By Reshef Mann
Container Runtime Security with Falco, by Néstor Salceda
Kafka Mirror Tester: Go and Kubernetes Powered Test Suite for Kafka Replicati...
Running I/O intensive workloads on Kubernetes, by Nati Shalom
WTF Do We Need a Service Mesh? By Anton Weiss.
Update Strategies for the Edge, by Kat Cosgrove
Building a Cloud-Native SaaS Product The Hard Way. By Arthur Berezin
The Four Questions (Every Monitoring Engineer gets asked), by Leon Adato
K8s Pod Scheduling - Deep Dive. By Tsahi Duek.
Cloud Native: The Cattle, the Pets, and the Germs, by Avishai Ish-Shalom
MySQL Shell: the daily tool for devs and admins. By Vittorio Cioe.
Cloud native transformation patterns, by Pini Reznik
Cloud and Edge: price, performance and privacy considerations in IOT, by Tsvi...
Two Years, Zero servers: Lessons learned from running a startup 100% on Serve...
12 Factor Serverless Applications - Mike Morain, AWS - Cloud Native Day Tel A...
Not my problem! Delegating responsibilities to the infrastructure - Yshay Yaa...
Brain in the Cloud: Machine Learning on OpenStack & Kubernetes Done Right - E...
A stateful application walks into a Kubernetes bar - Arthur Berezin, JovianX ...
The story of how KubeMQ was born - Oz Golan, KubeMQ - Cloud Native Day Tel Av...
I want it all: go hybrid - Orit Yaron, Outbrain - Cloud Native Day Tel Aviv 2018

Recently uploaded (20)

PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Review of recent advances in non-invasive hemoglobin estimation
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
KodekX | Application Modernization Development
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Encapsulation theory and applications.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Review of recent advances in non-invasive hemoglobin estimation
The AUB Centre for AI in Media Proposal.docx
Machine learning based COVID-19 study performance prediction
Dropbox Q2 2025 Financial Results & Investor Presentation
KodekX | Application Modernization Development
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Reach Out and Touch Someone: Haptics and Empathic Computing
Encapsulation theory and applications.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
“AI and Expert System Decision Support & Business Intelligence Systems”
Spectral efficient network and resource selection model in 5G networks
Mobile App Security Testing_ A Comprehensive Guide.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf

Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or, Omer Anson - OpenStack Day Israel 2017

  • 1. Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow Omer Anson (#oanson), Dragonflow PTL @ Huawei Eshed Gal-Or (#oshidoshi), Chief Architect Open Source @ Huawei
  • 3. Highlights from Mirantis Perf&Scale Test (Dec’16) • MOS 9.0 with Mitaka-based Neutron • 3 hardware labs were used for testing • The largest lab included 378 nodes • Line-rate throughput was achieved • Over 24500 VMs were launched on a 200-node lab • …and yes, Neutron works at scale! https://www.mirantis.com/blog/openstack-neutron-performance-and-scalability-testing-summary/
  • 4. Highlights from Mirantis Perf&Scale Test (Dec’16) Configuration • ML2 OVS • VxLAN/L2 POP • DVR Behavior • ARP tables exploded at 16K VMs (had to be increased) • RabbitMQ & Ceph broke at 20K VMs • Services and agents broke at 24.5K VMs • Integrity test: Successful Compute 1 Compute n (n<=378) … VM s 19 6 DV R Ro ute r Su bn et … Heat Stack 1 VM s 19 6 DV R Ro ute r Su bn et Heat Stack 125 https://www.mirantis.com/blog/openstack-neutron-performance-and-scalability-testing-summary/
  • 5. Is it enough? Full OpenStack per ~400 servers Max 24,500 VMs per OpenStack
  • 6. What if we need Scale? 1,000+ Servers
  • 7. The Problem: Network Control & Services Break @ Scale
  • 8. The Solution (for Networking): •Add a scalable “Read Replica” of Neutron DB •Use a well-distributed, well-scaling DB (e.g. Redis) Separate “Reads” from “Updates” •Manage small (1) virtual switches in each controller •Controller should be small (e.g. Not Opendaylight) Lean Distributed Control Plane •Small footprint •Grows with workload (not with infrastructure) •Transformed to southbound at the edge Distribute Policy (vs. Flows) •“Run at edge” •Suppress control messages from going out •Leverage “predefined” nature of cloud env Distribute Network Services
  • 10. Neutron Server Neutr on API Dragonflow Server Distributed Network Services in Dragonflow Compute NodeCompute NodeCompute Node Dragonflow Network DB OVS OVSDB OVSDB-Server ETCD Redis Kernel Datapath Module NIC User Space Kernel Space DB Drivers OVSD B ETCD Redis Future (Pike+) vswitchd Dragonflow Controller Applications L2 App L3 App DHCP App VLAN App SG App LBaaS Metadata App Flat Net App IGMP ICMP App Remote Port App Pluggable DB Layer NBDBDrivers SB DB Drivers smartNIC OVSDB OVSDB ETCD Redis ØMQ ØMQ Neutron DB Dist. SNAT App ML2Driver L2 SG Trunk Port Pub/Sub Drivers ØMQRedis ETCD Trunk Port Active Port Detection TAP FW OpenFlow Contai ner VM Service Plugins Route r BGP TAP LBaa S FW New (Ocata)
  • 11. SNAT
  • 12. CN CN CN Brief Overview (SNAT vs. DNAT) VM VM VM SNA T 10.1.11. 5 10.1.13.8 10.1.7.7 21.3.5.5 VM VM VM DNA T DNA T DNA T 21.3.5.5 21.3.5.7 21.3.8.7 WA N GW WA N GW SNA T DNA T
  • 13. SNAT Implemented in Neutron DVR https://www.mirantis.com/blog/openstack-neutron-performance-and-scalability-testing-summary/
  • 14. Distributed SNAT Implemented in Dragonflow … Compute Node VM Compute Node VM Some vRouters Some WAN Gateways Internet NAT #1 NAT #2
  • 15. Distributed SNAT Implemented in Dragonflow Compute Node Dragonflow VM OVS VM 1 2 br-int qvoXX X qvoXX X OpenFlo w 1 42 Dragonflow Controller Abstraction Layer L2 App L3 App Dist. SNA T App … 3 1 VM Send Packet 2 Classify Flow as Internet (i.e. not on any of the internal routed networks) 3 Apply NAT function in OVS with the 4 Forward packet towards Internet 5 Possibly, Internet Gateway does 2nd NAT on Packet To the Internet 5 Pluggable DB Layer Distrib uted DB
  • 16. DHCP
  • 17. Network Node DHCP Implementation in Neutron DHCP Agent Neutron Server Message Queue Example • 100 Tenants • 3 vNet / tenant = 300 DHCP Servers DHCP namespace dnsmasq
  • 18. 1 VM Send DHCP_DISCOVER 2 Classify Flow as DHCP, Forward to Controller 3 DHCP App sends DHCP_OFFER back to VM 4 VM Send DHCP_REQUEST 5 Classify Flow as DHCP, Forward to Controller 6 DHCP App populates DHCP_OPTIONS from DB/CFG and send DHCP_ACK Distributed DHCP Implemented in Dragonflow VM DHCP SERVER 1 3 4 6 7 Compute Node Dragonflow VM OVS VM 1 2 br-int qvoXX X qvoXX X OpenFlo w 1 4 2 5 7 Dragonflow Controller Abstraction Layer L2 App L3 App DHC P App … 36 Pluggable DB Layer Distrib uted DB
  • 20. Test Plan 1. Baseline Neutron – Measure Neutron API-to-DB latency 2. Baseline Dragonflow – Measure Dragonflow in small environment (1 controller per compute node) – Total 33 3. 4K scale – Measure Dragonflow in large environment (130 controllers per compute node) – Total 4031 4. Baseline Redis – Measure Redis in large environment (130 agents per compute node) – Total 4031
  • 21. OVS32 DF Server OVS31OVS1 Baseline Test Server 1 Server 31 Server 32… Controller 1 Server 33 Server 38… Redis 1 Master DB Redis 2 Master DB Redis 3 Master DB Redis 4 Replica DB Redis 5 Replica DB Redis 6 Replica DB br-int Controller 31 br-int Controller 32 br-int
  • 22. OVS32 DF Server OVS31OVS1 4K scale Server 1 Server 31 Server 32… Controller 1 Server 33 Server 38… Redis 1 Master DB Redis 2 Master DB Redis 3 Master DB Redis 4 Replica DB Redis 5 Replica DB Redis 6 Replica DB Total: 4030 DF Local Controllers br-int-1 Controller 130 br-int-130 … … Controller 1 br-int-1 Controller 130 br-int-130 … … Controller 1 br-int-1
  • 24. Benchmark Conclusions • Dragonflow performance consistent with scale • Neutron performance needs to improve (need to profile) – Multiple scripts with single Neutron improve 250% (from 1.06 subnet/sec to 2.63 subnet/sec) • Current performance is production ready – Faster than VM spin-up – Comparable to Container spin-up – Scale-agnostic • Redis performance far exceeds the requirements – ~177 top-level network events per second, fully synchronized to 4161 nodes
  • 28. SDN Controller North-bound Interface (REST?) South-bound Interface (Openflow) SDN Apps SDN DB Neutr on DB Neutron-server ML2-Core-Plugin ML2.Drivers.Mechanism.XXX Services-Plugin Service Network Neutron API Nova API CLI / Dashboard (Horizon) / Orchestration Tool (Heat) Switch Nova Nova Compute VM VM Nova Compute VM VM Virtual Switch Virtual Switch Neu tron Plug in Age nt Neu tron Plug in Age nt Message Queue (AMQP) Neutron-L3-Agent Neutron-DHCP- Agent Loa d Bala ncerFire wall VPN L3 Serv ices Top olog y Mgr. Over lay Mgr.Sec urity Vendor-specific API DB Consistency: Common Problem to all SDN Solutions
  • 29. DB Consistency: Common Problem to all SDN Solution • Neutron DB transaction is committed, but the related operations on SDN Controller DB have failed Problem 1 • Concurrent APIs cause multiple transactions on a given Neutron object. Neutron DB can deal with it very well due to its ACID nature. How about the SDN Controller DB? Problem 2 • Nested transactions can be done in Neutron DB. How about the SDN Controller DB? Problem 3 Problem N…
  • 30. Consistency Paradigms • Basically Available • Soft-state • Eventual consistent • Atomic • Consistent • Isolated • Durable
  • 31. Dragonflow Data System vs. Neutron Neutron DB Relational Database ACID system Stores the whole virtualized network topology for OpenStack Dragonflow DB Key-value Store BASE system Stores a ‘partial’ virtualized network topology used in Dragonflow
  • 32. DB Consistency in Dragonflow • Introduce a distributed lock for coordination – Guarantee the atomicity of a given API – Implemented in the Neutron core plugin layer – Project-based lock allows concurrency Neutr on DB Neutron-server ML2 Dragonflow Driver Neutron API CLI / Dashboard (Horizon) / Orchestration Tool (Heat) Dragonflow North-bound Interface South-bound Interface (Openflow) SDN Apps Top olog y Mgr. Over lay Mgr.Sec urity Obtain distributed lock Dragonflow NB APIDB
  • 33. DB Consistency in Dragonflow • Introduce an object synchronization mechanism – All the objects stored in both databases are versioned – Take advantage of CAS operations of the Dragonflow DB – Sync the object when something unexpected happens SDN DB Neutron DB Network_ID Name Status MTU VLAN Availability Zone Subnets Object_ID = Network_ID Version = 5 Read Notify compare & swap <- Version Compute NodeCompute NodeCompute Node Dragonflow Local Controller Subscriber vSwitch Flush Flow