SlideShare a Scribd company logo
Migrating production workloads 
from OVS to LinuxBridge 
Kevin Stevens (kevin.stevens@rackspace.com) 
James Denton (james.denton@rackspace.com)
Weโ€™re operators! 
Kevin Stevens 
โ€ข RPC Engineer since 2012 
(Essex) 
โ€ข IRC: k_stev on 
irc.freenode.net 
James Denton 
โ€ข RPC Network Engineer 
since 2012 (Essex) 
โ€ข IRC: busterswt on 
irc.freenode.net
What are we doing here? 
1 
A history of 
networking in 
Rackspace 
Private Cloud 
2 3 4 
Our 
experiences 
with Open 
vSwitch 
Swapping 
out OVS 
with 
LinuxBridge 
What to 
expect with 
each
2011 
Started building 
OpenStack-powered 
private clouds 
2012 
Began architecting, 
building and supporting 
private clouds in 
customer DCs 
2013 
Over 100 customers 
running Rackspace 
Private Clouds 
2014 
Released RPC v9 based 
on Icehouse. 99.99% API 
uptime SLA.
RPC 
v2.0/3.0 
RPC 
v4.0/4.1 
RPC v4.2 RPC v9.0 
OpenStack 
Release 
Folsom Grizzly Havana Icehouse 
Network Stack nova-network Quantum Neutron Neutron 
L2 
Connectivity 
flatDHCP Open vSwitch Open vSwitch 
LinuxBridge 
(ML2) 
L3 Agent 
Support 
N/A No Yes Yes 
Host OS 
Ubuntu 12.04 
LTS 
Ubuntu 12.04 
LTS 
CentOS 6.5 
RHEL 6.5 
Ubuntu 12.04 
LTS 
Ubuntu 14.04 
LTS 
The Evolution of Networking in RPC
Why Neutron?
Why Neutron w/ Open vSwitch? 
โ€ขOpen vSwitch pushed 
by community 
โ€ขOpen vSwitch pushed 
by packagers 
โ€ขWanted overlay 
networks
2014 OpenStack Summit - Neutron OVS to LinuxBridge Migration
The problems 
โ€ขKernel panics (1.10) 
โ€ขovs-vswitchd segfaults 
(1.11) 
โ€ขBroadcast storms 
โ€ขData corruption (2.01)
Why Linux Bridge?
Why move to LinuxBridge? 
โ€ขLooking for reliability and stability 
โ€ขLess moving parts 
โ€ขEasier to troubleshoot 
โ€ขSupported by the community
โ€ข Flexibility provided by overlay networking 
(if not using vxlan) 
โ€ข Neutron Distributed Virtual Routers (Juno) 
โ€ข Any customizability provided by OVS not 
implemented by Neutron itself 
www.rackspace.com 12 
What do we lose by moving?
Planning
Plan A: Scorch the earth! 
โ€ขSnapshot and delete all instances 
โ€ขDelete all networks 
โ€ขChange from OVS -> LB 
โ€ขRecreate all networks 
โ€ขBoot instances 
โ€ขโ€ฆ 
โ€ข It works butโ€ฆ
But waitโ€ฆ these are production environments!
Plan B: Migration Environment 
โ€ขDeploy LinuxBridge environment 
โ€ขSnapshot all instances 
โ€ขImport images into new 
environment 
โ€ขBuild new instances 
โ€ขCutover 
โ€ขโ€ฆ 
โ€ข It works, butโ€ฆ $$$
Plan C: Switch it out! 
โ€ขStop services 
โ€ขUpdate the database 
โ€ขChange the configuration from OVS -> LB 
โ€ขRestart services 
โ€ขโ€ฆ 
โ€ข Profit!
Issues with migrating 
โ€ข Neutron OVS DB schema != Neutron LinuxBridge DB schema 
โ€“ Migration to OVS ML2 DB schema is required first 
โ€ข Overlay networks may not supported 
โ€“ LinuxBridge uses VXLAN rather than GRE 
โ€“ Requires kernel >= 3.9 
โ€ข Means GRE networks must be converted to VLAN networks 
โ€“ Didnโ€™t want to introduce additional complexity 
โ€“ VLANs easier to troubleshoot if something went wrong
The Process
Preparation 
โ€ข Determine whatโ€™s needed: 
โ€“Dependencies 
โ€“Some method of converting database to ML2 schema 
โ€“Some method of converting data to LB from OVS 
โ€“Which configuration files need mangling 
โ€“Which services need disabling 
โ€“Which services need restarting 
โ€“Roll-back plan
Define a successful outcome 
โ€ข Can instances gain a DHCP lease? 
โ€ข Do instances have internal/external connectivity? 
โ€ข Are security groups/other functions still operational? 
โ€ขWere instances placed into the correct bridge? 
โ€ข Will the changes survive a reboot?
Normal OVS Operation (Network Node)
Normal OVS Operation (Compute Node)
First steps: Database manipulation 
โ€ข Backup! Backup! Backup! 
โ€ข Use migrate_to_ml2.py (modified) to change the DB schema 
โ€ข Update segments, ports and vlan tables 
โ€“Change GRE to VLAN 
โ€“Change segmentation id to real VLAN ID 
โ€“Set a provider bridge
Next steps: Install and Configure 
โ€ข Install the LinuxBridge plugin 
โ€ข Update SQL connection strings 
โ€ข Configure ml2_conf.ini / linuxbridge_conf.ini 
โ€ข Change driver from OVS to ML2 in Neutron and Nova conf files
Next steps: Pull ports from bridges 
โ€ข Stop Neutron services on all nodes 
โ€ข Remove host data-plane port from the OVS bridge(s) 
โ€ข Pull instance taps out of the OVS-related linux bridges 
โ€ข Remove router and dhcp interfaces from OVS integration 
bridge 
โ€ข Stop Openvswitch
Interfaces removed from bridges
Stop openvswitch services
Finally: Restart services 
โ€ข Start Neutron services 
โ€ข Restart compute services
Post Service Restart (Network Node)
Post Service Restart (Compute Node)
2014 OpenStack Summit - Neutron OVS to LinuxBridge Migration
Failure Scenarios 
โ€ขInstances unresponsive? 
โ€“Check traffic from tap->bridge->physical interface 
โ€“Verify VLANs properly trunked through (and VLANs created on 
the switch) ๏Š
Failure Scenarios (Contโ€™d) 
โ€ขIPs disappear or taps placed in QBR bridges 
โ€“Check Nova instance_info_caches table. 
โ€“Cache can be regenerated with a hard reboot of instance, or by 
adding an interface to the instance
Failure Scenarios (Contโ€™d Contโ€™d) 
โ€ข Unable to boot new instances? 
โ€“ Usual troubleshooting techniques should be used 
โ€ข DHCP Binding_failed error messages? 
โ€“ Check /etc/default/neutron-server is referencing ML2 
configuration file 
โ€ข BRQ bridges not built? 
โ€“ Verify New agents checking in? 
โ€“ Verify the LinuxBridge agent is installed and running
Benchmarks
Compare all the things 
8 
4 
2 
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 
* Host-to-host testing; no virtualization. Longer is better. 
1 
Aggregate Throughput (Gbps) 
# of 
Threads 
iPerf3 Benchmarks (TCP / 1500 MTU / 10G Data) โ€“ Intel X520* 
(ixgbe driver) 
Open vSwitch 
(VXLAN) 
LinuxBridge 
(VXLAN) 
Open vSwitch 
(GRE) 
Open vSwitch 
(VLAN) 
LinuxBridge 
(VLAN)
Compare all the things 
SCP File Transfers (10G file)* 
115.00 Seconds 
104.00 Seconds 
* Host-to-host testing; no virtualization. Longer is better. 
61.50 Seconds 
59.75 Seconds 
110.50 Seconds 
0.00 20.00 40.00 60.00 80.00 100.00 120.00 140.00 160.00 180.00 200.00 
OVS VXLAN 
LB VXLAN 
OVS GRE 
OVS VLAN 
LB VLAN 
Transfer Speed (MBps)
Compare all the things 
SCP File Transfers (10G file)* 
115.00 Seconds 
104.00 Seconds 
* Host-to-host testing; no virtualization. Longer is better. 
61.50 Seconds 
59.75 Seconds 
110.50 Seconds 
0.00 20.00 40.00 60.00 80.00 100.00 120.00 140.00 160.00 180.00 200.00 
OVS VXLAN 
LB VXLAN 
OVS GRE 
OVS VLAN 
LB VLAN 
Transfer Speed (MBps)
Compare all the things 
SCP File Transfers (10G file)* 
115.00 Seconds 
104.00 Seconds 
* Host-to-host testing; no virtualization. Longer is better. 
61.50 Seconds 
59.75 Seconds 
110.50 Seconds 
0.00 20.00 40.00 60.00 80.00 100.00 120.00 140.00 160.00 180.00 200.00 
OVS VXLAN 
LB VXLAN 
OVS GRE 
OVS VLAN 
LB VLAN 
Transfer Speed (MBps)
โ€ข OVS provides a great deal of functionality 
โ€ข Network stability more important for our customers than 
being on the cutting edge 
โ€ข Linux bridge provides almost all of the features we might 
want to use 
โ€ข How to migrate existing environments to LinuxBridge 
โ€ข Improved stability and comparable performance with OVS 
achieved 
www.rackspace.com 41 
In Summary
2014 OpenStack Summit - Neutron OVS to LinuxBridge Migration
Questions?
Download @ 
https://github.com/busterswt/openstackparis2014

More Related Content

PDF
Understanding Open vSwitch
PPTX
SDN Architecture & Ecosystem
PDF
็ฌฌ20ๅ›ž OpenStackๅ‹‰ๅผทไผš Neutron Deep Dive - DVR
PDF
Linux Networking Explained
PPTX
The Basic Introduction of Open vSwitch
PDF
ML2/OVN ใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃๆฆ‚่ฆณ
PDF
ONOS SDN Controller - Clustering Tests & Experiments
PDF
๏ฟผๅ›ใซใ‚‚ใงใใ‚‹! ใซใ‚…ใƒผใจใ‚ใ‚“ๅ›ใซใชใฃใฆใฟใ‚ˆใƒผ๏ผ๏ผ ใ€ŒNeutronใซใชใฃใฆ็†่งฃใ™ใ‚‹OpenStack Net - OpenStackๆœ€ๆ–ฐๆƒ…ๅ ฑใ‚ปใƒŸใƒŠใƒผ ...
Understanding Open vSwitch
SDN Architecture & Ecosystem
็ฌฌ20ๅ›ž OpenStackๅ‹‰ๅผทไผš Neutron Deep Dive - DVR
Linux Networking Explained
The Basic Introduction of Open vSwitch
ML2/OVN ใ‚ขใƒผใ‚ญใƒ†ใ‚ฏใƒใƒฃๆฆ‚่ฆณ
ONOS SDN Controller - Clustering Tests & Experiments
๏ฟผๅ›ใซใ‚‚ใงใใ‚‹! ใซใ‚…ใƒผใจใ‚ใ‚“ๅ›ใซใชใฃใฆใฟใ‚ˆใƒผ๏ผ๏ผ ใ€ŒNeutronใซใชใฃใฆ็†่งฃใ™ใ‚‹OpenStack Net - OpenStackๆœ€ๆ–ฐๆƒ…ๅ ฑใ‚ปใƒŸใƒŠใƒผ ...

What's hot (20)

PPTX
Introduction to DPDK
PDF
Open vSwitch Introduction
PPTX
Packet flow on openstack
PDF
OpenStack NeutronใฎๆฉŸ่ƒฝๆฆ‚่ฆ - OpenStackๆœ€ๆ–ฐๆƒ…ๅ ฑใ‚ปใƒŸใƒŠใƒผ 2014ๅนด12ๆœˆ
PDF
20150511 jun lee_openstack neutron ๋ถ„์„ (์ตœ์ข…)
PDF
VPNaaS in Neutron
PDF
RDMA, Scalable MPI-3 RMA, and Next-Generation Post-RDMA Interconnects
PDF
Intel dpdk Tutorial
ODP
Dpdk performance
PPTX
EVPN-Presentation.pptx
PDF
ๆ—ฅๆœฌOpenStackใƒฆใƒผใ‚ถไผš ็ฌฌ37ๅ›žๅ‹‰ๅผทไผš
PPTX
OVN ่จญๅฎšใ‚ตใƒณใƒ—ใƒซ ๏ฝœ OVN config example 2015/12/27
PDF
1000 Ccna Questions And Answers
PDF
Monitoring pfSense 2.4 with SNMP - pfSense Hangout March 2018
ย 
PDF
OpenStackใƒˆใƒฉใƒ–ใƒซใ‚ทใƒฅใƒผใƒ†ใ‚ฃใƒณใ‚ฐๅ…ฅ้–€
PDF
Service Function Chaining in Openstack Neutron
PPTX
OpenvSwitch Deep Dive
ย 
PDF
็Ÿฅใฃใฆใ„ใ‚‹ใ‚ˆใ†ใง็Ÿฅใ‚‰ใชใ„Neutron -ไปฎๆƒณใƒซใƒผใ‚ฟใฎๅ†—้•ทใจๅˆ†ๆ•ฃ- - OpenStackๆœ€ๆ–ฐๆƒ…ๅ ฑใ‚ปใƒŸใƒŠใƒผ 2016ๅนด3ๆœˆ
PDF
BGP Unnumbered ใง้Šใ‚“ใงใฟใŸ
PDF
DPDK & Layer 4 Packet Processing
Introduction to DPDK
Open vSwitch Introduction
Packet flow on openstack
OpenStack NeutronใฎๆฉŸ่ƒฝๆฆ‚่ฆ - OpenStackๆœ€ๆ–ฐๆƒ…ๅ ฑใ‚ปใƒŸใƒŠใƒผ 2014ๅนด12ๆœˆ
20150511 jun lee_openstack neutron ๋ถ„์„ (์ตœ์ข…)
VPNaaS in Neutron
RDMA, Scalable MPI-3 RMA, and Next-Generation Post-RDMA Interconnects
Intel dpdk Tutorial
Dpdk performance
EVPN-Presentation.pptx
ๆ—ฅๆœฌOpenStackใƒฆใƒผใ‚ถไผš ็ฌฌ37ๅ›žๅ‹‰ๅผทไผš
OVN ่จญๅฎšใ‚ตใƒณใƒ—ใƒซ ๏ฝœ OVN config example 2015/12/27
1000 Ccna Questions And Answers
Monitoring pfSense 2.4 with SNMP - pfSense Hangout March 2018
ย 
OpenStackใƒˆใƒฉใƒ–ใƒซใ‚ทใƒฅใƒผใƒ†ใ‚ฃใƒณใ‚ฐๅ…ฅ้–€
Service Function Chaining in Openstack Neutron
OpenvSwitch Deep Dive
ย 
็Ÿฅใฃใฆใ„ใ‚‹ใ‚ˆใ†ใง็Ÿฅใ‚‰ใชใ„Neutron -ไปฎๆƒณใƒซใƒผใ‚ฟใฎๅ†—้•ทใจๅˆ†ๆ•ฃ- - OpenStackๆœ€ๆ–ฐๆƒ…ๅ ฑใ‚ปใƒŸใƒŠใƒผ 2016ๅนด3ๆœˆ
BGP Unnumbered ใง้Šใ‚“ใงใฟใŸ
DPDK & Layer 4 Packet Processing
Ad

Viewers also liked (12)

PPTX
DevOops - Lessons Learned from an OpenStack Network Architect
ODP
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
PDF
Osdc2014 openstack networking yves_fauser
ย 
PPTX
Vxlan frame format and forwarding
PPTX
Introduction to vxlan
PDF
OpenStack Networking
PDF
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
PPT
Whats kpi
PPT
Kpi logistics examples
PPTX
Criterias creative idea
PDF
book
PPT
What kpi
DevOops - Lessons Learned from an OpenStack Network Architect
OpenStack Oslo Messaging RPC API Tutorial Demo Call, Cast and Fanout
Osdc2014 openstack networking yves_fauser
ย 
Vxlan frame format and forwarding
Introduction to vxlan
OpenStack Networking
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
Whats kpi
Kpi logistics examples
Criterias creative idea
book
What kpi
Ad

Similar to 2014 OpenStack Summit - Neutron OVS to LinuxBridge Migration (20)

PPTX
Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...
PDF
OpenStack and OpenContrail for FreeBSD platform by Michaล‚ Dubiel
PDF
10 sdn-vir-6up
PPTX
Dragonflow Austin Summit Talk
PPTX
nested-kvm
PDF
Cumulus Linux 2.5 Overview
PPTX
OpenStack Networking and Automation
PPTX
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
PPTX
Dragonflow 01 2016 TLV meetup
PDF
Software Defined Networks (SDN) na przykล‚adzie rozwiฤ…zania OpenContrail.
PDF
VMworld 2013: Advanced VMware NSX Architecture
ย 
PDF
Tech Talk by John Casey (CTO) CPLANE_NETWORKS : High Performance OpenStack Ne...
PDF
Open Networking for Your OpenStack
PPTX
Reference design for v mware nsx
PDF
Open stack networking_101_part-2_tech_deep_dive
ย 
PDF
Open stack networking_101_update_2014-os-meetups
ย 
PDF
Nvp deep dive_session_cee-day
ย 
PPTX
VIO on Cisco UCS and Network
ODP
Cloudstack networking2
PPTX
Optimising nfv service chains on open stack using docker
Scaling OpenStack Networking Beyond 4000 Nodes with Dragonflow - Eshed Gal-Or...
OpenStack and OpenContrail for FreeBSD platform by Michaล‚ Dubiel
10 sdn-vir-6up
Dragonflow Austin Summit Talk
nested-kvm
Cumulus Linux 2.5 Overview
OpenStack Networking and Automation
Running Neutron at Scale - Gal Sagie & Eran Gampel - OpenStack Day Israel 2016
Dragonflow 01 2016 TLV meetup
Software Defined Networks (SDN) na przykล‚adzie rozwiฤ…zania OpenContrail.
VMworld 2013: Advanced VMware NSX Architecture
ย 
Tech Talk by John Casey (CTO) CPLANE_NETWORKS : High Performance OpenStack Ne...
Open Networking for Your OpenStack
Reference design for v mware nsx
Open stack networking_101_part-2_tech_deep_dive
ย 
Open stack networking_101_update_2014-os-meetups
ย 
Nvp deep dive_session_cee-day
ย 
VIO on Cisco UCS and Network
Cloudstack networking2
Optimising nfv service chains on open stack using docker

Recently uploaded (20)

PPTX
INTERNET------BASICS-------UPDATED PPT PRESENTATION
PDF
๐Ÿ’ฐ ๐”๐Š๐“๐ˆ ๐Š๐„๐Œ๐„๐๐€๐๐†๐€๐ ๐Š๐ˆ๐๐„๐‘๐Ÿ’๐ƒ ๐‡๐€๐‘๐ˆ ๐ˆ๐๐ˆ ๐Ÿ๐ŸŽ๐Ÿ๐Ÿ“ ๐Ÿ’ฐ
ย 
PPT
tcp ip networks nd ip layering assotred slides
PDF
Vigrab.top โ€“ Online Tool for Downloading and Converting Social Media Videos a...
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
PPTX
Funds Management Learning Material for Beg
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PPTX
Introduction to Information and Communication Technology
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PDF
Testing WebRTC applications at scale.pdf
PPTX
SAP Ariba Sourcing PPT for learning material
PDF
RPKI Status Update, presented by Makito Lay at IDNOG 10
ย 
PPTX
Introuction about WHO-FIC in ICD-10.pptx
PPTX
Digital Literacy And Online Safety on internet
PDF
Triggering QUIC, presented by Geoff Huston at IETF 123
ย 
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PPTX
Job_Card_System_Styled_lorem_ipsum_.pptx
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PPTX
international classification of diseases ICD-10 review PPT.pptx
PPTX
artificial intelligence overview of it and more
INTERNET------BASICS-------UPDATED PPT PRESENTATION
๐Ÿ’ฐ ๐”๐Š๐“๐ˆ ๐Š๐„๐Œ๐„๐๐€๐๐†๐€๐ ๐Š๐ˆ๐๐„๐‘๐Ÿ’๐ƒ ๐‡๐€๐‘๐ˆ ๐ˆ๐๐ˆ ๐Ÿ๐ŸŽ๐Ÿ๐Ÿ“ ๐Ÿ’ฐ
ย 
tcp ip networks nd ip layering assotred slides
Vigrab.top โ€“ Online Tool for Downloading and Converting Social Media Videos a...
Tenda Login Guide: Access Your Router in 5 Easy Steps
Funds Management Learning Material for Beg
Module 1 - Cyber Law and Ethics 101.pptx
Introduction to Information and Communication Technology
Design_with_Watersergyerge45hrbgre4top (1).ppt
Testing WebRTC applications at scale.pdf
SAP Ariba Sourcing PPT for learning material
RPKI Status Update, presented by Makito Lay at IDNOG 10
ย 
Introuction about WHO-FIC in ICD-10.pptx
Digital Literacy And Online Safety on internet
Triggering QUIC, presented by Geoff Huston at IETF 123
ย 
WebRTC in SignalWire - troubleshooting media negotiation
Job_Card_System_Styled_lorem_ipsum_.pptx
The New Creative Director: How AI Tools for Social Media Content Creation Are...
international classification of diseases ICD-10 review PPT.pptx
artificial intelligence overview of it and more

2014 OpenStack Summit - Neutron OVS to LinuxBridge Migration

  • 1. Migrating production workloads from OVS to LinuxBridge Kevin Stevens (kevin.stevens@rackspace.com) James Denton (james.denton@rackspace.com)
  • 2. Weโ€™re operators! Kevin Stevens โ€ข RPC Engineer since 2012 (Essex) โ€ข IRC: k_stev on irc.freenode.net James Denton โ€ข RPC Network Engineer since 2012 (Essex) โ€ข IRC: busterswt on irc.freenode.net
  • 3. What are we doing here? 1 A history of networking in Rackspace Private Cloud 2 3 4 Our experiences with Open vSwitch Swapping out OVS with LinuxBridge What to expect with each
  • 4. 2011 Started building OpenStack-powered private clouds 2012 Began architecting, building and supporting private clouds in customer DCs 2013 Over 100 customers running Rackspace Private Clouds 2014 Released RPC v9 based on Icehouse. 99.99% API uptime SLA.
  • 5. RPC v2.0/3.0 RPC v4.0/4.1 RPC v4.2 RPC v9.0 OpenStack Release Folsom Grizzly Havana Icehouse Network Stack nova-network Quantum Neutron Neutron L2 Connectivity flatDHCP Open vSwitch Open vSwitch LinuxBridge (ML2) L3 Agent Support N/A No Yes Yes Host OS Ubuntu 12.04 LTS Ubuntu 12.04 LTS CentOS 6.5 RHEL 6.5 Ubuntu 12.04 LTS Ubuntu 14.04 LTS The Evolution of Networking in RPC
  • 7. Why Neutron w/ Open vSwitch? โ€ขOpen vSwitch pushed by community โ€ขOpen vSwitch pushed by packagers โ€ขWanted overlay networks
  • 9. The problems โ€ขKernel panics (1.10) โ€ขovs-vswitchd segfaults (1.11) โ€ขBroadcast storms โ€ขData corruption (2.01)
  • 11. Why move to LinuxBridge? โ€ขLooking for reliability and stability โ€ขLess moving parts โ€ขEasier to troubleshoot โ€ขSupported by the community
  • 12. โ€ข Flexibility provided by overlay networking (if not using vxlan) โ€ข Neutron Distributed Virtual Routers (Juno) โ€ข Any customizability provided by OVS not implemented by Neutron itself www.rackspace.com 12 What do we lose by moving?
  • 14. Plan A: Scorch the earth! โ€ขSnapshot and delete all instances โ€ขDelete all networks โ€ขChange from OVS -> LB โ€ขRecreate all networks โ€ขBoot instances โ€ขโ€ฆ โ€ข It works butโ€ฆ
  • 15. But waitโ€ฆ these are production environments!
  • 16. Plan B: Migration Environment โ€ขDeploy LinuxBridge environment โ€ขSnapshot all instances โ€ขImport images into new environment โ€ขBuild new instances โ€ขCutover โ€ขโ€ฆ โ€ข It works, butโ€ฆ $$$
  • 17. Plan C: Switch it out! โ€ขStop services โ€ขUpdate the database โ€ขChange the configuration from OVS -> LB โ€ขRestart services โ€ขโ€ฆ โ€ข Profit!
  • 18. Issues with migrating โ€ข Neutron OVS DB schema != Neutron LinuxBridge DB schema โ€“ Migration to OVS ML2 DB schema is required first โ€ข Overlay networks may not supported โ€“ LinuxBridge uses VXLAN rather than GRE โ€“ Requires kernel >= 3.9 โ€ข Means GRE networks must be converted to VLAN networks โ€“ Didnโ€™t want to introduce additional complexity โ€“ VLANs easier to troubleshoot if something went wrong
  • 20. Preparation โ€ข Determine whatโ€™s needed: โ€“Dependencies โ€“Some method of converting database to ML2 schema โ€“Some method of converting data to LB from OVS โ€“Which configuration files need mangling โ€“Which services need disabling โ€“Which services need restarting โ€“Roll-back plan
  • 21. Define a successful outcome โ€ข Can instances gain a DHCP lease? โ€ข Do instances have internal/external connectivity? โ€ข Are security groups/other functions still operational? โ€ขWere instances placed into the correct bridge? โ€ข Will the changes survive a reboot?
  • 22. Normal OVS Operation (Network Node)
  • 23. Normal OVS Operation (Compute Node)
  • 24. First steps: Database manipulation โ€ข Backup! Backup! Backup! โ€ข Use migrate_to_ml2.py (modified) to change the DB schema โ€ข Update segments, ports and vlan tables โ€“Change GRE to VLAN โ€“Change segmentation id to real VLAN ID โ€“Set a provider bridge
  • 25. Next steps: Install and Configure โ€ข Install the LinuxBridge plugin โ€ข Update SQL connection strings โ€ข Configure ml2_conf.ini / linuxbridge_conf.ini โ€ข Change driver from OVS to ML2 in Neutron and Nova conf files
  • 26. Next steps: Pull ports from bridges โ€ข Stop Neutron services on all nodes โ€ข Remove host data-plane port from the OVS bridge(s) โ€ข Pull instance taps out of the OVS-related linux bridges โ€ข Remove router and dhcp interfaces from OVS integration bridge โ€ข Stop Openvswitch
  • 29. Finally: Restart services โ€ข Start Neutron services โ€ข Restart compute services
  • 30. Post Service Restart (Network Node)
  • 31. Post Service Restart (Compute Node)
  • 33. Failure Scenarios โ€ขInstances unresponsive? โ€“Check traffic from tap->bridge->physical interface โ€“Verify VLANs properly trunked through (and VLANs created on the switch) ๏Š
  • 34. Failure Scenarios (Contโ€™d) โ€ขIPs disappear or taps placed in QBR bridges โ€“Check Nova instance_info_caches table. โ€“Cache can be regenerated with a hard reboot of instance, or by adding an interface to the instance
  • 35. Failure Scenarios (Contโ€™d Contโ€™d) โ€ข Unable to boot new instances? โ€“ Usual troubleshooting techniques should be used โ€ข DHCP Binding_failed error messages? โ€“ Check /etc/default/neutron-server is referencing ML2 configuration file โ€ข BRQ bridges not built? โ€“ Verify New agents checking in? โ€“ Verify the LinuxBridge agent is installed and running
  • 37. Compare all the things 8 4 2 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 * Host-to-host testing; no virtualization. Longer is better. 1 Aggregate Throughput (Gbps) # of Threads iPerf3 Benchmarks (TCP / 1500 MTU / 10G Data) โ€“ Intel X520* (ixgbe driver) Open vSwitch (VXLAN) LinuxBridge (VXLAN) Open vSwitch (GRE) Open vSwitch (VLAN) LinuxBridge (VLAN)
  • 38. Compare all the things SCP File Transfers (10G file)* 115.00 Seconds 104.00 Seconds * Host-to-host testing; no virtualization. Longer is better. 61.50 Seconds 59.75 Seconds 110.50 Seconds 0.00 20.00 40.00 60.00 80.00 100.00 120.00 140.00 160.00 180.00 200.00 OVS VXLAN LB VXLAN OVS GRE OVS VLAN LB VLAN Transfer Speed (MBps)
  • 39. Compare all the things SCP File Transfers (10G file)* 115.00 Seconds 104.00 Seconds * Host-to-host testing; no virtualization. Longer is better. 61.50 Seconds 59.75 Seconds 110.50 Seconds 0.00 20.00 40.00 60.00 80.00 100.00 120.00 140.00 160.00 180.00 200.00 OVS VXLAN LB VXLAN OVS GRE OVS VLAN LB VLAN Transfer Speed (MBps)
  • 40. Compare all the things SCP File Transfers (10G file)* 115.00 Seconds 104.00 Seconds * Host-to-host testing; no virtualization. Longer is better. 61.50 Seconds 59.75 Seconds 110.50 Seconds 0.00 20.00 40.00 60.00 80.00 100.00 120.00 140.00 160.00 180.00 200.00 OVS VXLAN LB VXLAN OVS GRE OVS VLAN LB VLAN Transfer Speed (MBps)
  • 41. โ€ข OVS provides a great deal of functionality โ€ข Network stability more important for our customers than being on the cutting edge โ€ข Linux bridge provides almost all of the features we might want to use โ€ข How to migrate existing environments to LinuxBridge โ€ข Improved stability and comparable performance with OVS achieved www.rackspace.com 41 In Summary

Editor's Notes

  • #3: Who are we Kevin since 2006 James since 2008
  • #7: -Why did we originally go with Neutron
  • #8: -Why did we go with OVS
  • #10: -Kernel panics - Physical hosts would crash with a familiar stack trace -Segfaults โ€“ ovs-vswitchd segfaults and restarts itself resulting in empty flow tables because the Neutron OVS plugin would not rebuild the flows. Instances on an affected node would lose connectivity. Still a problem in 1.11 if pushing enough traffic. -Broadcast storms โ€“ If servers come up around the same time and the Neutron OVS plugin does not start for some reason, the servers begin forwarding everything to each other. Easily fixed by patching the startup script for Openvswitch but again extra complexity. -Data corruption. Eg. MYSQL transaction โ€˜BEGINโ€™ seen as โ€˜BEGieโ€™ on the receiving end -Relying on packaged version of OVS cripples your ability to stay current on something so fast-moving as OVS -Compiling OVS and kernel modules adds unnecessary complexity and requires frequent downtime
  • #11: -What made us choose to move to Linux Bridge?
  • #12: -tried and true -Been around for over a decade (2000). -Less complexity. -Allows us to take advantage of all of the features we wanted to use anyway.
  • #13: -lose the flexibility provided by using overlay networking like GRE which for the most part has been pretty solid in itself. -wouldnโ€™t be able to take advantage of the DVR functionality in Juno. -QOS customizations?
  • #14: -how do we move to Linux Bridge
  • #20: -*how* we went about the migration
  • #21: Extensive testing, no cowboy shit
  • #23: A glance at how our networking looks with OVS and GRE tenant overlay networks
  • #24: A glance at how our networking looks with OVS and GRE tenant overlay networks
  • #27: -Give the linuxbridge agent a clean slate to work with
  • #28: look at the removal of the tap interfaces from the โ€˜qbrโ€™ linux bridges, and removal of the physical interface from OVS provider bridge
  • #29: When we stop the openvswitch service, the OVS bridges are no longer available
  • #31: Restart LinuxBridge agent, L3 agent and DHCP agent
  • #32: Restart linuxbridge and compute
  • #35: Cache - This cache is used by nova when connecting instance to bridges. To rebuild, hard reboot or add an interface.
  • #36: Binding failed - Check /etc/default/neutron-server (Ubuntu)
  • #37: -does it actually pay off? -In the stability and supportability department definitely but how about performance-wise?
  • #38: -Benchmarks should be taken with a grain of salt, given the disparity of hardware and software being evaluated. YMMV.
  • #39: -Here we have a simple file transfer speed comparison (more is better in this graph)
  • #40: -GRE on OVS
  • #41: โ€ฆand VLAN on linux bridge