SlideShare a Scribd company logo
v
Routing on The Host: Concepts and Case Studies
Ivan Pepelnjak, Dinesh G Dutt
July 21, 2016
Agenda
Introduction
Use Cases
Detailed Design with Case Studies
Summary
July 26, 2016 cumulusnetworks.com 2
Routing on the Host
cumulusnetworks.com 3
 Extend L3 fabric all the way down
to the host by running Cumulus’
Routing App (Cumulus Quagga)
on the host
 Advertising /32 from host
 Automatic Segmentation and L3
convergence for reachability
leaf
HostsHostsHosts Hosts
spine
L3 Routing
Layer 3
ECMP
OSPF or
BGP
1.1.1.1/32
Confidential – Do Not Distribute
July 26, 2016 cumulusnetworks.com 4
•Applications and Servers are the
last bastion of bridging
How Bridging Plays A Role in Application Design
Service or node discovery relies on broadcast
Cluster heartbeat uses multicast
Assumptions about being in a single subnet
VM Mobility continued this trend
July 26, 2016 cumulusnetworks.com 5
Reasons Why Bridging Is How Compute Folks Think About
Networks
In the wild west days, IP routing was a low
performance and high cost solution since L2
switching was done in hardware
Vendors still charge extra for L3 licenses on the
same box:
 BGP costs even more money than OSPF
No good routing protocol stack on the host
L3 considered complex to configure and
troubleshoot compared to (mythical) L2 which was
plug-and-playJuly 26, 2016 cumulusnetworks.com 6
So What’s Changed ?
Two Key Factors
 Modern DC Applications
 Open Networking
July 26, 2016 Cumulus Networks Confidential 7
Routing Protocol Suite on Host
Many high quality open source routing suites now
available for the host
 Cumulus Quagga
 BIRD
Also commercial offerings are coming in:
 Windows Server 2012
July 26, 2016 cumulusnetworks.com 8
July 26, 2016 Cumulus Networks Confidential 9
•Use Cases
10 © ipSpace.net 2016 Routing on Hosts
Basics: Routing on Servers
Run a routing protocol on servers
• Advertise loopback IP addresses or host-specific prefixes to the network
• Easy to migrate IP addresses between servers
BGP
Advertise loopback IP with
BGP
11 © ipSpace.net 2016 Routing on Hosts
Routing on Servers: Use Cases
• Anycast services (examples: DNS, logging, load balancing…)
• A cluster member advertises service-specific IP address
• Large-scale virtual environments (Project Calico, Mesos, containers…)
• Workload mobility and disaster recovery scenarios
12 © ipSpace.net 2016 Routing on Hosts
Routing on Servers: Anycast
• All servers advertise the same IP address
• ECMP load balancing of network-to-server traffic
• Direct server return to avoid NAT and centralized state
• Ideal solution for stateless services (example: UDP)
• Potential disruption of individual TCP sessions after topology changes
13 © ipSpace.net 2016 Routing on Hosts
Routing on Servers: Clustering
Assumption: cluster services are tied to fixed IP addresses (bad idea, but still commonly used)
• Cluster members advertise IP addresses of active services
• Routing protocol announcement is revoked (or removed) after cluster member or service failure
• Another cluster member takes over and advertises service IP address
14 © ipSpace.net 2016 Routing on Hosts
Routing on Servers: Large-Scale Virtualization
Each host acts as a PE-router
• Advertises prefix assigned to it (numerous containers per prefix)
• Alternative: advertise a host route for every VM/container
• Example: Project Calico
15 © ipSpace.net 2016 Routing on Hosts
Routing Protocol Selection
Don’t use a link-state routing protocol across administrative boundaries
• No filtering or policy enforcement
• Updates generated by any node are flooded across the network before they’re evaluated
• OSPF areas don’t really help (exception: stub or NSSA areas)
Technology options: RIP or BGP (preferred)
16 © ipSpace.net 2016 Routing on Hosts
Routing Protocol Selection: BGP
• Use BGP between servers and leaf switches
• Easy-to-implement filters and prefix limits on ToR switches
Ideal implementations
• Dynamic BGP neighbors on leaf switches
• BGP over unnumbered links (available in recent Quagga code)
July 26, 2016 Cumulus Networks Confidential 17
•Configuration Deep Dive
Supported Servers
Red Hat 7
 Includes CentOS and Fedora
Ubuntu
 12.04, 14.04 and 16.04 LTS releases
Docker
 Run as a container
 Easy consumption model for other server OSes
July 26, 2016 Cumulus Networks Confidential 18
BGP Choices
Use of ASN
 Per server OR for all servers
Use of remote-as external
July 26, 2016 Cumulus Networks Confidential 19
Defending the ToR
BGP was designed to work across trust boundaries
 Multiple knobs to control this
Use the following knobs on ToR:
 Accept only locally originated routes from server
 Limit number of routes accepted from server
 Announce only default route to server along with loopback of
ToR
• This is more to keep the routing tables on the server intact
 Further limit what prefixes are accepted if you know the list
July 26, 2016 Cumulus Networks Confidential 20
Case Study 1: Replacing MLAG
July 26, 2016 Cumulus Networks Confidential 21
 Replace multiple protocols
with BGP
 MLAG, STP variations, FHRP
 Fewer Ports
 No peer link required
 Simpler failure model
10.1.20.11/32
10.254.0.1 10.254.0.2 10.254.0.1 10.254.0.2
10.1.20.11/32
 Switch maintenance
window uncoordinated
with servers
 Routing protocols support
graceful turning off of links
 Ability to connect to more
than 2 ToRs
Sample Configs; Dual Attach Servers
July 26, 2016 Cumulus Networks Confidential 22
log file /var/log/quagga/quagga.log
router bgp 65535
bgp router-id 10.1.20.11
bgp bestpath as-path multipath-relax
neighbor TOR peer-group
neighbor TOR remote-as external
neighbor TOR capability extended-nexthop
neighbor eth1 interface peer-group TOR
neighbor eth2 interface peer-group TOR
network 10.1.20.11/32
10.254.0.1 10.254.0.2
10.1.20.11/32
log file /var/log/quagga/quagga.log
router bgp 65535
bgp router-id 10.254.0.2
neighbor SRVR peer-group
neighbor SRVR remote-as external
neighbor SRVR capability extended-nexthop
neighbor SRVR route-map INROUTE in
neighbor SRVR route-map OUTROUTE out
neighbor SRVR maximum-prefix 10
neighbor SRVR default-originate
neighbor swp1 interface peer-group TOR
network 10.254.0.2
!
ip prefix-list DEFONLY seq 5 permit 0.0.0.0/0
ip as-path access-list 1 permit ^65535$
!
route-map INROUTE permit 10
match as-path 1
route-map OUTROUTE permit 10
match ip address prefix-list DEFONLY
Some Observations on Dual-Attach
Can’t use dynamic BGP neighbor model since we
have a single IP address
 BGP runs on TCP and so needs a separate IP per interface to
peer with neighbor
BGP Unnumbered usage simplifies the
configuration
July 26, 2016 Cumulus Networks Confidential 23
Caveats on Dual-Attach Servers
Works well with static IP addressing
With DHCP, ensure DHCP server doles out the same
IP to both interfaces
 Or else use dhclient-exit-hooks to configure loopback
PXE boot doesn’t work yet
 Solution in the works
July 26, 2016 Cumulus Networks Confidential 24
Case Study: Dual Attach Servers Without Vendor Lockin
July 26, 2016 Cumulus Networks Confidential 25
Dual attach Ubuntu
server to Cumulus &
Arista
Uses IPv4 link-local
address
Advertises loopback IP
Receives Default Route
from both
169.254.0.2/31 169.254.0.4/31
169.254.0.5/31169.254.0.3/31
10.1.20.11/32
10.254.0.1 10.254.0.2
server1
Case Study: Use with Containers
Replace docker’s NAT model
 Improve performance
 Improve transparency
Use case valid within DC
 Docker model is geared towards cloud deployment
Use higher performance Linux drivers
 IPVLAN and MacVlan instead of bridge or overlay
July 26, 2016 Cumulus Networks Confidential 26
Case Study: Anycast
Common case in DC to use distributed virtual
services
Distributed services use a common anycast address
How do you announce this address ?
 Current model is to use the ToR to announce or some other
entity and use set nexthop route-map to ensure
July 26, 2016 Cumulus Networks Confidential 27
Sysctls: The Missing Manual For OSPF on Servers
If OSPF is the routing protocol chosen, set the
following sysctls to allow ospfd to receive multicast
frames:
 net.ipv4.conf.all.rp_filter = 0
 net.ipv4.conf.default.rp_filter = 0
 net.ipv4.conf.lo.rp_filter = 0
July 26, 2016 Cumulus Networks Confidential 28
Scaling
Summarization is difficult in a
CLOS network with BGP
~32K containers in production
today
Modern switching silicon has
upwards of 128K IPv4 routing
entries
July 26, 2016 Cumulus Networks Confidential 29
SPINE
LEAF
Connectivity To Outside World
July 11, 2016 30cumulusnetworks.com
Pod Border Pod
Internet
Summarize here
Running Quagga As Container
 Cumulus Quagga can also be run as a container
 docker run -t -i -d --net=host --privileged=true --
name=quagga_docker cumulusnetworks/quagga:xenial-latest
 Runs as privileged container to modify routing tables,
receive netlink notifications etc.
 --net=host allows full access to host routing tables,
interfaces etc.
 https://github.com/CumulusNetworks/cldemo-docker-
quagga
 For sample Ansible playbooks
July 11, 2016 31
Resources
Try the setup out on your laptop with these
resources:
 https://cumulusnetworks.com/routing-on-the-host/
 https://cumulusnetworks.com/cumulus-vx/
 https://www.virtualbox.org/
 https://www.vagrantup.com/
July 26, 2016 cumulusnetworks.com 32
Summary
A new breed of applications are ushering in a
rethink of how networking is done in the DC
Building Pure L3 Fabrics is real
 Networks, Compute and Applications are showing how to do
this
 Standards-based, robust, scalable design
Cumulus Quagga is a high quality routing suite
used in production in many places
 Community and Enterprise supported
 Available for Docker, Ubuntu and Redhat
July 26, 2016 Cumulus Networks Confidential 33
July 26, 2016 cumulusnetworks.com 34
•Tips and Tricks of Network
Automation
•Guest Speaker: To be Announced
•When: August 30
Next Month’s Webinar
© 2016 Cumulus Networks. Cumulus Networks, the Cumulus Networks Logo, and Cumulus Linux are trademarks or registered trademarks of Cumulus Networks, Inc. or its affiliates in
the U.S. and other countries. Other names may be trademarks of their respective owners. The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive
licensee of Linus Torvalds, owner of the mark on a world-wide basis.
ThankYou!
July 26, 2016 Cumulus Networks Confidential 35

More Related Content

PPTX
File transfer protocol
PPT
The process states
PDF
PostgreSQL Query Cache - "pqc"
PPTX
Basic commands of linux
PDF
MAAS High Availability Overview
PPTX
Linux Memory Management with CMA (Contiguous Memory Allocator)
PPT
Weblogic Server Overview Weblogic Scripting Tool
PDF
Q4.11: ARM Architecture
File transfer protocol
The process states
PostgreSQL Query Cache - "pqc"
Basic commands of linux
MAAS High Availability Overview
Linux Memory Management with CMA (Contiguous Memory Allocator)
Weblogic Server Overview Weblogic Scripting Tool
Q4.11: ARM Architecture

What's hot (20)

PPTX
Virtual machines and their architecture
PPT
Linux SD/MMC Driver Stack
PPTX
Difference Program vs Process vs Thread
PPT
Linux monitoring and Troubleshooting for DBA's
PDF
CREATING AND MANAGING USER ACCOUNTS.pdf
PDF
BPF Internals (eBPF)
PDF
computer system structure
PDF
CNIT 126: 10: Kernel Debugging with WinDbg
PPTX
Operating System Operations ppt.pptx
PPTX
Introduction to Rust language programming
PPTX
RISC-V Boot Process: One Step at a Time
PDF
A crash course in CRUSH
PDF
Load Balancing MySQL with HAProxy - Slides
PDF
Open vSwitch - Stateful Connection Tracking & Stateful NAT
PDF
Linux Kernel - Virtual File System
PDF
Syslog Protocols
PDF
HTTP/3 in curl - curl up 2022
PDF
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
PPT
Ch2: Computer System Structure (OS)
PPTX
Operating system components
Virtual machines and their architecture
Linux SD/MMC Driver Stack
Difference Program vs Process vs Thread
Linux monitoring and Troubleshooting for DBA's
CREATING AND MANAGING USER ACCOUNTS.pdf
BPF Internals (eBPF)
computer system structure
CNIT 126: 10: Kernel Debugging with WinDbg
Operating System Operations ppt.pptx
Introduction to Rust language programming
RISC-V Boot Process: One Step at a Time
A crash course in CRUSH
Load Balancing MySQL with HAProxy - Slides
Open vSwitch - Stateful Connection Tracking & Stateful NAT
Linux Kernel - Virtual File System
Syslog Protocols
HTTP/3 in curl - curl up 2022
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
Ch2: Computer System Structure (OS)
Operating system components
Ad

Viewers also liked (20)

PPTX
Webinar: Network Automation [Tips & Tricks]
PPTX
Network Architecture for Containers
PPTX
Microservices Network Architecture 101
PPTX
Building Scalable Data Center Networks
PPTX
Demystifying Networking: Data Center Networking Trends 2017
PDF
What is 3d torus
PDF
Manage your switches like servers
PDF
Webinar- Tea for the Tillerman
PPTX
Cumulus Linux 2.5.3
PPTX
Linux networking is Awesome!
PDF
How deep is your buffer – Demystifying buffers and application performance
PDF
Cumulus Linux 2.5.4
PPTX
July NYC Open Networking Meeup
PDF
Cumulus Linux 2.5.5 What's New
PDF
Dreamhost deploying dreamcompute at scale
PPTX
Webinar-Linux Networking is Awesome
PDF
Ifupdown2: Network Interface Manager
PDF
Operationalizing BGP in the SDDC
PPTX
Operationalizing VRF in the Data Center
PDF
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Webinar: Network Automation [Tips & Tricks]
Network Architecture for Containers
Microservices Network Architecture 101
Building Scalable Data Center Networks
Demystifying Networking: Data Center Networking Trends 2017
What is 3d torus
Manage your switches like servers
Webinar- Tea for the Tillerman
Cumulus Linux 2.5.3
Linux networking is Awesome!
How deep is your buffer – Demystifying buffers and application performance
Cumulus Linux 2.5.4
July NYC Open Networking Meeup
Cumulus Linux 2.5.5 What's New
Dreamhost deploying dreamcompute at scale
Webinar-Linux Networking is Awesome
Ifupdown2: Network Interface Manager
Operationalizing BGP in the SDDC
Operationalizing VRF in the Data Center
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
Ad

Similar to Demystifying Networking Webinar Series- Routing on the Host (20)

PDF
Cloud Traffic Engineer – Google Espresso Project by Shaowen Ma
PPTX
ISP core routing project
PDF
Inter-AS MPLS VPN Deployment
PDF
BGP Traffic Engineering with SDN Controller
PPT
Interautonomous System PLS VPN Advanced Concepts
PPTX
10 routing-bgp
PDF
Configuration & Routing of Clos Networks
PDF
3 ip routing part b
PDF
3 ip routing bgp-updated
PDF
Ospf
PDF
6LoWPAN: An open IoT Networking Protocol
PPTX
TechWiseTV Workshop: Segment Routing for the Datacenter
PPTX
Chapter -1 Basic Network Routing Concepts.pptx
PDF
Improving the peering business case with RPKI
PPT
CCNA CHAPTER 16 BY jetarvind kumar madhukar
PDF
Interconnecting Neutron and Network Operators' BGP VPNs
PPTX
Network Layer
PDF
CSC427_Week_11.pdf
PDF
3 ip routing pbr bfd -v2
PPTX
Manrs 7_sept__indonesia
Cloud Traffic Engineer – Google Espresso Project by Shaowen Ma
ISP core routing project
Inter-AS MPLS VPN Deployment
BGP Traffic Engineering with SDN Controller
Interautonomous System PLS VPN Advanced Concepts
10 routing-bgp
Configuration & Routing of Clos Networks
3 ip routing part b
3 ip routing bgp-updated
Ospf
6LoWPAN: An open IoT Networking Protocol
TechWiseTV Workshop: Segment Routing for the Datacenter
Chapter -1 Basic Network Routing Concepts.pptx
Improving the peering business case with RPKI
CCNA CHAPTER 16 BY jetarvind kumar madhukar
Interconnecting Neutron and Network Operators' BGP VPNs
Network Layer
CSC427_Week_11.pdf
3 ip routing pbr bfd -v2
Manrs 7_sept__indonesia

More from Cumulus Networks (12)

PPTX
Building a Layer 3 network with Cumulus Linux
PDF
Operationalizing EVPN in the Data Center: Part 2
PDF
Demystifying EVPN in the data center: Part 1 in 2 episode series
PPTX
Best practices for network troubleshooting
PDF
NetDevOps 202: Life After Configuration
PPTX
Cumulus Networks: Automating Network Configuration
PDF
Open Networking for Your OpenStack
PDF
Big data, better networks
PDF
Mlag invisibile layer 2 redundancy
PDF
Using linux to manage the entire rack
PPTX
Big Data, Better Networks
PDF
Open Hardware for All - Webinar March 25, 2015
Building a Layer 3 network with Cumulus Linux
Operationalizing EVPN in the Data Center: Part 2
Demystifying EVPN in the data center: Part 1 in 2 episode series
Best practices for network troubleshooting
NetDevOps 202: Life After Configuration
Cumulus Networks: Automating Network Configuration
Open Networking for Your OpenStack
Big data, better networks
Mlag invisibile layer 2 redundancy
Using linux to manage the entire rack
Big Data, Better Networks
Open Hardware for All - Webinar March 25, 2015

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Cloud computing and distributed systems.
PDF
cuic standard and advanced reporting.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
PPTX
A Presentation on Artificial Intelligence
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Cloud computing and distributed systems.
cuic standard and advanced reporting.pdf
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology
A Presentation on Artificial Intelligence
Building Integrated photovoltaic BIPV_UPV.pdf
Machine learning based COVID-19 study performance prediction
Understanding_Digital_Forensics_Presentation.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Approach and Philosophy of On baking technology
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Empathic Computing: Creating Shared Understanding
Diabetes mellitus diagnosis method based random forest with bat algorithm
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Demystifying Networking Webinar Series- Routing on the Host

  • 1. v Routing on The Host: Concepts and Case Studies Ivan Pepelnjak, Dinesh G Dutt July 21, 2016
  • 2. Agenda Introduction Use Cases Detailed Design with Case Studies Summary July 26, 2016 cumulusnetworks.com 2
  • 3. Routing on the Host cumulusnetworks.com 3  Extend L3 fabric all the way down to the host by running Cumulus’ Routing App (Cumulus Quagga) on the host  Advertising /32 from host  Automatic Segmentation and L3 convergence for reachability leaf HostsHostsHosts Hosts spine L3 Routing Layer 3 ECMP OSPF or BGP 1.1.1.1/32 Confidential – Do Not Distribute
  • 4. July 26, 2016 cumulusnetworks.com 4 •Applications and Servers are the last bastion of bridging
  • 5. How Bridging Plays A Role in Application Design Service or node discovery relies on broadcast Cluster heartbeat uses multicast Assumptions about being in a single subnet VM Mobility continued this trend July 26, 2016 cumulusnetworks.com 5
  • 6. Reasons Why Bridging Is How Compute Folks Think About Networks In the wild west days, IP routing was a low performance and high cost solution since L2 switching was done in hardware Vendors still charge extra for L3 licenses on the same box:  BGP costs even more money than OSPF No good routing protocol stack on the host L3 considered complex to configure and troubleshoot compared to (mythical) L2 which was plug-and-playJuly 26, 2016 cumulusnetworks.com 6
  • 7. So What’s Changed ? Two Key Factors  Modern DC Applications  Open Networking July 26, 2016 Cumulus Networks Confidential 7
  • 8. Routing Protocol Suite on Host Many high quality open source routing suites now available for the host  Cumulus Quagga  BIRD Also commercial offerings are coming in:  Windows Server 2012 July 26, 2016 cumulusnetworks.com 8
  • 9. July 26, 2016 Cumulus Networks Confidential 9 •Use Cases
  • 10. 10 © ipSpace.net 2016 Routing on Hosts Basics: Routing on Servers Run a routing protocol on servers • Advertise loopback IP addresses or host-specific prefixes to the network • Easy to migrate IP addresses between servers BGP Advertise loopback IP with BGP
  • 11. 11 © ipSpace.net 2016 Routing on Hosts Routing on Servers: Use Cases • Anycast services (examples: DNS, logging, load balancing…) • A cluster member advertises service-specific IP address • Large-scale virtual environments (Project Calico, Mesos, containers…) • Workload mobility and disaster recovery scenarios
  • 12. 12 © ipSpace.net 2016 Routing on Hosts Routing on Servers: Anycast • All servers advertise the same IP address • ECMP load balancing of network-to-server traffic • Direct server return to avoid NAT and centralized state • Ideal solution for stateless services (example: UDP) • Potential disruption of individual TCP sessions after topology changes
  • 13. 13 © ipSpace.net 2016 Routing on Hosts Routing on Servers: Clustering Assumption: cluster services are tied to fixed IP addresses (bad idea, but still commonly used) • Cluster members advertise IP addresses of active services • Routing protocol announcement is revoked (or removed) after cluster member or service failure • Another cluster member takes over and advertises service IP address
  • 14. 14 © ipSpace.net 2016 Routing on Hosts Routing on Servers: Large-Scale Virtualization Each host acts as a PE-router • Advertises prefix assigned to it (numerous containers per prefix) • Alternative: advertise a host route for every VM/container • Example: Project Calico
  • 15. 15 © ipSpace.net 2016 Routing on Hosts Routing Protocol Selection Don’t use a link-state routing protocol across administrative boundaries • No filtering or policy enforcement • Updates generated by any node are flooded across the network before they’re evaluated • OSPF areas don’t really help (exception: stub or NSSA areas) Technology options: RIP or BGP (preferred)
  • 16. 16 © ipSpace.net 2016 Routing on Hosts Routing Protocol Selection: BGP • Use BGP between servers and leaf switches • Easy-to-implement filters and prefix limits on ToR switches Ideal implementations • Dynamic BGP neighbors on leaf switches • BGP over unnumbered links (available in recent Quagga code)
  • 17. July 26, 2016 Cumulus Networks Confidential 17 •Configuration Deep Dive
  • 18. Supported Servers Red Hat 7  Includes CentOS and Fedora Ubuntu  12.04, 14.04 and 16.04 LTS releases Docker  Run as a container  Easy consumption model for other server OSes July 26, 2016 Cumulus Networks Confidential 18
  • 19. BGP Choices Use of ASN  Per server OR for all servers Use of remote-as external July 26, 2016 Cumulus Networks Confidential 19
  • 20. Defending the ToR BGP was designed to work across trust boundaries  Multiple knobs to control this Use the following knobs on ToR:  Accept only locally originated routes from server  Limit number of routes accepted from server  Announce only default route to server along with loopback of ToR • This is more to keep the routing tables on the server intact  Further limit what prefixes are accepted if you know the list July 26, 2016 Cumulus Networks Confidential 20
  • 21. Case Study 1: Replacing MLAG July 26, 2016 Cumulus Networks Confidential 21  Replace multiple protocols with BGP  MLAG, STP variations, FHRP  Fewer Ports  No peer link required  Simpler failure model 10.1.20.11/32 10.254.0.1 10.254.0.2 10.254.0.1 10.254.0.2 10.1.20.11/32  Switch maintenance window uncoordinated with servers  Routing protocols support graceful turning off of links  Ability to connect to more than 2 ToRs
  • 22. Sample Configs; Dual Attach Servers July 26, 2016 Cumulus Networks Confidential 22 log file /var/log/quagga/quagga.log router bgp 65535 bgp router-id 10.1.20.11 bgp bestpath as-path multipath-relax neighbor TOR peer-group neighbor TOR remote-as external neighbor TOR capability extended-nexthop neighbor eth1 interface peer-group TOR neighbor eth2 interface peer-group TOR network 10.1.20.11/32 10.254.0.1 10.254.0.2 10.1.20.11/32 log file /var/log/quagga/quagga.log router bgp 65535 bgp router-id 10.254.0.2 neighbor SRVR peer-group neighbor SRVR remote-as external neighbor SRVR capability extended-nexthop neighbor SRVR route-map INROUTE in neighbor SRVR route-map OUTROUTE out neighbor SRVR maximum-prefix 10 neighbor SRVR default-originate neighbor swp1 interface peer-group TOR network 10.254.0.2 ! ip prefix-list DEFONLY seq 5 permit 0.0.0.0/0 ip as-path access-list 1 permit ^65535$ ! route-map INROUTE permit 10 match as-path 1 route-map OUTROUTE permit 10 match ip address prefix-list DEFONLY
  • 23. Some Observations on Dual-Attach Can’t use dynamic BGP neighbor model since we have a single IP address  BGP runs on TCP and so needs a separate IP per interface to peer with neighbor BGP Unnumbered usage simplifies the configuration July 26, 2016 Cumulus Networks Confidential 23
  • 24. Caveats on Dual-Attach Servers Works well with static IP addressing With DHCP, ensure DHCP server doles out the same IP to both interfaces  Or else use dhclient-exit-hooks to configure loopback PXE boot doesn’t work yet  Solution in the works July 26, 2016 Cumulus Networks Confidential 24
  • 25. Case Study: Dual Attach Servers Without Vendor Lockin July 26, 2016 Cumulus Networks Confidential 25 Dual attach Ubuntu server to Cumulus & Arista Uses IPv4 link-local address Advertises loopback IP Receives Default Route from both 169.254.0.2/31 169.254.0.4/31 169.254.0.5/31169.254.0.3/31 10.1.20.11/32 10.254.0.1 10.254.0.2 server1
  • 26. Case Study: Use with Containers Replace docker’s NAT model  Improve performance  Improve transparency Use case valid within DC  Docker model is geared towards cloud deployment Use higher performance Linux drivers  IPVLAN and MacVlan instead of bridge or overlay July 26, 2016 Cumulus Networks Confidential 26
  • 27. Case Study: Anycast Common case in DC to use distributed virtual services Distributed services use a common anycast address How do you announce this address ?  Current model is to use the ToR to announce or some other entity and use set nexthop route-map to ensure July 26, 2016 Cumulus Networks Confidential 27
  • 28. Sysctls: The Missing Manual For OSPF on Servers If OSPF is the routing protocol chosen, set the following sysctls to allow ospfd to receive multicast frames:  net.ipv4.conf.all.rp_filter = 0  net.ipv4.conf.default.rp_filter = 0  net.ipv4.conf.lo.rp_filter = 0 July 26, 2016 Cumulus Networks Confidential 28
  • 29. Scaling Summarization is difficult in a CLOS network with BGP ~32K containers in production today Modern switching silicon has upwards of 128K IPv4 routing entries July 26, 2016 Cumulus Networks Confidential 29 SPINE LEAF
  • 30. Connectivity To Outside World July 11, 2016 30cumulusnetworks.com Pod Border Pod Internet Summarize here
  • 31. Running Quagga As Container  Cumulus Quagga can also be run as a container  docker run -t -i -d --net=host --privileged=true -- name=quagga_docker cumulusnetworks/quagga:xenial-latest  Runs as privileged container to modify routing tables, receive netlink notifications etc.  --net=host allows full access to host routing tables, interfaces etc.  https://github.com/CumulusNetworks/cldemo-docker- quagga  For sample Ansible playbooks July 11, 2016 31
  • 32. Resources Try the setup out on your laptop with these resources:  https://cumulusnetworks.com/routing-on-the-host/  https://cumulusnetworks.com/cumulus-vx/  https://www.virtualbox.org/  https://www.vagrantup.com/ July 26, 2016 cumulusnetworks.com 32
  • 33. Summary A new breed of applications are ushering in a rethink of how networking is done in the DC Building Pure L3 Fabrics is real  Networks, Compute and Applications are showing how to do this  Standards-based, robust, scalable design Cumulus Quagga is a high quality routing suite used in production in many places  Community and Enterprise supported  Available for Docker, Ubuntu and Redhat July 26, 2016 Cumulus Networks Confidential 33
  • 34. July 26, 2016 cumulusnetworks.com 34 •Tips and Tricks of Network Automation •Guest Speaker: To be Announced •When: August 30 Next Month’s Webinar
  • 35. © 2016 Cumulus Networks. Cumulus Networks, the Cumulus Networks Logo, and Cumulus Linux are trademarks or registered trademarks of Cumulus Networks, Inc. or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis. ThankYou! July 26, 2016 Cumulus Networks Confidential 35