SlideShare a Scribd company logo
Jeong Wook-jae
wjjung11@gmail.com
Data Center Network Architecture:
Towards a Cloud Data Center
1/44
Contents
 The Conventional Architecture & Problem
 The New Architecture
 The Monsoon Architecture
 The VL2 Architecture
 The SEATTLE Architecture
 The PortLand Architecture
 The TRILL
 Related Works
 Summary
 The CDCN(Cloud Data Center Network) Architecture Proposal
 Trend
2/44
Confidential
The Conventional Architecture
The conventional architecture for data centers (adapted from figure by Cisco_2004)
3/44
Confidential
The Problems of a Conventional DC
Ethernet is hard to scale out
- STP
- Broadcast (ARP, RARP, DHCP…)
- Packet Floods in Switch (for Mac Learning)
Fragmentation of resources
No Performance Isolation
Poor server to server connectivity
Need very high reliability near top of the tree (Single Point of Failure)
4/44
Confidential
The Problems of a Conventional DC
Fragmentation of Resources
- VLANs used to isolate properties from each other
- IP addresses topologically determined by ARs
- Reconfiguration of IPs and VLAN trunks
• painful, error-prone, slow, often manual
5/44
Confidential
The Problems of a Conventional DC
No Performance Isolation
- VLANs typically provide only reachability isolation
- One service sending/receiving too much traffic hurts all services sharing its
subtree
6/44
Confidential
The Problems of a Conventional DC
Poor server to server connectivity
- Data centers run two kinds of applications:
• Outward facing (serving web pages to users)
• Internal computation
- 70~80% of the packets stay inside the data center
7/44
Confidential
The Problems of a Conventional DC
8/44
Confidential
Monsoon
Albert Greenberg and 4 other persons
(Microsoft Research)
9/44
Confidential
The Monsoon Architecture
Monsoon
- A new network architecture, which scales and commoditizes data center networking.
Abstract
- Scale-out instead of Scale-up
- A single large Layer 2 domain
- Using programmable commodity layer 2 switches and servers.
- Hierarchy has 2:
• TOR(Top-Of-Rack) Switch => Access Switch
• LB(Load Balancing) Switch => Core Switch
- Scale to 100,000 servers or more.
10/44
Confidential
The Monsoon Architecture
Objectives
- Low-Cost & Scale-out
- Uniform high capacity
• Capacity between two servers limited only by their NICs
• No need to consider topology when adding servers
- Performance isolation
• Traffic of one service should be unaffected by others
- Layer-2 semantics
• Flat addressing, so any server can have any IP address
• Server configuration is the same as in a LAN
• Legacy applications depending on broadcast must work
11/44
Confidential
The Monsoon Architecture
Server-to-Server Forwarding
- An Example Monsoon Topology (Clos Network)
• A scale-out design with broad layers
- Same bisection BW at each layer -> no oversubscription
- Extensive path diversity -> Graceful degradation under failure
SWITCH Up-link Port Down-link Port #
Inter. SW N/A 10Gbps X 144 72
Aggr. SW 10Gbps X 72 10Gbps X 72 144
TOR SW 10Gbps X 2 1Gbps X 20 5,184
12/44
Confidential
The Monsoon Architecture
Clos Network Topology
- A Multistage(ex. 3-stage) switching network.
- The advantage
• The connection between a large number of input and output ports can be made by
using only small-sized switches.
• It can be shown that with k ≥ n, the clos network can be non-blocking like a crossbar
switch.
- Clos Theorem: If K >= 2n-1, then a new connection can always be added
without rearrangement
13/44
Confidential
The Monsoon Architecture
Server-to-Server Forwarding
Valiant Load Balancing
• Every flow “bounced” off a random intermediate switch
• Probably hotspot free for any admissible traffic matrix
• Servers could randomize flow-lets if needed
14/44
Confidential
The Monsoon Architecture
Valiant Load Balancing
15/44
Confidential
The Monsoon Architecture
Server-to-Server Forwarding
- Encapsulation used to transfer complexity to servers
• Commodity switches have simple forwarding primitives
• Complexity moved to computing the headers
- Encapsulation available
• IEEE 802.1ah defines MAC-in-MAC encapsulation
Frame processing when packets go from one server to another in the same data center.
16/44
Confidential
The Monsoon Architecture
Server-to-Server Forwarding
- Data center OSes already heavily modified for VMs, storage, etc.
• A thin shim for network support is no big deal
- Applications work with Application Addresses
• AA’s are flat names; infrastructure addresses invisible to apps
- No change to applications or clients outside DC
The networking stack of a host.
The Monsoon Agent looks up remote IPs in the central directory.
Monsoon
Agent
17/44
Confidential
The Monsoon Architecture
External Connection & Full Topology(Example)
- Routers do not support the Monsoon functions
- Ingress Server with each Access Router
• Implements the Monsoon functionality and acts as a GW to the DC.
• Two Interface : AR & TOR switch
• Default GW
ARAR AR AR ···
Ingress
Server
···Ingress
Server
Ingress
Server
Ingress
Server
18/44
Confidential
The Monsoon Architecture
Directory System Performance
- Key issues:
• Lookup latency
• How many servers needed to handle a DC’s lookup traffic?
• Update latency
• Convergence latency
19/44
Confidential
VL2
Albert Greenberg, Changhoon Kim and 7 other persons
(Microsoft Research)
20/44
Confidential
The VL2 Architecture
VL2 uses
- flat addressing to allow service instances to be placed anywhere in the network
- Valiant Load Balancing to spread traffic uniformly across network paths
- end system-based address resolution to scale to large server pools without introducing
complexity to the network control plane.
Objectives
- Uniform high capacity
- Performance isolation
- Layer-2 semantics
Topology
- Low-cost switch into a Clos topology.
• Traffic Engineering
- Valiant Load Balancing
21/44
Confidential
The VL2 Architecture
Building on proven networking technology
- Link-state routing
• To maintain the Switch-level topology
• Not end hosts’ information
- ECMP to enable VLB
Separating names from locators
- Hosting any service on any server.
- Addressing scheme
• AAs(Application-specific Addresses) & LAs(Location-specific Addresses)
• Directory system: mapping between names and locators.
• VL2 agent (in Host) : 2.5Layer, invokes the directory system’s resolution service.
Embracing end-system
- VL2 agent in host
22/44
Confidential
The VL2 Architecture
Addressing
23/44
Confidential
The VL2 Architecture
Routing
24/44
Confidential
The VL2 Architecture
Potential issue for both ECMP and VLB
- transient congestion on some links.
- it can change the hash used to create the source address periodically or
whenever TCP detects a severe congestion event (e.g., a full window loss) or an
Explicit Congestion Notification.
- Switches today only support up to 16-way ECMP, with 256-way ECMP being
released by some vendors this year.
- Some inexpensive switches cannot correctly retrieve the five-tuple values when
a packet is encapsulated with multiple IP headers. Thus, the agent at the source
computes a hash of the five-tuple values and writes that value into the source
IP address field, which all switches do use in making ECMP forwarding
decisions.
25/44
Confidential
The VL2 Architecture
Discussion
- Cost & Scale
• the VL2 topology can scale to create networks with no oversubscription.
• switches with 144 ports (D = 144) are available today for $150K.
• switches with 24 ports (D = 24) are available today for $8K.
• Building a conventional network with no oversubscription would cost roughly 14× the
cost of a equivalent VL2 network with no oversubscription.
26/44
Confidential
SEATTLE
Changhoon Kim and 2 other persons
(Univ. of Princeton)
27/44
Confidential
The SEATTLE Architecture
Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises.
- In SIGCOMM, 2008.
Flat addressing of end-hosts
- Switches use hosts’ MAC addresses for routing
- Ensures zero-configuration and backwards-compatibility
Automated host discovery at the edge
- Switches detect the arrival/departure of hosts
- Obviates flooding and ensures scalability
Hash-based on-demand resolution
- Hash deterministically maps a host to a switch
- Switches resolve end-hosts’ location and address via hashing
- Ensures scalability
Shortest-path forwarding between switches
- Switches run link-state routing to maintain only switch-level topology (i.e., do
not disseminate end-host information)
- Ensures data-plane efficiency
28/44
Confidential
The SEATTLE Architecture
Packet forwarding & Lookup
29/44
Confidential
The SEATTLE Architecture
Packet forwarding & Lookup
30/44
Confidential
PortLand
R.N. Mysore and 7 other persons
(Univ. of California San Diego)
31/44
Confidential
The PortLand Architecture
Add a new host
Transfer a packet
Key features
- Layer 2 protocol based on tree topology
- PMAC encode the position information
- Data forwarding proceeds based on PMAC
- Edge switch’s responsible for mapping between
PMAC and AMAC (Rewriting)
- Fabric manger’s responsible for address resolution
- Edge switch makes PMAC invisible to end host
- Each switch node can identify its position by itself
- Fabric manager keep information of overall topology.
Corresponding to the fault, it notifies affected nodes.
- PMAC(48bits): pod(16).position(8).port(8).vmid(16)
32/44
Confidential
TRILL (RFC 5556)
Radia Perlman
(Univ. of California San Diego)
33/44
Confidential
The TRILL
TRILL: Transparent Interconnection of Lots of Links
- TRILL is a new standard protocol to perform Layer 2 bridging with IS-IS link state routing
technology.
A simple idea
- Encapsulate native frames in a transport header providing a hop count.
- Route the encapsulated frames using IS-IS.
- Decapsulate the native frame before delivery.
Definitions
- RBridge - Routing Bridge
• A device which implements TRILL
- RBridge Campus
• A network of RBridges, links, and any intervening bridges, bounded by end stations/layer 3
router.
34/44
Confidential
The TRILL
Encapsulation & Header
TRILL Header – 64 bits
Nicknames - auto-configured 16-bit campus local names for RBridges
V = Version (2 bits)
R = Reserved (2 bits)
M = Multi-Destination (1 bit)
OpLng = Length of TRILL Options
Hop = Hop Limit (6 bits)
35/44
Confidential
The TRILL
Packet Routing
- ESADI (End Station Address Distribution Information protocol)
36/44
Confidential
Related Works & Summary
37/44
Confidential
Related Works
OpenFlow
- Shares idea of simple switches controlled by external SW
- Monsoon & VL2 is a philosophy for how to use the switches
Brocade: Brocade One (TRILL, Clos Net, DCB)
Cisco: FabricPath (TRILL)
Juniper: Qfabric (HW & FC)
38/44
Confidential
Summary
Comparison of the Data Center Network Architecture
Monsoon VL2 SEATTLE FAT-TREE PortLand SPAIN
MOOS
E
TRILL Dcell Bcube MDCube
Org. MS Research
Univ. of
Princeton
Univ. of California
San Diego
HP
Univ. of
Cambrid
ge
MS Research Asia
Publishing
SIGCOMM
2008
SIGCOMM
2009
SIGCOMM
2008
SIGCOMM
2008
SIGCOMM
2009
NSDI 2010
DC CAVE
S Works
hop
2009
RFC 5556
2009
SIGCOMM
2008
SIGCOMM
2009
CoNEXT
2009
Authors
Albert
Greenberg…
Albert
Greenberg,
Changhoon
Kim…
Changhoon
Kim…
M. Al-Fares…
R.N.
Mysore…
J. Mudigon
da,
M. Al-Fare
s…
M. Scott
…
Radia
Perlman
C. GUO… C. GUO…
H. Wu,
C. GUO…
Topology Clos Network Clos Network N/A Fat-Tree Fat-Tree N/A N/A N/A
Bcube Topo
logy
Packetizing
MAC-in-MAC
(802.1ah PBB)
IP-in-IP IP-in-IP(?) IP rewriting
MAC
rewriting
(PMAC)
MAC
rewriting
TRILL Hdr
Load
Spreading
MAC-Rotation ECMP ECMP ECMP ECMP
Multi-path O O X O O O X O
Mod. of
End-Host?
O O X X X O X X O
Mod. of
switches?
O X O
O
(Special HW)
O
(Special
HW)
X
O
(Rbridge)
△
ARP
Directory
Server
Directory
Server
DHT
on
the switches
Fabric
Manager
ESADI
39/44
Confidential
Traffic Engineering is …
Thank you.

More Related Content

PDF
Ccnp presentation day 4 sd-access vs traditional network architecture
PDF
Designing Multi-tenant Data Centers Using EVPN
PDF
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
PDF
Building DataCenter networks with VXLAN BGP-EVPN
PPTX
Vxlan deep dive session rev0.5 final
PPTX
Software Defined Networking (SDN)
PPTX
Network Virtualization
PPTX
Network Function Virtualization : Infrastructure Overview
Ccnp presentation day 4 sd-access vs traditional network architecture
Designing Multi-tenant Data Centers Using EVPN
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
Building DataCenter networks with VXLAN BGP-EVPN
Vxlan deep dive session rev0.5 final
Software Defined Networking (SDN)
Network Virtualization
Network Function Virtualization : Infrastructure Overview

What's hot (20)

PPTX
Data center
PPTX
Network virtualization
PDF
Virtualization and cloud Computing
PDF
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking
PPTX
Meetup 23 - 02 - OVN - The future of networking in OpenStack
PPTX
Differences of the Cisco Operating Systems
PPTX
DATA CENTER
PPTX
Vxlan control plane and routing
PDF
Network Virtualization in Cloud Data Centers
DOCX
Layer 2 & layer 3 switching
PDF
project report on DATACENTER
PPTX
Datacenter overview
PDF
Telco Cloud - 01. introduction to Telco cloud
PPTX
SDWAN Introduction presentation & Public Speaking
PDF
VXLAN and FRRouting
PPTX
Introduction to Data Center Network Architecture
PDF
Cisco IPv6 Tutorial
PPTX
Open shortest path first (ospf)
PPTX
VXLAN
PPTX
Redhat ha cluster with pacemaker
Data center
Network virtualization
Virtualization and cloud Computing
Deploying CloudStack and Ceph with flexible VXLAN and BGP networking
Meetup 23 - 02 - OVN - The future of networking in OpenStack
Differences of the Cisco Operating Systems
DATA CENTER
Vxlan control plane and routing
Network Virtualization in Cloud Data Centers
Layer 2 & layer 3 switching
project report on DATACENTER
Datacenter overview
Telco Cloud - 01. introduction to Telco cloud
SDWAN Introduction presentation & Public Speaking
VXLAN and FRRouting
Introduction to Data Center Network Architecture
Cisco IPv6 Tutorial
Open shortest path first (ospf)
VXLAN
Redhat ha cluster with pacemaker
Ad

Viewers also liked (20)

PDF
Data Center Network Topologies
PPTX
POWER POINT PRESENTATION ON DATA CENTER
PPTX
Building Scalable Data Center Networks
PDF
Data Center Architecture Trends
PDF
Modern Data Center Network Architecture - The house that Clos built
PPTX
FATTREE: A scalable Commodity Data Center Network Architecture
PPT
Tia 942 Data Center Standards
ZIP
DataCenter:: Infrastructure Presentation
PPTX
Demystifying Networking: Data Center Networking Trends 2017
PDF
Dell Data Center Networking Overview
PDF
Configuration & Routing of Clos Networks
PPTX
The Data Center Network Evolution
PDF
How deep is your buffer – Demystifying buffers and application performance
PDF
Operationalizing BGP in the SDDC
PPTX
Morphology of Modern Data Center Networks - YaC 2013
PDF
diagrama 6
PDF
diagrama 2
PDF
PPT
Net Ops Data Center Architecture Diagram 06
PDF
Presentation data center design overview
Data Center Network Topologies
POWER POINT PRESENTATION ON DATA CENTER
Building Scalable Data Center Networks
Data Center Architecture Trends
Modern Data Center Network Architecture - The house that Clos built
FATTREE: A scalable Commodity Data Center Network Architecture
Tia 942 Data Center Standards
DataCenter:: Infrastructure Presentation
Demystifying Networking: Data Center Networking Trends 2017
Dell Data Center Networking Overview
Configuration & Routing of Clos Networks
The Data Center Network Evolution
How deep is your buffer – Demystifying buffers and application performance
Operationalizing BGP in the SDDC
Morphology of Modern Data Center Networks - YaC 2013
diagrama 6
diagrama 2
Net Ops Data Center Architecture Diagram 06
Presentation data center design overview
Ad

Similar to Data center network architectures v1.3 (20)

PPT
Theo's slides
PPT
Theo's slides
PDF
PLNOG 13: Alexis Dacquay: Architectures for Universal Data Centre Networks, t...
PPT
A Scalable, Commodity Data Center Network Architecture
PDF
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PPTX
Cloud interconnection networks basic .pptx
PDF
10 sdn-vir-6up
PPTX
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
PPTX
LinkedIn's Approach to Programmable Data Center
PDF
Networking is NOT Free: Lessons in Network Design
PDF
LinkedIn OpenFabric Project - Interop 2017
DOCX
Enterprise Data Center Networking (with citations)
PPTX
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
PDF
JSLab. Георгий Подсветов "Путь архитектора. Введение в архитектурные паттерны."
PDF
Introduction to architectural patterns
PPTX
Openstack Neutron Insights
PPT
Foundation of computerr nnetworks strong
PPTX
PortLand.pptx
DOCX
Akash rajguru project report sem VI
DOCX
BROCADE and New IP Story
Theo's slides
Theo's slides
PLNOG 13: Alexis Dacquay: Architectures for Universal Data Centre Networks, t...
A Scalable, Commodity Data Center Network Architecture
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
Cloud interconnection networks basic .pptx
10 sdn-vir-6up
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
LinkedIn's Approach to Programmable Data Center
Networking is NOT Free: Lessons in Network Design
LinkedIn OpenFabric Project - Interop 2017
Enterprise Data Center Networking (with citations)
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
JSLab. Георгий Подсветов "Путь архитектора. Введение в архитектурные паттерны."
Introduction to architectural patterns
Openstack Neutron Insights
Foundation of computerr nnetworks strong
PortLand.pptx
Akash rajguru project report sem VI
BROCADE and New IP Story

Recently uploaded (20)

PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Approach and Philosophy of On baking technology
PDF
cuic standard and advanced reporting.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation theory and applications.pdf
PPTX
Cloud computing and distributed systems.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
KodekX | Application Modernization Development
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Review of recent advances in non-invasive hemoglobin estimation
Dropbox Q2 2025 Financial Results & Investor Presentation
Programs and apps: productivity, graphics, security and other tools
Understanding_Digital_Forensics_Presentation.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Approach and Philosophy of On baking technology
cuic standard and advanced reporting.pdf
Empathic Computing: Creating Shared Understanding
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
The AUB Centre for AI in Media Proposal.docx
Digital-Transformation-Roadmap-for-Companies.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Spectral efficient network and resource selection model in 5G networks
Encapsulation theory and applications.pdf
Cloud computing and distributed systems.
Mobile App Security Testing_ A Comprehensive Guide.pdf
KodekX | Application Modernization Development

Data center network architectures v1.3

  • 1. Jeong Wook-jae wjjung11@gmail.com Data Center Network Architecture: Towards a Cloud Data Center
  • 2. 1/44 Contents  The Conventional Architecture & Problem  The New Architecture  The Monsoon Architecture  The VL2 Architecture  The SEATTLE Architecture  The PortLand Architecture  The TRILL  Related Works  Summary  The CDCN(Cloud Data Center Network) Architecture Proposal  Trend
  • 3. 2/44 Confidential The Conventional Architecture The conventional architecture for data centers (adapted from figure by Cisco_2004)
  • 4. 3/44 Confidential The Problems of a Conventional DC Ethernet is hard to scale out - STP - Broadcast (ARP, RARP, DHCP…) - Packet Floods in Switch (for Mac Learning) Fragmentation of resources No Performance Isolation Poor server to server connectivity Need very high reliability near top of the tree (Single Point of Failure)
  • 5. 4/44 Confidential The Problems of a Conventional DC Fragmentation of Resources - VLANs used to isolate properties from each other - IP addresses topologically determined by ARs - Reconfiguration of IPs and VLAN trunks • painful, error-prone, slow, often manual
  • 6. 5/44 Confidential The Problems of a Conventional DC No Performance Isolation - VLANs typically provide only reachability isolation - One service sending/receiving too much traffic hurts all services sharing its subtree
  • 7. 6/44 Confidential The Problems of a Conventional DC Poor server to server connectivity - Data centers run two kinds of applications: • Outward facing (serving web pages to users) • Internal computation - 70~80% of the packets stay inside the data center
  • 9. 8/44 Confidential Monsoon Albert Greenberg and 4 other persons (Microsoft Research)
  • 10. 9/44 Confidential The Monsoon Architecture Monsoon - A new network architecture, which scales and commoditizes data center networking. Abstract - Scale-out instead of Scale-up - A single large Layer 2 domain - Using programmable commodity layer 2 switches and servers. - Hierarchy has 2: • TOR(Top-Of-Rack) Switch => Access Switch • LB(Load Balancing) Switch => Core Switch - Scale to 100,000 servers or more.
  • 11. 10/44 Confidential The Monsoon Architecture Objectives - Low-Cost & Scale-out - Uniform high capacity • Capacity between two servers limited only by their NICs • No need to consider topology when adding servers - Performance isolation • Traffic of one service should be unaffected by others - Layer-2 semantics • Flat addressing, so any server can have any IP address • Server configuration is the same as in a LAN • Legacy applications depending on broadcast must work
  • 12. 11/44 Confidential The Monsoon Architecture Server-to-Server Forwarding - An Example Monsoon Topology (Clos Network) • A scale-out design with broad layers - Same bisection BW at each layer -> no oversubscription - Extensive path diversity -> Graceful degradation under failure SWITCH Up-link Port Down-link Port # Inter. SW N/A 10Gbps X 144 72 Aggr. SW 10Gbps X 72 10Gbps X 72 144 TOR SW 10Gbps X 2 1Gbps X 20 5,184
  • 13. 12/44 Confidential The Monsoon Architecture Clos Network Topology - A Multistage(ex. 3-stage) switching network. - The advantage • The connection between a large number of input and output ports can be made by using only small-sized switches. • It can be shown that with k ≥ n, the clos network can be non-blocking like a crossbar switch. - Clos Theorem: If K >= 2n-1, then a new connection can always be added without rearrangement
  • 14. 13/44 Confidential The Monsoon Architecture Server-to-Server Forwarding Valiant Load Balancing • Every flow “bounced” off a random intermediate switch • Probably hotspot free for any admissible traffic matrix • Servers could randomize flow-lets if needed
  • 16. 15/44 Confidential The Monsoon Architecture Server-to-Server Forwarding - Encapsulation used to transfer complexity to servers • Commodity switches have simple forwarding primitives • Complexity moved to computing the headers - Encapsulation available • IEEE 802.1ah defines MAC-in-MAC encapsulation Frame processing when packets go from one server to another in the same data center.
  • 17. 16/44 Confidential The Monsoon Architecture Server-to-Server Forwarding - Data center OSes already heavily modified for VMs, storage, etc. • A thin shim for network support is no big deal - Applications work with Application Addresses • AA’s are flat names; infrastructure addresses invisible to apps - No change to applications or clients outside DC The networking stack of a host. The Monsoon Agent looks up remote IPs in the central directory. Monsoon Agent
  • 18. 17/44 Confidential The Monsoon Architecture External Connection & Full Topology(Example) - Routers do not support the Monsoon functions - Ingress Server with each Access Router • Implements the Monsoon functionality and acts as a GW to the DC. • Two Interface : AR & TOR switch • Default GW ARAR AR AR ··· Ingress Server ···Ingress Server Ingress Server Ingress Server
  • 19. 18/44 Confidential The Monsoon Architecture Directory System Performance - Key issues: • Lookup latency • How many servers needed to handle a DC’s lookup traffic? • Update latency • Convergence latency
  • 20. 19/44 Confidential VL2 Albert Greenberg, Changhoon Kim and 7 other persons (Microsoft Research)
  • 21. 20/44 Confidential The VL2 Architecture VL2 uses - flat addressing to allow service instances to be placed anywhere in the network - Valiant Load Balancing to spread traffic uniformly across network paths - end system-based address resolution to scale to large server pools without introducing complexity to the network control plane. Objectives - Uniform high capacity - Performance isolation - Layer-2 semantics Topology - Low-cost switch into a Clos topology. • Traffic Engineering - Valiant Load Balancing
  • 22. 21/44 Confidential The VL2 Architecture Building on proven networking technology - Link-state routing • To maintain the Switch-level topology • Not end hosts’ information - ECMP to enable VLB Separating names from locators - Hosting any service on any server. - Addressing scheme • AAs(Application-specific Addresses) & LAs(Location-specific Addresses) • Directory system: mapping between names and locators. • VL2 agent (in Host) : 2.5Layer, invokes the directory system’s resolution service. Embracing end-system - VL2 agent in host
  • 25. 24/44 Confidential The VL2 Architecture Potential issue for both ECMP and VLB - transient congestion on some links. - it can change the hash used to create the source address periodically or whenever TCP detects a severe congestion event (e.g., a full window loss) or an Explicit Congestion Notification. - Switches today only support up to 16-way ECMP, with 256-way ECMP being released by some vendors this year. - Some inexpensive switches cannot correctly retrieve the five-tuple values when a packet is encapsulated with multiple IP headers. Thus, the agent at the source computes a hash of the five-tuple values and writes that value into the source IP address field, which all switches do use in making ECMP forwarding decisions.
  • 26. 25/44 Confidential The VL2 Architecture Discussion - Cost & Scale • the VL2 topology can scale to create networks with no oversubscription. • switches with 144 ports (D = 144) are available today for $150K. • switches with 24 ports (D = 24) are available today for $8K. • Building a conventional network with no oversubscription would cost roughly 14× the cost of a equivalent VL2 network with no oversubscription.
  • 27. 26/44 Confidential SEATTLE Changhoon Kim and 2 other persons (Univ. of Princeton)
  • 28. 27/44 Confidential The SEATTLE Architecture Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises. - In SIGCOMM, 2008. Flat addressing of end-hosts - Switches use hosts’ MAC addresses for routing - Ensures zero-configuration and backwards-compatibility Automated host discovery at the edge - Switches detect the arrival/departure of hosts - Obviates flooding and ensures scalability Hash-based on-demand resolution - Hash deterministically maps a host to a switch - Switches resolve end-hosts’ location and address via hashing - Ensures scalability Shortest-path forwarding between switches - Switches run link-state routing to maintain only switch-level topology (i.e., do not disseminate end-host information) - Ensures data-plane efficiency
  • 31. 30/44 Confidential PortLand R.N. Mysore and 7 other persons (Univ. of California San Diego)
  • 32. 31/44 Confidential The PortLand Architecture Add a new host Transfer a packet Key features - Layer 2 protocol based on tree topology - PMAC encode the position information - Data forwarding proceeds based on PMAC - Edge switch’s responsible for mapping between PMAC and AMAC (Rewriting) - Fabric manger’s responsible for address resolution - Edge switch makes PMAC invisible to end host - Each switch node can identify its position by itself - Fabric manager keep information of overall topology. Corresponding to the fault, it notifies affected nodes. - PMAC(48bits): pod(16).position(8).port(8).vmid(16)
  • 33. 32/44 Confidential TRILL (RFC 5556) Radia Perlman (Univ. of California San Diego)
  • 34. 33/44 Confidential The TRILL TRILL: Transparent Interconnection of Lots of Links - TRILL is a new standard protocol to perform Layer 2 bridging with IS-IS link state routing technology. A simple idea - Encapsulate native frames in a transport header providing a hop count. - Route the encapsulated frames using IS-IS. - Decapsulate the native frame before delivery. Definitions - RBridge - Routing Bridge • A device which implements TRILL - RBridge Campus • A network of RBridges, links, and any intervening bridges, bounded by end stations/layer 3 router.
  • 35. 34/44 Confidential The TRILL Encapsulation & Header TRILL Header – 64 bits Nicknames - auto-configured 16-bit campus local names for RBridges V = Version (2 bits) R = Reserved (2 bits) M = Multi-Destination (1 bit) OpLng = Length of TRILL Options Hop = Hop Limit (6 bits)
  • 36. 35/44 Confidential The TRILL Packet Routing - ESADI (End Station Address Distribution Information protocol)
  • 38. 37/44 Confidential Related Works OpenFlow - Shares idea of simple switches controlled by external SW - Monsoon & VL2 is a philosophy for how to use the switches Brocade: Brocade One (TRILL, Clos Net, DCB) Cisco: FabricPath (TRILL) Juniper: Qfabric (HW & FC)
  • 39. 38/44 Confidential Summary Comparison of the Data Center Network Architecture Monsoon VL2 SEATTLE FAT-TREE PortLand SPAIN MOOS E TRILL Dcell Bcube MDCube Org. MS Research Univ. of Princeton Univ. of California San Diego HP Univ. of Cambrid ge MS Research Asia Publishing SIGCOMM 2008 SIGCOMM 2009 SIGCOMM 2008 SIGCOMM 2008 SIGCOMM 2009 NSDI 2010 DC CAVE S Works hop 2009 RFC 5556 2009 SIGCOMM 2008 SIGCOMM 2009 CoNEXT 2009 Authors Albert Greenberg… Albert Greenberg, Changhoon Kim… Changhoon Kim… M. Al-Fares… R.N. Mysore… J. Mudigon da, M. Al-Fare s… M. Scott … Radia Perlman C. GUO… C. GUO… H. Wu, C. GUO… Topology Clos Network Clos Network N/A Fat-Tree Fat-Tree N/A N/A N/A Bcube Topo logy Packetizing MAC-in-MAC (802.1ah PBB) IP-in-IP IP-in-IP(?) IP rewriting MAC rewriting (PMAC) MAC rewriting TRILL Hdr Load Spreading MAC-Rotation ECMP ECMP ECMP ECMP Multi-path O O X O O O X O Mod. of End-Host? O O X X X O X X O Mod. of switches? O X O O (Special HW) O (Special HW) X O (Rbridge) △ ARP Directory Server Directory Server DHT on the switches Fabric Manager ESADI

Editor's Notes

  • #20: RSM : Replication Server Manager