SlideShare a Scribd company logo
Designing	Multi-Tenant	Data	
Centers	using	EVPN-IRB
Neeraj	Malhotra,	Principal	Engineer,	Cisco
nmalhotr@cisco.com
Objectives
Architecture	Objectives	– Evolving	DC	Requirements
• Operational	simplicity	via	uniform	control,	data	plane	across	L2,	L3,	DC,	WAN
• Flexible	workload	placement	and	mobility	within	DC	and	across	DCs
• Efficient	bandwidth	utilization	within	DC	– no	flood	and	learn,	ECMP
• Traffic	engineering	- traffic	steering,	ECMP,	FRR
• Horizontal	Scaling
• Multi-tenancy	with	L2	and	L3	VPN	in	DC
• Interworking	with	Legacy	L3VPN	/	L2VPN	WAN
A	DC	network	fabric	must	.....
Leaf-1 Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
BD-1 BD-1
VM
BD-2BD-2
VMVM
Leaf-5
VM
BD-2
.....	be	seamless	and	act	like	a	
single	switch	/	router	
Leaf-x Leaf-x+1
VM VMVM VM
Why	not	VPLS?
Why not use
VPLS in DC?
Simply not
designed for
DC use-case
L2 Only
No All-Active
Redundancy
No per-flow
ECMP
Load-balancing
Flood and Learn
MAC learning
Is
Sub-optimal
What	is	the	Solution?
Fabric	Solution	Components
BGP-EVPN
Overlay
EVPN	IRB	DC	Fabric
VM	Mobility	and	
Any-cast	L3	GW
Overlay	
Distributed	
IRB
MPLS	or	IP	
Underlay
IP	or	MPLS	Underlay
Underlay	vs.	Overlay
Underlay	=	Transport
Physical	Network
IP,	MPLS	/	SR	Transport
Traffic	Steering,	ECMP,	FRR,.....
Overlay =	VPN	(L2+L3)
Control	Plane	– EVPN
Data	Plane	– MPLS,	VXLAN,.....
Policy	Driven
Overlay	Control	Plane	– BGP	EVPN
BGP	EVPN	– EVI
VMVM VMVMVM
EVI	20
EVI	10
EVI	extended	over	BGP-EVPN	
Fabric	to	all	the	Leafs	belonging	
to	the	EVI
Leafs	that	don’t	belong	to	a	specific	
EVI	will	not	have	MAC-VRF	for	that	
EVI,	providing	efficient	scalability
EVI: An	EVPN	instance	extends	Layer	2	between	the	Leafs
Leaf
Spine
BGP	EVPN	– Host	Connectivity	Options,	ESI
• Ethernet	Segment	Identifier	
(ESI)	‘0’
• No	DF	election
Single	Home	Device	(SHD)
Multi-home	(MHD)	All-Active	(Per-
Flow)	LB
VM VM
ESI-0 ESI-0 ESI-1 ESI-1
• Identical	ESI	on	Leafs
• Per	VLAN	DF	election
VMSingle	homed	host
Multi-homing	with	Link	Bundling
Leaf
Spine
BGP	EVPN	– MAC	and	IP	Learning
• MAC/IP	addresses	are	advertised	along	with	L2	and	L3	VPN	encap (MPLS	label	or	VNID	)	to	rest	of	
Leafs	via	MAC+IP	RT-2
• IP	Prefix	routes	are	advertised	via	BGP	EVPN	via	RT-5
Leaf
Spine
Data	Plane,	ARP,	ND	
learning	from	the	hosts
VMVM VMVM
RR RR
EVPN	Route	Type	2	carries	MAC	
and	IP	reachability	with	L2+L3	VPN	
encapsulation,	L2+L3	RTs
RD
Ethernet	Segment	Identifier
Ethernet	Tag	ID
MAC	Address	Length
MAC	Address
IP	Address	Length
IP	Address
MPLS	Label1
MPLS	Label2
BGP	EVPN	– Load	Balancing	via	Aliasing
Challenge:	
How	to	load-balance	traffic	towards	a	multi-homed	device	across	multiple	Leafs	when	MAC	
addresses	are	learnt	by	only	a	single	Leaf?
RD
Ethernet	Segment	Identifier
Ethernet	Tag	ID
MPLS	VPN	Label
EVPN	Route	Type	1	advertises	ESI	
reachability	per-EVI	to	enable	MAC	
ECMP	without	an	explicit	MAC	route	
advertisement
BGP	EVPN	– Fast	Convergence	via	Mass-Withdraw
Challenge:	
How	to	inform	other	Leafs	of	a	failure	affecting	many	MAC	addresses	quickly	while	the	control-
plane	re-converges?
RD
Ethernet	Segment	Identifier
Ethernet	Tag	ID	=	ALL	FF
MPLS	Label
EVPN	Route	Type	1	also advertises	ESI	
reachability	globally	for	ALL	EVIs	to	
enable	MAC	independent	convergence	
on	ESI	failure
BGP	EVPN	– Multi-destination	traffic
Challenge:	
How	to	distribute	BUM	traffic	across	an	EVPN	instance?
RD
Ethernet	Tag	ID
IP	Address	Length	
Originating	Router’s	IP	add.
EVPN	Route	Type	3	+	PMSI	ATTR.
Inclusive	Multicast	route	with	a	PMSI	
attribute	signals	participation	in	an	
EVPN’s	flood	list
VMVM
Leaf-3Leaf-1 Leaf-4Leaf-2
Flags
Tunnel	Type
BUM	VPN	Label
Tunnel	ID	/	TEP	IP
BGP	EVPN	- Designated	Forwarder	(DF)
Challenge:	
How	to	prevent	duplicate	copies	of	flooded	traffic	from	being	delivered	to	a	multi-homed	Ethernet	
Segment?
RD
Ethernet	Segment	Identifier
IP	Address	Length	
Originating	Router’s	IP	add.
EVPN	Route	Type	4	
enables	ESI	discovery	and	
DF	election
BGP	EVPN	- Split	Horizon	Group	Filtering
Leaf-2
Spine
VMVM
ESI-1
Echo	!
Challenge:	
How	to	prevent	flooded	traffic	from	echoing	back	to	a	multi-homed	Ethernet	Segment?
BUM	Label
SH	Label
0x01
Flags
Reserved
ESI	MPLS	Label
0x06
Per- ESI	SHG	Label	EXT-COMM	with	
EVPN	RT-1	enables	SHG	filtering	to	cut	
potential	loops	back	to	same	ESI
Leaf-1
VM
VM	Mobility	– MAC	+	IP
Challenge:	
How	to	detect	the	correct	location	of	MAC	after	the	movement	of	host	from	one	Ethernet	Segment	to	
another	also	called	“MAC	move”?
19
VMVM
IP-1	MAC-1
Leaf-3Leaf-1
MAC IP ESI Seq. Next-Hop
MAC-1 IP-1 0 0 Leaf-1
Host	move
Leaf-4Leaf-2
Sequence	number	and	Next-Hop	value	
will	be	changed	after	the	host	move
0x00
Reserved
Sequence	Number
0x06
Mobility	EXT-COMM	with	EVPN	RT-2
carries	MAC+IP	route	sequence	
number to	enable	MAC	mobility
VMVM
IP-1	MAC-1
Leaf-3Leaf-1
MAC IP ESI Seq. Next-Hop
MAC-1 IP-1 0 1 Leaf-3
Leaf-4
ESI-1
Leaf-2
Sequence	number	is	incremented	and	
Next-hop	is	changed	to	Leaf-3
VM	Mobility,	continued
Overlay	Integrated	Routing	and	
Bridging	(IRB)
How	do	we	do	inter-subnet	routing?
Overlay	Routing	Architectures
• Centralized	Routing
• Distributed	Routing	– Asymmetric	IRB
• Distributed	Routing	– Symmetric	IRB
Leaf-1 Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM
Leaf-5
VM
VLAN-2
Bridging on	the	leaf
Centralized	Routing
• east<->west	routed	traffic	traverses	to	centralized	L3	gateways
• Scale	bottleneck:
• Centralized	have	full	ARP/MAC	state	in	the	DC
• Centralized	GW	needs	to	host	all	DC	subnets
IRB-1	
GW	MAC
IRB-2
GW	MAC
IRB-1	
GW	MAC
IRB-2
GW	MAC
Centralized	Routing
on	the	Spine
Bridging on	the	leaf
L3
L2
Distributed	Routing	– Asymmetric	IRB
• Egress	subnet	is	always	local
• Inter-subnet	packets	routed	directly	to	destination	VM’s	DMAC	
• Scale	bottleneck:
• All	egress	subnets	needs	to	be	present	on	ingress	leaf
• Ingress	leaf	maintains	ARP/ND	state	every	egress	leaf
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM
Leaf-5
VM
VLAN-2
Routed	and	Bridged	
to	remote	VM	
IRB-1	
GW	MAC
IRB-2
GW	MAC
IRB-1	
GW	MAC
IRB-2
GW	MAC
IP	or	MPLS	Transport
(underlay	routing)
Bridging to	local	VM	
MAC
IRB-2
GW	MAC
VLAN-2
IRB-2
GW	MAC
VRF
L3
L2
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM
Leaf-5
VM
VLAN-2
Routed	to	remote	
leaf	
Distributed	Routing	– Symmetric	IRB
IRB-1	
GW	MAC
IRB-2
GW	MAC
IRB-1	
GW	MAC
IRB-2
GW	MAC
IP	or	MPLS	Transport
(underlay	routing)
Routed	to	local	VM
IRB-2
GW	MAC
VRF
• Remote	VM	IP	is	installed	like	a	VPN	IP	route	recursively	over	remote	leaf	next-hop
• No	adjacencies	to	remote	hosts	even	if	the	subnet	is	local
• Subnet	does	not	need	to	be	local	on	ingress	leaf	unless	there	are	local	hosts
L3
L2
Overlay	Distributed	Any-cast GW
How	do	we	let	hosts	move?
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM
Leaf-5
VM
VLAN-2
Any-cast	GW	IP	and	
MAC	for	subnet-a
Symmetric	IRB	– Distributed	Any-cast GW
• Any-cast	GW	IP	and	Any-cast	GW	MAC	configured	on	ALL	leafs	with	local	subnet
• Essentially,	Subnet	GW	is	distributed	across	ALL	leafs	with	local	subnet
GW	IP-a	
GW	MAC-a
GW	IP-b
GW	MAC-
b
GW	IP-a
GW	MAC-a
GW	IP-b
GW	MAC-
b
GW	IP-b
GW	MAC-
b
VLAN-2
GW	IP-b
GW	MAC-
b
VRF
VM
Any-cast	GW	IP	and	
MAC	for	subnet-a
Any-cast	GW	IP	and	
MAC	for	subnet-b
Any-cast	GW	IP	and	
MAC	for	subnet-b
Any-cast	GW	IP	and	
MAC	for	subnet-b
Control	and	Data	Plane	Call	Flow
Putting	it	all	together.....
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM-a
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM-b
Leaf-5
VM
VLAN-2
Host	Learning	- ARP	REQUEST	contd.
1. IP	packet	destined	to	VM-b	triggers	ARP	for	VM-b	on	Leaf-1	from	any-cast	GW	IP-b	and	any-cast	GW	MAC-b
2. ARP	to	VM-b	flooded	to	all	remote	leafs	where	VLAN-b	is	stretched	(via	EVPN	RT-3	enabled	IR)
3. Leafs	flood	on	local	BD	ports
GW	IP-a	
GW	MAC-a
GW	IP-b
GW	MAC-
b
GW	IP-a
GW	MAC-a
GW	IP-b
GW	MAC-
b
GW	IP-b
GW	MAC-
b
VLAN-2
GW	IP-b
GW	MAC-
b
VRF
VM
DIP:	VM-b
DIP:	VM-b
ARP:	VM-b
ARP:	VM-b ARP:	VM-b ARP:	VM-b
ARP:	VM-b
RT-2:	VM-a
Leaf-2 Leaf-3
Spine-RR Spine
Leaf-4
VM-a
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM-b
Leaf-5
VM
VLAN-2
Host	Learning	– ARP	REPLY,	MAC+IP	RT-2
GW	IP-a	
GW	MAC-a
GW	IP-b
GW	MAC-
b
GW	IP-a
GW	MAC-a
GW	IP-b
GW	MAC-
b
GW	IP-b
GW	MAC-
b
VLAN-2
GW	IP-b
GW	MAC-
b
VRF
VM
ARP:	VM-b
ARP	REPLY:	
VM-b
VM-b-MAC
GW	MAC-b
ARP:	VM-b
• ARP	REPLY	to	GW	MAC-b	
consumed	on	Leaf-4	and	
installed	in	ARP	table
• EVPN	MAC+IP	RT-2	
advertised	to	remote	leafs	
via	RR	
EVPN	RT-2
RD:	Leaf-4:
IVM-b--MAC
VM-b-IP
L23VPN LABEL	/	VNI
L2 VPN	LABEL	/	VNI
NH-Leaf-4
L3-RT, L2-RT
VM-b	MAC	Reachability	installed	in	MAC-VRF	across	remote	leafs
VM-b	IP	Reachability	installed	in	IP-VRF	across	remote	leafs	as	BGP	L3VPN	route	independent	of	subnet	
being	local	or	not
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM-a
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM-b
Leaf-5
VM
VLAN-2
Routed	to	remote	leaf
IP	VRF-a:
IP-b/32	->	Leaf-4,	L3VPN	Label	
Inter-subnet	traffic	to	VM-b
IRB-1	
GW	MAC
IRB-2
GW	MAC
IRB-1	
GW	MAC
IRB-2
GW	MAC
IP	or	MPLS	Transport
(underlay	routing)
Routed	to	local	VM-b
IP	VRF-a:
IP-b/32	->	BVI	ARP	adjacency
IRB-2
GW	MAC
VLAN-2
IRB-2
GW	MAC
VRF-a
VM
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM-a
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM-b
Leaf-5
VM
VLAN-2
Bridged	to	remote	leaf	next-hop
MAC-VRF:
MAC-b	->	Leaf-4,	L2VPN	Label	
Intra-subnet	traffic	to	VM-b
IRB-1	
GW	MAC
IRB-2
GW	MAC
IRB-1	
GW	MAC
IRB-2
GW	MAC
IP	or	MPLS	Transport
(underlay	routing)
IRB-2
GW	MAC
VLAN-2
IRB-2
GW	MAC
VRF
VM
Bridged	to	local	VM-b	MAC
MAC-VRF:
MAC-b	->	BE1.1
Summary
• Unified	control,	data	plane	across	L2,	L3,	DC,	WAN
• Flexible	workload	placement	and	mobility	across	L2	Overlay
• Optimal	bandwidth	utilization	– no	flood	and	learn,	ECMP	in	overlay,	underlay
• Traffic	engineering	with	MPLS	fabric	- traffic	steering,	ECMP,	FRR
• Horizontal	Scaling	with	distributed	symmetric	IRB
• Multi-tenancy	with	L2	and	L3	VPN
• Interworking	with	Legacy	L3VPN	/	L2VPN	WAN
Thank	You
nmalhotr@cisco.com

More Related Content

PPTX
Vxlan control plane and routing
PPTX
Vxlan deep dive session rev0.5 final
PDF
Building DataCenter networks with VXLAN BGP-EVPN
PDF
VXLAN BGP EVPN: Technology Building Blocks
PDF
Demystifying EVPN in the data center: Part 1 in 2 episode series
PDF
Implementing cisco mpls
PDF
Operationalizing EVPN in the Data Center: Part 2
PPTX
Scaleway Approach to VXLAN EVPN Fabric
Vxlan control plane and routing
Vxlan deep dive session rev0.5 final
Building DataCenter networks with VXLAN BGP-EVPN
VXLAN BGP EVPN: Technology Building Blocks
Demystifying EVPN in the data center: Part 1 in 2 episode series
Implementing cisco mpls
Operationalizing EVPN in the Data Center: Part 2
Scaleway Approach to VXLAN EVPN Fabric

What's hot (20)

PPT
MPLS & BASIC LDP
PDF
Layer-2 VPN
PPT
Mpls L3_vpn
PDF
VXLAN Design and Deployment.pdf
PDF
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
PDF
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014
PDF
Lte epc kp is and signalling (sf)
PPTX
MP BGP-EVPN 실전기술-1편(개념잡기)
PDF
Segment Routing Lab
PPTX
OpenvSwitch Deep Dive
PDF
MPLS Traffic Engineering
PPTX
VXLAN
PPT
Juniper mpls best practice part 1
PDF
MPLS (Multiprotocol Label Switching)
PDF
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017
PPTX
Vpc notes
PDF
Layer 3 redundancy hsrp
PPSX
PDF
Neighbor Discovery Deep Dive – IPv6-Networking-Referat
MPLS & BASIC LDP
Layer-2 VPN
Mpls L3_vpn
VXLAN Design and Deployment.pdf
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Milan Jan/2014
Lte epc kp is and signalling (sf)
MP BGP-EVPN 실전기술-1편(개념잡기)
Segment Routing Lab
OpenvSwitch Deep Dive
MPLS Traffic Engineering
VXLAN
Juniper mpls best practice part 1
MPLS (Multiprotocol Label Switching)
Cisco Live! :: Cisco ASR 9000 Architecture :: BRKARC-2003 | Las Vegas 2017
Vpc notes
Layer 3 redundancy hsrp
Neighbor Discovery Deep Dive – IPv6-Networking-Referat
Ad

Similar to Designing Multi-tenant Data Centers Using EVPN (20)

PDF
PLNOG 13: Emil Gągała: EVPN – rozwiązanie nie tylko dla Data Center
PPTX
06 evpn use-case_reviewv1
PPTX
IT1634 – SDN Unit 3Software Defined Nwtwork
PPTX
The new imperative in the data center with workload centric networking
PDF
evpn_in_service_provider_network-web.pdf
PDF
Multicloud as the Next Generation of Cloud Infrastructure
PDF
Mondaygeneralhankinsvpn2 140605100226-phpapp01 (1)
PDF
Eywa - Cloud Network Architecture 20180625(20150907)(compact)
PDF
Spirent TestCenter EVPN Emulation
PDF
vPC_Final
PPTX
Data center network reference architecture with hpe flex fabric
PDF
CISCO Virtual Private LAN Service (VPLS) Technical Deployment Overview
DOCX
Leaf Spine Evolved DC Deployments.
PPTX
Automate programmable fabric in seconds with an open standards based solution
PPTX
L'aposta de Pluribus en el camp de les SDN
PDF
NFV в сетях операторов связи
PDF
CloudStack Networking Deepdive CCCEU13
PPTX
VXLAN Distributed Service Node
PDF
Computer Network NFV Management and Orchestration.pdf
PLNOG 13: Emil Gągała: EVPN – rozwiązanie nie tylko dla Data Center
06 evpn use-case_reviewv1
IT1634 – SDN Unit 3Software Defined Nwtwork
The new imperative in the data center with workload centric networking
evpn_in_service_provider_network-web.pdf
Multicloud as the Next Generation of Cloud Infrastructure
Mondaygeneralhankinsvpn2 140605100226-phpapp01 (1)
Eywa - Cloud Network Architecture 20180625(20150907)(compact)
Spirent TestCenter EVPN Emulation
vPC_Final
Data center network reference architecture with hpe flex fabric
CISCO Virtual Private LAN Service (VPLS) Technical Deployment Overview
Leaf Spine Evolved DC Deployments.
Automate programmable fabric in seconds with an open standards based solution
L'aposta de Pluribus en el camp de les SDN
NFV в сетях операторов связи
CloudStack Networking Deepdive CCCEU13
VXLAN Distributed Service Node
Computer Network NFV Management and Orchestration.pdf
Ad

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Electronic commerce courselecture one. Pdf
PDF
Spectral efficient network and resource selection model in 5G networks
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Cloud computing and distributed systems.
PDF
KodekX | Application Modernization Development
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
MYSQL Presentation for SQL database connectivity
Building Integrated photovoltaic BIPV_UPV.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Electronic commerce courselecture one. Pdf
Spectral efficient network and resource selection model in 5G networks
The AUB Centre for AI in Media Proposal.docx
Mobile App Security Testing_ A Comprehensive Guide.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
NewMind AI Weekly Chronicles - August'25 Week I
Cloud computing and distributed systems.
KodekX | Application Modernization Development
NewMind AI Monthly Chronicles - July 2025
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Per capita expenditure prediction using model stacking based on satellite ima...

Designing Multi-tenant Data Centers Using EVPN