SlideShare a Scribd company logo
1
Segment Routing
• Rasoul Mesghali CCIE#34938
• Vahid Tavajjohi
MENOG 18
From HAMIM Corporation
2
• Introduction
• Technology Overview
• Use Cases
• Closer look at the Control and Data Plane
• Traffic Protection
• Traffic engineering
• SRv6
Agenda
3
Introduction
MENOG 18
4
MPLS “classic” (LDP and RSVP-TE) control-plane was too complex and
lacked scalability.
LDP is redundant to the IGP and that it is better to distribute labels
bound to IGP signaled prefixes in the IGP itself rather than using an
independent protocol (LDP) to do it.
LDP-IGP synchronization issue, RFC 5443, RFC 6138
Overall, we would estimate that 10% of the SP market and likely 0% of the
Enterprise market have used RSVP-TE and that among these deployments,
the vast majority did it for FRR reasons.
The point is to look at traditional technology
(LDP/RSVP_TE) applicability in IP networks in 2018.
Does it fit the needs of modern IP networks?
MPLS Historical Perspective
5
In RSVP-TE and the classic MPLS TE The objective was to create circuits
whose state would be signaled hop-by-hop along the circuit path.
Bandwidth would be booked hop-by-hop. Each hop’s state would be
updated. The available bandwidth of each link would be flooded
throughout the domain using IGP to enable distributed TE computation.
First, RSVP-TE is not ECMP-friendly.
Second, to accurately book the used bandwidth,
RSVP-TE requires all the IP traffic to run within so-
called “RSVP-TE tunnels”. This leads to much
complexity and lack of scale in practice.
MPLS Historical Perspective
6
1.network has enough capacity to accommodate without congestion
traffic engineering to avoid congestion is not needed. It seems
obvious to write it but as we will see further, this is not the case
for an RSVP-TE network.
2.In the rare cases where the traffic is larger than expected or a
non-expected failure occurs, congestion occurs and a traffic
engineering solution may be needed. We write “may” because it
depends on the capacity planning process.
3.Some other operators may not tolerate even these rare
congestions and then require a tactical traffic-engineering
process.
A tactical traffic-engineering solution is a solution that is used
only when needed.
7
the classic RSVP TE solution is an “always-on” solution
This is the reason for the infamous full-mesh of RSVP-TE
tunnels.
N2*K tunnels While no traffic engineering is required in the
most likely situation of an IP network, the classical MPLS TE
solution always requires all the IP traffic to not be switched
as IP, but as MPLS TE circuits.
complexity and limited scale,
most of the time, without any gain
An analogy would be that one
needs to wear his raincoat and
boots every day while it rains
only a few days a year.
8
• Make things easier for operators
Improve scale, simplify operations
Minimize introduction complexity/disruption
• Enhance service offering potential through programmability
• Leverage the efficient MPLS dataplane that we have today
Push, swap, pop
Maintain existing label structure
• Leverage all the services supported over MPLS
Explicit routing, FRR, VPNv4/6, VPLS, L2VPN, etc
• IPv6 dataplane a must, and should share parity with MPLS
Goals and Requirements
9
• Simplicity
less protocols to operate
less protocol interactions to troubleshoot
avoid directed LDP sessions between core routers
deliver automated FRR for any topology
• Scale
avoid millions of labels in LDP database
avoid millions of TE LSP’s in the network
avoid millions of tunnels to configure
Operators Ask For Drastic LDP/RSVP
Improvement
10
• Applications must be able to interact with the network
cloud based delivery
internet of everything
• Programmatic interfaces and Orchestration
Necessary but not sufficient
• The network must respond to application interaction
Rapidly-changing application requirements
Virtualization
Guaranteed SLA and Network Efficiency
Operators Ask For A Network Model
Optimized For Application Interaction
11
• Simple to deploy and operate
Leverage MPLS services & hardware
straightforward ISIS/OSPF extension to distribute labels
LDP/RSVP not required
• Provide for optimum scalability, resiliency and virtualization
• SDN enabled
simple network, highly programmable
highly responsive
Segment Routing
12
Technology Overview
MENOG 18
13
CE1 PE1 P2 P3 P4
P5 P6 P7 CE2
Segment 1
Segment
2
Segment 3
What is the meaning of Segment
Routing?
PE2
10 10
20 10
Default Cost is 100
14
CE1 PE1 P1 P2 P3 P4
P5 P6 P7 CE2
PE2
16099
24001
16007
16007
24001
16007
Segment 1
Segment
2
Segment 3
16007
24001
16007
Adj-SID Label
Prefix-SID Label
Service: L3VPN,L2VPN,6PE,6 VPE
Prefix-SID
Loopback0
Label 16099
Prefix-SID
Loopback0
Label 16007
AdjLabel24001
Default: PHP at each segment
Prefix-SIDs are global Labels
Adj-SIDs are local labels
Deviate from shortest path-Source Routing:
Traffic Engineering based on SR
SR in one Slide
PE2
15
Let’s take a closer look
16
• Source Routing
the source chooses a path and encodes it in the packet header as an ordered
list of segments
• the rest of the network executes the encoded instructions (In Stack of
labels/IPv6 EH)
• Segment: an identifier for any type of instruction
• forwarding or service
• Forwarding state (segment) is established by IGP
LDP and RSVP-TE are not required
Agnostic to forwarding data plane: IPv6 or MPLS
• MPLS Data plane is leveraged without any modification
push, swap and pop: all that we need
segment = label
17
Segment Routing – Overview
• MPLS: an ordered list of segments
is represented as a stack of labels
• IPv6: an ordered list of segments is
encoded in a routing extension header
• This presentation: MPLS data plane
Segment → Label
Basic building blocks distributed
by the IGP or BGP
Paths options
Dynamic
(SPT computation)
Explicit
(expressed in the
packet)
Control Plane
Routing protocols with
extensions
(IS-IS,OSPF, BGP)
SDN controller
Data Plane
MPLS
(segment ID = label)
IPv6
(segment ID = V6 address)
Strict or loose path
18
• Global Segment
Any node in SR domain understands associated instruction
Each node in SR domain installs the associated instruction in its
forwarding table
MPLS: global label value in Segment Routing Global Block
(SRGB)
• Local Segment
Only originating node understands associated instruction
MPLS: locally allocated label
Global and Local Segments
19
• Global Segments always distributed as a label range
(SRGB) + Index
Index must be unique in Segment Routing Domain
• Best practice: same SRGB on all nodes
“Global model”, requested by all operators
Global Segments are global label values, simplifying network
operations
Default SRGB: 16,000 – 23,999
Other vendors also use this label range
Global Segments – Global Label Indexes
20
Types of Segment
21
SegmentSegmentSegmentSegment
SegmentSegment
SegmentSegmentSegment
Segment
Segment
IGP Segment
Two Basic building blocks distributed by IGP:
-Prefix Segment
-Adjacency Segment
Prefix-SID (Node-SID)
Adjacency-SID
4
3
2
11 1
Segment
22
6
5
2
1
7
16006
1.1.1.6/32
16006
16006
16006
16006
6
5
2
1
7
16005
16005
16005
16005
16005
1.1.1.5/32
IGP Prefix Segment
Node-SID
• Shortest-path to the IGP
prefix
Equal Cost Multipath (ECMP)-
aware
• Global Segment
• Label = 16000 + Index
Advertised as index
• Distributed by ISIS/OSPF
Default SRGB 16000-23,999
23
• E advertises its node segment
Simple ISIS/OSPF sub-TLV extension
• All remote nodes install the node segment to E in the MPLS Data Plane
Node Segment
16065
FEC Z
push 16065
swap 16065
to 16065
swap 16065
to 16065 pop 16065
A packet injected
anywhere with top
label 16065 will reach
E via shortest-path
A B C D
E
24
• E advertises its node segment
simple ISIS sub-TLV extension and OSPF
• All remote nodes install the node segment to E in the MPLS dataplane
Node Segment
16065
FEC Z
push 16065
swap 16065
to 16065
swap 16065
to 16065 pop 16065
A packet injected
anywhere with top
label 16065 will reach
E via shortest-pathPacket
to E
Packet
to E
16065
Packet
to E
16065
Packet
to E
16065
Packet
to E
A B C D
E
25
• C allocates a local label and forward on the IGP adjacency
• C advertises the adjacency label
Distributed by OSPF/ISIS
simple sub-TLV extension
(https://datatracker.ietf.org/doc/draft-ietf-isis-segment-routing-extensions/)
https://www.iana.org/assignments/isis-tlv-codepoints/isis-tlv-codepoints.xhtml
• C is the only node to install the adjacency segment in MPLS dataplane
Adjacency Segment
A packet injected at
node 5 with label
24056 is forced
through datalink 5-6
6
5
2
1
7
24057
Adj to 7
Adj to 6
24056
26
• Adjacency segment represents a specific datalink to an adjacent node
• Adjacency segment represents a set of datalinks to the adjacent node
Datalink and Bundle
Pop 9003
9001 switches on blue member
9002 switches on green member
9003 load-balances on any
member of the adj
Pop 9001
Pop 9002
Pop 9003
BA
27
• Source routing along any explicit path
stack of adjacency labels
• SR provides for entire path control
A path with Adjacency Segments
9101 9105
9107
9103 9105
9101
9105
9107
9103
9105
9105
9107
9103
9105
9107
9103
9105
9103
9105
9105
6
5
4
3
2
1
77
28
Combining Segments
24078
Packet to Z
16011
24078
Packet to Z
16011
Packet to Z
Packet to Z
16011
Packet to Z
16011
24078
16007
Packet to Z
16011
24078
16007
1600716007
16011
16011
Adj-SID
Prefix-SID
1 5 9
3 6 8
7
11
• Steer traffic on any path
through the network
• Path is specified by list of
segments in packet header, a
stack of labels
• No path is signaled
• No per-flow state is created
• For IGP – single protocol, for
BGP – AF LS
10
29
P1 P2 P3 P4 10.20.34.0/24
GE 0/0/0/0
Prefix attached to P4 Outgoing label in CEF?
Entry in LFIB?
Prefix-SID P4 (10.100.1.4/32) Y
Prefix-SID P4 without Node flag
(10.100.3.4/32)
Y
loopback prefix without prefix-sid
(10.100.4.4/32)
N
link prefix connected to P4 (10.1.45.0/24) N
Labeling Which prefixes?
• So, this is the equivalent of LDP label prefix filtering: only
assigning/advertising labels to /32 prefixes (loopback prefixes, used by
service, (e.g. L3VPN), so BGP next hop IP addresses)
• Traffic to link prefixes is not labeled!
30
Data 7
Dynamic path
Explicit path
High cost
Low latency
Adj SID: 46
R1
SID: 1
R2
SID: 2
SID: Segment ID
R4
SID: 4
R6
SID: 6
R7
SID: 7
R3
SID: 3
R5
SID: 5
Data 7 46 4 Explicit loose path for low latency app
No LDP, no RSVP-TE
31
10 20
30 40
50 60
70
80
90
Anycast SID: 200
70
200
Packet
70
200
Packet
70
200
Packet
70
Packet
Packet
• A group of Nodes share the same SID
• Work as a “Single” router, single Label
• Same Prefix advertised by multiple
nodes
• traffic forwarded to one of Anycast-
Prefix-SID based on best IGP Path
• if primary node fails,traffic is auto re-
routed to other node
• Application
– ABR Protection
– Seamless MPLS
– ASBR inter-AS protection
Any-Cast SID for Node Redundancy
32
1 2 3 4 5 6 7 8 9 10
Node 10
30410
16004
16003
Node 10
30410
16004
Node 10
30410
Node 10
30710
16007
16006
Node 10
30710
16007
Node 10
30710
Node 10
16010
16009
Node 10
16010
Node 10
14
410
BSID:
30410
SID:
30710
Binding-SIDs can be used in the following cases:
• Multi-Domain (inter-domain, inter-autonomous system)
• Large-Scale within a single domain
• Label stack compression
• BGP SR-TE Dynamic
• Stitching SR-TE Polices Using Binding SID
Binding-SID All Nodes SRGB [16000-23999]
Prefix-SID NodeX: 1600X
Binding-SID X->Y: 300XY
33
12
11
Node SID:
16001
10
1
13
14
3
BGP Prefix Segment
BGP-Connections
• Shortest-Path to the BGP Prefix
• Global
• 16000 + Index
• Signaled by BGP
34
12
11
7
AS1 AS2
Node SID:
16001
BGP-Peering-SID
SID: 30012
5.5.5.5/32
10
1
13
14
5
2
3
Packet
18005
30012
16001
Packet
18005
30012
Packet
18005
Packet
Node SID:
18005
BGP Peering Segment
Egress Peering Engineering
• Pop and Forward to the BGP Peer
• Local
• Signaled by BGP-LS (Topology Information) to the controller
• Local Segment- Like an adjacency SID external to the IGP
Dynamically allocated but persistent
BGP-Peering-SID
35
12
11
7
IGP-1 IGP-2
BGP-LS
5.5.5.5/32
10
1
13
14
5
2
3
Node SID:
18005
WAN Controller
SR
PCE
BGP-LS
SR PCE Collects via BGP-LS
• IGP Segments
• BGP Segments
• Topology
Collects information
from network
36
50
Low latency
Low bandwidth
12
11 5
IGP-1 IGP-2
10
1
13
14
42
3
An end-to-end path as a list of segment
SR
PCE
PCEP,Netconf,
BGP
7
Peering
SID: 147
Node SID:
16001
Node SID:
16002 Adj SID: 124
{16002,
124,
147}
{124,
147}
{147}
{16001,
16002,
124,
147}
BGP-Peer
• Controller learn the
network topology and
usage dynamically
• Controller calculate the
optimized path for
different applications:
low latency, or high
bandwidth
• Controller just program a
list of the labels on the
source routers. The rest
of the network is not
aware: no signaling, no
state information 
simple and Scalable
Default ISIS cost metric: 10
High latency
High bandwidth
37
38
Segment Routing
Segment Routing Global Block
MENOG 18
39
• Segment Routing Global Block
Range of labels reserved for Segment Routing Global
Segments
Default SRGB is 16,000 – 23,999
• A prefix-SID is advertised as a domain-wide unique
index
• The Prefix-SID index points to a unique label within
the SRGB
Index is zero based, i.e. first index = 0
Label = Prefix-SID index + SRGB base
E.g. Prefix 1.1.1.65/32 with prefix-SID index 65 gets label
16065
index 65 --> SID is 16000 + 65 =16065
• Multiple IGP instances can use the same SRGB or use
different non-overlapping SRGBs
Segment Routing Global Block (SRGB)
40
SRGB
16000-2399
1 2 3 4
SRGB
16000-2399
SRGB
24000-31999
Recommended SRGB allocation:
Same SRGB for all
Same SRGB for all:
Simple
Predictable
easier to troubleshoot
simplifies SDN Programming
41
Segment Routing
IGP Control and Date Plane
MENOG 18
42
MP-BGP
PE-1
PE-2
IPv4 IPv6
IPv4
VPN
IPv6
VPN
VPWS VPLS
LDP RSVP BGP Static IS-IS OSPF
MPLS ForwardingPE-1 PE-2
IGP
MPLS Control and Forwarding Operation with Segment
Routing
No changes to
control or
forwarding plane
IGP label
distribution for
IPv4 and IPv6.
Forwarding plane
remains the same
Services
Packet
Transport
43
• IPv4 and IPv6 control plane
• Level 1, level 2 and multi-level routing
• Prefix Segment ID (Prefix-SID) for host prefixes on loopback interfaces
• Adjacency SIDs for adjacencies
• Prefix-to-SID mapping advertisements (mapping server)
• MPLS penultimate hop popping (PHP) and explicit-null label signaling
SR IS-IS Control Plane overview
44
ISIS TLV Extensions
• SR for IS-IS introduces support for the following (sub-)TLVs:
– SR Capability sub-TLV (2) IS-IS Router Capability TLV (242)
– Prefix-SID sub-TLV (3) Extended IP reachability TLV (135)
– Prefix-SID sub-TLV (3) IPv6 IP reachability TLV (236)
– Prefix-SID sub-TLV (3) Multitopology IPv6 IP reachability TLV (237)
– Prefix-SID sub-TLV (3) SID/Label Binding TLV (149)
– Adjacency-SID sub-TLV (31) Extended IS Reachability TLV (22)
– LAN-Adjacency-SID sub-TLV (32) Extended IS Reachability TLV (22)
– Adjacency-SID sub-TLV (31) Multitopology IS Reachability TLV (222)
– LAN-Adjacency-SID sub-TLV (32) Multitopology IS Reachability TLV (222)
– SID/Label Binding TLV (149)
• Implementation based on draft-ietf-isis-segment-routing-extensions
45
SR OSPF Control Plane Overview
• OSPFv2 control plane
• Multi-area
• IPv4 Prefix Segment ID (Prefix-SID) for host prefixes on loopback interfaces
• Adjacency SIDs for adjacencies
• MPLS penultimate hop popping (PHP) and explicit-null label signaling
SR OSPF Control Plane overview
46
• OSPF adds to the Router Information Opaque LSA (type 4):
– SR-Algorithm TLV (8)
– SID/Label Range TLV (9)
• OSPF defines new Opaque LSAs to advertise the SIDs
– OSPFv2 Extended Prefix Opaque LSA (type 7)
>OSPFv2 Extended Prefix TLV (1)
• Prefix SID Sub-TLV (2)
– OSPFv2 Extended Link Opaque LSA (type 8)
>OSPFv2 Extended Link TLV (1)
• Adj-SID Sub-TLV (2)
• LAN Adj-SID Sub-TLV (3)
• Implementation is based on
– draft-ietf-ospf-prefix-link-attr and draft-ietf-ospf-segment-routing-
extensions
OSPF Extensions
47
TLV 135
TLV 22
48
TLV 242
49
Sub-TLV 3
Prefix-SID
SID-Index
16
TLV 135
50
TLV 22
Sub-TLV 32
LAN-Adj-SID
LAN-Adj-SID
24001
51
Use Cases
MENOG 18
52
Do More With Less
IGP
LDP
RSVP
BGP-LU
LDP BGP
BGP-LU
LDP BGP BGP
IGP With SR
IGP With SR
Unified MPLS
PCE
Intra-Domain CP
FRR or TE
Intra-Domain CP
L2/L3VPN Services
EPN 5.0 Metro Fabric
Netconf
Yang
Netconf
Yang
Programmability
Provisioning
53
• IGP only
• No LDP, No RSVP-TE
• ECMP multi-hop shortest-path
IPv4/v6 VPN/Service transport
5
7
7
5
VPN
7
5
Packet to Z
VPN
7
5
Packet to Z
VPN
7
5
Packet to Z
Packet to Z
7
VPN
Packet to Z
7
VPN
Site-1
VPN
Site-2
VPN
VPN
7
5
Packet to Z
Packet to Z
VPN
Packet to Z
PHP
PE-1 2
4
3
5 6
PE-7
54
Internetworking With LDP
MENOG 18
55
1
3
2
4
5 6
LDP
LDP
LDP
LDP
LDP
LDP
LDP
LDP Domain
Initial state: All nodes run LDP, not SR
Step1: All nodes are upgraded to SR
• in no particular order
• Default label imposition preference = LDP
• Leave default LDP label imposition
preference
Step2: All PEs are configured to prefer SR
Label imposition
• in no particular order
Step3: LDP is removed from the nodes in the
network
• in no particular order
Final State: All nodes run SR,Not LDP
Simplest Migration: LDP to SR
LDP+SR
LDP+SR
LDP+SR
LDP+SR
LDP+SR
LDP+SR
SR
segment-routing mpls sr-prefer
SR
SR
SR
SR
SR
SR
56
Local/in lbl Out lblLocal/in lbl Out lblLocal/in lbl Out lblLocal/in lbl Out lbl
SRGB
1 2 3 4 5
segment-routing mpls sr-prefer
segment-routing mpls (defualt)
57
CB D
A Z
LDP SR
• When a node is LDP capable but its next-hop along the SPT to the destination
is not LDP capable
• no LDP outgoing label
• In this case, the LDP LSP is connected to the prefix segment
• C installs the following LDP-to-SR FIB entry:
• incoming label: label bound by LDP for FEC Z
• outgoing label: prefix segment bound to Z
• outgoing interface: D
• This entry is derived and installed
automatically , no config required
Prefix Out Label (LDP),
Interface
Z 16, 0
Input
label(LDP)
Out Label (SID),
Interface
32 16006, 1
LDP/SR Interworking - LDP to SR
58
Local/in lbl Out lblLocal/in lbl Out lblLocal/in lbl Out lblLocal/in lbl Out lbl
SGB
1 2 3 4 5
LDP SR
Copy
LDP LSP ????16005
1.1.1.5/32
90007
1.1.1.5/32
lbl 90100
SID 16005
1.1.1.5/32
59
CB D
A Z
Prefix Out Label (SID),
Interface
Z ?, 0
Input
Label(SID)
Out Label (LDP),
Interface
? 16, 1
16006
LDP/SR Interworking - SR to LDP
• When a node is SR capable but its next-hop along the SPT to the destination
is not SR capable
• no SR outgoing label available
• In this case, the prefix segment is connected to the LDP LSP
• Any node on the SR/LDP border installs SR-to-LDP FIB entry(ies)
LDPSR
60
CB D
A Z
Prefix Out Label (SID),
Interface
Z 16006, 0
Input
Label(SID)
Out Label (LDP),
Interface
16006 16, 1
Z(16006)
LDP/SR Interworking - Mapping Server
• A wants to send traffic to Z, but
• Z is not SR-capable, Z does not advertise any prefixSID
 which label does A have to use?
• The Mapping Server advertises the SID mappings for the non-SR routers
• for example, it advertises that Z is 16066
• A and B install a normal SR prefix segment for 16066
• C realizes that its next hop along the SPT to Z is not SR capable hence C installs
an SR-to-LDP FIB entry
• incoming label: prefix-SID bound
to Z (16066)
• outgoing label: LDP binding
from D for FEC Z
• A sends a frame to Z with a
single label: 16006
LDPSR
61
Local/in lbl Out lblLocal/in lbl Out lbl Local/in lbl Out lbl
1 2 3 4 5
LDPSR
Mapping-Server
1.1.1.5
1.1.1.5/32
Imp-null
1.1.1.5/32
lbl 90090
Local/in lbl Out lbl
NA ?
Copy
90090
62
Traffic Protection
MENOG 18
63
• Classic LFA has disadvantages:
– Incomplete coverage, topology dependent
– Not always providing most optimal backup path
Topology Independent LFA (TI-LFA) solves these
issues
Classic Per-Prefix LFA – disadvantages
64
Classic LFA Rules
65
1 2 3
5
6 7 5
Default Metric : 10
Initial
Classic LFA FRR
TI-LFA FRR
Post-Convergence
Dest-1
Dest-2
Classic LFA has partial coverage
X
Classic LFA is topology dependent: not all
topologies provide LFA for all destinations
– Depends on network topology and
metrics
– E.g. Node6 is not an LFA for Dest1
(Node5) on
Node2, packets would loop since Node6
uses Node2
to reach Dest1 (Node5)
Node2 does not have an LFA for this
destination
(no  backup path in topology)
Topology Independent LFA (TI-LFA)
provides 100% coverage
20
66
1 2 3
5
6 7 5
Default Metric : 10
Initial
Classic LFA FRR
TI-LFA FRR
Post-Convergence
Dest-1
Dest-2
PE-4
100 100
X
Classic LFA and suboptimal path
Classic LFA may provide a suboptimal
FRR backup
path:
– This backup path may not be planned for
capacity, e.g. P
node 2 would use PE4 to protect a core
link, while a
common planning rule is to avoid using
Edge nodes for
transit traffic
– Additional case specific LFA configuration
would be needed
to avoid selecting undesired backup paths
– Operator would prefer to use the post-
convergence path as
FRR backup path, aligned with the regular
IGP
convergence
 TI-LFA uses the post-convergence
path as FRR backup path
67
A
1 2
5
34
Z
Default metric: 10
Packet to Z
Prefix-SID Z
Packet to Z
Prefix-SID Z
Packet to Z1000
P-Space
Q-Space
• TI-LFA for link R1R2 on R1
• Calculate LFA(s)
- Compute post-convergence
SPT
- Encode post-convergence
path in a SID-list
- In this example R1 forwards
the packets towards R5
TI-LFA – Zero-Segment Example
68
A
1 2
34
Z
Default metric: 10
Packet to Z
Prefix-SID Z
Packet to Z
Packet to Z
Prefix-SID Z
Prefix-SID (R4) P-Space
Q-Space
• TI-LFA for link R1R2 on R1
- Compute post-convergence SPT
- Encode post-convergence path in
a SID-list
- In this example R1 imposes the
SID-list <Prefix-SID(R4)> and
sends packets towards R5
TI-LFA – Single-Segment Example
5
Packet to Z
Prefix-SID Z
69
A
1 2
34
Z
Default metric: 10
Packet to Z
Prefix-SID Z
Packet to Z
1000
Packet to Z
Prefix-SID Z
Prefix-SID (R4)
Adj-SID (R4-R3)
Packet to Z
Prefix-SID Z
Adj-SID (R4-R3)
P-Space
Q-Space
Packet to Z
Prefix-SID Z
5
TI-LFA – Double-Segment Example
TI-LFA for link R1R2 on R1
- Compute post-convergence SPT
- Encode post-convergence path in a
SID-list
- In this example R1 imposes the SID-
list <Prefix-SID(R4), Adj-SID(R4-R3)>
and sends packets towards R5
70
A
1 2
5
34
Z
Default metric: 10
Packet to Z
LDP (1,Z)
Packet to Z
1000
Packet to Z
Prefix-SID Z
LDP (5,4)
Adj-SID (R4-R3)
Packet to Z
Prefix-SID Z
Adj-SID (R4-R3)
P-Space
Q-Space
Packet to Z
Prefix-SID Z
TI-LFA for LDP Traffic
71
Traffic Engineering
MENOG 18
72
• Little deployment and many issues
• Not scalable
– Core states in k×n2
– No inter-domain
• Complex configuration
– Tunnel interfaces
• Complex steering
– PBR, autoroute
• Does not support ECMP
RSVP-TE
73
• Simple, Automated and Scalable
– No core state: state in the packet header
– No tunnel interface: “SR Policy”
– No head-end a-priori configuration: on-demand policy instantiation
– No head-end a-priori steering: automated steering
• Multi-Domain
– SDN Controller for compute
– Binding-SID (BSID) for scale
• Lots of Functionality
– Designed with lead operators along their use-cases
• Provides explicit routing
• Supports constraint-based routing
• Supports centralized admission control
• No RSVP-TE to establish LSPs
• Uses existing ISIS / OSPF extensions to advertise link attributes
• Supports ECMP
• Disjoint Path
SRTE
74
5
14
22
9 23
3
13
7
2 21
10 11
T:30
T:30VRF Blue VRF Blue
Router-id of NodeX: 1.1.1.X
Prefix-SID index of NodeX: X
Link address XY: 99.X.Y.X/24 with X < Y
Adj-SID XY: 240XY
Default IGP Metric: I:10
Default TE Metric: T:10
TE Metric used to express latency
1.1.1.111.1.1.10
1.1.1.3
16003
1.1.1.5
16005
1.1.1.22
16022
1.1.1.7
16007
1.1.1.9
16009
1.1.1.23
16023
Domain-1
ISI-S/SR
Domain-2
ISI-S/SR
PCC
PCC PCC
PCC
SR
PCE
Domain-1
ISI-S/SR
Domain-2
ISI-S/SR
BGP-LS
SR
PCE
PCEP
PCEP
PCEP
PCEP
BGP
BGP
BGP
BGP
RR
1.1.1.2 1.1.1.21
75
5
14
22
9 23
3
13
7
2 21
10 11
T:30
T:30VRF Blue VRF Blue
SR
PCE RR
BGP:
1.1.1.21/32
via 21
MAP: 1.1.1.21/32 in vrf BLUE must receive
low latency service  tag with
community (100:777)
VPN Label : 99999
MAP: Community (100:777) means
“minimize TE Metric” and
“compute at PCE”
PCreq/reply
BSID:
30022
COMPUTE: minimize TE Metric to Node22
RESULT: SID list: OIF: to3
Automated Steering uses color extended communities and nexthop to match with the color and
end-point of an SR Policy
E.g. BGP route 2/8 with nexthop 1.1.1.1 and color 100
will be steered into an SR Policy with color 100 and end-point 1.1.1.1
If no such SR Policy exists, it can be instantiated automatically (ODN)
76
SRv6
MENOG 18
77
SRv6 for underlay
IPv6 for reach
RSVP for FRR/TE Horrendous states scaling in k*N^2SRv6 for Underlay Simplification, FRR, TE, SDN
78
• Multiplicity of protocols and states hinder network economics
Opportunity for further simplification
IPv6 for reach
Simplification, FRR, TE, SDNSRv6 for Underlay
Additional Protocol just for tenant IDUDP+VxLAN Overlay
Additional Protocol and StateNSH for NFV
79
• IPV6 Header
• Next Header (NH)
• Indicate what comes next
80
• NH=IPv6
• NH=IPv4
81
• Generic routing extension header
– Defined in RFC 2460
– Next Header: UDP, TCP, IPv6…
– Hdr Ext Len: Any IPv6 device can skip
this header
– Segments Left: Ignore extension
header if equal to 0
• Routing Type field:
> 0 Source Route (deprecated since
2007)
> 1 Nimrod (deprecated since 2009)
> 2 Mobility (RFC 6275)
> 3 RPL Source Route (RFC 6554)
> 4 Segment Routing
• NH=Routing Extension
82
• NH=SRv6
NH=43,Type=4
83
NH=43
Routing Extension
RT = 4
Segment-List
84
SRH Processing
MENOG 18
85
• Source node is SR-capable
• SR Header (SRH) is created with
Segment list in reversed order of the path
Segment List [ 0 ] is the LAST segment
Segment List [ 𝑛 − 1 ] is the FIRST segment
Segments Left is set to 𝑛 − 1
First Segment is set to 𝑛 − 1
• IP DA is set to the first segment
• Packet is send according to the IP DA
Normal IPv6 forwarding
Source Node
4
A4::
1
A1::
SR Hdr
IPv6 Hdr SA = A1::, DA = A2::
( A4::, A3::, A2:: ) SL=2
Payload
2
A2::
3
A3::
Version Traffic Class
Next = 43 Hop LimitPayload Length
Source Address = A1::
Destination Address = A2::
Segment List [ 0 ] = A4::
Segment List [ 1 ] = A3::
Next Header Len= 6 Type = 4 SL = 2
First = 2 Flags TAG
IPv6Hdr
Segment List [ 2 ] = A2::
SRHdr
Payload
Flow LabelFlow Label
86
SR Hdr
IPv6 Hdr SA = A1::, DA = A2::
( A4::, A3::, A2:: ) SL=2
Payload
• Plain IPv6 forwarding
• Solely based on IPv6 DA
• No SRH inspection or update
Non-SR Transit Node
4
A4::
1
A1::
2
A2::
3
A3::
87
SR Hdr
IPv6 Hdr SA = A1::, DA = A3::
( A4::, A3::, A2:: ) SL=1
Payload
• SR Endpoints: SR-capable nodes
whose address is in the IP DA
• SR Endpoints inspect the SRH and do:
IF Segments Left > 0, THEN
Decrement Segments Left ( -1 )
Update DA with Segment List [ Segments Left ]
Forward according to the new IP DA
SR Segment Endpoints
Version Traffic Class
Next = 43 Hop LimitPayload Length
Source Address = A1::
Destination Address = A3::
Segment List [ 0 ] = A4::
Segment List [ 1 ] = A3::
Next Header Len= 6 Type = 4 SL = 1
First = 2 Flags TAG
IPv6Hdr
Segment List [ 2 ] = A2::
SRHdr
Payload
Flow LabelFlow Label
4
A4::
A
A1::
2
A2::
3
A3::
88
SR Hdr
IPv6 Hdr SA = A1::, DA = A4::
( A4::, A3::, A2:: ) SL=0
Payload
• SR Endpoints: SR-capable nodes
whose address is in the IP DA
• SR Endpoints inspect the SRH and do:
IF Segments Left > 0, THEN
Decrement Segments Left ( -1 )
Update DA with Segment List [ Segments Left ]
Forward according to the new IP DA
ELSE (Segments Left = 0)
Remove the IP and SR header
Process the payload:
Inner IP: Lookup DA and forward
TCP / UDP: Send to socket
…
SR Segment Endpoints
Version Traffic Class
Next = 43 Hop LimitPayload Length
Source Address = A1::
Destination Address = A4::
Segment List [ 0 ] = A4::
Segment List [ 1 ] = A3::
Next Header Len= 6 Type = 4 SL = 0
First = 2 Flags TAG
IPv6Hdr
Segment List [ 2 ] = A2::
SRHdr
Payload
Flow LabelFlow Label
4
A4::
1
A1::
2
A2::
3
A3::
Standard IPv6 processing
The final destination does
not have to be SR-capable.
89
Deployments around the world
• Bell in Canada
• Orange
• Microsoft
• SoftBank
• Alibaba
• Vodafone
• Comcast
• China Unicom
90
Rasoul Mesghali : rasoul.mesghali@gmail.com
Vahid Tavajjohi : vahid.tavajjohi@gmail.com

More Related Content

PDF
Segment Routing: A Tutorial
PPTX
VPLS Fundamental
PPT
Mpls Services
PPTX
Multiprotocol label switching (mpls) - Networkshop44
PDF
MPLS Traffic Engineering
PPT
Mpls L3_vpn
PDF
Deploy MPLS Traffic Engineering
Segment Routing: A Tutorial
VPLS Fundamental
Mpls Services
Multiprotocol label switching (mpls) - Networkshop44
MPLS Traffic Engineering
Mpls L3_vpn
Deploy MPLS Traffic Engineering

What's hot (20)

PDF
WAN SDN meet Segment Routing
PDF
Segment Routing Lab
PPTX
EVPN-Presentation.pptx
PDF
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
PDF
VXLAN BGP EVPN: Technology Building Blocks
PDF
SRv6 Network Programming: deployment use-cases
PDF
Segment Routing
PPTX
Vxlan deep dive session rev0.5 final
PDF
Segment Routing
PDF
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
PDF
Segment routing tutorial
PPTX
06 evpn use-case_reviewv1
PDF
Brkmpl 2333
PDF
MPLS L3 VPN Deployment
PDF
Segment Routing for Dummies
PPTX
TechWiseTV Workshop: Segment Routing for the Datacenter
PPTX
Vxlan control plane and routing
PDF
Implementing cisco mpls
PDF
SP Routing Innovation with Segment Routing, VXLAN and EVPN - Ismail Ali
WAN SDN meet Segment Routing
Segment Routing Lab
EVPN-Presentation.pptx
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
VXLAN BGP EVPN: Technology Building Blocks
SRv6 Network Programming: deployment use-cases
Segment Routing
Vxlan deep dive session rev0.5 final
Segment Routing
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
Segment routing tutorial
06 evpn use-case_reviewv1
Brkmpl 2333
MPLS L3 VPN Deployment
Segment Routing for Dummies
TechWiseTV Workshop: Segment Routing for the Datacenter
Vxlan control plane and routing
Implementing cisco mpls
SP Routing Innovation with Segment Routing, VXLAN and EVPN - Ismail Ali
Ad

Similar to MENOG-Segment Routing Introduction (20)

PDF
Engineering The New IP Transport
PPTX
PLNOG 17 - Leonir Hoxha - Next Generation Network Architecture - Segment Routing
PDF
Segment Routing: Prepare Your Network For New Business Models
PDF
Segment Routing Session#1.pdfSegment Routing Session#1.pdf
PDF
Segment Routing Session#2.pdfSegment Routing Session#2.pdf
PDF
Segment Routing Technology Deep Dive and Advanced Use Cases
PDF
Cisco Connect Montreal 2017 - Segment Routing - Technology Deep-dive and Adva...
PDF
Introduction to segment routing
PPT
Sl3c3
PDF
Traffic Engineering Using Segment Routing
PPSX
PDF
MPLS Lecture1(H)-102020.pdf
PDF
PLNOG 13: Jeff Tantsura: Programmable and Application aware IP/MPLS networking
PDF
MPLS Deployment Chapter 1 - Basic
PDF
05 - IDNOG04 - Bambang Gunawan (Juniper) - Segment Routing
PDF
Technology Tutorial and the basics of What Is SR MPLS
PPT
mpls.ppt
PPTX
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
PPTX
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
PPTX
Engineering The New IP Transport
PLNOG 17 - Leonir Hoxha - Next Generation Network Architecture - Segment Routing
Segment Routing: Prepare Your Network For New Business Models
Segment Routing Session#1.pdfSegment Routing Session#1.pdf
Segment Routing Session#2.pdfSegment Routing Session#2.pdf
Segment Routing Technology Deep Dive and Advanced Use Cases
Cisco Connect Montreal 2017 - Segment Routing - Technology Deep-dive and Adva...
Introduction to segment routing
Sl3c3
Traffic Engineering Using Segment Routing
MPLS Lecture1(H)-102020.pdf
PLNOG 13: Jeff Tantsura: Programmable and Application aware IP/MPLS networking
MPLS Deployment Chapter 1 - Basic
05 - IDNOG04 - Bambang Gunawan (Juniper) - Segment Routing
Technology Tutorial and the basics of What Is SR MPLS
mpls.ppt
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
Ad

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
A Presentation on Artificial Intelligence
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation theory and applications.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Machine learning based COVID-19 study performance prediction
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
DOCX
The AUB Centre for AI in Media Proposal.docx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
A Presentation on Artificial Intelligence
“AI and Expert System Decision Support & Business Intelligence Systems”
Building Integrated photovoltaic BIPV_UPV.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation theory and applications.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Review of recent advances in non-invasive hemoglobin estimation
Machine learning based COVID-19 study performance prediction
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
NewMind AI Weekly Chronicles - August'25 Week I
The Rise and Fall of 3GPP – Time for a Sabbatical?
Mobile App Security Testing_ A Comprehensive Guide.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
The AUB Centre for AI in Media Proposal.docx

MENOG-Segment Routing Introduction

  • 1. 1 Segment Routing • Rasoul Mesghali CCIE#34938 • Vahid Tavajjohi MENOG 18 From HAMIM Corporation
  • 2. 2 • Introduction • Technology Overview • Use Cases • Closer look at the Control and Data Plane • Traffic Protection • Traffic engineering • SRv6 Agenda
  • 4. 4 MPLS “classic” (LDP and RSVP-TE) control-plane was too complex and lacked scalability. LDP is redundant to the IGP and that it is better to distribute labels bound to IGP signaled prefixes in the IGP itself rather than using an independent protocol (LDP) to do it. LDP-IGP synchronization issue, RFC 5443, RFC 6138 Overall, we would estimate that 10% of the SP market and likely 0% of the Enterprise market have used RSVP-TE and that among these deployments, the vast majority did it for FRR reasons. The point is to look at traditional technology (LDP/RSVP_TE) applicability in IP networks in 2018. Does it fit the needs of modern IP networks? MPLS Historical Perspective
  • 5. 5 In RSVP-TE and the classic MPLS TE The objective was to create circuits whose state would be signaled hop-by-hop along the circuit path. Bandwidth would be booked hop-by-hop. Each hop’s state would be updated. The available bandwidth of each link would be flooded throughout the domain using IGP to enable distributed TE computation. First, RSVP-TE is not ECMP-friendly. Second, to accurately book the used bandwidth, RSVP-TE requires all the IP traffic to run within so- called “RSVP-TE tunnels”. This leads to much complexity and lack of scale in practice. MPLS Historical Perspective
  • 6. 6 1.network has enough capacity to accommodate without congestion traffic engineering to avoid congestion is not needed. It seems obvious to write it but as we will see further, this is not the case for an RSVP-TE network. 2.In the rare cases where the traffic is larger than expected or a non-expected failure occurs, congestion occurs and a traffic engineering solution may be needed. We write “may” because it depends on the capacity planning process. 3.Some other operators may not tolerate even these rare congestions and then require a tactical traffic-engineering process. A tactical traffic-engineering solution is a solution that is used only when needed.
  • 7. 7 the classic RSVP TE solution is an “always-on” solution This is the reason for the infamous full-mesh of RSVP-TE tunnels. N2*K tunnels While no traffic engineering is required in the most likely situation of an IP network, the classical MPLS TE solution always requires all the IP traffic to not be switched as IP, but as MPLS TE circuits. complexity and limited scale, most of the time, without any gain An analogy would be that one needs to wear his raincoat and boots every day while it rains only a few days a year.
  • 8. 8 • Make things easier for operators Improve scale, simplify operations Minimize introduction complexity/disruption • Enhance service offering potential through programmability • Leverage the efficient MPLS dataplane that we have today Push, swap, pop Maintain existing label structure • Leverage all the services supported over MPLS Explicit routing, FRR, VPNv4/6, VPLS, L2VPN, etc • IPv6 dataplane a must, and should share parity with MPLS Goals and Requirements
  • 9. 9 • Simplicity less protocols to operate less protocol interactions to troubleshoot avoid directed LDP sessions between core routers deliver automated FRR for any topology • Scale avoid millions of labels in LDP database avoid millions of TE LSP’s in the network avoid millions of tunnels to configure Operators Ask For Drastic LDP/RSVP Improvement
  • 10. 10 • Applications must be able to interact with the network cloud based delivery internet of everything • Programmatic interfaces and Orchestration Necessary but not sufficient • The network must respond to application interaction Rapidly-changing application requirements Virtualization Guaranteed SLA and Network Efficiency Operators Ask For A Network Model Optimized For Application Interaction
  • 11. 11 • Simple to deploy and operate Leverage MPLS services & hardware straightforward ISIS/OSPF extension to distribute labels LDP/RSVP not required • Provide for optimum scalability, resiliency and virtualization • SDN enabled simple network, highly programmable highly responsive Segment Routing
  • 13. 13 CE1 PE1 P2 P3 P4 P5 P6 P7 CE2 Segment 1 Segment 2 Segment 3 What is the meaning of Segment Routing? PE2 10 10 20 10 Default Cost is 100
  • 14. 14 CE1 PE1 P1 P2 P3 P4 P5 P6 P7 CE2 PE2 16099 24001 16007 16007 24001 16007 Segment 1 Segment 2 Segment 3 16007 24001 16007 Adj-SID Label Prefix-SID Label Service: L3VPN,L2VPN,6PE,6 VPE Prefix-SID Loopback0 Label 16099 Prefix-SID Loopback0 Label 16007 AdjLabel24001 Default: PHP at each segment Prefix-SIDs are global Labels Adj-SIDs are local labels Deviate from shortest path-Source Routing: Traffic Engineering based on SR SR in one Slide PE2
  • 15. 15 Let’s take a closer look
  • 16. 16 • Source Routing the source chooses a path and encodes it in the packet header as an ordered list of segments • the rest of the network executes the encoded instructions (In Stack of labels/IPv6 EH) • Segment: an identifier for any type of instruction • forwarding or service • Forwarding state (segment) is established by IGP LDP and RSVP-TE are not required Agnostic to forwarding data plane: IPv6 or MPLS • MPLS Data plane is leveraged without any modification push, swap and pop: all that we need segment = label
  • 17. 17 Segment Routing – Overview • MPLS: an ordered list of segments is represented as a stack of labels • IPv6: an ordered list of segments is encoded in a routing extension header • This presentation: MPLS data plane Segment → Label Basic building blocks distributed by the IGP or BGP Paths options Dynamic (SPT computation) Explicit (expressed in the packet) Control Plane Routing protocols with extensions (IS-IS,OSPF, BGP) SDN controller Data Plane MPLS (segment ID = label) IPv6 (segment ID = V6 address) Strict or loose path
  • 18. 18 • Global Segment Any node in SR domain understands associated instruction Each node in SR domain installs the associated instruction in its forwarding table MPLS: global label value in Segment Routing Global Block (SRGB) • Local Segment Only originating node understands associated instruction MPLS: locally allocated label Global and Local Segments
  • 19. 19 • Global Segments always distributed as a label range (SRGB) + Index Index must be unique in Segment Routing Domain • Best practice: same SRGB on all nodes “Global model”, requested by all operators Global Segments are global label values, simplifying network operations Default SRGB: 16,000 – 23,999 Other vendors also use this label range Global Segments – Global Label Indexes
  • 21. 21 SegmentSegmentSegmentSegment SegmentSegment SegmentSegmentSegment Segment Segment IGP Segment Two Basic building blocks distributed by IGP: -Prefix Segment -Adjacency Segment Prefix-SID (Node-SID) Adjacency-SID 4 3 2 11 1 Segment
  • 22. 22 6 5 2 1 7 16006 1.1.1.6/32 16006 16006 16006 16006 6 5 2 1 7 16005 16005 16005 16005 16005 1.1.1.5/32 IGP Prefix Segment Node-SID • Shortest-path to the IGP prefix Equal Cost Multipath (ECMP)- aware • Global Segment • Label = 16000 + Index Advertised as index • Distributed by ISIS/OSPF Default SRGB 16000-23,999
  • 23. 23 • E advertises its node segment Simple ISIS/OSPF sub-TLV extension • All remote nodes install the node segment to E in the MPLS Data Plane Node Segment 16065 FEC Z push 16065 swap 16065 to 16065 swap 16065 to 16065 pop 16065 A packet injected anywhere with top label 16065 will reach E via shortest-path A B C D E
  • 24. 24 • E advertises its node segment simple ISIS sub-TLV extension and OSPF • All remote nodes install the node segment to E in the MPLS dataplane Node Segment 16065 FEC Z push 16065 swap 16065 to 16065 swap 16065 to 16065 pop 16065 A packet injected anywhere with top label 16065 will reach E via shortest-pathPacket to E Packet to E 16065 Packet to E 16065 Packet to E 16065 Packet to E A B C D E
  • 25. 25 • C allocates a local label and forward on the IGP adjacency • C advertises the adjacency label Distributed by OSPF/ISIS simple sub-TLV extension (https://datatracker.ietf.org/doc/draft-ietf-isis-segment-routing-extensions/) https://www.iana.org/assignments/isis-tlv-codepoints/isis-tlv-codepoints.xhtml • C is the only node to install the adjacency segment in MPLS dataplane Adjacency Segment A packet injected at node 5 with label 24056 is forced through datalink 5-6 6 5 2 1 7 24057 Adj to 7 Adj to 6 24056
  • 26. 26 • Adjacency segment represents a specific datalink to an adjacent node • Adjacency segment represents a set of datalinks to the adjacent node Datalink and Bundle Pop 9003 9001 switches on blue member 9002 switches on green member 9003 load-balances on any member of the adj Pop 9001 Pop 9002 Pop 9003 BA
  • 27. 27 • Source routing along any explicit path stack of adjacency labels • SR provides for entire path control A path with Adjacency Segments 9101 9105 9107 9103 9105 9101 9105 9107 9103 9105 9105 9107 9103 9105 9107 9103 9105 9103 9105 9105 6 5 4 3 2 1 77
  • 28. 28 Combining Segments 24078 Packet to Z 16011 24078 Packet to Z 16011 Packet to Z Packet to Z 16011 Packet to Z 16011 24078 16007 Packet to Z 16011 24078 16007 1600716007 16011 16011 Adj-SID Prefix-SID 1 5 9 3 6 8 7 11 • Steer traffic on any path through the network • Path is specified by list of segments in packet header, a stack of labels • No path is signaled • No per-flow state is created • For IGP – single protocol, for BGP – AF LS 10
  • 29. 29 P1 P2 P3 P4 10.20.34.0/24 GE 0/0/0/0 Prefix attached to P4 Outgoing label in CEF? Entry in LFIB? Prefix-SID P4 (10.100.1.4/32) Y Prefix-SID P4 without Node flag (10.100.3.4/32) Y loopback prefix without prefix-sid (10.100.4.4/32) N link prefix connected to P4 (10.1.45.0/24) N Labeling Which prefixes? • So, this is the equivalent of LDP label prefix filtering: only assigning/advertising labels to /32 prefixes (loopback prefixes, used by service, (e.g. L3VPN), so BGP next hop IP addresses) • Traffic to link prefixes is not labeled!
  • 30. 30 Data 7 Dynamic path Explicit path High cost Low latency Adj SID: 46 R1 SID: 1 R2 SID: 2 SID: Segment ID R4 SID: 4 R6 SID: 6 R7 SID: 7 R3 SID: 3 R5 SID: 5 Data 7 46 4 Explicit loose path for low latency app No LDP, no RSVP-TE
  • 31. 31 10 20 30 40 50 60 70 80 90 Anycast SID: 200 70 200 Packet 70 200 Packet 70 200 Packet 70 Packet Packet • A group of Nodes share the same SID • Work as a “Single” router, single Label • Same Prefix advertised by multiple nodes • traffic forwarded to one of Anycast- Prefix-SID based on best IGP Path • if primary node fails,traffic is auto re- routed to other node • Application – ABR Protection – Seamless MPLS – ASBR inter-AS protection Any-Cast SID for Node Redundancy
  • 32. 32 1 2 3 4 5 6 7 8 9 10 Node 10 30410 16004 16003 Node 10 30410 16004 Node 10 30410 Node 10 30710 16007 16006 Node 10 30710 16007 Node 10 30710 Node 10 16010 16009 Node 10 16010 Node 10 14 410 BSID: 30410 SID: 30710 Binding-SIDs can be used in the following cases: • Multi-Domain (inter-domain, inter-autonomous system) • Large-Scale within a single domain • Label stack compression • BGP SR-TE Dynamic • Stitching SR-TE Polices Using Binding SID Binding-SID All Nodes SRGB [16000-23999] Prefix-SID NodeX: 1600X Binding-SID X->Y: 300XY
  • 33. 33 12 11 Node SID: 16001 10 1 13 14 3 BGP Prefix Segment BGP-Connections • Shortest-Path to the BGP Prefix • Global • 16000 + Index • Signaled by BGP
  • 34. 34 12 11 7 AS1 AS2 Node SID: 16001 BGP-Peering-SID SID: 30012 5.5.5.5/32 10 1 13 14 5 2 3 Packet 18005 30012 16001 Packet 18005 30012 Packet 18005 Packet Node SID: 18005 BGP Peering Segment Egress Peering Engineering • Pop and Forward to the BGP Peer • Local • Signaled by BGP-LS (Topology Information) to the controller • Local Segment- Like an adjacency SID external to the IGP Dynamically allocated but persistent BGP-Peering-SID
  • 35. 35 12 11 7 IGP-1 IGP-2 BGP-LS 5.5.5.5/32 10 1 13 14 5 2 3 Node SID: 18005 WAN Controller SR PCE BGP-LS SR PCE Collects via BGP-LS • IGP Segments • BGP Segments • Topology Collects information from network
  • 36. 36 50 Low latency Low bandwidth 12 11 5 IGP-1 IGP-2 10 1 13 14 42 3 An end-to-end path as a list of segment SR PCE PCEP,Netconf, BGP 7 Peering SID: 147 Node SID: 16001 Node SID: 16002 Adj SID: 124 {16002, 124, 147} {124, 147} {147} {16001, 16002, 124, 147} BGP-Peer • Controller learn the network topology and usage dynamically • Controller calculate the optimized path for different applications: low latency, or high bandwidth • Controller just program a list of the labels on the source routers. The rest of the network is not aware: no signaling, no state information  simple and Scalable Default ISIS cost metric: 10 High latency High bandwidth
  • 37. 37
  • 38. 38 Segment Routing Segment Routing Global Block MENOG 18
  • 39. 39 • Segment Routing Global Block Range of labels reserved for Segment Routing Global Segments Default SRGB is 16,000 – 23,999 • A prefix-SID is advertised as a domain-wide unique index • The Prefix-SID index points to a unique label within the SRGB Index is zero based, i.e. first index = 0 Label = Prefix-SID index + SRGB base E.g. Prefix 1.1.1.65/32 with prefix-SID index 65 gets label 16065 index 65 --> SID is 16000 + 65 =16065 • Multiple IGP instances can use the same SRGB or use different non-overlapping SRGBs Segment Routing Global Block (SRGB)
  • 40. 40 SRGB 16000-2399 1 2 3 4 SRGB 16000-2399 SRGB 24000-31999 Recommended SRGB allocation: Same SRGB for all Same SRGB for all: Simple Predictable easier to troubleshoot simplifies SDN Programming
  • 41. 41 Segment Routing IGP Control and Date Plane MENOG 18
  • 42. 42 MP-BGP PE-1 PE-2 IPv4 IPv6 IPv4 VPN IPv6 VPN VPWS VPLS LDP RSVP BGP Static IS-IS OSPF MPLS ForwardingPE-1 PE-2 IGP MPLS Control and Forwarding Operation with Segment Routing No changes to control or forwarding plane IGP label distribution for IPv4 and IPv6. Forwarding plane remains the same Services Packet Transport
  • 43. 43 • IPv4 and IPv6 control plane • Level 1, level 2 and multi-level routing • Prefix Segment ID (Prefix-SID) for host prefixes on loopback interfaces • Adjacency SIDs for adjacencies • Prefix-to-SID mapping advertisements (mapping server) • MPLS penultimate hop popping (PHP) and explicit-null label signaling SR IS-IS Control Plane overview
  • 44. 44 ISIS TLV Extensions • SR for IS-IS introduces support for the following (sub-)TLVs: – SR Capability sub-TLV (2) IS-IS Router Capability TLV (242) – Prefix-SID sub-TLV (3) Extended IP reachability TLV (135) – Prefix-SID sub-TLV (3) IPv6 IP reachability TLV (236) – Prefix-SID sub-TLV (3) Multitopology IPv6 IP reachability TLV (237) – Prefix-SID sub-TLV (3) SID/Label Binding TLV (149) – Adjacency-SID sub-TLV (31) Extended IS Reachability TLV (22) – LAN-Adjacency-SID sub-TLV (32) Extended IS Reachability TLV (22) – Adjacency-SID sub-TLV (31) Multitopology IS Reachability TLV (222) – LAN-Adjacency-SID sub-TLV (32) Multitopology IS Reachability TLV (222) – SID/Label Binding TLV (149) • Implementation based on draft-ietf-isis-segment-routing-extensions
  • 45. 45 SR OSPF Control Plane Overview • OSPFv2 control plane • Multi-area • IPv4 Prefix Segment ID (Prefix-SID) for host prefixes on loopback interfaces • Adjacency SIDs for adjacencies • MPLS penultimate hop popping (PHP) and explicit-null label signaling SR OSPF Control Plane overview
  • 46. 46 • OSPF adds to the Router Information Opaque LSA (type 4): – SR-Algorithm TLV (8) – SID/Label Range TLV (9) • OSPF defines new Opaque LSAs to advertise the SIDs – OSPFv2 Extended Prefix Opaque LSA (type 7) >OSPFv2 Extended Prefix TLV (1) • Prefix SID Sub-TLV (2) – OSPFv2 Extended Link Opaque LSA (type 8) >OSPFv2 Extended Link TLV (1) • Adj-SID Sub-TLV (2) • LAN Adj-SID Sub-TLV (3) • Implementation is based on – draft-ietf-ospf-prefix-link-attr and draft-ietf-ospf-segment-routing- extensions OSPF Extensions
  • 52. 52 Do More With Less IGP LDP RSVP BGP-LU LDP BGP BGP-LU LDP BGP BGP IGP With SR IGP With SR Unified MPLS PCE Intra-Domain CP FRR or TE Intra-Domain CP L2/L3VPN Services EPN 5.0 Metro Fabric Netconf Yang Netconf Yang Programmability Provisioning
  • 53. 53 • IGP only • No LDP, No RSVP-TE • ECMP multi-hop shortest-path IPv4/v6 VPN/Service transport 5 7 7 5 VPN 7 5 Packet to Z VPN 7 5 Packet to Z VPN 7 5 Packet to Z Packet to Z 7 VPN Packet to Z 7 VPN Site-1 VPN Site-2 VPN VPN 7 5 Packet to Z Packet to Z VPN Packet to Z PHP PE-1 2 4 3 5 6 PE-7
  • 55. 55 1 3 2 4 5 6 LDP LDP LDP LDP LDP LDP LDP LDP Domain Initial state: All nodes run LDP, not SR Step1: All nodes are upgraded to SR • in no particular order • Default label imposition preference = LDP • Leave default LDP label imposition preference Step2: All PEs are configured to prefer SR Label imposition • in no particular order Step3: LDP is removed from the nodes in the network • in no particular order Final State: All nodes run SR,Not LDP Simplest Migration: LDP to SR LDP+SR LDP+SR LDP+SR LDP+SR LDP+SR LDP+SR SR segment-routing mpls sr-prefer SR SR SR SR SR SR
  • 56. 56 Local/in lbl Out lblLocal/in lbl Out lblLocal/in lbl Out lblLocal/in lbl Out lbl SRGB 1 2 3 4 5 segment-routing mpls sr-prefer segment-routing mpls (defualt)
  • 57. 57 CB D A Z LDP SR • When a node is LDP capable but its next-hop along the SPT to the destination is not LDP capable • no LDP outgoing label • In this case, the LDP LSP is connected to the prefix segment • C installs the following LDP-to-SR FIB entry: • incoming label: label bound by LDP for FEC Z • outgoing label: prefix segment bound to Z • outgoing interface: D • This entry is derived and installed automatically , no config required Prefix Out Label (LDP), Interface Z 16, 0 Input label(LDP) Out Label (SID), Interface 32 16006, 1 LDP/SR Interworking - LDP to SR
  • 58. 58 Local/in lbl Out lblLocal/in lbl Out lblLocal/in lbl Out lblLocal/in lbl Out lbl SGB 1 2 3 4 5 LDP SR Copy LDP LSP ????16005 1.1.1.5/32 90007 1.1.1.5/32 lbl 90100 SID 16005 1.1.1.5/32
  • 59. 59 CB D A Z Prefix Out Label (SID), Interface Z ?, 0 Input Label(SID) Out Label (LDP), Interface ? 16, 1 16006 LDP/SR Interworking - SR to LDP • When a node is SR capable but its next-hop along the SPT to the destination is not SR capable • no SR outgoing label available • In this case, the prefix segment is connected to the LDP LSP • Any node on the SR/LDP border installs SR-to-LDP FIB entry(ies) LDPSR
  • 60. 60 CB D A Z Prefix Out Label (SID), Interface Z 16006, 0 Input Label(SID) Out Label (LDP), Interface 16006 16, 1 Z(16006) LDP/SR Interworking - Mapping Server • A wants to send traffic to Z, but • Z is not SR-capable, Z does not advertise any prefixSID  which label does A have to use? • The Mapping Server advertises the SID mappings for the non-SR routers • for example, it advertises that Z is 16066 • A and B install a normal SR prefix segment for 16066 • C realizes that its next hop along the SPT to Z is not SR capable hence C installs an SR-to-LDP FIB entry • incoming label: prefix-SID bound to Z (16066) • outgoing label: LDP binding from D for FEC Z • A sends a frame to Z with a single label: 16006 LDPSR
  • 61. 61 Local/in lbl Out lblLocal/in lbl Out lbl Local/in lbl Out lbl 1 2 3 4 5 LDPSR Mapping-Server 1.1.1.5 1.1.1.5/32 Imp-null 1.1.1.5/32 lbl 90090 Local/in lbl Out lbl NA ? Copy 90090
  • 63. 63 • Classic LFA has disadvantages: – Incomplete coverage, topology dependent – Not always providing most optimal backup path Topology Independent LFA (TI-LFA) solves these issues Classic Per-Prefix LFA – disadvantages
  • 65. 65 1 2 3 5 6 7 5 Default Metric : 10 Initial Classic LFA FRR TI-LFA FRR Post-Convergence Dest-1 Dest-2 Classic LFA has partial coverage X Classic LFA is topology dependent: not all topologies provide LFA for all destinations – Depends on network topology and metrics – E.g. Node6 is not an LFA for Dest1 (Node5) on Node2, packets would loop since Node6 uses Node2 to reach Dest1 (Node5) Node2 does not have an LFA for this destination (no  backup path in topology) Topology Independent LFA (TI-LFA) provides 100% coverage 20
  • 66. 66 1 2 3 5 6 7 5 Default Metric : 10 Initial Classic LFA FRR TI-LFA FRR Post-Convergence Dest-1 Dest-2 PE-4 100 100 X Classic LFA and suboptimal path Classic LFA may provide a suboptimal FRR backup path: – This backup path may not be planned for capacity, e.g. P node 2 would use PE4 to protect a core link, while a common planning rule is to avoid using Edge nodes for transit traffic – Additional case specific LFA configuration would be needed to avoid selecting undesired backup paths – Operator would prefer to use the post- convergence path as FRR backup path, aligned with the regular IGP convergence  TI-LFA uses the post-convergence path as FRR backup path
  • 67. 67 A 1 2 5 34 Z Default metric: 10 Packet to Z Prefix-SID Z Packet to Z Prefix-SID Z Packet to Z1000 P-Space Q-Space • TI-LFA for link R1R2 on R1 • Calculate LFA(s) - Compute post-convergence SPT - Encode post-convergence path in a SID-list - In this example R1 forwards the packets towards R5 TI-LFA – Zero-Segment Example
  • 68. 68 A 1 2 34 Z Default metric: 10 Packet to Z Prefix-SID Z Packet to Z Packet to Z Prefix-SID Z Prefix-SID (R4) P-Space Q-Space • TI-LFA for link R1R2 on R1 - Compute post-convergence SPT - Encode post-convergence path in a SID-list - In this example R1 imposes the SID-list <Prefix-SID(R4)> and sends packets towards R5 TI-LFA – Single-Segment Example 5 Packet to Z Prefix-SID Z
  • 69. 69 A 1 2 34 Z Default metric: 10 Packet to Z Prefix-SID Z Packet to Z 1000 Packet to Z Prefix-SID Z Prefix-SID (R4) Adj-SID (R4-R3) Packet to Z Prefix-SID Z Adj-SID (R4-R3) P-Space Q-Space Packet to Z Prefix-SID Z 5 TI-LFA – Double-Segment Example TI-LFA for link R1R2 on R1 - Compute post-convergence SPT - Encode post-convergence path in a SID-list - In this example R1 imposes the SID- list <Prefix-SID(R4), Adj-SID(R4-R3)> and sends packets towards R5
  • 70. 70 A 1 2 5 34 Z Default metric: 10 Packet to Z LDP (1,Z) Packet to Z 1000 Packet to Z Prefix-SID Z LDP (5,4) Adj-SID (R4-R3) Packet to Z Prefix-SID Z Adj-SID (R4-R3) P-Space Q-Space Packet to Z Prefix-SID Z TI-LFA for LDP Traffic
  • 72. 72 • Little deployment and many issues • Not scalable – Core states in k×n2 – No inter-domain • Complex configuration – Tunnel interfaces • Complex steering – PBR, autoroute • Does not support ECMP RSVP-TE
  • 73. 73 • Simple, Automated and Scalable – No core state: state in the packet header – No tunnel interface: “SR Policy” – No head-end a-priori configuration: on-demand policy instantiation – No head-end a-priori steering: automated steering • Multi-Domain – SDN Controller for compute – Binding-SID (BSID) for scale • Lots of Functionality – Designed with lead operators along their use-cases • Provides explicit routing • Supports constraint-based routing • Supports centralized admission control • No RSVP-TE to establish LSPs • Uses existing ISIS / OSPF extensions to advertise link attributes • Supports ECMP • Disjoint Path SRTE
  • 74. 74 5 14 22 9 23 3 13 7 2 21 10 11 T:30 T:30VRF Blue VRF Blue Router-id of NodeX: 1.1.1.X Prefix-SID index of NodeX: X Link address XY: 99.X.Y.X/24 with X < Y Adj-SID XY: 240XY Default IGP Metric: I:10 Default TE Metric: T:10 TE Metric used to express latency 1.1.1.111.1.1.10 1.1.1.3 16003 1.1.1.5 16005 1.1.1.22 16022 1.1.1.7 16007 1.1.1.9 16009 1.1.1.23 16023 Domain-1 ISI-S/SR Domain-2 ISI-S/SR PCC PCC PCC PCC SR PCE Domain-1 ISI-S/SR Domain-2 ISI-S/SR BGP-LS SR PCE PCEP PCEP PCEP PCEP BGP BGP BGP BGP RR 1.1.1.2 1.1.1.21
  • 75. 75 5 14 22 9 23 3 13 7 2 21 10 11 T:30 T:30VRF Blue VRF Blue SR PCE RR BGP: 1.1.1.21/32 via 21 MAP: 1.1.1.21/32 in vrf BLUE must receive low latency service  tag with community (100:777) VPN Label : 99999 MAP: Community (100:777) means “minimize TE Metric” and “compute at PCE” PCreq/reply BSID: 30022 COMPUTE: minimize TE Metric to Node22 RESULT: SID list: OIF: to3 Automated Steering uses color extended communities and nexthop to match with the color and end-point of an SR Policy E.g. BGP route 2/8 with nexthop 1.1.1.1 and color 100 will be steered into an SR Policy with color 100 and end-point 1.1.1.1 If no such SR Policy exists, it can be instantiated automatically (ODN)
  • 77. 77 SRv6 for underlay IPv6 for reach RSVP for FRR/TE Horrendous states scaling in k*N^2SRv6 for Underlay Simplification, FRR, TE, SDN
  • 78. 78 • Multiplicity of protocols and states hinder network economics Opportunity for further simplification IPv6 for reach Simplification, FRR, TE, SDNSRv6 for Underlay Additional Protocol just for tenant IDUDP+VxLAN Overlay Additional Protocol and StateNSH for NFV
  • 79. 79 • IPV6 Header • Next Header (NH) • Indicate what comes next
  • 81. 81 • Generic routing extension header – Defined in RFC 2460 – Next Header: UDP, TCP, IPv6… – Hdr Ext Len: Any IPv6 device can skip this header – Segments Left: Ignore extension header if equal to 0 • Routing Type field: > 0 Source Route (deprecated since 2007) > 1 Nimrod (deprecated since 2009) > 2 Mobility (RFC 6275) > 3 RPL Source Route (RFC 6554) > 4 Segment Routing • NH=Routing Extension
  • 85. 85 • Source node is SR-capable • SR Header (SRH) is created with Segment list in reversed order of the path Segment List [ 0 ] is the LAST segment Segment List [ 𝑛 − 1 ] is the FIRST segment Segments Left is set to 𝑛 − 1 First Segment is set to 𝑛 − 1 • IP DA is set to the first segment • Packet is send according to the IP DA Normal IPv6 forwarding Source Node 4 A4:: 1 A1:: SR Hdr IPv6 Hdr SA = A1::, DA = A2:: ( A4::, A3::, A2:: ) SL=2 Payload 2 A2:: 3 A3:: Version Traffic Class Next = 43 Hop LimitPayload Length Source Address = A1:: Destination Address = A2:: Segment List [ 0 ] = A4:: Segment List [ 1 ] = A3:: Next Header Len= 6 Type = 4 SL = 2 First = 2 Flags TAG IPv6Hdr Segment List [ 2 ] = A2:: SRHdr Payload Flow LabelFlow Label
  • 86. 86 SR Hdr IPv6 Hdr SA = A1::, DA = A2:: ( A4::, A3::, A2:: ) SL=2 Payload • Plain IPv6 forwarding • Solely based on IPv6 DA • No SRH inspection or update Non-SR Transit Node 4 A4:: 1 A1:: 2 A2:: 3 A3::
  • 87. 87 SR Hdr IPv6 Hdr SA = A1::, DA = A3:: ( A4::, A3::, A2:: ) SL=1 Payload • SR Endpoints: SR-capable nodes whose address is in the IP DA • SR Endpoints inspect the SRH and do: IF Segments Left > 0, THEN Decrement Segments Left ( -1 ) Update DA with Segment List [ Segments Left ] Forward according to the new IP DA SR Segment Endpoints Version Traffic Class Next = 43 Hop LimitPayload Length Source Address = A1:: Destination Address = A3:: Segment List [ 0 ] = A4:: Segment List [ 1 ] = A3:: Next Header Len= 6 Type = 4 SL = 1 First = 2 Flags TAG IPv6Hdr Segment List [ 2 ] = A2:: SRHdr Payload Flow LabelFlow Label 4 A4:: A A1:: 2 A2:: 3 A3::
  • 88. 88 SR Hdr IPv6 Hdr SA = A1::, DA = A4:: ( A4::, A3::, A2:: ) SL=0 Payload • SR Endpoints: SR-capable nodes whose address is in the IP DA • SR Endpoints inspect the SRH and do: IF Segments Left > 0, THEN Decrement Segments Left ( -1 ) Update DA with Segment List [ Segments Left ] Forward according to the new IP DA ELSE (Segments Left = 0) Remove the IP and SR header Process the payload: Inner IP: Lookup DA and forward TCP / UDP: Send to socket … SR Segment Endpoints Version Traffic Class Next = 43 Hop LimitPayload Length Source Address = A1:: Destination Address = A4:: Segment List [ 0 ] = A4:: Segment List [ 1 ] = A3:: Next Header Len= 6 Type = 4 SL = 0 First = 2 Flags TAG IPv6Hdr Segment List [ 2 ] = A2:: SRHdr Payload Flow LabelFlow Label 4 A4:: 1 A1:: 2 A2:: 3 A3:: Standard IPv6 processing The final destination does not have to be SR-capable.
  • 89. 89 Deployments around the world • Bell in Canada • Orange • Microsoft • SoftBank • Alibaba • Vodafone • Comcast • China Unicom
  • 90. 90 Rasoul Mesghali : rasoul.mesghali@gmail.com Vahid Tavajjohi : vahid.tavajjohi@gmail.com

Editor's Notes

  • #86: Here we assume that the source node is SR-capable. But of course we are also able to handle cases where the source is either not SR capable or not aware that SR is used at some point in the network. In these cases the SR information can be added later on the path. We'll see that in the use-case section Summarized representation of the headers
  • #87: Before reaching 2, the packet may traverse a non-SR transit node. And this is not a problem at all. Since the IP DA is A2::, the SR H will not even be looked at before reaching 2.
  • #88: At 2 we have reached the IP Destination Address At this point the forwarding engine will look at whatever is next in the packet