SlideShare a Scribd company logo
15-441 Computer Networking
Lecture 8 – IP Packets, Routers
Lecture 8: 9-20-01 2
Outline
• IP Packet Format
• NAT
• IPv6
• Router Internals
• Route Lookup
Lecture 8: 9-20-01 3
IPv4 Header – RFC791 (1981)
ver length
32 bits
data
(variable length,
typically a TCP
or UDP segment)
16-bit identifier
Header
checksum
time to
live
32 bit source IP address
header
length
type of
service
flags
fragment
offset
Protocol
32 bit destination IP address
Options (if any)
0 4 16 24 32
8 19
Padding (if any)
Lecture 8: 9-20-01 4
IP Header Fields
• Version  4 for IPv4
• Header length (in 32 bit words)
• Minimum value is 5 (header without any options)
• Length of entire IP packet in octets (including
header)
• Identifier, flags, fragment offset  used primarily
for fragmentation
• Time to live
• Must be decremented at each router
• Packets with TTL=0 are thrown away
• Ensure packets exit the network
Lecture 8: 9-20-01 5
IP Header Fields
• Protocol
• Demultiplexing to higher layer protocols
• TCP = 6, ICMP = 1, UDP = 17…
• Header checksum
• Ensures some degree of header integrity
• Relatively weak – 16 bit
• Source/Dest address
• Options
• E.g. Source routing, record route, etc.
• Performance issues
• Poorly supported
Lecture 8: 9-20-01 6
IP Type of Service
• Typically ignored
• Values
• 3 bits of precedence
• 1 bit of delay requirements
• 1 bit of throughput requirements
• 1 bit of reliability requirements
• Replaced by DiffServ
Lecture 8: 9-20-01 7
ICMP: Internet Control
Message Protocol
• Used by hosts, routers,
gateways to communication
network-level information
• Error reporting: unreachable
host, network, port, protocol
• Echo request/reply (used by
ping)
• Network-layer “above” IP:
• ICMP msgs carried in IP
datagrams
• ICMP message: type, code plus
first 8 bytes of IP datagram
causing error
Type Code description
0 0 echo reply (ping)
3 0 dest. network unreachable
3 1 dest host unreachable
3 2 dest protocol unreachable
3 3 dest port unreachable
3 6 dest network unknown
3 7 dest host unknown
4 0 source quench (congestion
control - not used)
8 0 echo request (ping)
9 0 route advertisement
10 0 router discovery
11 0 TTL expired
12 0 bad IP header
Lecture 8: 9-20-01 8
Fragmentation
• IP packets can be up to 64KB
• Different link-layers have different MTUs
• Split IP packet into multiple fragments
• IP header on each fragment
• Intermediate router may fragment as needed
Lecture 8: 9-20-01 9
IP Fragmentation & Reassembly
• Network links have MTU
(max.transfer size) - largest
possible link-level frame.
• different link types,
different MTUs
• Large IP datagram divided
(“fragmented”) within net
• one datagram becomes
several datagrams
• IP header bits used to
identify, order related
fragments
fragmentation:
in: one large datagram
out: 3 smaller datagrams
reassembly
Lecture 8: 9-20-01 10
Reassembly
• Where to do reassembly?
• End nodes
• Avoids unnecessary work where large packets
are fragmented multiple times
• Dangerous to do at intermediate nodes
• How much buffer space required at routers?
• What if routes in network change?
• Multiple paths through network
• All fragments only required to go through destination
Lecture 8: 9-20-01 11
Fragmentation Related Fields
• Length
• Length of IP fragment
• Identification
• To match up with other fragments
• Flags
• Don’t fragment flag
• More fragments flag
• Fragment offset
• Where this fragment lies in entire IP datagram
• Measured in 8 octet units (13 bit field)
Lecture 8: 9-20-01 12
IP Fragmentation and Reassembly
ID
=x
offset
=0
fragflag
=0
length
=4000
ID
=x
offset
=0
fragflag
=1
length
=1500
ID
=x
offset
=1480
fragflag
=1
length
=1500
ID
=x
offset
=2960
fragflag
=0
length
=1040
One large datagram becomes
several smaller datagrams
Lecture 8: 9-20-01 13
Fragmentation is Harmful
• Uses resources poorly
• Forwarding costs per packet
• Best if we can send large chunks of data
• Worst case: packet just bigger than MTU
• Poor end-to-end performance
• Loss of a fragment
• Reassembly is hard
• Buffering constraints
Lecture 8: 9-20-01 14
Path MTU Discovery
• Hosts dynamically discover minimum MTU of path
• Algorithm:
• Initialize MTU to MTU for first hop
• Send datagrams with Don’t Fragment bit set
• If ICMP “pkt too big” msg, decrease MTU
• What happens if path changes?
• Periodically (>5mins, or >1min after previous increase),
increase MTU
• Some routers will return proper MTU
• MTU values cached in routing table
Lecture 8: 9-20-01 15
Outline
• IP Packet Format
• NAT
• IPv6
• Router Internals
• Route Lookup
Lecture 8: 9-20-01 16
IP Address Utilization (‘98)
• Address space
depletion
• In danger of running
out of classes A and B
• 32-bit address space
completely allocated
by 2008
• Two solutions
• NAT
• IPv6
Lecture 8: 9-20-01 17
Network Address Translation
(NAT)
• Possible solution to address space exhaustion
• Kludge (but useful)
• Sits between your network and the Internet
• Translates local network layer addresses to global
IP addresses
• Has a pool of global IP addresses (less than
number of hosts on your network)
• Uses special unallocated addresses (RFC 1597)
locally
• 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
Lecture 8: 9-20-01 18
NAT Illustration
Global
Internet
Private
Network
Pool of global IP
addresses
• Operation: Source (S) wants to talk to Destination (D):
• Create Sg-Sp mapping
• Replace Sp with Sg for outgoing packets
• Replace Sg with Sp for incoming packets
• How many hosts can have active transfers at one time?
P
G
Dg Sp Data
NAT
Destination Source
Dg Sg Data
Lecture 8: 9-20-01 19
Problems with NAT
• What if we only have few (or just one) IP
address?
• Use Network Address & Port Translator (NAPT)
• NAPT translates:
• Translates addrprivate + flow info to addrglobal +
new flow info
• Uses TCP/UDP port numbers
• Potentially thousands of simultaneous
connections with one global IP address
Lecture 8: 9-20-01 20
Problems with NAT
• Hides the internal network structure
• Some consider this an advantage
• Some protocols carry addresses
• E.g., FTP carries addresses in text
• What is the problem?
• Must update transport protocol headers
(port number & checksum)
• Encryption
• No inbound connections
Lecture 8: 9-20-01 21
Outline
• IP Packet Format
• NAT
• IPv6
• Router Internals
• Route Lookup
Lecture 8: 9-20-01 22
IPv6
• Primary objective bigger addresses
• Addresses are 128bit  What about header
size!!!
• Simplification
• Header format helps speed
processing/forwarding
• Header changes to facilitate QoS
• Removes infrequently used parts of header
• 40byte fixed size vs. 20+ byte variable
Lecture 8: 9-20-01 23
IPv6 Changes
• IPv6 removes checksum
• Relies on upper layer protocols to provide
integrity
• IPv6 eliminates fragmentation
• Requires path MTU discovery
• Requires 1280 byte MTU
Lecture 8: 9-20-01 24
IPv6 Header
Source Address
Destination Address
0 4 16 24 32
Version Class Flow Label
Payload Length Next Header Hop Limit
12 19
Lecture 8: 9-20-01 25
IPv6 Changes
• TOS replaced with traffic class octet
• Flow label
• Identify datagrams in same “flow.” (concept
of“flow” not well defined)
• Help soft state systems
• Maps well onto TCP connection or stream of
UDP packets on host-port pair
• Easy configuration
• Provides auto-configuration using hardware
MAC address to provide unique base
Lecture 8: 9-20-01 26
IPv6 Changes
• Protocol field replaced by next header field
• Support for protocol demultiplexing as well as option
processing
• Option processing
• Options are added using next header field
• Options header does not need to be processed by
every router
• Large performance improvement
• Makes options practical/useful
• Additional requirements
• Support for security
• Support for mobility
Lecture 8: 9-20-01 27
Transition From IPv4 To IPv6
• Not all routers can be upgraded
simultaneous
• No “flag days”
• How will the network operate with mixed IPv4
and IPv6 routers?
• Two proposed approaches:
• Dual Stack: some routers with dual stack (v6,
v4) can “translate” between formats
• Tunneling: IPv6 carried as payload n IPv4
datagram among IPv4 routers
Lecture 8: 9-20-01 28
Dual Stack Approach
Lecture 8: 9-20-01 29
Tunneling
IPv6 inside IPv4 where needed
Lecture 8: 9-20-01 30
Outline
• IP Packet Format
• NAT
• IPv6
• Router Internals
• Route Lookup
Lecture 8: 9-20-01 31
Router Architecture Overview
Two key router functions:
• Run routing algorithms/protocol (RIP, OSPF, BGP)
• Switching datagrams from incoming to outgoing link
Lecture 8: 9-20-01 32
What Does a Router Look Like?
• Line cards
• Network interface cards
• Forwarding engine
• Fast path routing (hardware vs. software)
• Usually on line card
• Backplane
• Switch or bus interconnect
• Processor
• Handles routing protocols, error conditions
Lecture 8: 9-20-01 33
Router Processing
• Packet arrives arrives at inbound line card
• Header processed by forwarding engine
• Forwarding engine determines output line
card/destination
• Checksum updated but not checked
• Packet copied to outbound line card
• Odd situations sent to network processor
Lecture 8: 9-20-01 34
Network Processor
• Runs routing protocol and downloads
forwarding table to forwarding engines
• Performs “slow” path processing
• ICMP error messages
• IP option processing
• Fragmentation
• Packets destined to router
Lecture 8: 9-20-01 35
Three Types of Switching Fabrics
Lecture 8: 9-20-01 36
Switching Via Memory
First generation routers:
• Packet copied by system’s (single) CPU
• Speed limited by memory bandwidth (2 bus crossings
per datagram)
Input
Port
Output
Port
Memory
System Bus
Modern routers:
• Input port processor performs lookup, copy into
memory
• Cisco Catalyst 8500
Lecture 8: 9-20-01 37
Switching Via Bus
• Datagram from input port
memory to output port
memory via a shared bus
• Bus contention: switching
speed limited by bus
bandwidth
• 1 Gbps bus, Cisco 1900:
sufficient speed for access
and enterprise routers (not
regional or backbone)
Lecture 8: 9-20-01 38
Switching Via An Interconnection
Network
• Overcome bus bandwidth limitations
• Crossbar provides full NxN interconnect
• Expensive
• Banyan networks, other interconnection nets
initially developed to connect processors in
multiprocessor
• Typically less capable than complete crossbar
• Cisco 12000: switches Gbps through the
interconnection network
Lecture 8: 9-20-01 39
Switch Design Issues
• Suppose we have N inputs and M outputs
• Multiple packets for same output – output contention
• Switch contention – switching fabric cannot support
arbitrary set of transfers
• I.e, not a full crossbar
• Solution – buffer packets when/where needed
• What happens when these buffers fill up?
• Packets are THROWN AWAY!! This is where packet
loss comes from
Lecture 8: 9-20-01 40
Input Port Functions
Decentralized switching:
• Given datagram dest., lookup output port using
routing table in input port memory
• Goal: complete input port processing at ‘line
speed’
• Needed if datagrams arrive faster than
forwarding rate into switch fabric
Physical layer:
bit-level reception
Data link layer:
e.g., Ethernet
Lecture 8: 9-20-01 41
Output Ports
• Queuing required when datagrams arrive from
fabric faster than the line transmission rate
Lecture 8: 9-20-01 42
Switch Buffering
• 3 types of switch buffering
• Input buffering
• Fabric slower than input ports combined  queuing may occur
at input queues
• Can avoid any input queuing by making switch speed = N x link
speed
• Output buffering
• Buffering when arrival rate via switch exceeds output line
speed
• Internal buffering
• Can have buffering inside switch fabric to deal with limitations
of fabric
Lecture 8: 9-20-01 43
Input Port Queuing
• Which inputs are processed each slot –
schedule?
• Head-of-the-Line (HOL) blocking: datagram at
front of queue prevents others in queue from
moving forward
Lecture 8: 9-20-01 44
Output Port Queuing
• Scheduling discipline chooses among queued
datagrams for transmission
• Can be simple (e.g., first-come first-serve) or more
clever (e.g., weighted round robin)
Lecture 8: 9-20-01 45
Virtual Output Queuing
• Maintain per output buffer at input
• Solves head of line blocking problem
• Each of MxN input buffer places bid for
output
• Challenge: map bids to schedule of
interconnect transfers
Lecture 8: 9-20-01 46
Outline
• IP Packet Format
• NAT
• IPv6
• Router Internals
• Route Lookup
Lecture 8: 9-20-01 47
How To Do Variable Prefix Match
128.2/16
10
16
19
128.32/16
128.32.130/240 128.32.150/24
default
0/0
0
• Traditional method – Patricia Tree
• Arrange route entries into a series of bit tests
• Worst case = 32 bit tests
• Problem: memory speed is a bottleneck
Bit to test – 0 = left child,1 = right child
Lecture 8: 9-20-01 48
Speeding up Prefix Match -
Alternatives
• Content addressable memory (CAM)
• Hardware based route lookup
• Input = tag, output = value associated with tag
• Requires exact match with tag
• Multiple cycles (1 per prefix searched) with single
CAM
• Multiple CAMs (1 per prefix) searched in parallel
• Ternary CAM
• 0,1,don’t care values in tag match
• Priority (I.e. longest prefix) by order of entries in
CAM
Lecture 8: 9-20-01 49
Speeding up Prefix Match
• Cut prefix tree at 16/24/32 bit depth
• Fill in prefix tree entries by creating extra entries
• Entries contain output interface for route
• Add special value to indicate that there are deeper tree
entries
• Only keep 24/32 bit cuts as needed
• Example cut prefix tree at 16 bit depth
• 64K entries!!
• Use a variety of clever techniques to compress space
taken
Lecture 8: 9-20-01 50
Prefix Tree
1
0
1 1 1 5 5 X 7 3 3 3 3 X X 9 5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Port 1 Port 5 Port 7
Port 3
Port 9
Port 5
Lecture 8: 9-20-01 51
Prefix Tree
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Subtree 1 Subtree 2
Subtree 3
1 1 1 1 5 5 X 7 3 3 3 3 X X 9 5
Lecture 8: 9-20-01 52
Speeding up Prefix Match
• Scaling issues
• How would it handle IPv6
• Other possibilities
• Why were the cuts done at 16/24/32 bits?
Lecture 8: 9-20-01 53
Speeding up Prefix Match -
Alternatives
• Route caches
• Packet trains  group of packets belonging to
same flow
• Temporal locality
• Many packets to same destination
• Other algorithms
• Bremler-Barr – Sigcomm 99
• Clue = prefix length matched at previous hop
• Why is this useful?

More Related Content

PPTX
Networking essentials lect2
PDF
Computer network (12)
PPT
Lecture1, TCP/IP
PPTX
Introduction to IP
PPT
tcpip.ppt
PDF
CSE3213_17_Network Layer in OSI model_IP_F2010.pdf
PPT
Network Layer And I Pv6
PPTX
10 coms 525 tcpip - internet protocol - ip
Networking essentials lect2
Computer network (12)
Lecture1, TCP/IP
Introduction to IP
tcpip.ppt
CSE3213_17_Network Layer in OSI model_IP_F2010.pdf
Network Layer And I Pv6
10 coms 525 tcpip - internet protocol - ip

Similar to lecture08.ppt (20)

PPTX
Network_Layer_and_Internet_Protocols_IPv.pptx
PDF
MULTIMEDIA COMMUNICATION & NETWORKS
PPT
PPT
Ippptspk 3
PDF
CN 5151(15) Module II part 2 13082020.pdf
PPTX
Network Layer
PPTX
Gohil-Network layer & Address Resolution Protocol.pptx
PDF
IP Datagram Structure
PPT
ch4_2ed_31july2002SamirAdditions ipp address.ppt
PPT
ch4_ip address-2ed_31july2002SamirAdditions.ppt
PPT
IP Addressing for the extereme beggeners
PPTX
sandhiya
PPT
IP Addressing.ppt
PDF
IT Networks and Vulnarabilities .pdf
PPTX
Dik acn presentation
PPTX
Network.pptx
PPT
Networking and data communication IP.ppt
PPTX
Network Layer
PPTX
474-22-DatagramForwarding.pptx
PPTX
Ip protocals subnetworking
Network_Layer_and_Internet_Protocols_IPv.pptx
MULTIMEDIA COMMUNICATION & NETWORKS
Ippptspk 3
CN 5151(15) Module II part 2 13082020.pdf
Network Layer
Gohil-Network layer & Address Resolution Protocol.pptx
IP Datagram Structure
ch4_2ed_31july2002SamirAdditions ipp address.ppt
ch4_ip address-2ed_31july2002SamirAdditions.ppt
IP Addressing for the extereme beggeners
sandhiya
IP Addressing.ppt
IT Networks and Vulnarabilities .pdf
Dik acn presentation
Network.pptx
Networking and data communication IP.ppt
Network Layer
474-22-DatagramForwarding.pptx
Ip protocals subnetworking
Ad

Recently uploaded (20)

PPTX
Introduction to Information and Communication Technology
PDF
Sims 4 Historia para lo sims 4 para jugar
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PPT
Ethics in Information System - Management Information System
PPTX
E -tech empowerment technologies PowerPoint
PPTX
international classification of diseases ICD-10 review PPT.pptx
PPTX
artificialintelligenceai1-copy-210604123353.pptx
PPTX
newyork.pptxirantrafgshenepalchinachinane
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PDF
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PDF
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
PPTX
PptxGenJS_Demo_Chart_20250317130215833.pptx
PDF
Paper PDF World Game (s) Great Redesign.pdf
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PPTX
presentation_pfe-universite-molay-seltan.pptx
PPT
tcp ip networks nd ip layering assotred slides
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
PDF
Cloud-Scale Log Monitoring _ Datadog.pdf
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Introduction to Information and Communication Technology
Sims 4 Historia para lo sims 4 para jugar
WebRTC in SignalWire - troubleshooting media negotiation
Ethics in Information System - Management Information System
E -tech empowerment technologies PowerPoint
international classification of diseases ICD-10 review PPT.pptx
artificialintelligenceai1-copy-210604123353.pptx
newyork.pptxirantrafgshenepalchinachinane
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
Module 1 - Cyber Law and Ethics 101.pptx
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
PptxGenJS_Demo_Chart_20250317130215833.pptx
Paper PDF World Game (s) Great Redesign.pdf
Introuction about ICD -10 and ICD-11 PPT.pptx
presentation_pfe-universite-molay-seltan.pptx
tcp ip networks nd ip layering assotred slides
Tenda Login Guide: Access Your Router in 5 Easy Steps
Cloud-Scale Log Monitoring _ Datadog.pdf
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Ad

lecture08.ppt

  • 1. 15-441 Computer Networking Lecture 8 – IP Packets, Routers
  • 2. Lecture 8: 9-20-01 2 Outline • IP Packet Format • NAT • IPv6 • Router Internals • Route Lookup
  • 3. Lecture 8: 9-20-01 3 IPv4 Header – RFC791 (1981) ver length 32 bits data (variable length, typically a TCP or UDP segment) 16-bit identifier Header checksum time to live 32 bit source IP address header length type of service flags fragment offset Protocol 32 bit destination IP address Options (if any) 0 4 16 24 32 8 19 Padding (if any)
  • 4. Lecture 8: 9-20-01 4 IP Header Fields • Version  4 for IPv4 • Header length (in 32 bit words) • Minimum value is 5 (header without any options) • Length of entire IP packet in octets (including header) • Identifier, flags, fragment offset  used primarily for fragmentation • Time to live • Must be decremented at each router • Packets with TTL=0 are thrown away • Ensure packets exit the network
  • 5. Lecture 8: 9-20-01 5 IP Header Fields • Protocol • Demultiplexing to higher layer protocols • TCP = 6, ICMP = 1, UDP = 17… • Header checksum • Ensures some degree of header integrity • Relatively weak – 16 bit • Source/Dest address • Options • E.g. Source routing, record route, etc. • Performance issues • Poorly supported
  • 6. Lecture 8: 9-20-01 6 IP Type of Service • Typically ignored • Values • 3 bits of precedence • 1 bit of delay requirements • 1 bit of throughput requirements • 1 bit of reliability requirements • Replaced by DiffServ
  • 7. Lecture 8: 9-20-01 7 ICMP: Internet Control Message Protocol • Used by hosts, routers, gateways to communication network-level information • Error reporting: unreachable host, network, port, protocol • Echo request/reply (used by ping) • Network-layer “above” IP: • ICMP msgs carried in IP datagrams • ICMP message: type, code plus first 8 bytes of IP datagram causing error Type Code description 0 0 echo reply (ping) 3 0 dest. network unreachable 3 1 dest host unreachable 3 2 dest protocol unreachable 3 3 dest port unreachable 3 6 dest network unknown 3 7 dest host unknown 4 0 source quench (congestion control - not used) 8 0 echo request (ping) 9 0 route advertisement 10 0 router discovery 11 0 TTL expired 12 0 bad IP header
  • 8. Lecture 8: 9-20-01 8 Fragmentation • IP packets can be up to 64KB • Different link-layers have different MTUs • Split IP packet into multiple fragments • IP header on each fragment • Intermediate router may fragment as needed
  • 9. Lecture 8: 9-20-01 9 IP Fragmentation & Reassembly • Network links have MTU (max.transfer size) - largest possible link-level frame. • different link types, different MTUs • Large IP datagram divided (“fragmented”) within net • one datagram becomes several datagrams • IP header bits used to identify, order related fragments fragmentation: in: one large datagram out: 3 smaller datagrams reassembly
  • 10. Lecture 8: 9-20-01 10 Reassembly • Where to do reassembly? • End nodes • Avoids unnecessary work where large packets are fragmented multiple times • Dangerous to do at intermediate nodes • How much buffer space required at routers? • What if routes in network change? • Multiple paths through network • All fragments only required to go through destination
  • 11. Lecture 8: 9-20-01 11 Fragmentation Related Fields • Length • Length of IP fragment • Identification • To match up with other fragments • Flags • Don’t fragment flag • More fragments flag • Fragment offset • Where this fragment lies in entire IP datagram • Measured in 8 octet units (13 bit field)
  • 12. Lecture 8: 9-20-01 12 IP Fragmentation and Reassembly ID =x offset =0 fragflag =0 length =4000 ID =x offset =0 fragflag =1 length =1500 ID =x offset =1480 fragflag =1 length =1500 ID =x offset =2960 fragflag =0 length =1040 One large datagram becomes several smaller datagrams
  • 13. Lecture 8: 9-20-01 13 Fragmentation is Harmful • Uses resources poorly • Forwarding costs per packet • Best if we can send large chunks of data • Worst case: packet just bigger than MTU • Poor end-to-end performance • Loss of a fragment • Reassembly is hard • Buffering constraints
  • 14. Lecture 8: 9-20-01 14 Path MTU Discovery • Hosts dynamically discover minimum MTU of path • Algorithm: • Initialize MTU to MTU for first hop • Send datagrams with Don’t Fragment bit set • If ICMP “pkt too big” msg, decrease MTU • What happens if path changes? • Periodically (>5mins, or >1min after previous increase), increase MTU • Some routers will return proper MTU • MTU values cached in routing table
  • 15. Lecture 8: 9-20-01 15 Outline • IP Packet Format • NAT • IPv6 • Router Internals • Route Lookup
  • 16. Lecture 8: 9-20-01 16 IP Address Utilization (‘98) • Address space depletion • In danger of running out of classes A and B • 32-bit address space completely allocated by 2008 • Two solutions • NAT • IPv6
  • 17. Lecture 8: 9-20-01 17 Network Address Translation (NAT) • Possible solution to address space exhaustion • Kludge (but useful) • Sits between your network and the Internet • Translates local network layer addresses to global IP addresses • Has a pool of global IP addresses (less than number of hosts on your network) • Uses special unallocated addresses (RFC 1597) locally • 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
  • 18. Lecture 8: 9-20-01 18 NAT Illustration Global Internet Private Network Pool of global IP addresses • Operation: Source (S) wants to talk to Destination (D): • Create Sg-Sp mapping • Replace Sp with Sg for outgoing packets • Replace Sg with Sp for incoming packets • How many hosts can have active transfers at one time? P G Dg Sp Data NAT Destination Source Dg Sg Data
  • 19. Lecture 8: 9-20-01 19 Problems with NAT • What if we only have few (or just one) IP address? • Use Network Address & Port Translator (NAPT) • NAPT translates: • Translates addrprivate + flow info to addrglobal + new flow info • Uses TCP/UDP port numbers • Potentially thousands of simultaneous connections with one global IP address
  • 20. Lecture 8: 9-20-01 20 Problems with NAT • Hides the internal network structure • Some consider this an advantage • Some protocols carry addresses • E.g., FTP carries addresses in text • What is the problem? • Must update transport protocol headers (port number & checksum) • Encryption • No inbound connections
  • 21. Lecture 8: 9-20-01 21 Outline • IP Packet Format • NAT • IPv6 • Router Internals • Route Lookup
  • 22. Lecture 8: 9-20-01 22 IPv6 • Primary objective bigger addresses • Addresses are 128bit  What about header size!!! • Simplification • Header format helps speed processing/forwarding • Header changes to facilitate QoS • Removes infrequently used parts of header • 40byte fixed size vs. 20+ byte variable
  • 23. Lecture 8: 9-20-01 23 IPv6 Changes • IPv6 removes checksum • Relies on upper layer protocols to provide integrity • IPv6 eliminates fragmentation • Requires path MTU discovery • Requires 1280 byte MTU
  • 24. Lecture 8: 9-20-01 24 IPv6 Header Source Address Destination Address 0 4 16 24 32 Version Class Flow Label Payload Length Next Header Hop Limit 12 19
  • 25. Lecture 8: 9-20-01 25 IPv6 Changes • TOS replaced with traffic class octet • Flow label • Identify datagrams in same “flow.” (concept of“flow” not well defined) • Help soft state systems • Maps well onto TCP connection or stream of UDP packets on host-port pair • Easy configuration • Provides auto-configuration using hardware MAC address to provide unique base
  • 26. Lecture 8: 9-20-01 26 IPv6 Changes • Protocol field replaced by next header field • Support for protocol demultiplexing as well as option processing • Option processing • Options are added using next header field • Options header does not need to be processed by every router • Large performance improvement • Makes options practical/useful • Additional requirements • Support for security • Support for mobility
  • 27. Lecture 8: 9-20-01 27 Transition From IPv4 To IPv6 • Not all routers can be upgraded simultaneous • No “flag days” • How will the network operate with mixed IPv4 and IPv6 routers? • Two proposed approaches: • Dual Stack: some routers with dual stack (v6, v4) can “translate” between formats • Tunneling: IPv6 carried as payload n IPv4 datagram among IPv4 routers
  • 28. Lecture 8: 9-20-01 28 Dual Stack Approach
  • 29. Lecture 8: 9-20-01 29 Tunneling IPv6 inside IPv4 where needed
  • 30. Lecture 8: 9-20-01 30 Outline • IP Packet Format • NAT • IPv6 • Router Internals • Route Lookup
  • 31. Lecture 8: 9-20-01 31 Router Architecture Overview Two key router functions: • Run routing algorithms/protocol (RIP, OSPF, BGP) • Switching datagrams from incoming to outgoing link
  • 32. Lecture 8: 9-20-01 32 What Does a Router Look Like? • Line cards • Network interface cards • Forwarding engine • Fast path routing (hardware vs. software) • Usually on line card • Backplane • Switch or bus interconnect • Processor • Handles routing protocols, error conditions
  • 33. Lecture 8: 9-20-01 33 Router Processing • Packet arrives arrives at inbound line card • Header processed by forwarding engine • Forwarding engine determines output line card/destination • Checksum updated but not checked • Packet copied to outbound line card • Odd situations sent to network processor
  • 34. Lecture 8: 9-20-01 34 Network Processor • Runs routing protocol and downloads forwarding table to forwarding engines • Performs “slow” path processing • ICMP error messages • IP option processing • Fragmentation • Packets destined to router
  • 35. Lecture 8: 9-20-01 35 Three Types of Switching Fabrics
  • 36. Lecture 8: 9-20-01 36 Switching Via Memory First generation routers: • Packet copied by system’s (single) CPU • Speed limited by memory bandwidth (2 bus crossings per datagram) Input Port Output Port Memory System Bus Modern routers: • Input port processor performs lookup, copy into memory • Cisco Catalyst 8500
  • 37. Lecture 8: 9-20-01 37 Switching Via Bus • Datagram from input port memory to output port memory via a shared bus • Bus contention: switching speed limited by bus bandwidth • 1 Gbps bus, Cisco 1900: sufficient speed for access and enterprise routers (not regional or backbone)
  • 38. Lecture 8: 9-20-01 38 Switching Via An Interconnection Network • Overcome bus bandwidth limitations • Crossbar provides full NxN interconnect • Expensive • Banyan networks, other interconnection nets initially developed to connect processors in multiprocessor • Typically less capable than complete crossbar • Cisco 12000: switches Gbps through the interconnection network
  • 39. Lecture 8: 9-20-01 39 Switch Design Issues • Suppose we have N inputs and M outputs • Multiple packets for same output – output contention • Switch contention – switching fabric cannot support arbitrary set of transfers • I.e, not a full crossbar • Solution – buffer packets when/where needed • What happens when these buffers fill up? • Packets are THROWN AWAY!! This is where packet loss comes from
  • 40. Lecture 8: 9-20-01 40 Input Port Functions Decentralized switching: • Given datagram dest., lookup output port using routing table in input port memory • Goal: complete input port processing at ‘line speed’ • Needed if datagrams arrive faster than forwarding rate into switch fabric Physical layer: bit-level reception Data link layer: e.g., Ethernet
  • 41. Lecture 8: 9-20-01 41 Output Ports • Queuing required when datagrams arrive from fabric faster than the line transmission rate
  • 42. Lecture 8: 9-20-01 42 Switch Buffering • 3 types of switch buffering • Input buffering • Fabric slower than input ports combined  queuing may occur at input queues • Can avoid any input queuing by making switch speed = N x link speed • Output buffering • Buffering when arrival rate via switch exceeds output line speed • Internal buffering • Can have buffering inside switch fabric to deal with limitations of fabric
  • 43. Lecture 8: 9-20-01 43 Input Port Queuing • Which inputs are processed each slot – schedule? • Head-of-the-Line (HOL) blocking: datagram at front of queue prevents others in queue from moving forward
  • 44. Lecture 8: 9-20-01 44 Output Port Queuing • Scheduling discipline chooses among queued datagrams for transmission • Can be simple (e.g., first-come first-serve) or more clever (e.g., weighted round robin)
  • 45. Lecture 8: 9-20-01 45 Virtual Output Queuing • Maintain per output buffer at input • Solves head of line blocking problem • Each of MxN input buffer places bid for output • Challenge: map bids to schedule of interconnect transfers
  • 46. Lecture 8: 9-20-01 46 Outline • IP Packet Format • NAT • IPv6 • Router Internals • Route Lookup
  • 47. Lecture 8: 9-20-01 47 How To Do Variable Prefix Match 128.2/16 10 16 19 128.32/16 128.32.130/240 128.32.150/24 default 0/0 0 • Traditional method – Patricia Tree • Arrange route entries into a series of bit tests • Worst case = 32 bit tests • Problem: memory speed is a bottleneck Bit to test – 0 = left child,1 = right child
  • 48. Lecture 8: 9-20-01 48 Speeding up Prefix Match - Alternatives • Content addressable memory (CAM) • Hardware based route lookup • Input = tag, output = value associated with tag • Requires exact match with tag • Multiple cycles (1 per prefix searched) with single CAM • Multiple CAMs (1 per prefix) searched in parallel • Ternary CAM • 0,1,don’t care values in tag match • Priority (I.e. longest prefix) by order of entries in CAM
  • 49. Lecture 8: 9-20-01 49 Speeding up Prefix Match • Cut prefix tree at 16/24/32 bit depth • Fill in prefix tree entries by creating extra entries • Entries contain output interface for route • Add special value to indicate that there are deeper tree entries • Only keep 24/32 bit cuts as needed • Example cut prefix tree at 16 bit depth • 64K entries!! • Use a variety of clever techniques to compress space taken
  • 50. Lecture 8: 9-20-01 50 Prefix Tree 1 0 1 1 1 5 5 X 7 3 3 3 3 X X 9 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Port 1 Port 5 Port 7 Port 3 Port 9 Port 5
  • 51. Lecture 8: 9-20-01 51 Prefix Tree 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Subtree 1 Subtree 2 Subtree 3 1 1 1 1 5 5 X 7 3 3 3 3 X X 9 5
  • 52. Lecture 8: 9-20-01 52 Speeding up Prefix Match • Scaling issues • How would it handle IPv6 • Other possibilities • Why were the cuts done at 16/24/32 bits?
  • 53. Lecture 8: 9-20-01 53 Speeding up Prefix Match - Alternatives • Route caches • Packet trains  group of packets belonging to same flow • Temporal locality • Many packets to same destination • Other algorithms • Bremler-Barr – Sigcomm 99 • Clue = prefix length matched at previous hop • Why is this useful?