SlideShare a Scribd company logo
DPDK	Summit	- San	Jose	– 2017
#DPDKSummit
Accelerating Packet Processing with
GRO and GSO in DPDK
2#DPDKSummit
Packet Processing Overheads
u A	major	part	of	packet	processing	“applications”	are	performed	on	a	per-
packet	basis	(e.g.	Firewall,	TCP/IPv4	stack)
u Per-packet	routines	(e.g.	header	processing)	dominate	packet	processing	
overheads
u Reduce	the	packet	number to	be	processed	can	mitigate	the	per-packet	
overhead
3
Methods to Reduce the Packet Number
u Use	large	Maximum	Transmission	Unit	(MTU)
u MTU	size	depends	on	physical	links L
u Use	network	interface	card	(NIC)	features:	Large	Receive	Offload	(LRO),	
TCP	Segmentation	Offload	(TSO)	and	UDP	Fragmentation	Offload	(UFO)
Application
...
fewer large
packets
Egress
direction: large packets
LRO
NIC
Ingress
direction:
...
TSO/UFO
Drawbacks:
• Only support TCP and UDP
• Not all NICs support LRO/TSO/UFO
4#DPDKSummit
GRO and GSO
u Software Method:
• Generic Receive Offload (GRO) merges small packets into larger ones
• Generic Segmentation Offload (GSO) splits large packets into smaller ones
u Advantages:
• Don’t reply on physical link support
• Don’t reply on NIC hardwaresupport
• Able to support various kinds of protocols,like TCP, VxLAN and GRE
5
u In	DPDK,	GRO	and	GSO	are	two	standalone	libraries
GRO and GSO in DPDK
DPDK	Application
#DPDKSummit
NIC
DPDK PMD
DPDK GRO Library
NIC
DPDK PMD
DPDK GSO Library
Ingress direction Egress direction
6
Library Framework
#DPDKSummit
…
GRO API
IPv4-GRE-TCP/IPv4
GRO
Packet type:
IPv4-GRE-IPv4-TCP
Packet type:
IPv4-TCP
rte_gro_ctx_create()
rte_gro_ctx_destroy()
rte_gro_reassemble…()
rte_gro_timeout_flush()
rte_gro_get_pkt_count()
TCP/IPv4 GRO
GSO API
...IPv4-GRE-TCP/IPv4
GSO
TCP/IPv4 GSO
ol_flags:
PKT_TX_TCP_SEG| PKT_TX_IPV4
ol_flags:
PKT_TX_TCP_SEG |
PKT_TX_IPV4 |
PKT_TX_OUTER_IPV4 |
PKT_TX_TUNNEL_GRE
rte_gso_segment(pkt)
• One	GRO	type,	one	kind	of	packets
• Indicated	by	MBUF->packet_type
• One	GSO	type,	one	kind	of	packets
• Indicated	by	MBUF->ol_flags
7
API: How Applications Use it?
nb_pkts = rte_eth_rx_burst(…, pkts, …);
nb_segs = rte_gro_reassemble_burst(pkts, nb_pkts, …);
rte_eth_tx_burst(…, pkts, nb_segs);
struct rte_mbuf *gso_segs[N];
nb_segs = rte_gso_segment(…, pkt, gso_segs, …);
rte_eth_tx_burst(…, gso_segs, nb_segs);
GRO Sample
GSO Sample
8
API: Two Sets in GRO
#DPDKSummit
ctx = rte_gro_ctx_create()
rte_gro_reassemble(pkts, ctx)
rte_gro_timeout_flush(ctx)
Heavyweight Mode API
• Supports to merge a large
number of packets with fine-
grained control
rte_gro_reassemble_burst()
• Support to merge a small
number of packets rapidly
Lightweight Mode API
9
How to Merge and Split Packets?
u GRO	Reassembly	Algorithm	merges	packets
u GSO	Segmentation	Scheme	splits	packets
10
GRO Algorithm: Challenges
u A	high	cost	algorithm/implementation	would	cause	packet	dropping	in	a	
high	speed	network
u Packet	reordering	makes	it	hard	to	merge	packets
u Linux	GRO	fails	to	merge	packets	when	encounters	packet	reordering
#DPDKSummit
Algorithm	is	lightweight	to	scale	fast	networking	speed
Algorithm	is	capable	of	handling	packet	reordering
11
GRO Algorithm: Key-based Approach
#DPDKSummit
Categorize into an
existed “flow”
Search for a “neighbor” in
the “flow”
Merge the packets
• Insert a new “flow”
• Store the packet
Store the packet in
the “flow”
packet
find a “flow”
not
find
not
find
find a neighbor
TCP/IPv4 Packets
Header fields representing a “flow”:
• src/dst: mac, IP, port
• ACK number
Header fields deciding “neighbor”:
• IP id
• Sequence number
same
value
incremental
12
GRO Algorithm: Key-based Approach
#DPDKSummit
Two Characters
• Lightweight: classify packets
into “flows” to accelerate
packet aggregation is simple
• More: storing out-of-order
packets makes it possible to
merge later
Address challenge 1 and 2
Categorize into an
existed “flow”
Search for a “neighbor” in
the “flow”
Merge the packets
• Insert a new “flow”
• Store the packet
Store the packet in
the “flow”
packet
find a “flow”
not
find
not
find
find a neighbor
13
GRO Algorithm: TCP/IPv4 Example
#DPDKSummit
Flow MAC IP Port ACK
Number
F1 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5041
5043
1
F2 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5001
5002
1
Packet IP ID Sequence
number
Payload
length
P0 1 1 100
P2 3 301 100
P3
Flow IP ID Sequence
number
Payload
length
F2 4 401 100

‚
Packet IP ID Sequence
number
Payload
length
P1 7 701 1500
14#DPDKSummit
Flow MAC IP Port ACK
Number
F1 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5041
5043
1
F2 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5001
5002
1
Packets IP ID Sequence
number
Payload
length
P0 1 1 100
P2 3 301 100
P3
Flow IP ID Sequence
number
Payload
length
F2 4 401 100
Packets IP ID Sequence
number
Payload
length
P1 7 701 1500
IP	ID	and	sequence	number	
are	incremental	à Neighbors
GRO Algorithm: TCP/IPv4 Example
15#DPDKSummit
Flow MAC IP Port ACK
Numbe
r
f0 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5041
5043
1
f1 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5001
5002
1
Packets IP ID Sequence
number
Payload
length
P0 1 1 100
P2 4 301 200
Packets IP ID Sequence
number
Payload
length
P1 7 701 1500
ƒ
GRO Algorithm: TCP/IPv4 Example
16#DPDKSummit
GSO Scheme: Overview
u Segmentation	Workflow
Split the payload into
smaller parts
Add the packet header
to each payload part
Update packet headers
for all GSO segments
an input packet
GSO Segments
How to organize a
GSO segment?
17#DPDKSummit
GSO Scheme: Zero-copy Based
Approach
u “Two-part”	MBUF	is	used	to	organize	a	GSO	segment
• Direct	MBUF	holds the	packet	header
• Indirect	MBUF:	“pointer”,	point	to	a	payload	part	of	the	packet	to	segment
Data room
Direct MBUF
MBUF
Meta
data
Indirect MBUF 1
MBUF
Meta
datanext next…
Indirect MBUF N
Forming	a	GSO	segment	just	needs	to	copy	the	header!
MBUF
Meta
data
18
PYLD 1
#DPDKSummit
GSO Scheme: Example
PKT
HDR
MBUF
Meta
data next
PYLD 0HDRPKT
MBUF
Meta
data
 Copy HDR to direct MBUF
ƒ Reduce reference counter of
PKT’s MBUF by 1
• PKT will be freed
automatically, When all
indirect MBUFs are freed
MBUF
Meta
data
‚ “Attach” indirect MBUF to PKT
and make it point to the
correct payload part
PKT
HDR
MBUF
Meta
data next
MBUF
Meta
data
19
Host Kernel
#DPDKSummit
Experiment: Physical Topology
Switch (testpmd)
VM
vhost-userNIC 1
virtio-netNIC 0
logically
connected
Physically
connected
Server 0
iperf iperf
TCP/IPv4
packets
20
NIC 1
iperf client
Host Kernel
#DPDKSummit
Experiment: Setup
Testpmd
iperf server
VM
vhost-userNIC 1
virtio-netNIC 0
Small
TCP/IPv4
packets
iperf server
Host Kernel
Testpmd
iperf client
VM
vhost-user
GSO
virtio-netNIC 0
large
TCP/IPv4
packets
GRO
Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
Ethernet Controller XL710 for 40GbE QSFP+
DPDK GRO DPDK GSO
Linux GRO Linux GSO
21
0
0.5
1
1.5
2
2.5
3
1 TCP
Connection
2 TCP
Connections
4 TCP
Connections
Speedup#DPDKSummit
Experiment: GRO Performance
0
0.5
1
1.5
2
2.5
3
1 TCP
Connection
2 TCP
Connections
4 TCP
Connections
Speedup
Figure 1. Iperf throughput Improvement of
DPDK-GRO over No-GRO
Figure 2. Iperf throughput Improvement of
DPDK-GRO over Linux-GRO
2.7x 1.9x
22
0
0.5
1
1.5
2
2.5
3
1 TCP
Connection
2 TCP
Connections
4 TCP
Connections
Speedup
#DPDKSummit
Experiment: GSO Performance
0
0.5
1
1.5
2
2.5
1 TCP
Connection
2 TCP
Connections
4 TCP
Connections
Speedup
Figure 1. Iperf throughput Improvement of
DPDK-GSO over No-GSO
Figure 2. Iperf throughput Improvement of
DPDK-GSO over Linux-GSO
2.2x 1.5x
Thanks!
Jiayu	Hu
jiayu.hu@intel.com

More Related Content

PDF
Gitlab, GitOps & ArgoCD
ODP
OpenGurukul : Database : PostgreSQL
PDF
Google Cloud platform: GKE with CI/CD using CircleCI and Flux
PDF
Serving models using KFServing
PDF
Apache pulsar - storage architecture
PDF
Latency and Consistency Tradeoffs in Modern Distributed Databases
PDF
PostgreSQL and RAM usage
PPTX
PostgreSQL and CockroachDB SQL
Gitlab, GitOps & ArgoCD
OpenGurukul : Database : PostgreSQL
Google Cloud platform: GKE with CI/CD using CircleCI and Flux
Serving models using KFServing
Apache pulsar - storage architecture
Latency and Consistency Tradeoffs in Modern Distributed Databases
PostgreSQL and RAM usage
PostgreSQL and CockroachDB SQL

What's hot (20)

PPTX
[140315 박민근] 젠킨스를 이용한 자동빌드 시스템 구축하기(ci)
PDF
GitOps with ArgoCD
PPTX
MicroServices with Containers, Kubernetes & ServiceMesh
PDF
Performance Analysis of Apache Spark and Presto in Cloud Environments
PDF
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
PDF
Kubernetes in Docker
PPT
Monitoring using Prometheus and Grafana
PPTX
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
PDF
Integrating Existing C++ Libraries into PySpark with Esther Kundin
PDF
Introduction to eBPF
PDF
fluent-plugin-beats at Elasticsearch meetup #14
PPTX
Sizing MongoDB Clusters
PPTX
YugaByte DB Internals - Storage Engine and Transactions
PDF
Adventures in Observability - Clickhouse and Instana
PDF
KEY
PostgreSQL
PDF
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
PDF
VictoriaLogs: Open Source Log Management System - Preview
PDF
Data Warehousing with Python
PPTX
APACHE KAFKA / Kafka Connect / Kafka Streams
[140315 박민근] 젠킨스를 이용한 자동빌드 시스템 구축하기(ci)
GitOps with ArgoCD
MicroServices with Containers, Kubernetes & ServiceMesh
Performance Analysis of Apache Spark and Presto in Cloud Environments
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Kubernetes in Docker
Monitoring using Prometheus and Grafana
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Integrating Existing C++ Libraries into PySpark with Esther Kundin
Introduction to eBPF
fluent-plugin-beats at Elasticsearch meetup #14
Sizing MongoDB Clusters
YugaByte DB Internals - Storage Engine and Transactions
Adventures in Observability - Clickhouse and Instana
PostgreSQL
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
VictoriaLogs: Open Source Log Management System - Preview
Data Warehousing with Python
APACHE KAFKA / Kafka Connect / Kafka Streams
Ad

Similar to LF_DPDK17_GRO/GSO Libraries: Bring Significant Performance Gains to DPDK-based Applications (20)

DOCX
Chapter 3. sensors in the network domain
ODP
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
PDF
UAV Data Link Design for Dependable Real-Time Communications
PDF
100 M pps on PC.
PPTX
MTU (maximum transmission unit) & MRU (maximum receive unit)
PPTX
Introduction to DPDK
PPT
Exploiting Network Protocols To Exhaust Bandwidth Links 2008 Final
PDF
IETF 100: Surviving IPv6 fragmentation
PDF
Optimization of Low-efficiency Traffic in OpenFlow Software Defined Networks
PDF
Bar-BoF session about Simplemux at IETF93, Prague
PDF
Computer network (11)
PDF
Recent advance in netmap/VALE(mSwitch)
PDF
Aceleracion TCP Mikrotik.pdf
PDF
Transport Layer Numericals
PDF
Analyzing network packets Using Wireshark
PDF
Network Programming: Data Plane Development Kit (DPDK)
PPTX
PACKET Sniffer IMPLEMENTATION
PDF
Primer to Browser Netwroking
PPT
FEC & File Multicast
PPTX
Introduction to IP
Chapter 3. sensors in the network domain
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
UAV Data Link Design for Dependable Real-Time Communications
100 M pps on PC.
MTU (maximum transmission unit) & MRU (maximum receive unit)
Introduction to DPDK
Exploiting Network Protocols To Exhaust Bandwidth Links 2008 Final
IETF 100: Surviving IPv6 fragmentation
Optimization of Low-efficiency Traffic in OpenFlow Software Defined Networks
Bar-BoF session about Simplemux at IETF93, Prague
Computer network (11)
Recent advance in netmap/VALE(mSwitch)
Aceleracion TCP Mikrotik.pdf
Transport Layer Numericals
Analyzing network packets Using Wireshark
Network Programming: Data Plane Development Kit (DPDK)
PACKET Sniffer IMPLEMENTATION
Primer to Browser Netwroking
FEC & File Multicast
Introduction to IP
Ad

More from LF_DPDK (20)

PDF
LF_DPDK17_Event Adapters - Connecting Devices to Eventdev
PDF
LF_DPDK17_Integrating and using DPDK with Open vSwitch
PDF
LF_DPDK17_ OpenVswitch hardware offload over DPDK
PDF
LF_DPDK17_DPDK support for new hardware offloads
PDF
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
PDF
LF_DPDK17_Lagopus Router
PDF
LF_DPDK17_DPDK Membership Library
PDF
LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...
PDF
LF_DPDK17_testpmd: swissknife for NFV
PDF
LF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNG
PDF
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
PDF
LF_DPDK17_DPDK on Microsoft Azure
PDF
LF_DPDK17_VPP Host Stack
PDF
LF_DPDK17_Accelerating Packet Processing with FPGA NICs
PDF
LF_DPDK17_rte_security: enhancing IPSEC offload
PDF
LF_DPDK17_Enabling hardware acceleration in DPDK data plane applications
PDF
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
PDF
LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...
PDF
LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...
PDF
LF_DPDK17_Technical Roadmap
LF_DPDK17_Event Adapters - Connecting Devices to Eventdev
LF_DPDK17_Integrating and using DPDK with Open vSwitch
LF_DPDK17_ OpenVswitch hardware offload over DPDK
LF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_Lagopus Router
LF_DPDK17_DPDK Membership Library
LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...
LF_DPDK17_testpmd: swissknife for NFV
LF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNG
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
LF_DPDK17_DPDK on Microsoft Azure
LF_DPDK17_VPP Host Stack
LF_DPDK17_Accelerating Packet Processing with FPGA NICs
LF_DPDK17_rte_security: enhancing IPSEC offload
LF_DPDK17_Enabling hardware acceleration in DPDK data plane applications
LF_DPDK17_Serverless DPDK - How SmartNIC resident DPDK Accelerates Packet Pro...
LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...
LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...
LF_DPDK17_Technical Roadmap

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation theory and applications.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Big Data Technologies - Introduction.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Approach and Philosophy of On baking technology
PPTX
Spectroscopy.pptx food analysis technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation_ Review paper, used for researhc scholars
Review of recent advances in non-invasive hemoglobin estimation
Spectral efficient network and resource selection model in 5G networks
Encapsulation theory and applications.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Big Data Technologies - Introduction.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Approach and Philosophy of On baking technology
Spectroscopy.pptx food analysis technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

LF_DPDK17_GRO/GSO Libraries: Bring Significant Performance Gains to DPDK-based Applications

  • 1. DPDK Summit - San Jose – 2017 #DPDKSummit Accelerating Packet Processing with GRO and GSO in DPDK
  • 2. 2#DPDKSummit Packet Processing Overheads u A major part of packet processing “applications” are performed on a per- packet basis (e.g. Firewall, TCP/IPv4 stack) u Per-packet routines (e.g. header processing) dominate packet processing overheads u Reduce the packet number to be processed can mitigate the per-packet overhead
  • 3. 3 Methods to Reduce the Packet Number u Use large Maximum Transmission Unit (MTU) u MTU size depends on physical links L u Use network interface card (NIC) features: Large Receive Offload (LRO), TCP Segmentation Offload (TSO) and UDP Fragmentation Offload (UFO) Application ... fewer large packets Egress direction: large packets LRO NIC Ingress direction: ... TSO/UFO Drawbacks: • Only support TCP and UDP • Not all NICs support LRO/TSO/UFO
  • 4. 4#DPDKSummit GRO and GSO u Software Method: • Generic Receive Offload (GRO) merges small packets into larger ones • Generic Segmentation Offload (GSO) splits large packets into smaller ones u Advantages: • Don’t reply on physical link support • Don’t reply on NIC hardwaresupport • Able to support various kinds of protocols,like TCP, VxLAN and GRE
  • 5. 5 u In DPDK, GRO and GSO are two standalone libraries GRO and GSO in DPDK DPDK Application #DPDKSummit NIC DPDK PMD DPDK GRO Library NIC DPDK PMD DPDK GSO Library Ingress direction Egress direction
  • 6. 6 Library Framework #DPDKSummit … GRO API IPv4-GRE-TCP/IPv4 GRO Packet type: IPv4-GRE-IPv4-TCP Packet type: IPv4-TCP rte_gro_ctx_create() rte_gro_ctx_destroy() rte_gro_reassemble…() rte_gro_timeout_flush() rte_gro_get_pkt_count() TCP/IPv4 GRO GSO API ...IPv4-GRE-TCP/IPv4 GSO TCP/IPv4 GSO ol_flags: PKT_TX_TCP_SEG| PKT_TX_IPV4 ol_flags: PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE rte_gso_segment(pkt) • One GRO type, one kind of packets • Indicated by MBUF->packet_type • One GSO type, one kind of packets • Indicated by MBUF->ol_flags
  • 7. 7 API: How Applications Use it? nb_pkts = rte_eth_rx_burst(…, pkts, …); nb_segs = rte_gro_reassemble_burst(pkts, nb_pkts, …); rte_eth_tx_burst(…, pkts, nb_segs); struct rte_mbuf *gso_segs[N]; nb_segs = rte_gso_segment(…, pkt, gso_segs, …); rte_eth_tx_burst(…, gso_segs, nb_segs); GRO Sample GSO Sample
  • 8. 8 API: Two Sets in GRO #DPDKSummit ctx = rte_gro_ctx_create() rte_gro_reassemble(pkts, ctx) rte_gro_timeout_flush(ctx) Heavyweight Mode API • Supports to merge a large number of packets with fine- grained control rte_gro_reassemble_burst() • Support to merge a small number of packets rapidly Lightweight Mode API
  • 9. 9 How to Merge and Split Packets? u GRO Reassembly Algorithm merges packets u GSO Segmentation Scheme splits packets
  • 10. 10 GRO Algorithm: Challenges u A high cost algorithm/implementation would cause packet dropping in a high speed network u Packet reordering makes it hard to merge packets u Linux GRO fails to merge packets when encounters packet reordering #DPDKSummit Algorithm is lightweight to scale fast networking speed Algorithm is capable of handling packet reordering
  • 11. 11 GRO Algorithm: Key-based Approach #DPDKSummit Categorize into an existed “flow” Search for a “neighbor” in the “flow” Merge the packets • Insert a new “flow” • Store the packet Store the packet in the “flow” packet find a “flow” not find not find find a neighbor TCP/IPv4 Packets Header fields representing a “flow”: • src/dst: mac, IP, port • ACK number Header fields deciding “neighbor”: • IP id • Sequence number same value incremental
  • 12. 12 GRO Algorithm: Key-based Approach #DPDKSummit Two Characters • Lightweight: classify packets into “flows” to accelerate packet aggregation is simple • More: storing out-of-order packets makes it possible to merge later Address challenge 1 and 2 Categorize into an existed “flow” Search for a “neighbor” in the “flow” Merge the packets • Insert a new “flow” • Store the packet Store the packet in the “flow” packet find a “flow” not find not find find a neighbor
  • 13. 13 GRO Algorithm: TCP/IPv4 Example #DPDKSummit Flow MAC IP Port ACK Number F1 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5041 5043 1 F2 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5001 5002 1 Packet IP ID Sequence number Payload length P0 1 1 100 P2 3 301 100 P3 Flow IP ID Sequence number Payload length F2 4 401 100  ‚ Packet IP ID Sequence number Payload length P1 7 701 1500
  • 14. 14#DPDKSummit Flow MAC IP Port ACK Number F1 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5041 5043 1 F2 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5001 5002 1 Packets IP ID Sequence number Payload length P0 1 1 100 P2 3 301 100 P3 Flow IP ID Sequence number Payload length F2 4 401 100 Packets IP ID Sequence number Payload length P1 7 701 1500 IP ID and sequence number are incremental à Neighbors GRO Algorithm: TCP/IPv4 Example
  • 15. 15#DPDKSummit Flow MAC IP Port ACK Numbe r f0 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5041 5043 1 f1 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5001 5002 1 Packets IP ID Sequence number Payload length P0 1 1 100 P2 4 301 200 Packets IP ID Sequence number Payload length P1 7 701 1500 ƒ GRO Algorithm: TCP/IPv4 Example
  • 16. 16#DPDKSummit GSO Scheme: Overview u Segmentation Workflow Split the payload into smaller parts Add the packet header to each payload part Update packet headers for all GSO segments an input packet GSO Segments How to organize a GSO segment?
  • 17. 17#DPDKSummit GSO Scheme: Zero-copy Based Approach u “Two-part” MBUF is used to organize a GSO segment • Direct MBUF holds the packet header • Indirect MBUF: “pointer”, point to a payload part of the packet to segment Data room Direct MBUF MBUF Meta data Indirect MBUF 1 MBUF Meta datanext next… Indirect MBUF N Forming a GSO segment just needs to copy the header! MBUF Meta data
  • 18. 18 PYLD 1 #DPDKSummit GSO Scheme: Example PKT HDR MBUF Meta data next PYLD 0HDRPKT MBUF Meta data  Copy HDR to direct MBUF ƒ Reduce reference counter of PKT’s MBUF by 1 • PKT will be freed automatically, When all indirect MBUFs are freed MBUF Meta data ‚ “Attach” indirect MBUF to PKT and make it point to the correct payload part PKT HDR MBUF Meta data next MBUF Meta data
  • 19. 19 Host Kernel #DPDKSummit Experiment: Physical Topology Switch (testpmd) VM vhost-userNIC 1 virtio-netNIC 0 logically connected Physically connected Server 0 iperf iperf TCP/IPv4 packets
  • 20. 20 NIC 1 iperf client Host Kernel #DPDKSummit Experiment: Setup Testpmd iperf server VM vhost-userNIC 1 virtio-netNIC 0 Small TCP/IPv4 packets iperf server Host Kernel Testpmd iperf client VM vhost-user GSO virtio-netNIC 0 large TCP/IPv4 packets GRO Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz Ethernet Controller XL710 for 40GbE QSFP+ DPDK GRO DPDK GSO Linux GRO Linux GSO
  • 21. 21 0 0.5 1 1.5 2 2.5 3 1 TCP Connection 2 TCP Connections 4 TCP Connections Speedup#DPDKSummit Experiment: GRO Performance 0 0.5 1 1.5 2 2.5 3 1 TCP Connection 2 TCP Connections 4 TCP Connections Speedup Figure 1. Iperf throughput Improvement of DPDK-GRO over No-GRO Figure 2. Iperf throughput Improvement of DPDK-GRO over Linux-GRO 2.7x 1.9x
  • 22. 22 0 0.5 1 1.5 2 2.5 3 1 TCP Connection 2 TCP Connections 4 TCP Connections Speedup #DPDKSummit Experiment: GSO Performance 0 0.5 1 1.5 2 2.5 1 TCP Connection 2 TCP Connections 4 TCP Connections Speedup Figure 1. Iperf throughput Improvement of DPDK-GSO over No-GSO Figure 2. Iperf throughput Improvement of DPDK-GSO over Linux-GSO 2.2x 1.5x