SlideShare a Scribd company logo
TWNOG WORKSHOP 2010/7/2, Taipei 網路維運常見問題原因、偵錯 (Troubleshooting) 技術解析 網路與 TCP 效能關聯探討 智匯亞洲有限公司 許至凱 CCIE/JNCIE kaeatforum [at] gmail.com
Objects 對象:網路設備操作、維運人員 了解有那些網路環境因子會對於 TCP 效能造成影響,以連結網路維運與網路應用程式效能,做為網路環境改善方式的參考。 了解 TCP 運作原理 那些網路事件發生時將影響 TCP 效能表現? 因應對策
Agenda TCP Briefing TCP Performance Factors Network Event Impact Improvement – Network approach Improvement – Appliance approach Reference
TCP Briefing TCP/IP stack in a computer system Linux Application Socket Layer (net/socket.c) Inet Layer (net/ipv4/af_inte.c) IP Layer (various ip files in net/ipv4) TCP Layer (net/ipv4/tcp.c) UDP Layer (net/ipv4/udp.c) Ethernet Device Driver Ethernet Card Other Drivers Parallel/Serial/Other Interface Drivers
TCP Briefing TCP/IP stack in a computer system Windows TCP/IP Stack (Tcpip.sys) Windows Sockets Applications Windows Sockets AFD WSK Clients WSK NetBT and other TDI clients TDI TDX TCP UDP RAW IPv6 IPv4 802.3 PPP 802.11 Loopback IPv4 Tunnel NDIS User Kernel
TCP Briefing TCP/IP position in computer and network environment
TCP Briefing TCP header format (RFC793)
TCP Briefing TCP header format (updated by RFC3168)
TCP Performance Factors TCP Performance Factors Monitoring Tools Flow control Congestion control
TCP Performance Factors Measurement tools Monitoring tools tcpdump On Windows platform - Wireshark tcpstat Benchmarking tools ttcp Netperf NetPIPE DBS (Distributed Benchmark System)
TCP Performance Factors Flow control Sliding Window (window size = 6 in the example) Step 1 Step 2 Step 3 Step 4 Time 已收到 ACK 等待 ACK 中 可傳送區間 不可傳送區間 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0
TCP Performance Factors Flow control Window Size Adjustment “ Receiver window size filed” in TCP header
TCP Performance Factors Congestion Control Flow control 讓接收端控制進入之流量,避免 buffer overflow 情況發生 藉由 AdvertisedWindow 調整發送端 window size 無法反應網路連線狀況 無法避免所經網路是否有類似 buffer overflow 情況發生 為能偵測可能的網路壅塞, TCP 使用 Congestion control 。 藉由 CongestionWindow (cwnd) 來進行調整 Congestion control 主要含四種方式 (RFC5681) : Slow start Congestion avoidance Fast retransmit Fast recovery
TCP Performance Factors Slow start TCP connection 剛建立時,使用小的 window size 。等到收到 ACK 後再慢慢增加。 cwnd 初始值為 1 旨在偵測網路頻寬狀況 每收到 1 個 ACK 則 cwnd+1 如此一來,每經過一個 round-trip time (RTT) , cwnd 的值則變成上一次 RTT 的兩倍 指數成長 為避免 cwnd 增加太快,俟 cwnd 超過” slow start threshold, ssthresh” 後,每一 RTT 只增加 1 線性成長
TCP Performance Factors Congestion avoidance 在此階段 : cwnd > ssthresh cwnd + 1 for each RTT 當有 packet loss 發生時,則 : ssthresh -> cwnd/2 cwnd -> 1 packet retransmission 一旦 packet loss 發生時, TCP Performance 將受到嚴重影響。
TCP Performance Factors Slow start & Congestion avoidance characteristic
TCP Performance Factors Fast retransmit (Tahoe) 仍套用 slow start + congestion avoidance sender 收到 3 個 duplicate ACK 後即重新傳送封包 避免 sender timeout 後,因必須調整 ssthreh/cwnd 造成 TCP 效能嚴重下降 Fast recovery (Reno) 先套用 fast retransmit 收到 duplicate 封包後即進入 congestion avoidance 再執行 fast recovery ssthresh -> cwnd/2 重送封包 cwnd -> ssthresh + 3 NewReno, SACK, Vegas….. 都在 TCP 端進行效能改善
Network Event Impact Packet loss By TCP congestion control, packet loss will launch TCP retransmission 儘管 TCP congestion control 做的再好, packet loss 都會造成 TCP Performance downgrade
Network Event Impact Packet out-of-order Packet out-of-order 時 ,  雖然 TCP 能夠將封包組回 ,  但若 TCP fast recovery 作用時反可能會造成資源浪費 Reno 在收到 duplicate ACK 後即會開始重送封包,直到收到 Partial ACK 後才停止。 若 packet 只是慢點到而不是不到,則 sender 勢必會重傳不需要重傳的封包,造成資源浪費。 NewReno 為改善 Reno 的效率,會在收到 Final ACK 後才停止重傳遺失封包。 NewReno 會重覆送的封包數量有可能比 Reno 還多。
Improvement – Network approach Reduce packet loss Packet loss 對 TCP Performance 影響很大,網路環境中所有 packet loss 都應儘量排除。 Layer 1, layer 2 error Unqualified physical media CRC, P3 error etc… Layer 3 Router/Switch hardware or software error Congestion Reduce congestion impact by QoS deployment Avoid packet drop for high sensitive TCP application
Improvement – Network approach Packet forward process without QoS Tail-drop 網路設備 hardware queue 因線路擁塞而被佔滿,在無法容納更多待傳送封包後直接將待傳送封包丟棄。 Hardware queue 無法判斷 packet priority ,一但發生 queue 塞滿的情況時則無差別的將封包丟棄。 此類情況即為 Tail-drop 要儘量避免發生 Tail-drop 情況。
Improvement – Network approach Packet forward process with QoS 先使用不同的 logical queue 來存放 priority 不同的封包,再置入 h/w queue 中。在 H/W queue 塞滿之前,主動丟棄某些暫存於 low priority queue 的封包,防止 Tail-drop 情況發生。 RED – Random Early Detection WRED – Weighted Random Early Detection
Improvement – Network approach Reduce out-of-order packets 避免同一 TCP session 走在不同的 path 上 Per-packet load-sharing Load-sharing by destination IP only Per-flow load-sharing Load-sharing by IP packet hash value. Hash index includes: Source IP 、 Destination IP Protocol Source Port 、 Destination Port 有著相同 hash 值的封包會走相同的 next-hop interface ,避免 packet out-of-order 情況發生。 TCP 實作 Selective Acknowledgements RFC2018 RFC2883
Improvement – Appliance approach Operating System has to handle TCP session routine It’s CPU/Memory dependent Huge TCP session will occupy system resource like CPU cycles and memory utilization, and shrink the real service processes in asking CPU/Memory Reduce system resource consumption in TCP session handling TCP Offload TCP Optimization
Improvement – Appliance approach TCP Offload Migrate TCP handling out of kernel Use dedicate hardware to handle TCP Save system resource for real service processes TOE (TCP Offload Engine) NIC Handle TCP/IP on NIC
Improvement – Appliance approach TCP Offload NIC w/o TOE and NIC w/ TOE comparison
Improvement – Appliance approach TCP Offload TOE is wide deployed in iSCSI environment iSCSI:
Improvement – Appliance approach TCP Optimization Migrate huge TCP session out of system For any TCP session, 3-way handshaking and 4-way handshaking is necessary 3-way handshaking for TCP connection establishment 4-way handshaking for TCP connection termination Reduce TCP connection number will reduce connection “overhead” Deploy dedicate hardware in the front of servers
Improvement – Appliance approach TCP Optimization Regular TCP connection Client Server SYN ACK SYN+ACK GET FIN ACK ACK Data Data Data FIN
Improvement – Appliance approach TCP Optimization Reduce server TCP connection number Only ONE 3-way handshaking is necessary in early stage Client Server TCP Proxy SYN ACK SYN+ACK GET FIN ACK ACK Data Data Data GET Data Data Data FIN
Improvement – Appliance approach TCP Optimization 現實環境中很少僅用來改善 TCP 效能 多搭配其它功能 L4~L7 load-balance 由於 Client TCP connection end-to-end 是建立在 TCP Proxy 上,更多其它功能可以被加入 SSL 加速 Reverse cache
Reference Books High-Speed Networks and Internets – Performance and Quality of Service, 2nd Ed. By  William Stallings ; Prentice Hall High Performance TCP/IP Networking – Concepts, Issues and Solutions By  Mahbub Hassan  and  Raj Jain ; Pearson Prentice Hall TCP/IP Illustrated, Volume 1 By  W. Richard Stevens ; Addison Wesley Articles TCP Performance By  Geoff Huston ; The Internet Protocol Journal - Volume 3, No. 2 A very good “sliding window” description http://www.it.uu.se/edu/course/homepage/datakom/civinght04/schema/sliding_window.pps
Q & A

More Related Content

PPTX
F5 tcpdump
PPT
Lect9
PDF
Ns3: Newreno vs Vegas vs Veno
PDF
Xtcp Performance Brochure
PPTX
Programming TCP for responsiveness
PPT
Tcp congestion control (1)
PDF
Computer network (13)
PDF
Programming TCP for responsiveness
F5 tcpdump
Lect9
Ns3: Newreno vs Vegas vs Veno
Xtcp Performance Brochure
Programming TCP for responsiveness
Tcp congestion control (1)
Computer network (13)
Programming TCP for responsiveness

What's hot (20)

PDF
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
PPTX
Ovs dpdk hwoffload way to full offload
PPT
Troubleshooting TCP/IP
PPT
Tcp congestion control
PPTX
PDF
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
PDF
LF_OVS_17_Red Hat's perspective on OVS HW Offload Status
ODP
A Baker's dozen of TCP
PPTX
Congestion control in tcp
PPTX
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
PPTX
Tcp congestion avoidance
PPT
Congestion control avoidance
PPT
Tcp congestion avoidance algorithm identification
PPTX
Cache aware-server-push in H2O version 1.5
PDF
LF_DPDK17_ OpenVswitch hardware offload over DPDK
PDF
Transaction TCP
ODP
7.protocols 2
PPT
Tcp Congestion Avoidance
PPT
TCP congestion control
PDF
Developing the fastest HTTP/2 server
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Ovs dpdk hwoffload way to full offload
Troubleshooting TCP/IP
Tcp congestion control
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
LF_OVS_17_Red Hat's perspective on OVS HW Offload Status
A Baker's dozen of TCP
Congestion control in tcp
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
Tcp congestion avoidance
Congestion control avoidance
Tcp congestion avoidance algorithm identification
Cache aware-server-push in H2O version 1.5
LF_DPDK17_ OpenVswitch hardware offload over DPDK
Transaction TCP
7.protocols 2
Tcp Congestion Avoidance
TCP congestion control
Developing the fastest HTTP/2 server
Ad

Viewers also liked (20)

PDF
Botnets & DDoS Introduction
PDF
How To Process And Solve Network Security In ISP
PPTX
FEGTS IP Training - Network Diagnostic Introduction
PPTX
Rawnet Lightning Talk - Web Components
PDF
4 Byte As Ns Test Scenarios
PDF
Rawnet Lightning Talk - 'What is an idea & how do you create them?'
PPTX
A review of Concrete 5 and what is new in version 5.7
PDF
Rawnet Lightning Talk - Design Inspiration
PPT
Toward The Semantic Deep Web
PDF
Noisy information transmission through molecular interaction networks
PPTX
Rawnet Lightning Talk - Elasticsearch
PPT
4 byte AS number workshop material
PDF
4byte As Number Migration Suggestion
PPTX
How internet works and how messages are transferred in Internet
PPT
Web 101 by Jennifer Lill
PPTX
Rawnet Lightning talk - 'A Day in the Life of an Account Manager'
PDF
Rawnet Lightning Talk - Anyone Can Draw.
PDF
20th TWNIC OPM IPv6 Support by SDN & NFV
PPT
CDN and ISP Operation
PPT
Network Design in Cloud-ready IDC
Botnets & DDoS Introduction
How To Process And Solve Network Security In ISP
FEGTS IP Training - Network Diagnostic Introduction
Rawnet Lightning Talk - Web Components
4 Byte As Ns Test Scenarios
Rawnet Lightning Talk - 'What is an idea & how do you create them?'
A review of Concrete 5 and what is new in version 5.7
Rawnet Lightning Talk - Design Inspiration
Toward The Semantic Deep Web
Noisy information transmission through molecular interaction networks
Rawnet Lightning Talk - Elasticsearch
4 byte AS number workshop material
4byte As Number Migration Suggestion
How internet works and how messages are transferred in Internet
Web 101 by Jennifer Lill
Rawnet Lightning talk - 'A Day in the Life of an Account Manager'
Rawnet Lightning Talk - Anyone Can Draw.
20th TWNIC OPM IPv6 Support by SDN & NFV
CDN and ISP Operation
Network Design in Cloud-ready IDC
Ad

Similar to Network and TCP performance relationship workshop (20)

PDF
C25008013
PDF
Lecture 19 22. transport protocol for ad-hoc
PDF
A throughput analysis of tcp in adhoc networks
PDF
A THROUGHPUT ANALYSIS OF TCP IN ADHOC NETWORKS
PDF
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
PDF
Improving Performance of TCP in Wireless Environment using TCP-P
PPTX
High performance browser networking ch1,2,3
PDF
Analytical Research of TCP Variants in Terms of Maximum Throughput
PDF
Performance Analysis of TCP and SCTP For Congestion Losses In Manet
PPTX
Improving tcp performance over mobile ad hoc networks
PDF
Improved SCTP Scheme To Overcome Congestion Losses Over Manet
PDF
Investigating the Use of Synchronized Clocks in TCP Congestion Control
PPTX
High Performance Networking with Advanced TCP
PDF
Tcp santa cruz
PPT
TCP Over Wireless
PPTX
NE #1.pptx
PPSX
Cvc2009 Moscow Repeater+Ica Fabian Kienle Final
PPT
05compuernetworkscongestioncontrolalgo.ppt
PPT
05 ergeg mmergm maergergcongergeestion.ppt
PPTX
TCP Congestion Control By Owais Jara
C25008013
Lecture 19 22. transport protocol for ad-hoc
A throughput analysis of tcp in adhoc networks
A THROUGHPUT ANALYSIS OF TCP IN ADHOC NETWORKS
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Improving Performance of TCP in Wireless Environment using TCP-P
High performance browser networking ch1,2,3
Analytical Research of TCP Variants in Terms of Maximum Throughput
Performance Analysis of TCP and SCTP For Congestion Losses In Manet
Improving tcp performance over mobile ad hoc networks
Improved SCTP Scheme To Overcome Congestion Losses Over Manet
Investigating the Use of Synchronized Clocks in TCP Congestion Control
High Performance Networking with Advanced TCP
Tcp santa cruz
TCP Over Wireless
NE #1.pptx
Cvc2009 Moscow Repeater+Ica Fabian Kienle Final
05compuernetworkscongestioncontrolalgo.ppt
05 ergeg mmergm maergergcongergeestion.ppt
TCP Congestion Control By Owais Jara

More from Kae Hsu (8)

PPT
FEGTS IP training - TCP/IP Introduction
PPT
TWNIC 13th OPM session
PPT
How Internet Works
PDF
Redundant Internet service provision - customer viewpoint
PDF
Suggestions for end users to deploy multihoming, load-balance and load-sharing
PDF
PDF
Suggestions for end users to deploy multihoming, load-balance and load-sharing
PDF
Suggestions for end users to deploy multihoming, load-balance and load-sharing
FEGTS IP training - TCP/IP Introduction
TWNIC 13th OPM session
How Internet Works
Redundant Internet service provision - customer viewpoint
Suggestions for end users to deploy multihoming, load-balance and load-sharing
Suggestions for end users to deploy multihoming, load-balance and load-sharing
Suggestions for end users to deploy multihoming, load-balance and load-sharing

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Empathic Computing: Creating Shared Understanding
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Machine learning based COVID-19 study performance prediction
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
KodekX | Application Modernization Development
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Network Security Unit 5.pdf for BCA BBA.
Empathic Computing: Creating Shared Understanding
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
cuic standard and advanced reporting.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
sap open course for s4hana steps from ECC to s4
Diabetes mellitus diagnosis method based random forest with bat algorithm
Machine learning based COVID-19 study performance prediction
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
KodekX | Application Modernization Development
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Network and TCP performance relationship workshop

  • 1. TWNOG WORKSHOP 2010/7/2, Taipei 網路維運常見問題原因、偵錯 (Troubleshooting) 技術解析 網路與 TCP 效能關聯探討 智匯亞洲有限公司 許至凱 CCIE/JNCIE kaeatforum [at] gmail.com
  • 2. Objects 對象:網路設備操作、維運人員 了解有那些網路環境因子會對於 TCP 效能造成影響,以連結網路維運與網路應用程式效能,做為網路環境改善方式的參考。 了解 TCP 運作原理 那些網路事件發生時將影響 TCP 效能表現? 因應對策
  • 3. Agenda TCP Briefing TCP Performance Factors Network Event Impact Improvement – Network approach Improvement – Appliance approach Reference
  • 4. TCP Briefing TCP/IP stack in a computer system Linux Application Socket Layer (net/socket.c) Inet Layer (net/ipv4/af_inte.c) IP Layer (various ip files in net/ipv4) TCP Layer (net/ipv4/tcp.c) UDP Layer (net/ipv4/udp.c) Ethernet Device Driver Ethernet Card Other Drivers Parallel/Serial/Other Interface Drivers
  • 5. TCP Briefing TCP/IP stack in a computer system Windows TCP/IP Stack (Tcpip.sys) Windows Sockets Applications Windows Sockets AFD WSK Clients WSK NetBT and other TDI clients TDI TDX TCP UDP RAW IPv6 IPv4 802.3 PPP 802.11 Loopback IPv4 Tunnel NDIS User Kernel
  • 6. TCP Briefing TCP/IP position in computer and network environment
  • 7. TCP Briefing TCP header format (RFC793)
  • 8. TCP Briefing TCP header format (updated by RFC3168)
  • 9. TCP Performance Factors TCP Performance Factors Monitoring Tools Flow control Congestion control
  • 10. TCP Performance Factors Measurement tools Monitoring tools tcpdump On Windows platform - Wireshark tcpstat Benchmarking tools ttcp Netperf NetPIPE DBS (Distributed Benchmark System)
  • 11. TCP Performance Factors Flow control Sliding Window (window size = 6 in the example) Step 1 Step 2 Step 3 Step 4 Time 已收到 ACK 等待 ACK 中 可傳送區間 不可傳送區間 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0 12 13 11 10 9 8 7 6 5 4 3 2 1 0
  • 12. TCP Performance Factors Flow control Window Size Adjustment “ Receiver window size filed” in TCP header
  • 13. TCP Performance Factors Congestion Control Flow control 讓接收端控制進入之流量,避免 buffer overflow 情況發生 藉由 AdvertisedWindow 調整發送端 window size 無法反應網路連線狀況 無法避免所經網路是否有類似 buffer overflow 情況發生 為能偵測可能的網路壅塞, TCP 使用 Congestion control 。 藉由 CongestionWindow (cwnd) 來進行調整 Congestion control 主要含四種方式 (RFC5681) : Slow start Congestion avoidance Fast retransmit Fast recovery
  • 14. TCP Performance Factors Slow start TCP connection 剛建立時,使用小的 window size 。等到收到 ACK 後再慢慢增加。 cwnd 初始值為 1 旨在偵測網路頻寬狀況 每收到 1 個 ACK 則 cwnd+1 如此一來,每經過一個 round-trip time (RTT) , cwnd 的值則變成上一次 RTT 的兩倍 指數成長 為避免 cwnd 增加太快,俟 cwnd 超過” slow start threshold, ssthresh” 後,每一 RTT 只增加 1 線性成長
  • 15. TCP Performance Factors Congestion avoidance 在此階段 : cwnd > ssthresh cwnd + 1 for each RTT 當有 packet loss 發生時,則 : ssthresh -> cwnd/2 cwnd -> 1 packet retransmission 一旦 packet loss 發生時, TCP Performance 將受到嚴重影響。
  • 16. TCP Performance Factors Slow start & Congestion avoidance characteristic
  • 17. TCP Performance Factors Fast retransmit (Tahoe) 仍套用 slow start + congestion avoidance sender 收到 3 個 duplicate ACK 後即重新傳送封包 避免 sender timeout 後,因必須調整 ssthreh/cwnd 造成 TCP 效能嚴重下降 Fast recovery (Reno) 先套用 fast retransmit 收到 duplicate 封包後即進入 congestion avoidance 再執行 fast recovery ssthresh -> cwnd/2 重送封包 cwnd -> ssthresh + 3 NewReno, SACK, Vegas….. 都在 TCP 端進行效能改善
  • 18. Network Event Impact Packet loss By TCP congestion control, packet loss will launch TCP retransmission 儘管 TCP congestion control 做的再好, packet loss 都會造成 TCP Performance downgrade
  • 19. Network Event Impact Packet out-of-order Packet out-of-order 時 , 雖然 TCP 能夠將封包組回 , 但若 TCP fast recovery 作用時反可能會造成資源浪費 Reno 在收到 duplicate ACK 後即會開始重送封包,直到收到 Partial ACK 後才停止。 若 packet 只是慢點到而不是不到,則 sender 勢必會重傳不需要重傳的封包,造成資源浪費。 NewReno 為改善 Reno 的效率,會在收到 Final ACK 後才停止重傳遺失封包。 NewReno 會重覆送的封包數量有可能比 Reno 還多。
  • 20. Improvement – Network approach Reduce packet loss Packet loss 對 TCP Performance 影響很大,網路環境中所有 packet loss 都應儘量排除。 Layer 1, layer 2 error Unqualified physical media CRC, P3 error etc… Layer 3 Router/Switch hardware or software error Congestion Reduce congestion impact by QoS deployment Avoid packet drop for high sensitive TCP application
  • 21. Improvement – Network approach Packet forward process without QoS Tail-drop 網路設備 hardware queue 因線路擁塞而被佔滿,在無法容納更多待傳送封包後直接將待傳送封包丟棄。 Hardware queue 無法判斷 packet priority ,一但發生 queue 塞滿的情況時則無差別的將封包丟棄。 此類情況即為 Tail-drop 要儘量避免發生 Tail-drop 情況。
  • 22. Improvement – Network approach Packet forward process with QoS 先使用不同的 logical queue 來存放 priority 不同的封包,再置入 h/w queue 中。在 H/W queue 塞滿之前,主動丟棄某些暫存於 low priority queue 的封包,防止 Tail-drop 情況發生。 RED – Random Early Detection WRED – Weighted Random Early Detection
  • 23. Improvement – Network approach Reduce out-of-order packets 避免同一 TCP session 走在不同的 path 上 Per-packet load-sharing Load-sharing by destination IP only Per-flow load-sharing Load-sharing by IP packet hash value. Hash index includes: Source IP 、 Destination IP Protocol Source Port 、 Destination Port 有著相同 hash 值的封包會走相同的 next-hop interface ,避免 packet out-of-order 情況發生。 TCP 實作 Selective Acknowledgements RFC2018 RFC2883
  • 24. Improvement – Appliance approach Operating System has to handle TCP session routine It’s CPU/Memory dependent Huge TCP session will occupy system resource like CPU cycles and memory utilization, and shrink the real service processes in asking CPU/Memory Reduce system resource consumption in TCP session handling TCP Offload TCP Optimization
  • 25. Improvement – Appliance approach TCP Offload Migrate TCP handling out of kernel Use dedicate hardware to handle TCP Save system resource for real service processes TOE (TCP Offload Engine) NIC Handle TCP/IP on NIC
  • 26. Improvement – Appliance approach TCP Offload NIC w/o TOE and NIC w/ TOE comparison
  • 27. Improvement – Appliance approach TCP Offload TOE is wide deployed in iSCSI environment iSCSI:
  • 28. Improvement – Appliance approach TCP Optimization Migrate huge TCP session out of system For any TCP session, 3-way handshaking and 4-way handshaking is necessary 3-way handshaking for TCP connection establishment 4-way handshaking for TCP connection termination Reduce TCP connection number will reduce connection “overhead” Deploy dedicate hardware in the front of servers
  • 29. Improvement – Appliance approach TCP Optimization Regular TCP connection Client Server SYN ACK SYN+ACK GET FIN ACK ACK Data Data Data FIN
  • 30. Improvement – Appliance approach TCP Optimization Reduce server TCP connection number Only ONE 3-way handshaking is necessary in early stage Client Server TCP Proxy SYN ACK SYN+ACK GET FIN ACK ACK Data Data Data GET Data Data Data FIN
  • 31. Improvement – Appliance approach TCP Optimization 現實環境中很少僅用來改善 TCP 效能 多搭配其它功能 L4~L7 load-balance 由於 Client TCP connection end-to-end 是建立在 TCP Proxy 上,更多其它功能可以被加入 SSL 加速 Reverse cache
  • 32. Reference Books High-Speed Networks and Internets – Performance and Quality of Service, 2nd Ed. By William Stallings ; Prentice Hall High Performance TCP/IP Networking – Concepts, Issues and Solutions By Mahbub Hassan and Raj Jain ; Pearson Prentice Hall TCP/IP Illustrated, Volume 1 By W. Richard Stevens ; Addison Wesley Articles TCP Performance By Geoff Huston ; The Internet Protocol Journal - Volume 3, No. 2 A very good “sliding window” description http://www.it.uu.se/edu/course/homepage/datakom/civinght04/schema/sliding_window.pps
  • 33. Q & A