SlideShare a Scribd company logo
BREAKING THE DATA TRANSFER BOTTLENECK Yunhong GU [email_address] Laboratory for Advanced Computing National Center for Data Mining University of Illinois at Chicago October 10, 2005 udt.sourceforge.net UDT: A High Performance Data Transport Protocol
Outline INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS
>> INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS
Motivations The widespread use of high-speed networks (1Gb/s, 10Gb/s, etc.) has enabled many new distributed data intensive applications Inexpensive fibers and advanced optical networking technologies (e.g., DWDM - Dense Wavelength Division Multiplexing) 10Gb/s is common in high speed network testbeds, 40 Gb/s is emerging Large volumetric datasets Satellite weather data Astronomy observation Network monitoring The Internet transport protocol (TCP) does NOT scale well as network bandwidth-delay product (BDP) increases New transport protocol is needed!
Data Transport Protocol Functionalities Streaming, messaging Reliability Timeliness Unicast vs. multicast Congestion control Efficiency Fairness Convergence Distributedness Physical Layer Applications Data link Layer Network Layer Transport Layer
TCP Reliable, data streaming, unicast Congestion control Increase congestion window size ( cwnd ) one full sized packet per RTT Halve the  cwnd  per loss event Poor efficiency in high bandwidth-delay product networks Bias on flows with larger RTT ½ Bandwidth * RTT
TCP Throughput (Mb/s) Throughput (Mb/s) Packet Loss Round Trip Time (ms) 0.01% 0.05% 0.1% 0.1% 0.5% 1000 800 600 400 200 1 10 100 200 400 1000 800 600 400 200 LAN US-EU US-ASIA US
Related Work TCP variants HighSpeed, Scalable, BiC, FAST, H-TCP, L-TCP Parallel TCP PSockets, GridFTP Rate-based reliable UDP RBUDP, Tsunami, FOBS, FRTP (based on SABUL), Hurricane (based on UDT) XCP SABUL
Problems of Existing Work Hard to deploy TCP variants and XCP Need modifications in OS kernel and/or routers Cannot be used in shared networks Most reliable UDP-based protocols Poor fairness Intra-protocol fairness RTT fairness Manual parameter tuning
A New Protocol Throughput (Mb/s) Throughput (Mb/s) Packet Loss Round Trip Time (ms) 0.01% 0.05% 0.1% 0.1% 0.5% 1000 800 600 400 200 1 10 100 200 400 1000 800 600 400 200 LAN US-EU US-ASIA US
UDT (UDP-based Data Transfer Protocol) Application level, UDP-based Similar functionalities to TCP Connection-oriented reliable duplex unicast data streaming New protocol design and implementation New congestion control algorithm Configurable congestion control framework
Objective & Non-objective Objective For distributed data intensive applications in high speed networks A small number of flows share the abundant bandwidth Efficient, fair, and friendly Configurable Easily deployable and usable Non-objective Replace TCP on the Internet
UDT Project Open source (udt.sourceforge.net) Design and implement the UDT protocol Design the UDT congestion control algorithm Evaluate experimentally the performance of UDT Design and implement a configurable protocol framework based on UDT (Composable UDT)
>> PROTOCOL DESIGN & IMPLEMENTATION INTRODUCTION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS
UDT Overview Two orthogonal elements The UDT protocol The UDT congestion control algorithm Protocol design & implementation Functionality Efficiency Congestion control algorithm Efficiency, fairness, friendliness, and stability
UDT Overview UDP Socket API Applications TCP Socket API Applications Applications UDT UDT Socket
Functionality Reliability Packet-based sequencing Acknowledgment and loss report from receiver ACK sub-sequencing Retransmission (based on loss report and timeout) Streaming Buffer/memory management Connection maintenance Handshake, keep-alive message, teardown message Duplex Each UDT instance contains both a sender and a receiver
Protocol Architecture UDP Channel Sender Sender Receiver  Sender Receiver  UDP Seq. No TS Payload ACK Seq. No NAK Loss List A B
Software Architecture CC API UDP Channel Sender Receiver Sender's Buffer Receiver's Buffer Sender's Loss List Receiver's Loss List Listener
Efficiency Consideration Less packets Timer-based acknowledging Less CPU time Reduce per packet processing time Reduce memory copy Reduce loss list processing time Light ACK vs. regular ACK Parallel processing Threading architecture Less burst in processing Evenly distribute the processing time
Application Programming Interface (API) Socket API New functionalities sendfile/recvfile Overlapped IO support Transparent to existing applications Recompilation needed Certain limitations exist XIO support (in Globus Toolkit 4.0) Wrapper for other programming languages Java, Python
>> CONGESTION CONTROL INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS
Overview Congestion control vs. flow control Congestion control: effectively utilize the network bandwidth Flow control: prevent the receiver from being overwhelmed by incoming packets Window-based vs. rate-based Window-based: tune the maximum number of on-flight packets (TCP) Rate-based: tune the inter-packet sending time (UDT) AIMD: additive increases multiplicative decreases Feedback Packet loss (Most TCP variants, UDT) Delay (Vegas, FAST)
AIMD with Decreasing Increases AIMD x = x +   (x) , for every constant interval (e.g., RTT) x = (1 -   ) x , when there is a packet loss event where  x  is the packet sending rate. TCP  (x)     1 , and the increase interval is RTT .    = 0.5 AIMD with Decreasing Increase  (x)  is non-increasing, and  lim x->+     (x) = 0.
AIMD with Decreasing Increases  ( x ) x AIMD (TCP NewReno) UDT HighSpeed TCP Scalable TCP
Increase  (x) = f( B - x ) * c   where  B  is the link capacity (Bandwidth),  c  is a constant parameter Constant rate control interval ( SYN ), irrelevant to RTT SYN = 0.01 seconds Decrease Randomized decrease factor    = 1 – (8/9) n UDT Control Algorithm  ( x ) x
The Increase Formula: an Example Bandwidth ( B ) = 10 Gbps, Packet size = 1500 bytes 0.00067 <0.1 9999.9+ 0.001 (0.1, 1]  [9999, 9999.9) 0.01 (1, 10] [9990, 9999) 0.1 (10, 100] [9900, 9990)  1 (100, 1000]  [9000, 9900) 10 (1000, 10000] [0, 9000) Increment (pkts/SYN) B - x  (Mbps) x  (Mbps)
Dealing with Packet Loss Loss synchronization Randomization method Non-congestion loss Do not decrease sending rate for the first packet loss Packet reordering M=5, N=2 M=8, N=3
Bandwidth Estimation Packet Pair Filters Cross traffic Interrupt Coalescence Robust to estimation errors Randomized interval to send packet pair P2 P1 Packet Size / Space    Bottleneck Bandwidth P2 P1 P2 P1
>> PERFORMANCE EVALUATION INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL COMPOSABLE UDT CONCLUSIONS
Performance Characteristics Efficiency Higher bandwidth utilization, less CPU usage Intra-protocol fairness Max-min fairness Jain's fairness index TCP friendliness Bulk TCP flow vs Bulk UDT flow Short-lived TCP flow (slow start phase) vs Bulk UDT flow Stability (oscillations) Stability index (standard deviation)
Evaluation Strategies Simulations vs. experiments NS2 network simulator, NCDM teraflow testbed Setup Network topology, bandwidth, distance, queuing, Link error rate, etc. Concurrency (number of parallel flows) Comparison (against TCP) Real world applications SDSS data transfer, high performance mining of streaming data, etc. Independent evaluation SLAC, JGN2, UvA, Unipmn (Italy), etc.
Efficiency, Fairness, & Stability Flow 1 Flow 2 Flow 3 Flow 4 0 100 200 300 400 500 600 700 Time (sec) 206.220.241.16 206.220.241.15 206.220.241.14 206.220.241.13 145.146.98.81 145.146.98.80 145.146.98.79 145.146.98.78 1Gb/s bandwidth, 106 ms RTT, StarLight, Chicago SARA, Amsterdam
Efficiency, Fairness, & Stability 0.04 0.02 0.04 0.16 0.08 0.11 0.11 Stability 1 1 0.999 0.998 0.999 0.999 1 Fairness 885 904 918 830 923 912 902 Efficiency 197 Flow 4 307 202 302 Flow 3 452 310 216 308 446 Flow 2 885 452 301 215 313 466 902 Flow 1
TCP Friendliness 500 1MB TCP flows vs. 0 – 10 bulk UDT flows 1Gb/s between Chicago and Amsterdam 0 1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 Number of UDT flows TCP Throughput (Mb/s)
>> COMPOSABLE UDT INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION CONCLUSIONS
Composable UDT - Objectives Easy implementation and deployment of new control algorithms Easy evaluation of new control algorithms Application awareness support and dynamic configuration
Composable UDT - Methodologies Packet sending control Window-based, rate-based, and hybrid Control event handling onACK, onLoss, onTimeout, onPktSent, onPktRecved, etc. Protocol parameters access RTT, loss rate, RTO, etc. Packet extension User-defined control packets
Composable UDT - Evaluation Simplicity Can it be easily used? Expressiveness Can it be used to implement most control protocols? Similarity Can Composable UDT based implementations reproduce the performance of their native implementations? Overhead Will the overhead added by Composable UDT be too large?
Simplicity & Expressiveness Eight event handlers, four protocol control functions, and one performance monitoring function. Support a large variety of protocols Reliable UDT blast TCP and its variants (both loss and delay based) Group transport protocols
Simplicity & Expressiveness CCC Base Congestion  Control Class CTCP TCP NewReno CGTP Group Transport Protocol CUDPBlast Reliable UDP Blast CFAST FAST TCP CVegas TCP Vegas CScalable Scalable TCP CHS HighSpeed TCP CBiC BiC TCP CWestwood TCP Westwood 28 73  / +132-6 11  / +192-29 8  / +27-1 11  / +192-29 27  / +145-2 37  / +351-2
Similarity and Overhead Similarity How Composable UDT based implementations can simulate their native implementations CTCP vs. Linux TCP CPU usage Sender: CTCP uses about 100% more times of CPU as Linux TCP Receiver: CTCP uses about 20% more CPU than Linux TCP Flow # Throughput Fairness Stability TCP CTCP TCP CTCP TCP CTCP 1 112 122 1 1 0.517 0.415 2 191 208 0.997 0.999 0.476 0.426 4 322 323 0.949 0.999 0.484 0.492 8 378 422 0.971 0.999 0.633 0.550 16 672 642 0.958 0.985 0.502 0.482 32 877 799 0.988 0.997 0.491 0.470 64 921 716 0.994 0.996 0.569 0.529
>> CONCLUSIONS INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT
Contributions A high performance data transport protocol and associated implementation The UDT protocol Open source UDT library ( udt.sourceforge.net ) User includes ANL, ORNL, PNNL, etc. An efficient and fair congestion control algorithm DAIMD & the UDT control algorithm Packet loss handling techniques Using bandwidth estimation technique in congestion control A configurable transport protocol framework Composable UDT
Publications Papers on the UDT Protocol Supporting Configurable Congestion Control in Data Transport Services , Yunhong Gu and Robert L. Grossman, SC 2005, Nov 12 - 18, Seattle, WA. Optimizing UDP-based Protocol Implementation , Yunhong Gu and Robert L. Grossman, PFLDNet 2005, Lyon, France, Feb. 2005. Experiences in Design and Implementation of a High Performance Transport Protocol , Yunhong Gu, Xinwei Hong, and Robert L. Grossman, SC 2004, Nov 6 - 12, Pittsburgh, PA. An Analysis of AIMD Algorithms with Decreasing Increases , Yunhong Gu, Xinwei Hong and Robert L. Grossman, First Workshop on Networks for Grid Applications (Gridnets 2004), Oct. 29, San Jose, CA. SABUL: A Transport Protocol for Grid Computing , Yunhong Gu and Robert L. Grossman, Journal of Grid Computing, 2003, Volume 1, Issue 4, pp. 377-386. Internet Draft UDT: A Transport Protocol for Data Intensive Applications ,   Yunhong Gu and Robert L. Grossman, draft-gg-udt-01.txt.
Publications Papers on Data Transfer Service using UDT Experimental Studies of Data Transport and Data Access of Earth Science Data over Networks with High Bandwidth Delay Products , Robert Grossman, Yunhong Gu, Dave Hanley, Xinwei Hong and Parthasarathy Krishnaswamy, Computer Networks, Volume 46, Issue 3, Oct. 2004, pp. 411-421. Teraflows over Gigabit WANs with UDT , Robert Grossman, Yunhong Gu, Xinwei Hong, Antony Antony, Johan Blom, Freek Dijkstra, and Cees de Laat,, Journal of Future Computer Systems, Vol. 21, Issue 4, pp. 501-513, April 2005. The Photonic TeraStream: Enabling Next Generation Applications Through Intelligent Optical Networking at iGrid 2002 , J. Mambretti, J. Weinberger, J. Chen, E. Bacon, F. Yeh, D. Lillethun, R. Grossman, Y. Gu, M. Mazzuco,, Journal of Future Computer Systems, Volume 19, Number 6, pages 897-908.   Experimental Studies Using Photonic Data Services at IGrid 2002 , R. Grossman, Y. Gu, D. Hanley, X. Hong, D. Lillethun, J. Levera, J. Mambretti, M. Mazzucco, and J. Weinberger, Journal of Future Computer Systems, 2003, Volume 19, Number 6, pages 945-955.
Publications Papers on Applications using UDT Open DMIX: High Performance Web Services for Distributed Data Mining , R. Grossman, Y. Gu, C. Gupta, D. Hanley, X. Hong, and P. Krishnaswamy, 7th International Workshop on High Performance and Distributed Mining, . Open DMIX - Data Integration and Exploration Services for Data Grids , R. Grossman, Y. Gu, D. Hanley, X. Hong, and G. Rao, First International Workshop on Knowledge Grid and Grid Intelligence (KGGI 2003). Global Access to Large Distributed Data Sets using Photonic Data Services , R. Grossman, Y. Gu, D. Hanley, X. Hong, D. Lillethun, J. Levera, J. Mambretti, M. Mazzucco, and J. Weinberger, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST 2003), Los Alamitos, CA.   Data Webs for Earth Science Data , Asvin Ananthanarayan, Rajiv Balachandran, Yunhong Gu, Robert Grossman, Xinwei Hong, Jorge Levera, Marco Mazzucco, Parallel Computing, Volume 29, 2003, pages 1363-1379.
Achievements SC 2002  Bandwidth Challenge “Best Use of Emerging Network Infrastructure” Award SC 2003  Bandwidth Challenge “Application Foundation” Award   SC 2004 Bandwidth Challenge “Best Replacement for FedEx / UDP Fairness” Award SC 2005 ? Nov. 12 – 18, Seattle WA High Performance Mining of Streaming Data using UDT iGrid 2005 Exploring and mining remote data at 10Gb/s
Vision Short-term A practical solution to the distributed data intensive applications in high BDP environments Long-term Evolve with new technologies (open source & open standard) More functionalities and support for more use scenarios Network research platform (e.g., fast prototyping and evaluation of new control algorithms)
The End Thank You! Yunhong Gu, October 10, 2005

More Related Content

PPT
UDT
PPTX
PPT
Sania rtp
PPT
RTP.ppt
PPTX
RIP RTCP RTSP
PDF
HIGH SPEED NETWORKS
UDT
Sania rtp
RTP.ppt
RIP RTCP RTSP
HIGH SPEED NETWORKS

What's hot (20)

PPTX
PPT
Real-Time Streaming Protocol
PPTX
PDF
HIGH SPEED NETWORKS
PPTX
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
PDF
Optimization of Low-efficiency Traffic in OpenFlow Software Defined Networks
PDF
Lecture04 H
PPT
Chap05 gtp 03_kh
PPT
presentation
PPT
PDF
Computer network (13)
PDF
Improving Network Efficiency with Simplemux
PPT
Phasor data concentrator
PDF
IRJET- Modeling a New Startup Algorithm for TCP New Reno
PDF
Precision Time Synchronization
PDF
RTSP Protocol - Explanation to develop API of RTSP Protocol
PPT
RTSP Analysis Wireshark
PDF
DTS_4138-timeserver
PDF
Introduction to SCTP and it's benefits over TCP and UDP
PDF
Ieee 1588 ptp
Real-Time Streaming Protocol
HIGH SPEED NETWORKS
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
Optimization of Low-efficiency Traffic in OpenFlow Software Defined Networks
Lecture04 H
Chap05 gtp 03_kh
presentation
Computer network (13)
Improving Network Efficiency with Simplemux
Phasor data concentrator
IRJET- Modeling a New Startup Algorithm for TCP New Reno
Precision Time Synchronization
RTSP Protocol - Explanation to develop API of RTSP Protocol
RTSP Analysis Wireshark
DTS_4138-timeserver
Introduction to SCTP and it's benefits over TCP and UDP
Ieee 1588 ptp
Ad

Viewers also liked (18)

PPT
NAGARA: SRB and iRODS
PPTX
Green Shoots: Research Data Management Pilot at Imperial College London
PDF
Research Data Management en bibliotheken
PDF
Data Management for Grown Ups
PDF
iRODS/Dataverse Project by Jonathan Crabtree
PDF
iRODS User Group Meeting 2016 - MUMC+
PPTX
ODSC and iRODS
PDF
iRODS Rule Language Cheat Sheet
PPTX
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
PPTX
iRODS: Interoperability in Data Management
PPTX
Fast and secure protocol (fasp)
PPTX
Intel aspera-medical-v1
ODP
Private Cloud Architecture
PPT
File management ppt
PDF
I rods분석(20170313,01,김선태)
PDF
White Paper: Life Sciences at RENCI, Big Data IT to Manage, Decipher and Info...
 
PPTX
Operating Systems - File Management
NAGARA: SRB and iRODS
Green Shoots: Research Data Management Pilot at Imperial College London
Research Data Management en bibliotheken
Data Management for Grown Ups
iRODS/Dataverse Project by Jonathan Crabtree
iRODS User Group Meeting 2016 - MUMC+
ODSC and iRODS
iRODS Rule Language Cheat Sheet
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
iRODS: Interoperability in Data Management
Fast and secure protocol (fasp)
Intel aspera-medical-v1
Private Cloud Architecture
File management ppt
I rods분석(20170313,01,김선태)
White Paper: Life Sciences at RENCI, Big Data IT to Manage, Decipher and Info...
 
Operating Systems - File Management
Ad

Similar to UDT (20)

PPT
Learn TransportLayer of the OSI model to day with me.
PPTX
Transport Layer
PPTX
4th Module (1).pptx internet of things..
PPT
transport layer
PDF
UDT.pptx
PDF
TCP Theory
PPT
Chapter3 transport
PDF
SECURING DATA TRANSFER IN THE CLOUD THROUGH INTRODUCING IDENTIFICATION PACKET...
PPTX
Protocols for Fast Delivery of Large Data Volumes
PDF
A dynamic performance-based_flow_control
PPT
PPTX
Networking essentials lect3
PPT
Transport Layer
PPTX
Online TCP-IP Networking Assignment Help
PDF
Command Transfer Protocol (CTP) for Distributed or Parallel Computation
PPT
Troubleshooting TCP/IP
PPTX
transport layer pptxdkididkdkdkddjjdjffkfif
PPT
lec 3 4 Core Delays Thruput Net Arch.ppt
PDF
TCP Congestion Control
PDF
RIPE 80: Buffers and Protocols
Learn TransportLayer of the OSI model to day with me.
Transport Layer
4th Module (1).pptx internet of things..
transport layer
UDT.pptx
TCP Theory
Chapter3 transport
SECURING DATA TRANSFER IN THE CLOUD THROUGH INTRODUCING IDENTIFICATION PACKET...
Protocols for Fast Delivery of Large Data Volumes
A dynamic performance-based_flow_control
Networking essentials lect3
Transport Layer
Online TCP-IP Networking Assignment Help
Command Transfer Protocol (CTP) for Distributed or Parallel Computation
Troubleshooting TCP/IP
transport layer pptxdkididkdkdkddjjdjffkfif
lec 3 4 Core Delays Thruput Net Arch.ppt
TCP Congestion Control
RIPE 80: Buffers and Protocols

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Electronic commerce courselecture one. Pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
cuic standard and advanced reporting.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Approach and Philosophy of On baking technology
PDF
Empathic Computing: Creating Shared Understanding
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPT
Teaching material agriculture food technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Per capita expenditure prediction using model stacking based on satellite ima...
Electronic commerce courselecture one. Pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Building Integrated photovoltaic BIPV_UPV.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Programs and apps: productivity, graphics, security and other tools
cuic standard and advanced reporting.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Approach and Philosophy of On baking technology
Empathic Computing: Creating Shared Understanding
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Review of recent advances in non-invasive hemoglobin estimation
MIND Revenue Release Quarter 2 2025 Press Release
Teaching material agriculture food technology
The AUB Centre for AI in Media Proposal.docx
Understanding_Digital_Forensics_Presentation.pptx
Encapsulation theory and applications.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

UDT

  • 1. BREAKING THE DATA TRANSFER BOTTLENECK Yunhong GU [email_address] Laboratory for Advanced Computing National Center for Data Mining University of Illinois at Chicago October 10, 2005 udt.sourceforge.net UDT: A High Performance Data Transport Protocol
  • 2. Outline INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS
  • 3. >> INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS
  • 4. Motivations The widespread use of high-speed networks (1Gb/s, 10Gb/s, etc.) has enabled many new distributed data intensive applications Inexpensive fibers and advanced optical networking technologies (e.g., DWDM - Dense Wavelength Division Multiplexing) 10Gb/s is common in high speed network testbeds, 40 Gb/s is emerging Large volumetric datasets Satellite weather data Astronomy observation Network monitoring The Internet transport protocol (TCP) does NOT scale well as network bandwidth-delay product (BDP) increases New transport protocol is needed!
  • 5. Data Transport Protocol Functionalities Streaming, messaging Reliability Timeliness Unicast vs. multicast Congestion control Efficiency Fairness Convergence Distributedness Physical Layer Applications Data link Layer Network Layer Transport Layer
  • 6. TCP Reliable, data streaming, unicast Congestion control Increase congestion window size ( cwnd ) one full sized packet per RTT Halve the cwnd per loss event Poor efficiency in high bandwidth-delay product networks Bias on flows with larger RTT ½ Bandwidth * RTT
  • 7. TCP Throughput (Mb/s) Throughput (Mb/s) Packet Loss Round Trip Time (ms) 0.01% 0.05% 0.1% 0.1% 0.5% 1000 800 600 400 200 1 10 100 200 400 1000 800 600 400 200 LAN US-EU US-ASIA US
  • 8. Related Work TCP variants HighSpeed, Scalable, BiC, FAST, H-TCP, L-TCP Parallel TCP PSockets, GridFTP Rate-based reliable UDP RBUDP, Tsunami, FOBS, FRTP (based on SABUL), Hurricane (based on UDT) XCP SABUL
  • 9. Problems of Existing Work Hard to deploy TCP variants and XCP Need modifications in OS kernel and/or routers Cannot be used in shared networks Most reliable UDP-based protocols Poor fairness Intra-protocol fairness RTT fairness Manual parameter tuning
  • 10. A New Protocol Throughput (Mb/s) Throughput (Mb/s) Packet Loss Round Trip Time (ms) 0.01% 0.05% 0.1% 0.1% 0.5% 1000 800 600 400 200 1 10 100 200 400 1000 800 600 400 200 LAN US-EU US-ASIA US
  • 11. UDT (UDP-based Data Transfer Protocol) Application level, UDP-based Similar functionalities to TCP Connection-oriented reliable duplex unicast data streaming New protocol design and implementation New congestion control algorithm Configurable congestion control framework
  • 12. Objective & Non-objective Objective For distributed data intensive applications in high speed networks A small number of flows share the abundant bandwidth Efficient, fair, and friendly Configurable Easily deployable and usable Non-objective Replace TCP on the Internet
  • 13. UDT Project Open source (udt.sourceforge.net) Design and implement the UDT protocol Design the UDT congestion control algorithm Evaluate experimentally the performance of UDT Design and implement a configurable protocol framework based on UDT (Composable UDT)
  • 14. >> PROTOCOL DESIGN & IMPLEMENTATION INTRODUCTION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS
  • 15. UDT Overview Two orthogonal elements The UDT protocol The UDT congestion control algorithm Protocol design & implementation Functionality Efficiency Congestion control algorithm Efficiency, fairness, friendliness, and stability
  • 16. UDT Overview UDP Socket API Applications TCP Socket API Applications Applications UDT UDT Socket
  • 17. Functionality Reliability Packet-based sequencing Acknowledgment and loss report from receiver ACK sub-sequencing Retransmission (based on loss report and timeout) Streaming Buffer/memory management Connection maintenance Handshake, keep-alive message, teardown message Duplex Each UDT instance contains both a sender and a receiver
  • 18. Protocol Architecture UDP Channel Sender Sender Receiver  Sender Receiver  UDP Seq. No TS Payload ACK Seq. No NAK Loss List A B
  • 19. Software Architecture CC API UDP Channel Sender Receiver Sender's Buffer Receiver's Buffer Sender's Loss List Receiver's Loss List Listener
  • 20. Efficiency Consideration Less packets Timer-based acknowledging Less CPU time Reduce per packet processing time Reduce memory copy Reduce loss list processing time Light ACK vs. regular ACK Parallel processing Threading architecture Less burst in processing Evenly distribute the processing time
  • 21. Application Programming Interface (API) Socket API New functionalities sendfile/recvfile Overlapped IO support Transparent to existing applications Recompilation needed Certain limitations exist XIO support (in Globus Toolkit 4.0) Wrapper for other programming languages Java, Python
  • 22. >> CONGESTION CONTROL INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION PERFORMANCE EVALUATION COMPOSABLE UDT CONCLUSIONS
  • 23. Overview Congestion control vs. flow control Congestion control: effectively utilize the network bandwidth Flow control: prevent the receiver from being overwhelmed by incoming packets Window-based vs. rate-based Window-based: tune the maximum number of on-flight packets (TCP) Rate-based: tune the inter-packet sending time (UDT) AIMD: additive increases multiplicative decreases Feedback Packet loss (Most TCP variants, UDT) Delay (Vegas, FAST)
  • 24. AIMD with Decreasing Increases AIMD x = x +  (x) , for every constant interval (e.g., RTT) x = (1 -  ) x , when there is a packet loss event where x is the packet sending rate. TCP  (x)  1 , and the increase interval is RTT .  = 0.5 AIMD with Decreasing Increase  (x) is non-increasing, and lim x->+   (x) = 0.
  • 25. AIMD with Decreasing Increases  ( x ) x AIMD (TCP NewReno) UDT HighSpeed TCP Scalable TCP
  • 26. Increase  (x) = f( B - x ) * c where B is the link capacity (Bandwidth), c is a constant parameter Constant rate control interval ( SYN ), irrelevant to RTT SYN = 0.01 seconds Decrease Randomized decrease factor  = 1 – (8/9) n UDT Control Algorithm  ( x ) x
  • 27. The Increase Formula: an Example Bandwidth ( B ) = 10 Gbps, Packet size = 1500 bytes 0.00067 <0.1 9999.9+ 0.001 (0.1, 1] [9999, 9999.9) 0.01 (1, 10] [9990, 9999) 0.1 (10, 100] [9900, 9990) 1 (100, 1000] [9000, 9900) 10 (1000, 10000] [0, 9000) Increment (pkts/SYN) B - x (Mbps) x (Mbps)
  • 28. Dealing with Packet Loss Loss synchronization Randomization method Non-congestion loss Do not decrease sending rate for the first packet loss Packet reordering M=5, N=2 M=8, N=3
  • 29. Bandwidth Estimation Packet Pair Filters Cross traffic Interrupt Coalescence Robust to estimation errors Randomized interval to send packet pair P2 P1 Packet Size / Space  Bottleneck Bandwidth P2 P1 P2 P1
  • 30. >> PERFORMANCE EVALUATION INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL COMPOSABLE UDT CONCLUSIONS
  • 31. Performance Characteristics Efficiency Higher bandwidth utilization, less CPU usage Intra-protocol fairness Max-min fairness Jain's fairness index TCP friendliness Bulk TCP flow vs Bulk UDT flow Short-lived TCP flow (slow start phase) vs Bulk UDT flow Stability (oscillations) Stability index (standard deviation)
  • 32. Evaluation Strategies Simulations vs. experiments NS2 network simulator, NCDM teraflow testbed Setup Network topology, bandwidth, distance, queuing, Link error rate, etc. Concurrency (number of parallel flows) Comparison (against TCP) Real world applications SDSS data transfer, high performance mining of streaming data, etc. Independent evaluation SLAC, JGN2, UvA, Unipmn (Italy), etc.
  • 33. Efficiency, Fairness, & Stability Flow 1 Flow 2 Flow 3 Flow 4 0 100 200 300 400 500 600 700 Time (sec) 206.220.241.16 206.220.241.15 206.220.241.14 206.220.241.13 145.146.98.81 145.146.98.80 145.146.98.79 145.146.98.78 1Gb/s bandwidth, 106 ms RTT, StarLight, Chicago SARA, Amsterdam
  • 34. Efficiency, Fairness, & Stability 0.04 0.02 0.04 0.16 0.08 0.11 0.11 Stability 1 1 0.999 0.998 0.999 0.999 1 Fairness 885 904 918 830 923 912 902 Efficiency 197 Flow 4 307 202 302 Flow 3 452 310 216 308 446 Flow 2 885 452 301 215 313 466 902 Flow 1
  • 35. TCP Friendliness 500 1MB TCP flows vs. 0 – 10 bulk UDT flows 1Gb/s between Chicago and Amsterdam 0 1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 Number of UDT flows TCP Throughput (Mb/s)
  • 36. >> COMPOSABLE UDT INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION CONCLUSIONS
  • 37. Composable UDT - Objectives Easy implementation and deployment of new control algorithms Easy evaluation of new control algorithms Application awareness support and dynamic configuration
  • 38. Composable UDT - Methodologies Packet sending control Window-based, rate-based, and hybrid Control event handling onACK, onLoss, onTimeout, onPktSent, onPktRecved, etc. Protocol parameters access RTT, loss rate, RTO, etc. Packet extension User-defined control packets
  • 39. Composable UDT - Evaluation Simplicity Can it be easily used? Expressiveness Can it be used to implement most control protocols? Similarity Can Composable UDT based implementations reproduce the performance of their native implementations? Overhead Will the overhead added by Composable UDT be too large?
  • 40. Simplicity & Expressiveness Eight event handlers, four protocol control functions, and one performance monitoring function. Support a large variety of protocols Reliable UDT blast TCP and its variants (both loss and delay based) Group transport protocols
  • 41. Simplicity & Expressiveness CCC Base Congestion Control Class CTCP TCP NewReno CGTP Group Transport Protocol CUDPBlast Reliable UDP Blast CFAST FAST TCP CVegas TCP Vegas CScalable Scalable TCP CHS HighSpeed TCP CBiC BiC TCP CWestwood TCP Westwood 28 73 / +132-6 11 / +192-29 8 / +27-1 11 / +192-29 27 / +145-2 37 / +351-2
  • 42. Similarity and Overhead Similarity How Composable UDT based implementations can simulate their native implementations CTCP vs. Linux TCP CPU usage Sender: CTCP uses about 100% more times of CPU as Linux TCP Receiver: CTCP uses about 20% more CPU than Linux TCP Flow # Throughput Fairness Stability TCP CTCP TCP CTCP TCP CTCP 1 112 122 1 1 0.517 0.415 2 191 208 0.997 0.999 0.476 0.426 4 322 323 0.949 0.999 0.484 0.492 8 378 422 0.971 0.999 0.633 0.550 16 672 642 0.958 0.985 0.502 0.482 32 877 799 0.988 0.997 0.491 0.470 64 921 716 0.994 0.996 0.569 0.529
  • 43. >> CONCLUSIONS INTRODUCTION PROTOCOL DESIGN & IMPLEMENTATION CONGESTION CONTROL PERFORMANCE EVALUATION COMPOSABLE UDT
  • 44. Contributions A high performance data transport protocol and associated implementation The UDT protocol Open source UDT library ( udt.sourceforge.net ) User includes ANL, ORNL, PNNL, etc. An efficient and fair congestion control algorithm DAIMD & the UDT control algorithm Packet loss handling techniques Using bandwidth estimation technique in congestion control A configurable transport protocol framework Composable UDT
  • 45. Publications Papers on the UDT Protocol Supporting Configurable Congestion Control in Data Transport Services , Yunhong Gu and Robert L. Grossman, SC 2005, Nov 12 - 18, Seattle, WA. Optimizing UDP-based Protocol Implementation , Yunhong Gu and Robert L. Grossman, PFLDNet 2005, Lyon, France, Feb. 2005. Experiences in Design and Implementation of a High Performance Transport Protocol , Yunhong Gu, Xinwei Hong, and Robert L. Grossman, SC 2004, Nov 6 - 12, Pittsburgh, PA. An Analysis of AIMD Algorithms with Decreasing Increases , Yunhong Gu, Xinwei Hong and Robert L. Grossman, First Workshop on Networks for Grid Applications (Gridnets 2004), Oct. 29, San Jose, CA. SABUL: A Transport Protocol for Grid Computing , Yunhong Gu and Robert L. Grossman, Journal of Grid Computing, 2003, Volume 1, Issue 4, pp. 377-386. Internet Draft UDT: A Transport Protocol for Data Intensive Applications , Yunhong Gu and Robert L. Grossman, draft-gg-udt-01.txt.
  • 46. Publications Papers on Data Transfer Service using UDT Experimental Studies of Data Transport and Data Access of Earth Science Data over Networks with High Bandwidth Delay Products , Robert Grossman, Yunhong Gu, Dave Hanley, Xinwei Hong and Parthasarathy Krishnaswamy, Computer Networks, Volume 46, Issue 3, Oct. 2004, pp. 411-421. Teraflows over Gigabit WANs with UDT , Robert Grossman, Yunhong Gu, Xinwei Hong, Antony Antony, Johan Blom, Freek Dijkstra, and Cees de Laat,, Journal of Future Computer Systems, Vol. 21, Issue 4, pp. 501-513, April 2005. The Photonic TeraStream: Enabling Next Generation Applications Through Intelligent Optical Networking at iGrid 2002 , J. Mambretti, J. Weinberger, J. Chen, E. Bacon, F. Yeh, D. Lillethun, R. Grossman, Y. Gu, M. Mazzuco,, Journal of Future Computer Systems, Volume 19, Number 6, pages 897-908. Experimental Studies Using Photonic Data Services at IGrid 2002 , R. Grossman, Y. Gu, D. Hanley, X. Hong, D. Lillethun, J. Levera, J. Mambretti, M. Mazzucco, and J. Weinberger, Journal of Future Computer Systems, 2003, Volume 19, Number 6, pages 945-955.
  • 47. Publications Papers on Applications using UDT Open DMIX: High Performance Web Services for Distributed Data Mining , R. Grossman, Y. Gu, C. Gupta, D. Hanley, X. Hong, and P. Krishnaswamy, 7th International Workshop on High Performance and Distributed Mining, . Open DMIX - Data Integration and Exploration Services for Data Grids , R. Grossman, Y. Gu, D. Hanley, X. Hong, and G. Rao, First International Workshop on Knowledge Grid and Grid Intelligence (KGGI 2003). Global Access to Large Distributed Data Sets using Photonic Data Services , R. Grossman, Y. Gu, D. Hanley, X. Hong, D. Lillethun, J. Levera, J. Mambretti, M. Mazzucco, and J. Weinberger, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST 2003), Los Alamitos, CA. Data Webs for Earth Science Data , Asvin Ananthanarayan, Rajiv Balachandran, Yunhong Gu, Robert Grossman, Xinwei Hong, Jorge Levera, Marco Mazzucco, Parallel Computing, Volume 29, 2003, pages 1363-1379.
  • 48. Achievements SC 2002 Bandwidth Challenge “Best Use of Emerging Network Infrastructure” Award SC 2003 Bandwidth Challenge “Application Foundation” Award SC 2004 Bandwidth Challenge “Best Replacement for FedEx / UDP Fairness” Award SC 2005 ? Nov. 12 – 18, Seattle WA High Performance Mining of Streaming Data using UDT iGrid 2005 Exploring and mining remote data at 10Gb/s
  • 49. Vision Short-term A practical solution to the distributed data intensive applications in high BDP environments Long-term Evolve with new technologies (open source & open standard) More functionalities and support for more use scenarios Network research platform (e.g., fast prototyping and evaluation of new control algorithms)
  • 50. The End Thank You! Yunhong Gu, October 10, 2005