2. Object & Outline
▪ Principles of reliable data transfer
▪ Pipeline protocol concepts
▪ Go back N
▪ Selective repeat
▪ TCP message format
▪ Read:
▪ Chapter 3.4 and 3.5
4. Principles of reliable data transfer
sending
process
dat
a
receiving
process
dat
a
reliable channel
application
transport
reliable service abstraction
5. Principles of reliable data transfer
sending
process
dat
a
receiving
process
dat
a
application
transport
reliable service implementation
unreliable channel
network
transport
sender-side of
reliable data
transfer protocol
receiver-side
of reliable data
transfer protocol
sending
process
dat
a
receiving
process
dat
a
reliable channel
application
transport
reliable service abstraction
6. Principles of reliable data transfer
sending
process
dat
a
receiving
process
dat
a
application
transport
reliable service implementation
unreliable channel
network
transport
sender-side of
reliable data
transfer protocol
receiver-side
of reliable data
transfer protocol
Complexity of reliable data
transfer protocol will depend
(strongly) on characteristics of
unreliable channel (lose,
corrupt, reorder data?)
7. Principles of reliable data transfer
sending
process
dat
a
receiving
process
dat
a
application
transport
reliable service implementation
unreliable channel
network
transport
sender-side of
reliable data
transfer protocol
receiver-side
of reliable data
transfer protocol
Sender, receiver do not know
the “state” of each other, e.g.,
was a message received?
unless communicated via a
message
8. rdt1.0: reliable transfer over a reliable channel
underlying channel perfectly reliable
• no bit errors
• no loss of packets
packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packet,data)
deliver_data(data)
rdt_rcv(packet)
Wait for
call from
below
receiver
separate FSMs for sender, receiver:
• sender sends data into underlying channel
• receiver reads data from underlying channel
sender
Wait for
call from
above
9. rdt2.0: channel with bit errors
underlying channel may flip bits in packet
• checksum (e.g., Internet checksum) to detect bit errors
the question: how to recover from errors?
How do humans recover from “errors” during conversation?
11. rdt2.0: channel with bit errors
underlying channel may flip bits in packet
• checksum to detect bit errors
the question: how to recover from errors?
• acknowledgements (ACKs): receiver explicitly tells sender that pkt
received OK
• negative acknowledgements (NAKs): receiver explicitly tells sender
that pkt had errors
• sender retransmits pkt on receipt of NAK
stop and wait
sender sends one packet, then waits for receiver response
13. rdt2.0 has a fatal flaw!
what happens if ACK/NAK corrupted?
sender doesn’t know what happened at receiver!
can’t just retransmit: possible duplicate
Ack: 101
NAK: 010
Ack: 101→010 →000
NAK: 010 →101 →000
Solution to handle corrupted ACK/NAK
• Sender resends the current data packet when it receives a
garbled ACK or NAK
15. rdt2.0 has a fatal flaw!
what happens if ACK/NAK
corrupted?
sender doesn’t know what
happened at receiver!
can’t just retransmit: possible
duplicate
handling duplicates:
sender retransmits current pkt if
ACK/NAK corrupted
sender adds sequence number to
each pkt
receiver discards (doesn’t deliver
up) duplicate pkt
17. rdt2.1: discussion
sender:
seq # added to pkt
two seq. #s (0,1) will suffice.
Why?
must check if received ACK/NAK
corrupted
twice as many states
• state must “remember” whether
“expected” pkt should have seq #
of 0 or 1
receiver:
must check if received packet
is duplicate
• state indicates whether 0 or 1 is
expected pkt seq #
note: receiver can not know if
its last ACK/NAK received OK
at sender
18. rdt2.2: a NAK-free protocol
same functionality as rdt2.1, using ACKs only
instead of NAK, receiver sends ACK for last pkt received OK
• receiver must explicitly include seq # of pkt being ACKed
duplicate ACK at sender results in same action as NAK:
retransmit current pkt
As we will see, TCP uses this approach to be NAK-free
20. rdt3.0: channels with errors and loss
New channel assumption: underlying channel can also lose
packets (data, ACKs)
• checksum, sequence #s, ACKs, retransmissions will be of help …
but not quite enough
Q: How do humans handle lost sender-to-
receiver words in conversation?
21. rdt3.0: channels with errors and loss
Approach: sender waits “reasonable” amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost):
• retransmission will be duplicate, but seq #s already handles this!
• receiver must specify seq # of packet being ACKed
timeout
use countdown timer to interrupt after “reasonable” amount of
time
24. Performance of rdt3.0 (stop-and-wait)
example: 1 Gbps link, 15 ms prop. delay, 8000 bit packet
U sender: utilization – fraction of time sender busy sending
Dtrans =
L
R
8000 bits
109
bits/sec
= = 8 microsecs
• time to transmit packet into channel:
25. rdt3.0: stop-and-wait operation
first packet bit transmitted, t = 0
sender receive
r
RTT
first packet bit arrives
last packet bit arrives, send
ACK
ACK arrives, send next
packet, t = RTT + L / R
26. rdt3.0: stop-and-wait operation
sender receive
r
Usender
=
L / R
RTT
RTT
L/R
+ L / R
= 0.00027
=
.008
30.008
rdt 3.0 protocol performance stinks!
Protocol limits performance of underlying infrastructure (channel)
28. rdt3.0: pipelined protocols operation
pipelining: sender allows multiple, “in-flight”, yet-to-be-acknowledged
packets
• range of sequence numbers must be increased
• buffering at sender and/or receiver
Questions:
How to improve the utilization ratio?
What you can do when you wait a message reply?
29. Pipelining via Sliding Window
▪ Allow multiple outstanding (un-ACKed) frames
▪ Upper bound on un-ACKed frames, called window
Sender Receiver
Time
…
…
30. Pipelining: increased utilization
first packet bit transmitted, t = 0
sender receiver
RTT
last bit transmitted, t = L / R
first packet bit arrives
last packet bit arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R
last bit of 2nd
packet arrives, send ACK
last bit of 3rd
packet arrives, send ACK
3-packet pipelining increases
utilization by a factor of 3!
U
sender =
.0024
30.008
= 0.081%
3L / R
RTT + L / R
=
Busy time 3*L/R
31. Pipelining: increased utilization
first packet bit transmitted, t = 0
sender receiver
RTT
last bit transmitted, t = L / R
first packet bit arrives
last packet bit arrives, send ACK
last bit of 2nd
packet arrives, send ACK
last bit of 3rd
packet arrives, send ACK
Questions:
1. Can we keep sending without stop?
2. If the propagation delay is very small, do we need this?
32. Pipelining: increased utilization
▪ Long distance:
▪ D_prop =15ms
▪ Usender=0. 027%
▪ Short distance:
▪ D_prop=0.01 ms
▪ Usender=28%
1 Gbps transmission rate link, 8000 bit packet , L/R=0.008, RTT=2* D_prop
Usender
=
L / R
2*D_p+ L / R
=
.008
2*?+0.008
D_prop=?
If the propagation delay is relative small, wait and stop is good enough, will reduce the
complexity of the system.
33. Buffering on Sender and Receiver
▪ Sender needs to buffer data so that if data is lost, it can be resent
▪ Unsend: can be send continually
▪ Already Sent: wait until be acknowledged.
▪ Receiver needs to buffer data so that if data is received out of order, it can be held until all packets are
received
sender receiver
×
Sent unAcked packet
Possible resend
Buffer size:
Stop-wait: 1
Pipeline: >1 received uncompleted
34. Sliding Window Protocols
sender receiver
×
Sender Buffer size (S)
Receiver Buffer size (R)
Stop-wait: S=1,R=1;
Go-Back-N: S >1, R=1 ;
Selective-repeat: S >1, R>1
pipelined protocols
36. Sliding Window: Sender
▪ Assign sequence number to each packet (SeqNum)
▪ Maintain three state variables:
• send window size (SWS)
• last acknowledgment received (LAR)
• last packet sent (LFS)
▪ Maintain invariant: LFS - LAR <= SWS
▪ Advance LAR when ACK arrives
▪ Buffer up to SWS packet
£SWS
LAR LFS
… …
37. Sliding Window
▪ Sliding window is best known algorithm in networking
▪ enable reliable delivery of packets
• Timeouts and acknowledgements
▪ enable in order delivery of packets
• Receiver doesn’t pass data up to app until it has packets in order
▪ enable flow control
• Prevents server from overflowing receiver’s buffer
40. Go-Back-N: sender
sender: “window” of up to N, consecutive transmitted but unACKed pkts
• k-bit seq # in pkt header
cumulative ACK: ACK(n): ACKs all packets up to, including seq # n
• on receiving ACK(n): move window forward to begin at n+1
timer for oldest in-flight packet
timeout(n): retransmit packet n and all higher seq # packets in window
41. Go-Back-N: receiver
ACK-only: always send ACK for correctly-received packet so far, with
highest in-order seq #
• may generate duplicate ACKs
• need only remember rcv_base
on receipt of out-of-order packet:
• can discard (don’t buffer) or buffer: an implementation decision
• re-ACK pkt with highest in-order seq #
rcv_base
received and ACKed
Out-of-order: received but not ACKed
Not received
Receiver view of sequence number space:
… …
43. Questions
▪ Can We further improve the efficiency of go-back-N protocol?
▪ How?
44. Go-Back-N Simulation (5 Minutes)
https://www2.tkn.tu-berlin.de/teaching/rn/animations/gbn_sr/
Try to adjust different parameters
1. What happened if end to end delay is too large?
2. What happened if packet loss or delay occur?
3. If end to end delay is too large, how to adjust timeout?
4. How to improve the efficiency ?
QR code is
direct to the
same link
https://media.pearsoncmg.com/aw/ecs_kurose_compnetwork_7/cw/con
tent/interactiveanimations/go-back-n-protocol/index.html
49. Selective repeat: sender and receiver
data from above:
if next available seq # in
window, send packet
timeout(n):
resend packet n, restart timer
ACK(n) in [sendbase,sendbase+N]:
mark packet n as received
if n is smallest unACKed packet,
advance window base to next
unACKed seq #
sender
50. Selective repeat: sender and receiver
packet n in [rcvbase, rcvbase+N-1]
send ACK(n)
out-of-order: buffer
in-order: deliver (also deliver
buffered, in-order packets),
advance window to next not-yet-
received packet
packet n in [rcvbase-N,rcvbase-1]
ACK(n)
otherwise:
ignore
receiver
53. Summary of reliable data transfer mechanisms
Mechanism Use, Comments
Checksum Used to detect bit errors in a transmitted packet.
Timer Used to timeout/retransmit a packet.
Duplicate copies of a packet may be received by a receiver.
Sequence number Gaps in the sequence numbers of received packets allow to detect a lost packet.
Detect duplicate copies of a packet.
Acknowledgment a packet or set of packets has been received correctly.
Negative
acknowledgment
a packet has not been received correctly.
Window, pipelining The sender may be restricted to sending only packets with sequence numbers that fall
within a given range.
Sender utilization can be increased.
54. Reliable transfer protocols
Properties Stop and Wait Go Back N Selective Repeat
Sender window size 1 N N
Receiver Window size 1 1 N
Minimum Sequence number 2 N+1 2N
Efficiency 1/(1+2*a) N/(1+2*a) N/(1+2*a)
Type of Acknowledgement Individual Cumulative Individual
Supported order at the Receiving
end
– In-order delivery only Out-of-order delivery as well
Number of retransmissions in case
of packet drop
1 N 1
Transmission Type Half duplex Full duplex Full duplex
Implementation difficulty Low Moderate Complex
56. TCP: overview RFCs: 793,1122, 2018, 5681, 7323
point-to-point:
• one sender, one receiver
reliable, in-order byte steam:
• no “message boundaries"
full duplex data:
• bi-directional data flow in same connection
• MSS: maximum segment size
Why we need MSS?
57. TCP: overview RFCs: 793,1122, 2018, 5681, 7323
MSS = MTU - IP header size - TCP header size
The MSS field is set during the initial 3 way handshake of TCP, where we negotiate the MSS so that
the packet size do not exceed the MTU size of the network by which we avoid fragmentation.
58. TCP: overview RFCs: 793,1122, 2018, 5681, 7323
cumulative ACKs
pipelining:
• TCP congestion and flow control set window size
connection-oriented:
• handshaking (exchange of control messages) initializes sender,
receiver state before data exchange
flow controlled:
• sender will not overwhelm receiver
59. TCP segment structure : sequence number
source port # dest port #
32 bits
sequence number
segment seq #: counting
bytes of data into bytestream
(not segments!)
0 64 128 192 256 320 384 448 512 576
64 bytes
Application message
How to guarantee the received message as same as sent in
application layer after segmentation?
60. TCP segment structure : sequence number
0 64 128 192 256 320 384 448 512 576
A B C D E F G H I J
64 bytes
Sender side
Application layer
Transport layer
segmentation
A
64 bytes
+ B
+ C
+ D
+
64 bytes 64 bytes 64 bytes
A
sequence number 0
B
sequence number 64
C
sequence number 128
D
sequence number 192
61. TCP segment structure : sequence number
0 64 128 192 256 320 384 448 512 576
A B C D E F G H I J
64 bytes
Receiver side
Application layer
Transport layer
resemble
A
64 bytes
B C D
64 bytes 64 bytes 64 bytes
A
sequence number 0
B
sequence number 64
C
sequence number 128
D
sequence number 192
62. TCP segment structure
source port # dest port #
32 bits
not
used
sequence number
segment seq #: counting
bytes of data into bytestream
(not segments!)
A
acknowledgement number
ACK: seq # of next expected
byte; A bit: this is an ACK
Urg data pointer
P
U
Sender Receiver
ACK=554
Received 554
data, expect
to 555 and
later
63. TCP sequence numbers, ACKs
Sequence numbers:
• byte stream “number” of
first byte in segment’s data
source port # dest port #
sequence number
acknowledgement
number
checksum
rwnd
urg pointer
outgoing segment from receiver
A
sent
ACKed
sent, not-
yet ACKed
(“in-flight”)
usable
but not
yet sent
not
usable
window size
N
sender sequence number space
source port # dest port #
sequence
number
acknowledgement number
checksum
rwnd
urg pointer
outgoing segment from sender
Acknowledgements:
• seq # of next byte expected
from other side
• cumulative ACK
Q: how receiver handles out-of-
order segments
• A: TCP spec doesn’t say, - up
to implementor
64. TCP sequence numbers, ACKs
host ACKs receipt
of echoed ‘C’
host ACKs receipt of‘C’,
echoes back ‘C’
simple telnet scenario
Host B
Host A
User types‘C’
Seq=42, ACK=79, data = ‘C’
Seq=79, ACK=43, data = ‘C’
Seq=43, ACK=80
65. TCP segment structure
source port # dest port #
32 bits
not
used receive window flow control: # bytes
receiver willing to accept
sequence number
segment seq #: counting
bytes of data into bytestream
(not segments!)
application
data
(variable length)
data sent by
application into
TCP socket
A
acknowledgement number
ACK: seq # of next expected
byte; A bit: this is an ACK
options (variable length)
TCP options
head
len
length (of TCP header)
checksum
Internet checksum
RST, SYN, FIN: connection
management
F
S
R
Urg data pointer
P
U
C E
C, E: congestion notification
67. TCP: overview RFCs: 793,1122, 2018, 5681, 7323
point-to-point:
• one sender, one receiver
reliable, in-order byte steam:
• no “message boundaries"
full duplex data:
• bi-directional data flow in same connection
• MSS: maximum segment size
68. TCP: overview RFCs: 793,1122, 2018, 5681, 7323
cumulative ACKs
pipelining:
• TCP congestion and flow control set window size
connection-oriented:
• handshaking (exchange of control messages) initializes sender,
receiver state before data exchange
flow controlled:
• sender will not overwhelm receiver
69. TCP segment structure : sequence number
source port # dest port #
32 bits
sequence number
segment seq #: counting
bytes of data into bytestream
(not segments!)
0 64 128 192 256 320 384 448 512 576
64 bytes
Application message
70. TCP segment structure
source port # dest port #
32 bits
not
used
sequence number
segment seq #: counting
bytes of data into bytestream
(not segments!)
A
acknowledgement number
ACK: seq # of next expected
byte; A bit: this is an ACK
Urg data pointer
P
U
Sender Receiver
ACK=555
Received 554
data, expect
to 555 and
later
71. TCP sequence numbers, ACKs
Sequence numbers:
• byte stream “number” of
first byte in segment’s data
source port # dest port #
sequence number
acknowledgement
number
checksum
rwnd
urg pointer
outgoing segment from receiver
A
sent
ACKed
sent, not-
yet ACKed
(“in-flight”)
usable
but not
yet sent
not
usable
window size
N
sender sequence number space
source port # dest port #
sequence
number
acknowledgement number
checksum
rwnd
urg pointer
outgoing segment from sender
Acknowledgements:
• seq # of next byte expected
from other side
• cumulative ACK
72. TCP sequence numbers, ACKs
host ACKs receipt
of echoed ‘C’
host ACKs receipt of‘C’,
echoes back ‘C’
simple telnet scenario
Host B
Host A
User types‘C’
Seq=42, ACK=79, data = ‘C’
Seq=79, ACK=43, data = ‘C’
Seq=43, ACK=80
73. TCP segment structure
source port # dest port #
32 bits
not
used
sequence number
A
acknowledgement number
head
len
length (of TCP header)
Urg data pointer
P
U
A
Length of header
Where should we start the
data field at receiver side?
01010101110001111010
options (variable length)
74. TCP segment structure
source port # dest port #
32 bits
not
used receive window flow control: # bytes
receiver willing to accept
sequence number
segment seq #: counting
bytes of data into bytestream
(not segments!)
application
data
(variable length)
data sent by
application into
TCP socket
A
acknowledgement number
ACK: seq # of next expected
byte; A bit: this is an ACK
options (variable length)
TCP options
head
len
length (of TCP header)
checksum
Internet checksum
RST, SYN, FIN: connection
management
F
S
R
Urg data pointer
P
U
C E
C, E: congestion notification
76. TCP round trip time, timeout
Q: how to set TCP timeout value?
longer than RTT, but RTT varies!
too short: premature timeout, unnecessary retransmissions
too long: slow reaction to segment loss
https://www2.tkn.tu-berlin.de/teaching/rn/animations/gbn_sr/
77. TCP round trip time, timeout
Host B
Host A
Host B
Host A
78. TCP round trip time, timeout
Q: how to estimate RTT?
SampleRTT: measured time from segment transmission until ACK
receipt
• ignore retransmissions
SampleRTT will vary, want estimated RTT “smoother”
• average several recent measurements, not just current SampleRTT
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT
(milliseconds)
SampleRTT Estimated RTT
79. TCP round trip time, timeout
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
exponential weighted moving average (EWMA)
influence of past sample decreases exponentially
fast
typical value: = 0.125
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT
(milliseconds)
SampleRTT Estimated RTT
RTT
(milliseconds)
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
sampleRTT
EstimatedRTT
time
(seconds)
80. TCP round trip time, timeout
EstimatedRTT +“safety margin”
• large variation in EstimatedRTT: want a larger safety margin
TimeoutInterval = EstimatedRTT + 4*DevRTT
estimated RTT “safety margin”
* Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|
(typically, = 0.25)
DevRTT: EWMA of SampleRTT deviation from EstimatedRTT:
Timeout interval:
81. TCP Sender (simplified)
event: data received from
application
create segment with seq #
seq # is byte-stream number of
first data byte in segment
start timer if not already running
• think of timer as for oldest
unACKed segment
• expiration interval:
TimeOutInterval
event: timeout
retransmit segment that
caused timeout
restart timer
event: ACK received
if ACK acknowledges
previously unACKed segments
• update what is known to be
ACKed
• start timer if there are still
unACKed segments
82. TCP Receiver: ACK generation [RFC 5681]
Event at receiver
arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed
arrival of in-order segment with
expected seq #. One other
segment has ACK pending
arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
arrival of segment that
partially or completely fills gap
TCP receiver action
delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
immediately send single cumulative
ACK, ACKing both in-order segments
immediately send duplicate ACK,
indicating seq. # of next expected byte
immediate send ACK, provided that
segment starts at lower end of gap
83. A
40 49
500ms timeout, ACK 50
Case 1
TCP Receiver: ACK generation
A
40 49
B
50 59
C
60 69
delayed ACK. Wait up to 500ms
for next segment
A
40 49
Case 2
Relief the network burden and interference to sender
X
30 39
ACK 50
84. TCP Receiver: ACK generation [RFC 5681]
Event at receiver
arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed
arrival of in-order segment with
expected seq #. One other
segment has ACK pending
arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
arrival of segment that
partially or completely fills gap
TCP receiver action
delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
immediately send single cumulative
ACK, ACKing both in-order segments
immediately send duplicate ACK,
indicating seq. # of next expected byte
immediate send ACK, provided that
segment starts at lower end of gap
85. 40 49 50 59 60 69
Already received 40 with ACK pending,
waiting for 50
40 49
Case 1
Once received 50immediately send single
cumulative ACK 60, ACKing both in-order
segments
Case 2
50 59
Relief the network burden and interference to sender
86. TCP Receiver: ACK generation [RFC 5681]
Event at receiver
arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed
arrival of in-order segment with
expected seq #. One other
segment has ACK pending
arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
arrival of segment that
partially or completely fills gap
TCP receiver action
delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
immediately send single cumulative
ACK, ACKing both in-order segments
immediately send duplicate ACK,
indicating seq. # of next expected byte
immediate send ACK, provided that
segment starts at lower end of gap
87. A
40 49
B
50 59
C
60 69
B
50 59
arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
immediately send duplicate ACK 40,
indicating seq. #40 of next expected byte
40
88. TCP Receiver: ACK generation [RFC 5681]
Event at receiver
arrival of in-order segment with
expected seq #. All data up to
expected seq # already ACKed
arrival of in-order segment with
expected seq #. One other
segment has ACK pending
arrival of out-of-order segment
higher-than-expect seq. # .
Gap detected
arrival of segment that
partially or completely fills gap
TCP receiver action
delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
immediately send single cumulative
ACK, ACKing both in-order segments
immediately send duplicate ACK,
indicating seq. # of next expected byte
immediate send ACK, provided that
segment starts at lower end of gap
89. A
40 49
B
50 59
C
60 69
D
70 79
Gap 50-70
A
40 49
D
70 79
B
50 59
A
40 49
D
70 79
immediate send ACK80
B
50 59
A
40 49
D
70 79
arrival of segment that
partially fills gap
immediate send ACK60, provided that
segment starts at lower end 60of gap
Gap 60-70
C
60 69
case1
case2
arrival of segment that
completely fills gap
90. TCP: retransmission scenarios
lost ACK scenario
Host B
Host A
Seq=92, 8 bytes of data
Seq=92, 8 bytes of data
ACK=100
X
ACK=100
timeout
premature timeout
Host B
Host A
Seq=92, 8
bytes of
data
ACK=120
timeout
ACK=100
ACK=120
SendBase=100
SendBase=120
SendBase=120
Seq=92, 8 bytes of data
Seq=100, 20 bytes of data
SendBase=92
send cumulative
ACK for 120
91. TCP: retransmission scenarios
cumulative ACK
covers for earlier lost
ACK
Host B
Host A
Seq=92, 8 bytes of data
Seq=120, 15 bytes of data
Seq=100, 20 bytes of data
X
ACK=100
ACK=120
92. TCP fast retransmit
time-out period often relatively long:
• long delay before resending lost packet
detect lost segments via duplicate ACKs.
• sender often sends many segments
back-to-back
• if segment is lost, there will likely be
many duplicate ACKs.
if sender receives 3
ACKs for same data
(“triple duplicate ACKs”),
resend unacked
segment with smallest
seq #
likely that unacked
segment lost, so don’t
wait for timeout
TCP fast retransmit
49 59
40 70 79
60 69
ACK 50
ACK 50 ACK 50
80 89
ACK 50
93. TCP fast retransmit
Host B
Host A
timeout
ACK=100
ACK=100
ACK=100
ACK=100
X
Seq=92, 8 bytes of data
Seq=100, 20 bytes of data
Seq=100, 20 bytes of data
Receipt of three duplicate ACKs
indicates 3 segments received
after a missing segment – lost
segment is likely. So
retransmit!
#4:这是我们要考虑的一个简单的信息传输场景
我们有发送的process和接收的process,然后有一个可靠的channel在他们之间进行连接
只有一个方向,note arrows through reliable data transfer channel is just one way – reliably send from sender to receiver
这是我们想要应用的abstraction,就像是一个系统的interface,下一页我们看看如何实现这个interface
#6:
So we have a sender side and a receiver side.
How much work they’ll have to do depends on the IMPAIRMENTS introduced by channel
例如信道不稳定,存在丢包,corrupt和reorder,那么要保证丢失的包被重传,corrupt的错误能被检测和纠正,wrong order的data能被重新正确组织起来
存在的不稳定因素越多,那么需要考虑的问题就越多,协议要实现的难度就更高,协议也就更复杂
相反,如果存在的不稳定因素越少,那么需要考虑的问题也就越少,例如只有order的问题,那么只需要考虑reorder,不需要考虑重传
如果只有丢失,那么只需要考虑重传,不需要re order
如果啥错误都没有,那么什么都不用考虑。
下一页
#7:Here’s a point of view to keep in mind – it’s easy for US to look at sender and receiver together and see what is happening. OH – that message sent was lost.
But think about it say from senders POV How does the sender know if its transmitted message over the unreliable channel got though?? ONLY if receiver somehow signals to the sender that it was received.
The key point here is that one side does NOT know what is going on at the other side – it’s as if there’s a curtain between them. Everything they know about the other can ONLY be learned by sending/receiving messages.
Sender process wants to make sure a segment got through. But it can just somehow magically look through curtain to see if receiver got it. It will be up to the receiver to let the sender KNOW that it (the receiver) has correctly received the segment.
How will the sender and receiver do that – that’s the PROTOCOL.
Before starting to develop a protocol, let’s look more closely at the interface (the API if you will)
#8: We’ll start with the simplest case possible - an unreliable channel that is, in fact perfect – no segments are lost, corrupted, dupplicated or reordered. The sender just going to sends and it pops out the other side(perhaps after some delay) perfectly.
点击PPT继续
我们来看看分开的sender和reciver的fsm,因为是非常简单的场景,所以两边都不需要太多复杂操作,sender just need to sends data into underlying channel and receiver just need to reads data from underlying channel
描述下面的sender和reciever的状态变化图,读每个函数完整的名称和作用
发送方app调用函数rdt send来发送数据,数据传输到sender,sender一直在等待被上层call,拿到数据后构建packet,然后用udt——send来发送构建的packet到下面的网络
然后是接收方的状态,接收方一直等待下方来call,以便传输数据,先从rdt接收到packet,然后提取packet和data,然后把data递交给上层,通过deliver data函数
这里的收发和我们之前学习的没啥差别,都是在应对简单的tx层的收发操作。下一页
#10:顺次介绍图片里大家懂没懂的反应
海鸥里一开始觉得对方没有understand,但是对方双重确认
有yes I get ur message, I understand, 有hmmm,I don’t understand,
也有重新收到信息回复的我现在知道了,
让我们继续来看看协议是如何应对错误的,下一页
#11:考虑到信道的bit错误,现在让我们来定义rdt 2.0,来看看如何应对错误。
首先underlying channel may flip bits in packet。如何检查呢,我们用checksum to detect bit errors
然后问题是如何修复这些错误呢,就像我们刚刚看到的,我们用ack来确认收到的信息没问题
同时也用nak来说收到的问题有错误,
这样sender可以重新发送
这里我们有stop and wait protocol。
sender sends one packet, then waits for receiver response
就像是人说完一段话等待对方回应是否听懂了,就像老师问一个问题等待学生来回答。下一页。
#24:让我们看看rtt的表现如何,还是看传输时间
if RTT=30 msec, 1KB pkt every 30 msec: 33kB/sec thruput over 1 Gbps link
network protocol limits use of physical resources
Let’s develop a formula for utilization
#40:先看看go back n, 维持一个长度为N的window,顺序介绍, 不同的颜色代表什么,窗户大小,
window size of 14, 8 个sent but are not yet acknowledged
6 sequence numbers are available for us.
Gbn是累计确认的,tcp用的这种cumulative ACK
确认收到所有的数据后把窗户挪到n+1处继续开始,
timer始终针对当下最老的未确认的包,如果超时,则重新发送n及以后的数据
#41:接收端比较简单,没有buffer,只确定正确收到的最高序列号的
接收方可能会产生重复ack,如果收到out of order的packet,可以丢了或者存到buffer,取决于实现的过程,不管怎么样,都要重新确认最高包,
下面的图片里介绍各种颜色,以及过程,解释各种情况
#42:
window size of 4. at t=0, sender sends packets 0, 1, 2 3, 4, and packet 2 will be lost,介绍各个细节
At the receiver:
Packet 0 received ACK0 generated
Packet 1 received ACK1 generated
Packet 2 is lost, and so when packet 3 is received, 重发ack1指示要packet2,这就是我们之前说的累计确认
现在收到的3丢掉,因为0已经被ack了,Window往前挪,发送4,4也丢掉,同样第一个1到达的时候发送5,5也丢掉
然后第二个ack1到达,丢掉,因为是重复的ack1,
然后ack2超时,此时发送方窗口有2,3,4,5,所以重发2,3,4,5,直到依次确认。
#49:
这里是描述发送和接收,
发方如果窗口内还有可用序列号,就继续发
对于每个包都有timer,一个包超时了就重传,重传了就重启timer,收到确认就停止timer
确认的ack也要有序列号,表示要确认的包,这个序列号和发送方窗内序列号一致。在这个范围外的都丢弃,可能是延迟的重复确认
发送方的窗最左边维护当前最小的未被确认的序列号,如果收到了这个确认,就滑动,否则如果窗满了就停止
点击显示接收方
接收方只接收再窗内的包,对于超过buffer范围或者老的包,都丢掉
收到n后发送n的确认,buffer里可能是乱序的,
一定时间内上传接收到的顺序的包,然后滑动窗口,如果包
If the packet is in order, its data will be delivered, as will any buffered data that can now be delived in order
#50:接收方只接收再窗内的包,对于超过buffer范围或者老的包,都丢掉
收到n后发送n的确认,buffer里可能是乱序的,
一定时间内上传接收到的顺序的包,然后滑动窗口,如果包
If the packet is in order, its data will be delivered, as will any buffered data that can now be delived in order
如果包落在了窗前的范围,即使这些包已经被确认了,依然会发一次ack给发送方,以防止确认丢失
对于其他范围的就忽略
#60:Sequence number 是以字节流中的offset为编号的,不是以第几个segment为编号,比如第三个segment对应的是128 bytes的位置在原始message中,所以是128而不是3
#64:The key thing to note here is that the ACK number (43) on the B-to-A segment is one more than the sequence number (42) on the A-toB segment that triggered that ACK
Similarly, the ACK number (80) on the last A-to-B segment is one more than the sequence number (79) on the B-to-A segment that triggered that ACK
#79:
这就是每次获取新的 SampleRTT 时 TCP 重新计算估计 RTT 的方式。该过程被称为指数加权移动平均线,如此处的等式所示。
其中 alpha 反映了最近测量对估计 RTT 的影响;
实施中使用的典型 alpha 值为 .125
底部的图表显示了在马萨诸塞州的主机和法国的主机之间测量的 RTT,以及估计的“平滑”RTT
This is how TCP re-computes the estimated RTT each time a new SampleRTT is taken.
The process is knows as an exponeitally weighted moving average, shown by the equation here.
<say it>
Where alpha reflects the influence of the most recent measurements on the estimated RTT; a typical value of alpha used in implementaitons is .125
The graph at the bottom show measured RTTs beween a host in the Massachusetts and a host in France, as well as the estimated, “smoothed” RTT
#80:为了应对变化,要添加一个安全的margin,以覆盖更多的突发情况
给定这个估计 RTT 的值,TCP 将超时间隔计算为估计的 RTT 加上“安全裕度”。
直觉是,如果我们看到 SAMPLERTT 有很大的变化——RTT 估计值波动很大——那么我们;将需要更大的want a larger safety margin
RTT 中的偏差计算为最近测量的 SampleRTT 与估计 RTT 之间的差异的 eWMA
因此 TCP 计算超时间隔为估计 RTT 加上 4 倍 RTT 中的偏差度量.
Given this value of the estimated RTT, TCP computes the timeout interval to be the estimated RTT plus a “safety margin”
And the intuition is that if we are seeing a large variation in SAMPLERTT – the RTT estimates are fluctuating a lot - then we’ll want a larger safety margin
The deviation in the RTT is computed as the eWMA of the difference between the most recently measured SampleRTT from the Estimated RTT
So TCP computes the Timeout interval to be the Estimated RTT plus 4 times a measure of deviation in the RTT.
#81:
鉴于 TCP 序列号、ack 和计时器的这些详细信息,我们现在可以描述 TCP 发送方和接收方如何操作的全局视图。您可以查看书中的 FSM;让我们在这里只给出一个英文文本描述,让我们从发件人开始。
Given these details of TCP sequence numbers, acks, and timers, we can now describe the big picture view of how the TCP sender and receiver operate
You can check out FSMs in book; let’s just give an English text description here and let’s start with the sender.
#82:
许多 TCP 实现不会立即确认该段,而是等待半秒等待另一个有序段到达,然后为两个段生成单个累积 ACK——从而减少 ACK 流量。
第二个有序段的到达和覆盖这两个段的累积 ACK 生成是该表中的第二行
Rather than immediately ACKnowledig this segment, many TCP implementations will wait for half a second for another in-order segment to arrive, and then generate a single cumulative ACK for both segments – thus decreasing the amount of ACK traffic. The arrival of this second in-order segment and the cumulative ACK generation that covers both segments is the second row in this table.
#90:
为了巩固我们对 TCP 可靠性的理解,让我们看几个重传场景。第一种情况是 TCP 段被传输,ACK 丢失,TCP 超时机制导致另一个副本被传输,
然后重新 ACK 到发送方
第二个示例发送并确认了两个段,但第一个段有一个过早的超时,它被重新传输。
请注意,当接收到这个重新传输的段时,接收器已经接收到前两个段,
因此重新发送迄今为止接收到的两个段的累积 ACK,而不是仅针对第一个段的 ACK。
To cement our understanding of TCP reliability, let’s look a a few retransmission scenarios
In the first case a TCP segments is transmited and the ACK is lost, and the TCP timeout mechanism results in another copy of being transmitted and then re-ACKed a the sender
In the second example two segments are sent and acknowledged, but there is a premature timeout e for the first segment, which is retransmitted. Notet that when this retransmitted segment is received, the receiver has already received the first two segments, and so resends a cumulative ACK for both segments received so far, rather than an ACK for just this fist segment.
#91:在最后一个示例中,再次传输两个段,
第一个 ACK 丢失但第二个 ACK没有,
累积的 ACK 到达发送方,
然后发送方可以发送第三个段,知道前两个已经到达,即使 ACK第一段丢失了
And in this last example, two segments are again transmitted, the first ACK is lost but the second ACK, a cumulative ACK arrives at the sender, which then can transmit a third segment, knowing that the first two have arrived, even though the ACK for the first segment was lost
#93:看看右边的这个例子,其中 5 个段被传输,
第二个段丢失。在这种情况下,TCP 接收器发送一个 ACK 100 确认第一个接收到的段。
当第三个段到达接收器时,TCP 接收器发送另一个 ACK 100,因为第二个段尚未到达。
第 4 段和第 5 段也同样到达。发送两个ack100,
现在发件人看到了什么?发送方收到了它一直希望的第一个 ACK 100,但随后又收到了三个重复的 ACK100。
发送方知道出了点问题——它知道第一个数据段到达了接收方,但是三个后来到达的数据段到达了接收方——产生了三个重复 ACK 的数据段——我们正确接收到了,但没有按顺序排列。
也就是说,当生成三个重复 ACK 中的每一个时,接收器处存在丢失的段。
对于快速重传,三个重复的 ACK 的到达会导致发送方重新传输其最旧的未确认数据段,而无需等待超时事件。
这允许 TCP 更快地从很可能的丢失事件中恢复,特别是第二个段已丢失,因为收到了三个更高编号的段
Take a look at this example on the right where 5 segments are transmitted and the second segment is lost. In this case the TCP receiver sends an ACK 100 acknowledging the first received segment.
When the third segment arrives at the receiver, the TCP receiver sends another ACK 100 since the second segment has not arrived. And similarly for the 4th and 5th segments to arrive.
Now what does the sender see? The sender receives the first ACK 100 it has been hoping for, but then three additional duplicate ACK100s arrive. The sender knows that somethings’ wrong – it knows the first segment arrived at the receiver but three later arriving segments at the receiver – the ones that generated the three duplicate ACKs – we received correctly but were not in order. That is, that there was a missing segment at the receiver when each of the three duplicate ACK were generated.
With fast retransmit, the arrival of three duplicate ACK causes the sender to retransmit its oldest unACKed segment, without waiting for a timeout event. This allows TCP to recover more quickly from what is very likely a loss event
specifically that the second segment has been lost, since three higher -numbered segments were received