2. SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
4. 2.1 Principles of Network Applications -- Syllabus
2.1.1 Network Application Architectures.
2.1.2 Processes Communicating.
2.1.3 Transport Services Available to Applications.
2.1.4 Transport Services Provided by the Internet.
2.1.5 Application-Layer Protocols.
5. Principles of Network Applications
•Network application development is writing
programs that run on different end systems and
communicate with each other over the network.
6. Principles of Network Applications
•Example:
•Web Application
•Peer-to-Peer File Sharing System
7. Principles of Network Applications
• Example:
• Web Application:
• In the Web application, there are two distinct programs that
communicate with each other:
• The browser program running in the user’s host (desktop,
laptop, tablet, smartphone, and so on); and
• The Web server program running in the Web server host.
8. Principles of Network Applications
•Example:
•Peer-to-Peer File Sharing System
•In a P2P File-Sharing System, there is a program in
each host that participates in the file-sharing
community.
•In this case, the programs in the various hosts may
be similar or identical.
9. IMPORTANT POINTS TO BE CONSIDER WHEN WE
DEVELOP A NEW NETWORK APPLICATION
• When developing our new application, we need to write software that will run
on multiple end systems.
• This software could be written in C, Java, or Python.
• We do not need to write software that runs on network core devices, such as
routers or link-layer switches.
• Even if we wanted to write application software for these network-core
devices, we wouldn’t be able to do so.
• Network-core devices do not function at the application layer but instead
function at lower layers— specifically at the network layer and below.
• Communication for a network application takes place between end systems at
the application layer
10. Network Application Architectures
• An application’s architecture is distinctly different from the network
architecture.
• From the application developer’s perspective, the network architecture is fixed
and provides a specific set of services to applications.
• The application architecture is designed by the application developer and
dictates how the application is structured over the various end systems.
• An Application Developer will draw on one of the two predominant
architectural paradigms used in modern network applications:
• The client-server architecture
• The peer-to-peer (P2P) architecture
11. Client-Server Architecture
• There is an always-on host, called the server, which services requests
from many other hosts, called clients.
• A classic example is the Web application
• Here, one host is always-on is called as Web Server.
• Web Server services requests from browsers running on client hosts.
• When a Web server receives a request for an object from a client
host, it responds by sending the requested object to the client host.
12. Characteristics of Client Server Architecture
• With the client-server architecture, clients do not directly communicate with
each other; for example, in the Web application, two browsers do not
directly communicate.
• The server has a fixed, well-known address, called an IP address. Because the
server has a fixed, well-known address, and because the server is always on,
a client can always contact the server by sending a packet to the server’s IP
address.
• Some of the better-known applications with a client-server architecture
include the Web, FTP, Telnet, and e-mail.
• A single-server host is incapable of keeping up with all the requests from
clients. For this reason, a data center, housing a large number of hosts, is
often used to create a powerful virtual server
13. Some Examples For Client Server Architecture
• The most popular Internet services—such as search engines
(e.g., Google and Bing),
• Internet commerce (e.g., Amazon and e-Bay),
• Web-based email (e.g., Gmail and Yahoo Mail),
• Social networking (e.g., Facebook and Twitter)
— These above will employ one or more data centers.
14. Disadvantage
•Infrastructure Intensive A data center can have
hundreds of thousands of servers, which must be
powered and maintained.
•Service providers must pay recurring
interconnection and bandwidth costs for sending
data and receiving data to and from Internet.
16. P2P Architecture
• There is minimal (or no) reliance on dedicated servers in data centers.
• The application exploits direct communication between pairs of intermittently
connected hosts, called peers.
• The peers are not owned by the service provider, but are instead desktops and
laptops controlled by users, with most of the peers residing in homes,
universities, and offices.
• Because the peers communicate without passing through a dedicated server,
the architecture is called peer-to-peer.
• Most popular and traffic-intensive applications are based on P2P architectures
17. P2P Applications include
• File Sharing (e.g., BitTorrent),
• Peer-Assisted Download Acceleration (e.g., Xunlei),
• Internet Telephony (e.g., Skype),
• IPTV (e.g., Kankan and PPstream),
• LimeWire (A Music NFT MarketPlace)
19. Hybrid Architectures
•Combining both client-server and P2P elements.
•For example, for many instant messaging applications,
servers are used to track the IP addresses of users, but
user-to-user messages are sent directly between user hosts
(without passing through intermediate servers)
20. Features of P2P architectures
• P2P Architectures are Self-Scalability.
• For example, in a P2P file-sharing application, although
each peer generates workload by requesting files, each
peer also adds service capacity to the system by distributing
files to other peers.
• P2P Architectures are also cost effective
• Since they normally don’t require significant server
infrastructure and server bandwidth (in contrast with
clients-server designs with datacenters), cost will be less.
21. P2P Applications Face Three Major Challenges
• ISP Friendly. Most residential ISPs (including DSL and cable ISPs) have been
dimensioned for “asymmetrical” bandwidth usage, that is, for much more
downstream than upstream traffic. But P2P video streaming and file
distribution applications shift upstream traffic from servers to residential ISPs,
thereby putting significant stress on the ISPs. Future P2P applications need to
be designed so that they are friendly to ISPs [Xie 2008].
• Security. Because of their highly distributed and open nature, P2P applications
can be a challenge to secure.
• Incentives. The success of future P2P applications also depends on convincing
users to volunteer bandwidth, storage, and computation resources to the
applications, which is the challenge of incentive design.
24. Processes Communicating
•How Processes running on Same Host communicate?
•How Processes running on different hosts (with
potentially different operating systems)
communicate?
26. How Processes running on Same
Host communicate?
•When processes are running on the same end system,
they can communicate with each other with
interprocess communication, using rules that are
governed by the end system’s operating system.
27. How Processes running on different hosts
(with potentially different operating systems)
communicate?
28. How Processes running on different hosts
(with potentially different operating systems)
communicate?
•Processes on two different end systems communicate with
each other by exchanging messages across the computer
network. A sending process creates and sends messages
into the network; a receiving process receives these
messages and responds by sending messages back
29. Client and Server Processes
•A network application consists of pairs of processes that send
messages to each other over a network.
•For example,
•In the Web application, a client browser process
exchanges messages with a Web server process.
•In a P2P file-sharing system, a file is transferred from a
process in one peer to a process in another peer.
30. Client and Server Processes – Contd…
• For each pair of communicating processes, we typically label one of
the two processes as the client and the other process as the server.
• With the Web, a browser is a client process and a Web server
is a server process.
• With P2P file sharing, the peer that is downloading the file is
labeled as the client, and the peer that is uploading the file is
labeled as the server.
• In some applications, such as in P2P file sharing, a process can be
both a client and a server. A process in a P2P file-sharing system can
both upload and download files.
31. Definition of Client and Server Process
• In the context of a communication session between a pair
of processes,
•The process that initiates the communication (that is,
initially contacts the other process at the beginning of
the session) is labeled as the client.
•The process that waits to be contacted to begin the
session is the server.
32. Example for Client and Servers
•In the Web, a browser process initializes contact with a
Web server process; hence the browser process is the client
and the Web server process is the server.
•In P2P file sharing, when Peer A asks Peer B to send a
specific file, Peer A is the client and Peer B is the server in
the context of this specific communication session. When
there’s no confusion, we’ll sometimes also use the
terminology “client side and server side of an application.”
33. The Interface Between the Process and the
Computer Network
•A Process sends messages into, and receives
messages from, the network through a software
interface called a socket.
35. Socket
•A Socket is the interface between the Application Layer
and the Transport Layer within a host.
•It is also referred to as the Application Programming
Interface (API) between the application and the network,
since the socket is the programming interface with which
network applications are built.
•The application developer has control of everything on
the application-layer side of the socket but has little
control of the transport-layer side of the socket.
36. The only control that the Application
Developer has on the Transport-Layer side is
•The only control that the Application Developer has on the
Transport-Layer side is
•The choice of transport protocol and
•The ability to fix a few transport-layer parameters
such as maximum buffer and maximum segment sizes.
•Once the application developer chooses a transport
protocol, the application is built using the transport-layer
services provided by that protocol.
37. Addressing Processes
•In order for a process running on one host to send packets
to a process running on another host, the receiving
process needs to have an address.
•To identify the receiving process, two pieces of
information need to be specified:
•The address of the host and
•An identifier that specifies the receiving process in
the destination host.
38. Addressing Processes – Contd…
•In the Internet, the host is identified by its IP address.
An IP address is a 32-bit quantity that we can think of
as uniquely identifying the host.
•The sending process must also identify the receiving
process running in the host. This information is
needed because a host could be running many
network applications. A destination port number
serves this purpose.
39. Popular applications have been assigned
Specific Port Numbers.
•For example,
•A Web server is identified by port number 80.
•A mail server process (using the SMTP protocol) is
identified by port number 25.
•A list of well-known port numbers for all Internet
standard protocols can be found at
http://www.iana.org.
40. SYLLABUS – MODULE 1
•2.1 Principles of Network Applications:
•Network Application Architectures,
•Processes Communicating,
•Transport Services Available to Applications,
•Transport Services Provided by the Internet,
•Application-Layer Protocols.
41. Transport Services Available to Applications
•Services that a transport-layer protocol can offer
to applications invoking it can be classified into
four dimensions :
1. Reliable data transfer
2. Throughput
3. Timing
4. Security
42. Reliable Data Transfer
• Packets can get lost within a computer network.
• A packet can overflow a buffer in a router, or can be discarded by
a host or router after having some of its bits corrupted.
• For many applications—such as electronic mail, file transfer,
remote host access, Web document transfers, and financial
applications—data loss can have devastating consequences.
• The data sent by one end of the application is delivered correctly
and completely to the other end of the application. If a protocol
provides such a guaranteed data delivery service, it is said to
provide reliable data transfer.
43. One important service that a transport-layer
protocol can potentially provide to an application
•One important service that a transport-layer
protocol can potentially provide to an application is
Process-To-Process Reliable Data Transfer.
•When a transport protocol provides this service, the
sending process can just pass its data into the socket
and know with complete confidence that the data
will arrive without errors at the receiving process.
44. Loss-Tolerant Applications
• When a transport-layer protocol doesn’t provide reliable
data transfer, some of the data sent by the sending process
may never arrive at the receiving process. This may be
acceptable for Loss-Tolerant Applications
• Most notably multimedia applications such as conversational
audio/video that can tolerate some amount of data loss. In
these multimedia applications, lost data might result in a
small glitch in the audio/video—not a crucial impairment
45. Throughput
• In the context of a communication session between two
processes along a network path, Throughput is the rate at which
the sending process can deliver bits to the receiving process.
• Because other sessions will be sharing the bandwidth along the
network path, and because these other sessions will be coming
and going, the available throughput can fluctuate with time.
These observations lead to another natural service that a
transport-layer protocol could provide, namely, guaranteed
available throughput at some specified rate.
46. Throughput – Contd…
•The application could request a guaranteed
throughput of r bits/sec, and the transport protocol
would then ensure that the available throughput is
always at least r bits/sec. Such a guaranteed
throughput service would appeal to many applications.
47. For Example
•If an Internet telephony application encodes voice at 32 kbps,
it needs to send data into the network and have data
delivered to the receiving application at this rate.
•If the transport protocol cannot provide this throughput, the
application would need to encode at a lower rate or may have
to give up, since receiving half of the needed throughput is of
little or no use to this Internet telephony application.
48. Bandwidth-Sensitive Applications
•Applications that have throughput requirements
are said to be bandwidth-sensitive applications.
Many current multimedia applications are
bandwidth sensitive, although some multimedia
applications may use adaptive coding techniques
to encode digitized voice or video at a rate that
matches the currently available throughput.
49. Elastic Applications
•While bandwidth-sensitive applications have
specific throughput requirements, elastic
applications can make use of as much, or as
little, throughput as happens to be available.
Electronic mail, file transfer, and Web transfers
are all elastic applications.
50. Timing
• A transport-layer protocol can also provide timing guarantees.
• As with throughput guarantees, timing guarantees can come in
many shapes and forms.
• An example guarantee might be that every bit that the sender
pumps into the socket arrives at the receiver’s socket no more
than 100 msec later.
• Such a service would be appealing to interactive real-time
applications, such as Internet telephony, virtual environments,
teleconferencing, and multiplayer games, all of which require
tight timing constraints on data delivery in order to be effective.
51. Timing – Contd…
•Long delays in Internet telephony, for example, tend
to result in unnatural pauses in the conversation;
•In a multiplayer game or virtual interactive
environment, a long delay between taking an
action and seeing the response from the
environment makes the application feel less
realistic.
52. Security
• A transport protocol can provide an application with one or more
security services.
• For example, in the sending host, a transport protocol can encrypt
all data transmitted by the sending process, and in the receiving
host, the transport-layer protocol can decrypt the data before
delivering the data to the receiving process.
• Such a service would provide confidentiality between the two
processes, even if the data is somehow observed between sending
and receiving processes.
• A transport protocol can also provide other security services in
addition to confidentiality, including data integrity and end-point
authentication
53. SYLLABUS – MODULE 1
•2.1 Principles of Network Applications:
•Network Application Architectures,
•Processes Communicating,
•Transport Services Available to Applications,
•Transport Services Provided by the Internet,
•Application-Layer Protocols.
54. Transport Services Provided by the Internet
•The Internet makes two transport protocols
available to applications,
•UDP and
•TCP.
55. TCP Services
•The TCP service model includes
•A Connection-Oriented Service and
•A Reliable Data Transfer Service.
•Congestion-Control Mechanism Service
•When an application invokes TCP as its transport
protocol, the application receives both of these
services from TCP.
56. Connection-Oriented Service
•TCP has the client and server exchange transport
layer control information with each other before
the application-level messages begin to flow.
These Control information are called as
handshaking.
•This handshaking procedure alerts the client and
server, allowing them to prepare packets.
57. Connection-Oriented Service
•After the handshaking phase, a TCP connection is
said to exist between the sockets of the two
processes. The connection is a full-duplex
connection in that the two processes can send
messages to each other over the connection at the
same time.
•When the application finishes sending messages, it
must tear down the connection.
Contd…
58. Reliable Data Transfer Service
•The communicating processes can rely on TCP to
deliver all data sent without error and in the
proper order.
•When one side of the application passes a stream
of bytes into a socket, it can count on TCP to
deliver the same stream of bytes to the receiving
socket, with no missing or duplicate bytes.
59. Congestion-Control Mechanism Service
•It is a service for the general welfare of the
Internet rather than for the direct benefit of the
communicating processes.
•The TCP congestion-control mechanism throttles
a sending process (client or server) when the
network is congested between sender and
receiver.
60. UDP Services
•UDP is connectionless, so there is no handshaking
before the two processes start to communicate.
•UDP provides an unreliable data transfer service
•that is, when a process sends a message
into a UDP socket, UDP provides no
guarantee that the message will ever
reach the receiving process.
61. UDP Services
•Messages that do arrive at the receiving process
may arrive out of order.
•UDP does not include a congestion-control
mechanism, so the sending side of UDP can pump
data into the layer below (the network layer) at
any rate it pleases.
Contd…
62. SYLLABUS – MODULE 1
•2.1 Principles of Network Applications:
•Network Application Architectures,
•Processes Communicating,
•Transport Services Available to Applications,
•Transport Services Provided by the Internet,
•Application-Layer Protocols.
64. Services Not Provided by Internet
Transport Protocols
•Throughput guarantee or Timing guarantee—
services not provided by today’s Internet
transport protocols.
•Today’s Internet can often provide satisfactory
service to time-sensitive applications, but it
cannot provide any timing or throughput
guarantees.
65. In particular, an Application-Layer Protocol defines
•The types of messages exchanged, request messages
and response messages.
•The syntax of the various message types, such as the
fields in the message and how the fields are
delineated.
•The semantics of the fields, that is, the meaning of the
information in the fields.
•Rules for determining when and how a process sends
messages and responds to messages.
66. Distinguish between Network Applications and
Application-Layer Protocols
•An Application-Layer Protocol is only one
piece of a Network Application.
•Examples:
•Web Application
•Internet E-Mail Application
67. Example 1 Web Application
•The Web is a client-server application that allows users
to obtain documents from Web servers on demand.
•The Web application consists of many
components, including a standard for document
formats (that is, HTML), Web Browsers (for
example, Firefox and Microsoft Internet Explorer),
Web servers (for example, Apache and Microsoft
servers), and an Application-Layer protocol.
68. Example 1 Web Application
•The Web’s Application-Layer Protocol, HTTP, defines
the format and sequence of messages exchanged
between Browser and Web Server.
•Thus, HTTP is only one piece of the Web Application.
Contd…
69. Example 2: Internet E-Mail Application
•It has many components, including mail servers that
house user mailboxes; mail clients (such as Microsoft
Outlook) that allow users to read and create
messages; a standard for defining the structure of an
e-mail message; and application-layer protocols that
define how messages are passed between servers,
how messages are passed between servers and mail
clients, and how the contents of message headers are
to be interpreted.
70. Example 2: Internet E-Mail Application
•The principal Application-Layer Protocol for electronic
mail is SMTP (Simple Mail Transfer Protocol).
•Thus, e-mail’s Principal Application-Layer Protocol,
SMTP, is only one piece of the E-mail Application.
Contd…
72. SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
73. 2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
74. Overview of HTTP
•The HyperText Transfer Protocol (HTTP), the
Web’s application-layer protocol, is at the
heart of the Web.
•HTTP is implemented in two programs:
•A Client Program and
•A Server Program.
75. Overview of HTTP
•The client program and server program, executing
on different end systems.
•They may talk to each other by exchanging HTTP
messages.
•HTTP defines the structure of these messages and
how the client and server exchange the messages.
•URL Uniform Resource Locator
Contd…
76. Some Web Terminologies
•A Web page (also called a document)
consists of objects.
•An object is simply a file—such as an HTML
file, a JPEG image, a Java applet, or a video
clip—that is addressable by a single URL.
•Most Web pages consist of a base HTML file
and several referenced objects.
77. Some Web Terminologies
•For Example, if a Web page contains HTML
text and five JPEG images, then the Web
page has six objects: the base HTML file plus
the five images.
Contd…
78. Some Web Terminologies
•The base HTML file references the other
objects in the page with the objects’ URLs.
•Each URL has two components:
•The hostname of the server that houses
the object and
•The object’s path name.
Contd…
79. For Example
•Consider the URL:
http://www.abc.edu/myStore/picture.gif
•www.abc.edu for a hostname and
•/myStore/picture.gif for a path name
80. Some Web Terminologies
•Web browsers (such as Internet Explorer and
Firefox) implement the client side of HTTP
•Web servers implement the server side of HTTP,
house Web objects, each addressable by a URL.
•Popular Web servers include Apache and
Microsoft Internet Information Server
Contd…
81. •HTTP defines how Web clients request Web
pages from Web servers and how servers
transfer Web pages to clients.
•When a user requests a Web page (for example,
clicks on a hyperlink), the browser sends HTTP
request messages for the objects in the page to
the server. The server receives the requests and
responds with HTTP response messages that
contain the objects.
83. HTTP
•HTTP uses TCP as its underlying transport
protocol.
•The HTTP client first initiates a TCP connection
with the server.
•Once the connection is established, the browser
and the server processes access TCP through
their socket interfaces.
84. Client Side and Server Side Sockets
•On the Client Side, the socket interface is
the door between the client process and
the TCP connection;
•On the Server Side, it is the door between
the server process and the TCP connection.
85. Request Response Process
•The client sends HTTP request messages into
its socket interface and receives HTTP
response messages from its socket interface.
•Similarly, the HTTP server receives request
messages from its socket interface and sends
response messages into its socket interface.
86. HTTP is said to be a Stateless Protocol
•The server sends requested files to clients without
storing any state information about the client.
•If a particular client asks for the same object twice in a
period of a few seconds, the server does not respond
by saying that it just served the object to the client;
instead, the server resends the object, as it has
completely forgotten.
•Because an HTTP server maintains no information
about the clients, HTTP is said to be a stateless
protocol.
87. 2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
88. Non-Persistent and Persistent Connections
•HTTP, which can use both non-persistent
connections and persistent connections.
•Although HTTP uses persistent connections
in its default mode, HTTP clients and servers
can be configured to use non-persistent
connections instead.
89. The steps of transferring a Web page from server
to client for the case of non-persistent connections
•Let’s suppose the page consists of a base HTML
file and 10 JPEG images, and that all 11 of these
objects reside on the same server.
•Suppose the URL for the base HTML file is:
http://www.abc.edu/myDepartment/home.index
90. Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•1. The HTTP client process initiates a TCP
connection to the server www.abc.edu on port
number 80, which is the default port number for
HTTP. Associated with the TCP connection, there
will be a socket at the client and a socket at the
server.
91. Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•2. The HTTP client sends an HTTP request
message to the server via its socket. The request
message includes the path name
/myDepartment/home.index.
92. Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•3. The HTTP server process receives the request
message via its socket, retrieves the object
/myDepartment/home.index from its storage
(RAM or disk), encapsulates the object in an
HTTP response message, and sends the
response message to the client via its socket.
93. Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•4. The HTTP server process tells TCP to close the
TCP connection. (But TCP doesn’t actually
terminate the connection until it knows for sure
that the client has received the response
message intact.)
94. Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•5. The HTTP client receives the response
message. The TCP connection terminates. The
message indicates that the encapsulated object
is an HTML file. The client extracts the file from
the response message, examines the HTML file,
and finds references to the 10 JPEG objects.
95. Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•6. The first four steps are then repeated for each
of the referenced JPEG objects.
96. HTTP with Non-Persistent Connections
•The steps above illustrate the use of non-persistent
connections.
•Here, each TCP connection is closed after the server
sends the object—the connection does not persist
for other objects.
•Each TCP connection transports exactly one request
message and one response message.
•Thus, in this example, when a user requests the
Web page, 11 TCP connections are generated.
97. Round-Trip Time (RTT)
•It is the time it takes for a small packet to
travel from client to server and then back to
the client.
•The RTT includes packet-propagation delays,
packet queuing delays in intermediate routers
and switches, and packet-processing delays.
98. “three-way handshake”
•Client sends a small TCP segment to the server,
•The server acknowledges and responds with a
small TCP segment, and,
•The client acknowledges back to the server.
99. “three-way handshake”
•The first two parts of the three way
handshake take one RTT.
•After completing the first two parts of the
handshake, the client sends the HTTP request
message combined with the third part of the
three-way handshake (the acknowledgment)
into the TCP connection.
Contd…
100. “three-way handshake”
•Once the request message arrives at the server,
the server sends the HTML file into the TCP
connection.
•This HTTP request/response eats up another RTT.
•Thus the total response time is two RTTs plus the
transmission time at the server of the HTML file.
Contd…
101. Disadvantages of Non Persistent Connections
1. A new connection must be established and
maintained for each requested object. For each of
these connections, TCP buffers must be allocated and
TCP variables must be kept in both the client and
server. This can place a significant burden on the Web
server, which may be serving requests from hundreds
of different clients simultaneously.
2. Each object suffers a delivery delay of two RTTs— one
RTT to establish the TCP connection and one RTT to
request and receive an object.
102. HTTP with Persistent Connections
•The server leaves the TCP connection open
after sending a response.
•Subsequent requests and responses between
the same client and server can be sent over
the same connection.
1/4
103. HTTP with Persistent Connections
•In particular, an entire Web page (in our
example, the base HTML file and the 10 images)
can be sent over a single persistent TCP
connection.
•Multiple Web pages residing on the same
server can be sent from the server to the
same client over a single persistent TCP
connection.
Contd…
2/4
104. HTTP with Persistent Connections
•These requests for objects can be made back-
to-back, without waiting for replies to
pending requests (pipelining).
•The HTTP server closes a connection when it
isn’t used for a certain time (a configurable
timeout interval).
Contd…
3/4
105. HTTP with Persistent Connections
•When the server receives the back-to-back
requests, it sends the objects back-to-back.
•The default mode of HTTP uses persistent
connections with pipelining.
Contd…
4/4
106. 2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
107. HTTP Message Format
•The HTTP specifications include the
definitions of the HTTP message formats.
•There are two types of HTTP messages,
•HTTP Request messages and
•HTTP Response messages
108. HTTP Request Message
•A typical HTTP request message:
GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
Connection: close
User-agent: Mozilla/5.0
Accept-language: fr
109. Characteristics of the Simple Request Message
•The message is written in ordinary ASCII text, so
that ordinary computer-literate human being can
read it.
•The message consists of five lines, each followed
by a carriage return and a line feed. The last line
is followed by an additional carriage return and
line feed. Although this particular request
message has five lines, a request message can
have many more lines or as few as one line.
110. Characteristics of the Simple Request Message
•The first line of an HTTP request message is called
the request line; the subsequent lines are called
the header lines.
•The request line has three fields:
•The method field,
•the URL field, and
•the HTTP version field.
111. Characteristics of the Simple Request Message
•The method field can take on several different
values, including GET, POST, HEAD, PUT, and DELETE.
•The great majority of HTTP request messages use
the GET method.
•The GET method is used when the browser requests
an object, with the requested object identified in
the URL field.
•In this example, the browser is requesting the object
/somedir/page.html.
112. Consider The Header line
Host: www.someschool.edu
•It specifies the host on which the object resides.
This header line is unnecessary, as there is
already a TCP connection in place to the host.
But the information provided by the host
header line is required by Web proxy caches.
113. Consider The Header line
Connection: close
•The Browser is telling the server that it
doesn’t want to bother with persistent
connections; it wants the server to close the
connection after sending the requested
object.
114. Consider The Header line
User-agent: Mozilla/5.0
•It specifies the user agent, that is, the browser
type that is making the request to the server.
•Here the user agent is Mozilla/5.0, a Firefox
browser.
•This header line is useful because the server can
actually send different versions of the same object
to different types of user agents.
115. Consider The Header line
Accept-language: fr
•indicates that the user prefers to receive a
French version of the object, if such an object
exists on the server; otherwise, the server
should send its default version.
•The Accept-language: header is just one of
many content negotiation headers available in
HTTP.
117. General Format of a Request Message
•After the header lines (and the additional carriage
return and line feed) there is an “entity body.”
•The entity body is empty with the GET method,
but is used with the POST method.
• An HTTP client uses the POST method when the
user fills out a form.
•For example, when a user provides search words
to a search engine.
118. General Format of a Request Message
•With a POST message, the user is still requesting a
Web page from the server, but the specific
contents of the Web page depend on what the
user entered into the form fields.
•If the value of the method field is POST, then the
entity body contains what the user entered into
the form fields.
Contd…
119. General Format of a Request Message
•A request generated with a form does not
necessarily use the POST method. Instead,
HTML forms use the GET method and include
the inputted data (in the form fields) in the
requested URL.
Contd…
120. General Format of a Request Message
•For example, if a form uses the GET method,
has two fields, and the inputs to the two fields
are dogs and cats, then the URL will have the
structure
www.abc.com/animalsearch?dogs&cats
Contd…
121. General Format of a Request Message
•The HEAD method is similar to the GET method.
•When a server receives a request with the HEAD
method, it responds with an HTTP message but it
leaves out the requested object.
•Application developers often use the HEAD
method for debugging.
Contd…
122. General Format of a Request Message
•The PUT method is often used in conjunction with
Web publishing tools.
•It allows a user to upload an object to a specific
path (directory) on a specific Web server.
•The PUT method is also used by applications that
need to upload objects to Web servers.
•The DELETE method allows a user, or an
application, to delete an object on a Web server.
Contd…
123. HTTP Response Message
HTTP/1.1 200 OK
Connection: close
Date: Tue, 09 Aug 2011 15:44:04 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT
Content-Length: 6821
Content-Type: text/html
(data data data data data ...)
124. HTTP Response Message
•HTTP Response Message has three sections:
•An initial status line,
•Six header lines, and
•The entity body.
•The entity body is the meat of the message—it
contains the requested object itself
(represented by data data data data data ...).
Contd…
125. HTTP Response Message Status Line
•The status line has three fields:
•The protocol version field,
•A status code, and
•A corresponding status message.
•In this example, the status line indicates that the
server is using HTTP/1.1 and that everything is OK
(that is, the server has found, and is sending, the
requested object).
126. HTTP Response Message Header Lines
•First Header Line is
Connection: close
This header line to tell the client that it is going
to close the TCP connection after sending the
message.
127. HTTP Response Message
HTTP/1.1 200 OK
Connection: close
Date: Tue, 09 Aug 2011 15:44:04 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT
Content-Length: 6821
Content-Type: text/html
(data data data data data ...)
128. HTTP Response Message Header Lines
•Second Header Line is
Date: Tue, 09 Aug 2011 15:44:04 GMT
This header line indicates the time and date when
the HTTP response was created and sent by the
server. Note that this is not the time when the
object was created or last modified; it is the time
when the server retrieves the object from its file
system, inserts the object into the response
message, and sends the response message.
129. HTTP Response Message Header Lines
•Third Header Line is
Server: Apache/2.2.3 (CentOS)
This header line indicates that the message was
generated by an Apache Web server; it is
analogous to the User-agent: header line in the
HTTP request message.
130. HTTP Response Message Header Lines
•Fourth Header Line is
Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT
This header line indicates the time and date when
the object was created or last modified
131. HTTP Response Message Header Lines
•Fifth Header Line is
Content-Length: 6821
This header line indicates the number of bytes in
the object being sent.
132. HTTP Response Message Header Lines
•Sixth Header Line is
Content-Type: text/html
This header line indicates that the object in the
entity body is HTML text.
(The object type is officially indicated by the
Content-Type: header and not by the file
extension).
133. The status code and associated phrase
•The status code and associated phrase
indicate the result of the request.
134. Some common status codes and
associated phrases include
•200 OK: Request succeeded and the information
is returned in the response.
•301 Moved Permanently: Requested object has
been permanently moved; the new URL is
specified in Location: header of the response
message. The client software will automatically
retrieve the new URL.
135. Some common status codes and
associated phrases include
•400 Bad Request: This is a generic error code
indicating that the request could not be
understood by the server.
•404 Not Found: The requested document does not
exist on this server.
•505 HTTP Version Not Supported: The requested
HTTP protocol version is not supported by the
server
Contd…
136. 2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
137. User-Server Interaction: Cookies
•Web servers that can handle thousands of
simultaneous TCP connections.
•Web site has to identify users, either because the
server wishes to restrict user access or because it
wants to serve content as a function of the user
identity. For these purposes, HTTP uses cookies.
•Cookies allow sites to keep track of users.
138. Cookie technology has four components
1. A cookie header line in the HTTP
response message;
2. A cookie header line in the HTTP request
message;
3. A cookie file kept on the user’s end system
and managed by the user’s browser; and
4. A back-end database at the Web site
140. An Example of how Cookies work
•Suppose Sushanth, who always accesses the
Web using Internet Explorer from his home PC,
contacts amazon.com for the first time.
•Let us suppose that in the past he has already
visited the eBay site.
141. An Example of how Cookies work
•When the request comes into the Amazon Web
server, the server creates a unique
identification number and creates an entry in
its back-end database that is indexed by the
identification number.
Contd…
142. An Example of how Cookies work
•The Amazon Web server then responds to
Sushanth’s browser, including in the HTTP
response a Set-cookie: header, which contains
the identification number.
•For example, the header line might be:
Set-cookie: 1678
Contd…
143. •When Sushanth’s browser receives the HTTP
response message, it sees the Setcookie: header.
The browser then appends a line to the special
cookie file that it manages.
•This line includes the hostname of the server and
the identification number in the Set-cookie: header.
•Note that the cookie file already has an entry for
eBay, since Sushanth has visited that site in the past.
An Example of how Cookies work
Contd…
144. •As Sushanth continues to browse the Amazon site,
each time he requests a Web page, his browser
consults his cookie file, extracts his identification
number for this site, and puts a cookie header line that
includes the identification number in the HTTP request.
•Specifically, each of his HTTP requests to the Amazon
server includes the header line:
Cookie: 1678
An Example of how Cookies work
Contd…
145. Cookies can be used to identify a user
•The first time a user visits a site, the user can
provide a user identification (possibly his or her name).
•During the subsequent sessions, the browser
passes a cookie header to the server, thereby
identifying the user to the server.
•Cookies can thus be used to create a user session
layer on top of stateless HTTP.
146. Cookies can be used to identify a user
•For Example, when a user logs in to a Web-
based e-mail application (such as Hotmail),
the browser sends cookie information to the
server, permitting the server to identify the
user throughout the user’s session with the
application.
Contd…
147. 2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
148. Web Caching (Or Proxy Server)
•It is a network entity that satisfies HTTP
requests on the behalf of an origin Web
server.
•The Web cache has its own disk storage and
keeps copies of recently requested objects in
this storage
149. Web Caching (Or Proxy Server)
•A user’s browser can be configured so that all
of the user’s HTTP requests are first directed
to the Web cache.
•Once a browser is configured, each browser
request for an object is first directed to the
Web cache
Contd…
151. Clients requesting objects through a Web cache
•A user’s browser can be configured so that all
of the user’s HTTP requests are first directed
to the Web cache.
•Once a browser is configured, each browser
request for an object is first directed to the
Web cache
Contd…
152. Clients requesting objects through a Web cache
•Suppose a browser is requesting the object
http://www.someschool.edu/campus.gif.
•Here is what happens:
Contd…
Example
153. A browser is requesting the object
http://www.someschool.edu/campus.gif
• The browser establishes a TCP connection to the Web cache and sends an HTTP
request for the object to the Web cache.
• The Web cache checks to see if it has a copy of the object stored locally. If it
does, the Web cache returns the object within an HTTP response message to
the client browser.
• If the Web cache does not have the object, the Web cache opens a TCP
connection to the origin server, that is, to www.someschool.edu. The Web
cache then sends an HTTP request for the object into the cache-to-server TCP
connection. After receiving this request, the origin server sends the object
within an HTTP response to the Web cache.
• When the Web cache receives the object, it stores a copy in its local storage
and sends a copy, within an HTTP response message, to the client browser.
154. A browser is requesting the object
http://www.someschool.edu/campus.gif
•The browser establishes a TCP connection to the
Web cache and sends an HTTP request for the
object to the Web cache.
•The Web cache checks to see if it has a copy of
the object stored locally. If it does, the Web
cache returns the object within an HTTP
response message to the client browser.
1/3
155. A browser is requesting the object
http://www.someschool.edu/campus.gif
•If the Web cache does not have the object, the
Web cache opens a TCP connection to the origin
server, that is, to www.someschool.edu.
•The Web cache then sends an HTTP request for
the object into the cache-to-server TCP
connection.
Contd…
2/3
156. A browser is requesting the object
http://www.someschool.edu/campus.gif
•After receiving this request, the origin server
sends the object within an HTTP response to the
Web cache.
•When the Web cache receives the object, it
stores a copy in its local storage and sends a
copy, within an HTTP response message, to the
client browser.
Contd…
3/3
157. Note that a cache is both a server
and a client at the same time.
•When it receives requests from and sends
responses to a browser, it is a server.
•When it sends requests to and receives
responses from an origin server, it is a client.
158. Web caching has seen deployment in
the Internet for two reasons.
•First, A Web cache can substantially reduce the
response time for a client request, particularly if the
bottleneck bandwidth between the client and the origin
server is much less than the bottleneck bandwidth
between the client and the cache. If there is a high-
speed connection between the client and the cache and
if the cache has the requested object, then the cache
will be able to deliver the object rapidly to the client.
159. Web caching has seen deployment in
the Internet for two reasons.
•Second, as we will soon illustrate with an example, Web
caches can substantially reduce traffic on an institution’s
access link to the Internet. By reducing traffic, the
institution (for example, a company or a university) does
not have to upgrade bandwidth as quickly, thereby
reducing costs. Web caches can substantially reduce
Web traffic in the Internet as a whole, thereby
improving performance for all applications.
161. Explanation of the Diagram
•This figure shows two networks
•The institutional network and
•The rest of the public Internet.
162. The Institutional Network
•It is a high-speed LAN.
•A router in the institutional network and a router
in the Internet are connected by a 15 Mbps link.
•The origin servers are attached to the Internet
but are located all over the globe.
1/3
163. The Institutional Network
•Suppose that the average object size is 1 Mbits and
that the average request rate from the institution’s
browsers to the origin servers is 15 requests per
second.
•Suppose that the HTTP request messages are
negligibly small and thus create no traffic in the
networks or in the access link (from institutional
router to Internet router).
Contd…
2/3
164. The Institutional Network
•Suppose that the amount of time it takes from
when the router on the Internet side of the access
link forwards an HTTP request (within an IP
datagram) until it receives the response (within
many IP datagrams) is two seconds on average.
•Informally, we refer to this last delay as the
“Internet delay.”
Contd…
3/3
165. The Total Response Time
•It is the time from the browser’s request of an
object until its receipt of the object.
•It is the sum of the LAN delay, the access
delay (that is, the delay between the two
routers), and the Internet delay.
166. Calculation of these Delays
The Traffic intensity on the LAN is
(15 requests/sec)*(1 Mbits/request)/(100 Mbps) = 0.15
The traffic intensity on the access link (from the
Internet router to institution router) is
(15 requests/sec) (1 Mbits/request)/(15 Mbps) = 1
167. 2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
168. The Conditional GET
•Although caching can reduce user-perceived
response times, it introduces a new problem—the
copy of an object residing in the cache may be stale.
•In other words, the object housed in the Web
server may have been modified since the copy was
cached at the client.
Contd…
1/3
169. The Conditional GET
•HTTP has a mechanism that allows a cache to verify
that its objects are up to date. This mechanism is
called the Conditional GET.
•An HTTP request message is a so-called conditional
GET message if (1) the request message uses the
GET method and (2) the request message includes
an If-Modified-Since: header line
Contd…
2/3
170. How the Conditional GET Operates?
• First, On the behalf of a requesting browser, a proxy cache sends
a request message to a Web server:
GET /fruit/kiwi.gif HTTP/1.1
Host: www.exotiquecuisine.com
• Second, The Web server sends a response message with the
requested object to the cache:
HTTP/1.1 200 OK
Date: Sat, 8 Oct 2011 15:39:29
Server: Apache/1.3.0 (Unix)
Last-Modified: Wed, 7 Sep 2011 09:23:24
Content-Type: image/gif
(data data data data data ...)
171. How the Conditional GET Operates?
• The cache forwards the object to the requesting browser but
also caches the object locally. Importantly, the cache also stores
the last-modified date along with the object.
• Third, one week later, another browser requests the same object
via the cache, and the object is still in the cache. Since this
object may have been modified at the Web server in the past
week, the cache performs an up-to-date check by issuing a
conditional GET.
• The cache sends
GET /fruit/kiwi.gif HTTP/1.1
Host: www.exotiquecuisine.com
Contd…
172. How the Conditional GET Operates?
• The value of the If-modified-since: header line is
exactly equal to the value of the Last-Modified: header line that
was sent by the server one week ago. This conditional GET is
telling the server to send the object only if the object has been
modified since the specified date. Suppose the object has not
been modified since 7 Sep 2011 09:23:24.
• Then, fourth, the Web server sends a response message to the
cache:
HTTP/1.1 304 Not Modified
Date: Sat, 15 Oct 2011 15:39:29
Server: Apache/1.3.0 (Unix)
(empty entity body)
Contd…
173. How the Conditional GET Operates?
•We see that in response to the conditional GET, the Web
server still sends a response message but does not include
the requested object in the response message.
•Including the requested object would only waste bandwidth
and increase user-perceived response time, particularly if the
object is large.
•The last response message has 304 Not Modified in
the status line, which tells the cache that it can go ahead and
forward its (the proxy cache’s) cached copy of the object to
the requesting browser.
Contd…
175. SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
177. File Transfer Protocol: FTP
•In a typical FTP session, the user is sitting in front of
one host (the local host) and wants to transfer files to
or from a remote host.
•In order for the user to access the remote account,
the user must provide a user identification and a
password.
•After providing this authorization information, the
user can transfer files from the local file system to the
remote file system and vice versa
179. FTP moves files between local and remote file system
•The user interacts with FTP through an FTP
user agent.
•The user first provides the hostname of the
remote host, causing the FTP client process in
the local host to establish a TCP connection
with the FTP server process in the remote
host.
Contd…
180. FTP moves files between local and remote file system
•The user then provides the user identification
and password, which are sent over the TCP
connection as part of FTP commands.
•Once the server has authorized the user, the
user copies one or more files stored in the
local file system into the remote file system
(or vice versa).
Contd…
182. TCP Connections in FTP
•FTP uses two parallel TCP connections to
transfer a file,
•A Control Connection and
•A Data Connection.
183. Control Connection and Data Connection
•The Control Connection is used for sending control
information between the two hosts—
•The Control information such as
•User identification,
•Password,
•Commands to change remote directory, and
•Commands to “put” and “get” files.
•The Data Connection is used to send actual file.
184. Difference Between FTP and HTTP
First Difference is
•FTP uses a separate Control connection, So FTP is said
to send its Control information out-of-band.
•HTTP sends request and response header lines into the
same TCP connection that carries the transferred file
itself. For this reason, HTTP is said to send its Control
information in-band
185. Difference Between FTP and HTTP
Second Difference is
• The FTP server must maintain state about the user.
• The server must associate the control connection with a specific
user account, and the server must keep track of the user’s
current directory as the user wanders about the remote
directory tree.
• Keeping track of this state information for each ongoing user
session significantly constrains the total number of sessions that
FTP can maintain simultaneously
• HTTP is stateless—it does not have to keep track of any user state.
187. Operation of FTP
•When a user starts an FTP session with a
remote host, the client side of FTP (user) first
initiates a control TCP connection with the
server side (remote host) on server port
number 21.
1/5
188. Operation of FTP
•The client side of FTP sends the user
identification and password over this control
connection.
•The client side of FTP also sends, over the
control connection, commands to change
the remote directory.
Contd…
2/5
189. Operation of FTP
•When the server side receives a command
for a file transfer over the control connection
(either to, or from, the remote host), the
server side initiates a TCP data connection to
the client side.
Contd…
3/5
190. Operation of FTP
•FTP sends exactly one file over the data
connection and then closes the data
connection.
•If, during the same session, the user wants to
transfer another file, FTP opens another data
connection.
Contd…
4/5
191. Operation of FTP
•Thus, with FTP, the control connection
remains open throughout the duration of the
user session, but a new data connection is
created for each file transferred within a
session (that is, the data connections are
non-persistent).
Contd…
5/5
192. FTP Commands and Replies
•The commands, from client to server, and replies,
from server to client, are sent across the control
connection in 7-bit ASCII format. Thus, like HTTP
commands, FTP commands are readable by
people.
•Each command consists of four uppercase ASCII
characters, some with optional arguments.
193. Some of the Commands are
•USER username: Used to send the user
identification to the server.
•PASS password: Used to send the user
password to the server.
1/4
194. Some of the Commands are
•LIST: Used to ask the server to send back a list
of all the files in the current remote directory.
The list of files is sent over a (new and non-
persistent) data connection rather than the
control TCP connection.
Contd…
2/4
195. Some of the Commands are
•RETR filename: Used to retrieve (that is, get)
a file from the current directory of the
remote host. This command causes the
remote host to initiate a data connection and
to send the requested file over the data
connection.
Contd…
3/4
196. Some of the Commands are
•STOR filename: Used to store (that is, put) a
file into the current directory of the remote
host.
Contd…
4/4
197. Contd…
•There is a one-to-one correspondence
between the command that the user issues
and the FTP command sent across the control
connection.
•Each command is followed by a reply, sent
from server to client. The replies are three-
digit numbers, with an optional message
following the number.
198. Some replies, along with their possible messages
331 Username OK, password required.
125 Data connection already open; transfer starting.
425 Can’t open data connection.
452 Error writing file.
199. SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with
HTTP, Mail Message Format, Mail Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
200. 2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
201. Introduction to E-Mail
•E-Mail is an asynchronous communication medium.
•People send and read messages when it is convenient for
them, without having to coordinate with other people’s
schedules.
•Electronic Mail is fast, easy to distribute, and inexpensive.
•Modern e-mail has many powerful features, including
messages with attachments, hyperlinks, HTML-formatted
text, and embedded photos.
203. High-Level View of the
Internet Mail System
•Internet Mail has three major components:
•User Agents,
•Mail Servers, and
•The Simple Mail Transfer Protocol (SMTP)
204. Example
•Alice, sending an e-mail message to a recipient,
Bob.
•User agents allow users to read, reply to,
forward, save, and compose messages.
•Microsoft Outlook and Apple Mail are examples
of user agents for e-mail.
205. Example -- Contd…
•When Alice is finished composing her
message, her user agent sends the message to
her mail server, where the message is placed
in the mail server’s outgoing message queue.
•When Bob wants to read a message, his user
agent retrieves the message from his mailbox
in his mail server.
206. Example
•Mail servers form the core of the e-mail
infrastructure.
•Each recipient, such as Bob, has a mailbox
located in one of the mail servers.
•Bob’s mailbox manages and maintains the
messages that have been sent to him.
Contd…
207. Example
•A typical message starts its journey in the
sender’s user agent, travels to the sender’s
mail server, and travels to the recipient’s
mail server, where it is deposited in the
recipient’s mailbox
Contd…
208. Example
•When Bob wants to access the messages in
his mailbox, the mail server containing his
mailbox authenticates Bob (with usernames
and passwords).
•Alice’s mail server must also deal with failures
in Bob’s mail server.
Contd…
209. Example
•If Alice’s server cannot deliver mail to Bob’s
server, Alice’s server holds the message in a
message queue and attempts to transfer the
message later.
•Reattempts are often done every 30 minutes or
so; if there is no success after several days, the
server removes the message and notifies the
sender (Alice) with an e-mail message.
Contd…
210. 2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
211. SMTP
•SMTP is the principal application-layer
protocol for Internet electronic mail.
•It uses the reliable data transfer service of
TCP to transfer mail from the sender’s mail
server to the recipient’s mail server.
212. SMTP has two sides
•SMTP has two sides
•A Client Side, which executes on the
sender’s mail server, and
•A Server Side, which executes on the
recipient’s mail server.
•Both the client and server sides of SMTP run
on every mail server.
213. SMTP has two sides
•When a mail server sends mail to other mail
servers, it acts as an SMTP client.
•When a mail server receives mail from other
mail servers, it acts as an SMTP server
214. 2.4.1 SMTP
•SMTP is at the heart of Internet electronic mail.
•SMTP transfers messages from senders’ mail
servers to the recipients’ mail servers.
•SMTP is much older than HTTP.
•SMTP restricts the body (not just the headers)
of all mail messages to simple 7-bit ASCII.
215. The Basic Operation of SMTP
Suppose Alice wants to send Bob a simple ASCII message.
1. Alice invokes her user agent for e-mail, provides
Bob’s e-mail address (for example,
bob@someschool.edu), composes a message, and
instructs the user agent to send the message.
2. Alice’s user agent sends the message to her mail
server, where it is placed in a message queue
1/3
216. The Basic Operation of SMTP
Suppose Alice wants to send Bob a simple ASCII message.
3. The client side of SMTP, running on Alice’s mail
server, sees the message in the message queue. It
opens a TCP connection to an SMTP server,
running on Bob’s mail server.
4. After some initial SMTP handshaking, the SMTP
client sends Alice’s message into the TCP
connection.
Contd…
2/3
217. The Basic Operation of SMTP
Suppose Alice wants to send Bob a simple ASCII message.
5. At Bob’s mail server, the server side of SMTP
receives the message. Bob’s mail server then
places the message in Bob’s mailbox.
6. Bob invokes his user agent to read the
message at his convenience.
Contd…
3/3
219. •SMTP does not normally use
intermediate mail servers for sending
mail, even when the two mail servers
are located at opposite ends of the
world.
220. How SMTP transfers a message from a sending
mail server to a receiving mail server
•First, the client SMTP (running on the sending
mail server host) has TCP establish a
connection to port 25 at the server SMTP
(running on the receiving mail server host). If
the server is down, the client tries again later.
1/3
221. How SMTP transfers a message from a sending
mail server to a receiving mail server
•Once this connection is established, SMTP
client indicates the e-mail address of the sender
(the person who generated the message) and
the e-mail address of the recipient.
•The client sends the message.
Contd…
2/3
222. How SMTP transfers a message from a sending
mail server to a receiving mail server
•SMTP can count on the reliable data transfer
service of TCP to get the message to the server
without errors.
•The client then repeats this process over the
same TCP connection if it has other messages to
send to the server; otherwise, it instructs TCP to
close the connection.
Contd…
3/3
223. 2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
224. Comparison of SMTP with HTTP
•HTTP transfers files (also called objects) from a
Web server to a Web client (typically a browser);
SMTP transfers files (that is, e-mail messages)
from one mail server to another mail server.
•When transferring the files, both persistent HTTP
and SMTP use persistent connections. Thus, the
two protocols have common characteristics.
225. Difference between SMTP with HTTP
•HTTP is a pull protocol—someone loads information on
a Web server and users use HTTP to pull the
information from the server at their convenience. The
TCP connection is initiated by the machine that wants
to receive the file.
•SMTP is a push protocol—the sending mail server
pushes the file to the receiving mail server. The TCP
connection is initiated by the machine that wants to
send the file.
The First difference is
226. Difference between SMTP with HTTP
•SMTP requires each message, including the body of
each message, to be in 7-bit ASCII format. If the
message contains characters that are not 7-bit ASCII
(for example, French characters with accents) or
contains binary data (such as an image file), then the
message has to be encoded into 7-bit ASCII.
•HTTP data does not impose this restriction.
The Second Difference is
227. Difference between SMTP with HTTP
•HTTP encapsulates each object in its own
HTTP response message.
•Internet mail (SMTP) places all of the
message’s objects into one message.
The Third Difference is
228. 2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
229. 2.4.3 Mail Message Formats
•When an e-mail message is sent from one
person to another, a header containing
peripheral information precedes the body of
the message itself.
•This peripheral information is contained in a
series of header lines.
Contd…
1/5
230. 2.4.3 Mail Message Formats
•The header lines and the body of the message are
separated by a blank line
•Each header line contains readable text, consisting
of a keyword followed by a colon followed by a
value.
•Some of the keywords are required and others are
optional.
Contd…
2/5
231. 2.4.3 Mail Message Formats
•Every header must have
•A From: header line
•A To: header line;
•A header may include a Subject: header line
•Other optional header lines.
•It is important to note that these header lines are
different from the SMTP commands
Contd…
3/5
232. 2.4.3 Mail Message Formats
•A typical message header looks like this:
From: alice@crepes.fr
To: bob@hamburger.edu
Subject: Seeking Permission.
Contd…
4/5
233. 2.4.3 Mail Message Formats
•After the message header, a blank line follows;
then the message body (in ASCII) follows.
•You should use Telnet to send a message to a
mail server that contains some header lines,
including the Subject: header line.
•To do this, issue telnet serverName 25
Contd…
5/5
234. 2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
235. 2.4.4 Mail Access Protocols
•Mail access uses a client-server architecture—
the user reads e-mail with a client that
executes on the user’s end system.
•Once SMTP delivers the message from Alice’s
mail server to Bob’s mail server, the message
is placed in Bob’s mailbox.
1/7
236. 2.4.4 Mail Access Protocols
•Given that Bob (the recipient) executes his
user agent on his local PC, it is natural to
consider placing a mail server on his local PC.
•With this approach, Alice’s mail server would
dialogue directly with Bob’s PC.
Contd…
2/7
237. 2.4.4 Mail Access Protocols
•There is a problem with this approach.
•A mail server manages mailboxes and runs the
client and server sides of SMTP.
•If Bob’s mail server were to reside on his local
PC, then Bob’s PC would have to remain always
on, and connected to the Internet, in order to
receive new mail, which can arrive at any time.
• This is impractical for many Internet users.
Contd…
3/7
238. 2.4.4 Mail Access Protocols
•Instead, a typical user runs a user agent on
the local PC but accesses its mailbox stored
on an always-on shared mail server.
•This mail server is shared with other users
and is typically maintained by the user’s ISP
Contd…
4/7
239. 2.4.4 Mail Access Protocols
•SMTP has been designed for pushing e-mail
from one host to another.
•The sender’s user agent does not dialogue
directly with the recipient’s mail server.
Contd…
5/7
240. 2.4.4 Mail Access Protocols
•There are currently a number of popular mail
access protocols, including
•Post Office Protocol—Version 3 (POP3),
•Internet Mail Access Protocol (IMAP), and
•HTTP.
Contd…
6/7
241. 2.4.4 Mail Access Protocols
•SMTP is used to transfer mail from the sender’s
mail server to the recipient’s mail server.
•SMTP is also used to transfer mail from the
sender’s user agent to the sender’s mail server.
• A mail access protocol, such as POP3, is used to
transfer mail from the recipient’s mail server to
the recipient’s user agent.
Contd…
7/7
242. POP3
•POP3 is an extremely simple mail access
protocol.
•It is short and quite readable.
•Because the protocol is so simple, its
functionality is rather limited.
243. Working of POP3
•POP3 begins when the user agent (the client)
opens a TCP connection to the mail server (the
server) on port 110.
•With the TCP connection established, POP3
progresses through three phases:
•Authorization,
•Transaction, and
•Update.
244. First Phase -- Authorization
•During the first phase, authorization, the
user agent sends a username and a
password to authenticate the user.
245. Second Phase -- Transaction
•During the second phase, transaction, the
user agent retrieves messages; also during
this phase, the user agent can mark messages
for deletion, remove deletion marks, and
obtain mail statistics.
246. Third Phase -- Update
•The third phase, update, occurs after the
client has issued the quit command, ending
the POP3 session.
•At this time, the mail server deletes the
messages that were marked for deletion.
•In a POP3 transaction, the user agent issues
commands, and the server responds to each
command with a reply.
247. Possible Responses of POP3 Transaction
•There are two possible responses:
•+OK (sometimes followed by server-to-
client data), used by the server to indicate
that the previous command was fine; and
•-ERR, used by the server to indicate that
something was wrong with the previous
command.
248. Consider the sample response message
telnet mailServer 110
+OK POP3 server ready
user bob
+OK
pass hungry
+OK user successfully logged on
If you misspell a command, the POP3 server will
reply with an -ERR message.
249. Two modes of User in the POP3 Transaction Phase
•A user agent using POP3 can be configured (by
the user) to “download and delete” or to
“download and keep”.
•The sequence of commands issued by a POP3
user agent depends on which of these two modes
the user agent is operating in.
•In the download-and-delete mode, the user agent
will issue the list, retr, and dele commands.
250. Transaction Message in the Download and Delete Mode
C: list
S: 1 498
S: 2 912
S: .
C: retr 1
S: (blah blah ...
S: .................
S: ..........blah)
S: .
C: dele 1
C: retr 2
S: (blah blah ...
S: .................
S: ..........blah)
S: .
C: dele 2
C: quit
S: +OK POP3 server signing off
251. Explanation of the Message
•The user agent first asks the mail server to list the size
of each of the stored messages.
•The user agent then retrieves and deletes each message
from the server. Note that after the authorization
phase, the user agent employed only four commands:
list, retr, dele, and quit.
•After processing the quit command, the POP3 server
enters the update phase and removes messages 1 and 2
from the mailbox.
252. Disadvantage of Download-and-Delete Mode
•The recipient may want to access his mail messages
from multiple machines (say, his office PC, his home
PC, and his portable computer). Such users are
called as nomadic user.
•The download-and-delete mode partitions
recipient’s mail messages over these three
machines; if he first reads a message on his office
PC, he will not be able to reread the message from
his portable at home later in the evening.
253. Download-and-Keep Mode
•In the download-and-keep mode, the user agent
leaves the messages on the mail server after
downloading them.
•In this case, the recipient can reread messages
from different machines;
•he can access a message from work and access
it again later in the week from home.
254. During a POP3 session between a
user agent and the mail server
•The POP3 server maintains some state information.
•The POP3 Server keeps track of which user
messages have been marked deleted.
•The POP3 server does not carry state information
across POP3 sessions.
•This lack of state information across the sessions
simplifies the implementation of a POP3 server.
255. Problem with POP3 for Nomadic User
•With POP3 access, once Bob has downloaded his
messages to the local machine, he can create mail
folders and move the downloaded messages into
the folders.
•Bob can then delete messages, move messages
across folders, and search for messages (by
sender name or subject).
1/2
256. Problem with POP3 for Nomadic User
•But this paradigm—namely, folders and messages
in the local machine—poses a problem for the
nomadic user, who would prefer to maintain a
folder hierarchy on a remote server that can be
accessed from any computer. This is not possible
with POP3—the POP3 protocol does not provide
any means for a user to create remote folders
and assign messages to folders.
Contd…
2/2
257. Solution is IMAP Protocol
•IMAP is a mail access protocol.
•It has many more features than POP3.
•It is also significantly more complex.
•Thus the client and server side
implementations are more complex.
258. IMAP Server
• An IMAP server will associate each message with a folder;
• When a message first arrives at the server, it is associated with the
recipient’s INBOX folder.
• The recipient can then move the message into a new, user-created
folder, read the message, delete the message, and so on.
• The IMAP protocol provides commands to allow users to create
folders and move messages from one folder to another.
• IMAP provides commands that allow users to search remote folders
for messages matching specific criteria.
• IMAP server maintains user state information across IMAP sessions—
for example, the names of the folders and which messages are
associated with which folders.
259. Another important feature of IMAP
•It has commands that permit a user agent to obtain
components of messages.
•For example, a user agent can obtain just the message
header of a message or just one part of a multipart MIME
message.
•This feature is useful when there is a low-bandwidth
connection between the user agent and its mail server.
•With a low bandwidth connection, the user may not want to
download all of the messages in its mailbox, particularly
avoiding long messages that might contain, for example, an
audio or video clip.
260. Web-Based E-Mail
•With this service, the user agent is an ordinary Web browser,
and the user communicates with its remote mailbox via HTTP.
•When a recipient wants to access a message in his mailbox,
the e-mail message is sent from Bob’s mail server to
reciever’s browser using the HTTP protocol rather than the
POP3 or IMAP protocol.
•When a sender wants to send an e-mail message, the e-mail
message is sent from sender browser to sender mail server
over HTTP rather than over SMTP.
•Sender’s mail server still sends messages to, and receives
messages from, other mail servers using SMTP.
261. End of the Chapter - 2.4
Electronic Mail in the Internet
262. SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by
DNS, Overview of How DNS Works, DNS Records and
Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
263. 2.5 DNS --The Internet's Directory Service
Syllabus
2.5.1 Services Provided by DNS,
2.5.2 Overview of How DNS Works,
2.5.3 DNS Records and Messages
264. 2.5 DNS --The Internet's Directory Service
•Internet hosts can be identified in many ways.
•One identifier for a host is its hostname.
•cnn.com
•www.yahoo.com
•gaia.cs.umass.edu
•cis.poly.edu
1/4
265. 2.5 DNS --The Internet's Directory Service
•Hostnames provide little, if any, information
about the location within the Internet of the
host.
•A hostname such as www.eurecom.fr,
which ends with the country code .fr, tells
us that the host is probably in France.
Contd…
2/4
266. 2.5 DNS --The Internet's Directory Service
•Hostnames can consist of variable-length
alphanumeric characters, they would be
difficult to process by routers. For these
reasons, hosts are also identified by so-called
IP addresses.
Contd…
3/4
267. 2.5 DNS --The Internet's Directory Service
•An IP address consists of four bytes and has a rigid
hierarchical structure.
•An IP address looks like 121.7.106.83
•Each period separates one of the bytes expressed in
decimal notation from 0 to 255.
•An IP address is hierarchical because as we scan the
address from left to right, we obtain more and more
specific information about where the host is located in
the Internet.
Contd…
4/4
268. Two Ways to identify a Host
•By a hostname and
•By an IP address.
•People prefer the more mnemonic
hostname identifier.
•Routers prefer fixed-length, hierarchically
structured IP addresses.
269. Importance of DNS
•In order to reconcile these preferences, we
need a directory service that translates
hostnames to IP addresses.
•This is the main task of the Internet’s Domain
Name System (DNS).
270. Charecteristics of DNS
•The DNS is
•A distributed database implemented in
a hierarchy of DNS servers, and
•An application-layer protocol that allows
hosts to query the distributed database.
271. DNS Servers and Protocol
•The DNS servers are UNIX machines running
the Berkeley Internet Name Domain (BIND)
software.
•The DNS protocol runs over UDP and uses
port 53.
Contd…
1/2
272. DNS Servers and Protocol
•DNS is commonly employed by other
application-layer protocols—including HTTP,
SMTP, and FTP—to translate user-supplied
hostnames to IP addresses.
Contd…
2/2
273. What happens when a browser (i.e., an HTTP client),
running on some user’s host, requests the URL
www.someschool.edu/index.html.
•In order for the user’s host to be able to
send an HTTP request message to the Web
server www.someschool.edu, the user’s
host must first obtain the IP address of
www.someschool.edu.
•This is done as follows.
274. Steps when a browser (i.e., an HTTP client),
running on some user’s host, requests the URL
www.someschool.edu/index.html.
1. The user machine runs the client side of the
DNS application.
2. The browser extracts the hostname,
www.someschool.edu, from the URL
and passes the hostname to the client side
of the DNS application.
1/3
275. Steps when a browser (i.e., an HTTP client),
running on some user’s host, requests the URL
www.someschool.edu/index.html.
3. The DNS client sends a query containing the
hostname to a DNS server.
4. The DNS client eventually receives a reply,
which includes the IP address for the
hostname.
Contd…
2/3
276. Steps when a browser (i.e., an HTTP client), running
on some user’s host, requests the URL
www.someschool.edu/index.html.
5. Once the browser receives the IP address
from DNS, it can initiate a TCP connection
to the HTTP server process located at port
80 at that IP address.
Contd…
3/3
277. 2.5.1 Services Provided by DNS
•DNS helps to reduce DNS network traffic as well
as the average DNS delay.
•DNS provides a few other important services in
addition to translating hostnames to IP addresses:
•Host aliasing.
•Mail server aliasing.
•Load distribution.
278. Host Aliasing
•A host with a complicated hostname can have
one or more alias names.
•For example, a hostname such as
relay1.westcoast.enterprise.com
could have two aliases such as
• enterprise.com and
•www.enterprise.com
1/2
279. Host Aliasing
•relay1.westcoast.enterprise.com
hostname is said to be a canonical hostname.
•Alias hostnames are more mnemonic than
canonical hostnames.
•DNS can be invoked by an application to
obtain the canonical hostname for a supplied
alias hostname and IP address of the host.
Contd…
2/2
280. Mail Server Aliasing
•E-mail addresses are mnemonic.
•For example, if Bob has an account with
Hotmail, Bob’s e-mail address might be as
simple as bob@hotmail.com.
1/4
281. Mail Server Aliasing
•The hostname of the Hotmail mail server is
more complicated and much less mnemonic
than simply hotmail.com
•For example, the canonical hostname might
be something like
relay1.west-coast.hotmail.com
Contd…
2/4
282. Mail Server Aliasing
•DNS can be invoked by a mail application to
obtain the canonical hostname for a supplied
alias hostname as well as the IP address of
the host.
Contd…
3/4
283. Mail Server Aliasing
•The MX record permits a company’s mail
server and Web server to have identical
(aliased) hostnames.
•For example, a company’s Web server and
mail server can both be called
enterprise.com.
Contd…
4/4
284. Load Distribution.
•DNS is also used to perform load distribution
among replicated servers, such as replicated
Web servers.
•Busy sites, such as cnn.com, are replicated over
multiple servers, with each server running on a
different end system and each having a
different IP address.
1/4
285. Load Distribution.
•For replicated Web servers, a set of IP addresses
is thus associated with one canonical hostname.
•The DNS database contains this set of IP
addresses.
•Clients make a DNS query for a name mapped to
a set of addresses.
Contd…
2/4
286. Load Distribution.
•The server responds with the entire set of IP
addresses, but rotates the ordering of the
addresses within each reply.
•Because a client sends its HTTP request message to
the IP address that is listed first in the set, DNS
rotation distributes the traffic among the replicated
servers.
Contd…
3/4
287. Load Distribution.
•DNS rotation is also used for e-mail so that
multiple mail servers can have the same alias
name.
•Content distribution companies such as
Akamai have used DNS in more sophisticated
ways to provide Web content distribution.
Contd…
4/4
288. 2.5.2 Overview of How DNS Works
hostname-to-IP-address translation service
•Suppose that some application (such as a Web
browser or a mail reader) running in a user’s host
needs to translate a hostname to an IP address.
•The application will invoke the client side of DNS,
specifying the hostname that needs to be
translated.
1/4
289. 2.5.2 Overview of How DNS Works
hostname-to-IP-address translation service
•On many UNIX-based machines, gethostbyname()
is the function call that an application calls in
order to perform the translation.
•DNS in the user’s host then takes over, sending a
query message into the network.
Contd…
2/4
290. 2.5.2 Overview of How DNS Works
hostname-to-IP-address translation service
•All DNS query and reply messages are sent within
UDP datagrams to port 53.
•After a delay, ranging from milliseconds to
seconds, DNS in the user’s host receives a DNS
reply message that provides the desired mapping
Contd…
3/4
291. 2.5.2 Overview of How DNS Works
hostname-to-IP-address translation service
•This mapping is then passed to the invoking
application.
•Thus, from the perspective of the invoking
application in the user’s host, DNS is a black box
providing a simple, straightforward translation
service.
Contd…
4/4
292. A Simple Design for DNS
•A simple design for DNS would have one DNS server
that contains all the mappings.
•In this centralized design,
•Clients simply direct all queries to the single
DNS server, and
•The DNS server responds directly to the
querying clients.
•Although the simplicity of this design is attractive, it is
inappropriate for today’s Internet, with its vast (and
growing) number of hosts.
293. The Problems with a Centralized Design
•A single point of failure. If the DNS server crashes,
so does the entire Internet.
•Traffic volume. A single DNS server would have to
handle all DNS queries (for all the HTTP requests
and e-mail messages generated from hundreds of
millions of hosts).
1/3
294. The Problems with a Centralized Design
•Distant centralized database. A single DNS server
cannot be “close to” all the querying clients. If we
put the single DNS server in New York City, then
all queries from Australia must travel to the other
side of the globe, perhaps over slow and
congested links. This can lead to significant delays.
Contd…
2/3
295. The Problems with a Centralized Design
•Maintenance. The single DNS server would have
to keep records for all Internet hosts. Not only
would this centralized database be huge, but it
would have to be updated frequently to account
for every new host.
Contd…
3/3
296. A Distributed, Hierarchical Database
•The DNS uses a large number of servers,
organized in a hierarchical fashion and
distributed around the world.
•No single DNS server has all of the mappings
for all of the hosts in the Internet. Instead, the
mappings are distributed across the DNS
servers.
297. Three Classes of DNS Servers
•There are Three Classes of DNS Servers:
•Root DNS Servers,
•Top-Level Domain (TLD) DNS Servers, and
•Authoritative DNS Servers
•All these are organized in a hierarchy.
299. How these three classes of servers interact
Suppose a DNS client wants to determine the IP address for the
hostname www.amazon.com.
•To a first approximation, the following events will take place.
•The client first contacts one of the root servers, which
returns IP addresses for TLD servers for the top-level
domain com.
•The client then contacts one of these TLD servers, which
returns the IP address of an authoritative server for
amazon.com.
•Finally, the client contacts one of the authoritative
servers for amazon.com, which returns the IP address
300. Root DNS Servers
•In the Internet, there are 13 root DNS servers
most of which are located in North America.
•These DNS root servers are listed in 2012
•These 13 DNS Servers are listed here in
(name, organization, location) format:
(Refer Next Slide)
1/3
301. Root DNS Servers
1. Verisign, Los Angeles, CA (5 other sites)
2. USC-ISI, Marina del Rey, CA
3. Cogent, Herndon, VA (5 other sites)
4. U, Maryland College Park, MD
5. NASA, Mt View, CA
6. Internet Software C, Palo Alto, CA (and 48 other sites)
7. US DoD, Columbus, OH (5 other sites)
8. ARL, Aberdeen, MD
9. Netnod, Stockholm (37 other sites)
10. Verisign, Dulles, VA (69 other sites )
11. RIPE, London (17 other sites)
12. ICANN, Los Angeles, CA (41 other sites)
13. WIDE, Tokyo (5 other sites)
2/3 Contd…
13 DNS Servers are listed here in
(name, organization, location)
format
302. Root DNS Servers
•Each “server” is actually a network of
replicated servers, for both security and
reliability purposes.
• All together, there are 247 root servers as of
fall 2011
Contd…
3/3
303. Top-Level Domain (TLD) Servers
•These servers are responsible for top-level domains
such as com, org, net, edu, and gov, and all of the
country top-level domains such as uk, fr, ca, and jp.
•The company Verisign Global Registry Services
maintains the TLD Servers for the com top-level domain.
•The company Educause maintains the TLD servers for
edu top-level domain. for a list of all top-level domains.
304. Authoritative DNS Servers
•Every organization with publicly accessible
hosts (such as Web servers and mail servers)
on the Internet must provide publicly
accessible DNS records that map the names
of those hosts to IP addresses.
1/3
305. Authoritative DNS Servers
•An organization’s authoritative DNS server
houses these DNS records.
•An organization can choose to implement its
own authoritative DNS server to hold these
records.
Contd…
2/3
306. Authoritative DNS Servers
•The organization can pay to have these records
stored in an authoritative DNS server of some
service provider.
•Most universities and large companies
implement and maintain their own primary and
secondary (backup) authoritative DNS server.
Contd…
3/3
307. Local DNS Server
•A local DNS server does not strictly belong to
the hierarchy of servers but is nevertheless
central to the DNS architecture.
•Each ISP—such as a university, an academic
department, an employee’s company, or a
residential ISP—has a local DNS server (also
called a default name server).
Contd…
1/4
308. Local DNS Server
•When a host connects to an ISP, the ISP
provides the host with the IP addresses
of one or more of its local DNS servers.
Contd…
2/4
309. Local DNS Server
•A host’s local DNS server is “close to” the
host.
•For an institutional ISP, local DNS server
may be on the same LAN as the host;
•For a residential ISP, it is separated from
the host by no more than a few routers.
Contd…
3/4
310. Local DNS Server
•When a host makes a DNS query, the query is
sent to the local DNS server, which acts a
proxy, forwarding the query into the DNS
server hierarchy.
Contd…
4/4
312. Interaction of the various DNS Server
•Suppose the host cis.poly.edu desires the IP
address of gaia.cs.umass.edu.
•Suppose that Polytechnic’s local DNS server is
called dns.poly.edu and that an authoritative
DNS server for gaia.cs.umass.edu is called
dns.umass.edu.
Contd…
2/6
313. Interaction of the various DNS Server
•The host cis.poly.edu first sends a DNS query
message to its local DNS server, dns.poly.edu.
•The query message contains the hostname to
be translated, namely, gaia.cs.umass.edu.
•The local DNS server forwards the query
message to a root DNS server.
Contd…
3/6
314. Interaction of the various DNS Server
•The root DNS server takes note of the edu
suffix and returns to the local DNS server a
list of IP addresses for TLD servers responsible
for edu.
•The local DNS server then resends the query
message to one of these TLD servers.
Contd…
4/6
315. Interaction of the various DNS Server
•The TLD server takes note of the umass.edu
suffix and responds with the IP address of the
authoritative DNS server for the University of
Massachusetts, namely, dns.umass.edu.
Contd…
5/6
316. Interaction of the various DNS Server
•Finally, the local DNS server resends the query
message directly to dns.umass.edu, which
responds with the IP address of
gaia.cs.umass.edu.
•In order to obtain the mapping for one
hostname, eight DNS messages were sent: four
query messages and four reply messages.
Contd…
6/6
318. Recursive Querries and Iterative Querries
•The query sent from cis.poly.edu to dns.poly.edu is a
recursive query, since the query asks dns.poly.edu to
obtain the mapping on its behalf.
•But the subsequent three queries are iterative since all
of the replies are directly returned to dns.poly.edu.
•Any DNS query can be iterative or recursive.
•The query from the requesting host to the local DNS
server is recursive, and the remaining queries are
iterative.
319. DNS Caching
•In a query chain, when a DNS server receives a
DNS reply (containing a mapping from a
hostname to an IP address), it can cache the
mapping in its local memory.
•For example, each time the local DNS server
dns.poly.edu receives a reply from some DNS
server, it can cache any of the information
contained in the reply.
1/6
320. DNS Caching
•If a hostname/IP address pair is cached in a
DNS server and another query arrives to the
DNS server for the same hostname, the DNS
server can provide the desired IP address,
even if it is not authoritative for the
hostname.
Contd…
2/6
321. DNS Caching
•Because hosts and mappings between
hostnames and IP addresses are by no
means permanent, DNS servers discard
cached information after a period of time.
•Suppose that a host apricot.poly.edu queries
dns.poly.edu for the IP address for the
hostname cnn.com.
Contd…
3/6
322. DNS Caching
•Suppose that a host apricot.poly.edu queries
dns.poly.edu for the IP address for the
hostname cnn.com.
•Suppose that a few hours later, another
Polytechnic University host, say, kiwi.poly.fr,
also queries dns.poly.edu with the same
hostname.
Contd…
4/6
323. DNS Caching
•Because of caching, the local DNS server will
be able to immediately return the IP address
of cnn.com to this second requesting host
without having to query any other DNS
servers.
Contd…
5/6
324. DNS Caching
•A local DNS server can also cache the IP
addresses of TLD servers, thereby allowing
the local DNS server to bypass the root DNS
servers in a query chain.
Contd…
6/6
325. 2.5.3 DNS Records and Messages
•The DNS servers that together implement the
DNS distributed database store Resource
Records (RRs), including RRs that provide
hostname-to-IP address mappings.
•Each DNS reply message carries one or more
resource records.
326. A Resource Record is a four-tuple
that contains the following fields
(Name, Value, Type, TTL)
•TTL is the time to live of the resource record.
•It determines when a resource should be removed
from a cache.
•The meaning of Name and Value depend on Type.
327. The four values of TYPE field
•TYPE = A
•TYPE = NS
•TYPE = CNAME
•TYPE = MX
328. When TYPE = A
•If Type=A, then Name is a hostname and
Value is the IP address for the hostname.
•Thus, a Type A record provides the standard
hostname-to-IP address mapping.
•As an example, (relay1.bar.foo.com,
145.37.93.126, A) is a Type A record.
329. When TYPE = NS
•If Type=NS, then Name is a domain (such as
foo.com) and Value is the hostname of an
authoritative DNS server
•This record is used to route DNS queries
further along in the query chain.
•As an example, (foo.com, dns.foo.com, NS) is
a Type NS record.
330. When TYPE = CNAME
•If Type=CNAME, then Value is a canonical
hostname for the alias hostname Name.
•This record can provide querying hosts the
canonical name for a hostname.
•As an example, (foo.com, relay1.bar.foo.com,
CNAME) is a CNAME record.
331. When TYPE = MX
•If Type = MX, then Value is the canonical
name of a mail server that has an alias
hostname Name.
•As an example, (foo.com, mail.bar.foo.com, MX)
is an MX record.
•MX records allow the hostnames of mail servers
to have simple aliases.
1/2
332. When TYPE = MX
•Note that by using the MX record, a company can
have the same aliased name for its mail server
and for one of its other servers.
•To obtain the canonical name for the mail server, a
DNS client would query for an MX record; to
obtain the canonical name for the other server,
the DNS client would query for the CNAME record.
Contd…
2/2
333. DNS Server may be authoritative for a
particular hostname
•If a DNS server is authoritative for a particular
hostname, then the DNS server will contain a
Type A record for the hostname. (Even if the DNS
server is not authoritative, it may contain a Type
A record in its cache).
1/3
334. DNS Server may be authoritative for a
particular hostname
•If a server is not authoritative for a hostname,
then the server will contain a Type NS record for
the domain that includes the hostname; it will
also contain a Type A record that provides the IP
address of the DNS server in the Value field of the
NS record.
Contd…
2/3
335. DNS Server may be authoritative for a
particular hostname
• As an example, suppose an edu TLD server is not
authoritative for the host gaia.cs.umass.edu. Then this server
will contain a record for a domain that includes the host
gaia.cs.umass.edu.
• The edu TLD server would also contain a Type A record,
which maps the DNS server dns.umass.edu to an IP address,
for example, (dns.umass.edu, 128.119.40.111, A).
Contd…
3/3
336. DNS Messages
•Two kinds of DNS messages:
•DNS query message and.
•DNS reply message.
•Both query and reply messages have the same
format.
337. DNS Message Format
Identification Flags
Header Section 12 bytes
Number of
questions
Number of
answer RRs
Number of
authority RRs
Number of
additional RRs
Questions
(variable number of questions) Name, type fields for a query
Answers
(variable number of resource records) RRs in response to query
Authority
(variable number of resource records) Records for authoritative servers
Additional information Additional “helpful” info that
338. Sections in the DNS Message Format
•Header Sections (The first 12 bytes)
•Data Sections
339. DNS Message Format
Header Sections
• Identifier Field.
• Question Count field.
• Answer Count field.
• Authority Count field.
• Additional Information
Count field.
Data Sections
•Question Section
•Answer Section or Reply Section
•Authority Section
•Additional Information Section
340. Identifier Field
•It is the first field in the Header Section.
•It is a 16-bit number that identifies the query.
•This identifier is copied into the reply
message to a query, allowing the client to
match received replies with sent queries.
•There are a number of flags in the flag field.
341. Flags in the Flag field
•A 1-bit query/reply flag indicates whether
the message is a query (0) or a reply (1).
•A 1-bit authoritative flag is set in a reply
message when a DNS server is an
authoritative server for a queried name.
1/2
342. Flags in the Flag field
•A 1-bit recursion-desired flag is set when
a client (host or DNS server) desires that the
DNS server perform recursion when it
doesn’t have the record.
•A 1-bit recursion available flag is set in a
reply if the DNS server supports recursion.
Contd…
2/2
343. Other Header Sections in the
DNS Message Format
•In the header section, there are also four
number of fields.
•These fields indicate the number of
occurrences of the four types of data
sections that follow the header.
344. Four Fields in the Data Sections of
DNS Message Format
•Question Section
•Answer Section (or Reply Section or Response Section)
•Authority Section
•Additional Information Section
345. Question Section
• This Section contains information about the query that is
being made.
• This section includes
•A name field that contains the name that is
being queried, and
•A type field that indicates the type of question
being asked about the name.
• For Example, a host address associated with a name (Type A)
or the mail server for a name (Type MX).
346. Answer Section Or Reply Section Or
Response Section
•In a reply from a DNS server, the answer section
contains the resource records for the name that
was originally queried.
•In each resource record there is the Type (A, NS,
CNAME, and MX), the Value, and the TTL.
•A reply can return multiple RRs in the answer,
since a hostname can have multiple IP addresses.
348. Additional Section
•The additional section contains other helpful
records.
•For example, the answer field in a reply to an MX
query contains a resource record providing the
canonical hostname of a mail server.
•The additional section contains a Type A record
providing the IP address for the canonical hostname
of the mail server.
349. How would you like to send a DNS query
message directly from the host you’re
working on to some DNS server?
•This can easily be done with the nslookup
program, which is available from most
Windows and UNIX platforms.
350. nslookup in the Windows Host
•Open the Command Prompt.
•Invoke the nslookup program by simply typing
“nslookup.”
•Send a DNS query to any DNS server (root, TLD, or
authoritative).
•Receiving the reply message from the DNS server.
•Now nslookup will display the records included in the
reply (in a human-readable format).
351. End of 2.5 DNS --
The Internet's
Directory Service.
352. SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed
Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
354. 2.6.1 P2P File Distribution
•In Client-Server File Distribution, the server must
send a copy of the file to each of the peers—
placing an enormous burden on the server and
consuming a large amount of server bandwidth.
•In P2P File Distribution, each peer can
redistribute any portion of the file it has received
to any other peers, thereby assisting the server in
the distribution process
355. Most Popular P2P File Distribution Protocol
•As of 2012, the most popular P2P file distribution
protocol is BitTorrent.
•It was Originally developed by Bram Cohen.
•Now, there are many different independent
BitTorrent clients conforming to the BitTorrent
protocol, just as there are a number of Web
browser clients that conform to the HTTP protocol
356. Scalability of P2P Architectures
•Suppose the server and the peers are connected
to the Internet with access links.
•Let
• us be the upload rate of the server’s access link.
• ui be the upload rate of the ith
peer’s access link.
• di be the download rate of the ith peer’s access link.
• F be the size of the file to be distributed (in bits).
• N be the number of peers that want to obtain a copy of the file.
357. The Distribution Time
•The distribution time is the time it takes to get
a copy of the file to all N peers.
•Assume that the server and clients are not
participating in any other network applications, so
that all of their upload and download access
bandwidth can be fully devoted to distributing this
file.
359. Let’s first determine the distribution time for the
Client-Server Architecture
•Let DCS be the distribution time for the
client-server architecture.
•We make some observations
1/12
360. Let’s first determine the distribution time for the
Client-Server Architecture
•We make some observations
•The server must transmit one copy of the file to each of
the N peers. Thus the server must transmit___bits.
2/12 Contd…
361. Let’s first determine the distribution time for the
Client-Server Architecture
•We make some observations
•The server must transmit one copy of the file to each of
the N peers. Thus the server must transmit NF bits.
3/12 Contd…
362. Let’s first determine the distribution time for the
Client-Server Architecture
•We make some observations
•The server must transmit one copy of the file to each of
the N peers. Thus the server must transmit NF bits.
•The server’s upload rate is us,
•So, the time to distribute the file must be at least ____
4/12 Contd…
363. Let’s first determine the distribution time for the
Client-Server Architecture
•We make some observations
•The server must transmit one copy of the file to each of
the N peers. Thus the server must transmit NF bits.
•The server’s upload rate is us,
•So, the time to distribute the file must be at least NF/us
5/12 Contd…
1
364. Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = _______
6/12 Contd…
365. Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
7/12 Contd…
366. Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
•The peer with the lowest download rate cannot obtain
all F bits of the file in less than _____ seconds.
8/12 Contd…
367. Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
•The peer with the lowest download rate cannot obtain
all F bits of the file in less than F/dmin seconds.
•Thus the minimum distribution time is at least ____
9/12 Contd…
368. Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
•The peer with the lowest download rate cannot obtain
all F bits of the file in less than F/dmin seconds.
•Thus the minimum distribution time is at least F/dmin.
10/12 Contd…
369. Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
•The peer with the lowest download rate cannot obtain
all F bits of the file in less than F/dmin seconds.
•Thus the minimum distribution time is at least F/dmin.
11/12 Contd…
2
370. Let’s first determine the distribution time for the
Client-Server Architecture
•Putting these two observations together, we obtain
Contd…
12/12
This provides a lower bound on the minimum
distribution time for the client-server architecture.
Thus, the distribution time increases linearly with the
number of peers N.
371. Now Let’s determine the
distribution time for the
Peer-To-Peer Architecture
372. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
•In P2P architecture, each peer can assist the server in
distributing the file. In particular, when a peer receives some
file data, it can use its own upload capacity to redistribute
the data to other peers.
•Calculating the distribution time for the P2P architecture is
more complicated than for the client-server architecture,
since the distribution time depends on how each peer
distributes portions of the file to the other peers.
1/11
373. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•At the beginning of the distribution, only the server has the
file. To get this file into the community of peers, the server
must send each bit of the file at least once into its access
link.
•If us is the upload rate of server, then the time required to
upload 1 bit data is ____
2/11 Contd…
374. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•At the beginning of the distribution, only the server has the
file. To get this file into the community of peers, the server
must send each bit of the file at least once into its access
link.
•If us is the upload rate of server, then the time required to
upload 1 bit data is F/us
3/11 Contd…
375. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•At the beginning of the distribution, only the server has the
file. To get this file into the community of peers, the server
must send each bit of the file at least once into its access
link.
•If us is the upload rate of server, then the time required to
upload 1 bit data is F/us
•Thus, the minimum distribution time is at least F/u .
4/11 Contd…
1
376. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•At the beginning of the distribution, only the server has the
file. To get this file into the community of peers, the server
must send each bit of the file at least once into its access
link. Thus, the minimum distribution time is at least F/us .
(Unlike the client-server scheme, a bit sent once by the
server may not have to be sent by the server again, as the
peers may redistribute the bit among themselves.)
5/11 Contd…
377. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•If di is the download rate of ith
peer.
•So time required to download the file with F bits is _____
6/11 Contd…
378. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•If di is the download rate of ith
peer.
•So time required to download the file with F bits is F/dmin
7/11 Contd…
379. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•If di is the download rate of ith
peer.
•So time required to download the file with F bits is F/dmin
•The peer with the lowest download rate cannot obtain all F
bits of the file in less than F/dmin seconds. Thus the minimum
distribution time is at least F/dmin.
8/11 Contd…
380. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•If di is the download rate of ith
peer.
•So time required to download the file with F bits is F/dmin
•The peer with the lowest download rate cannot obtain all F
bits of the file in less than F/dmin seconds. Thus the minimum
distribution time is at least F/dmin.
9/11 Contd…
2
381. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
• Finally, observe that the total upload capacity of the system
as a whole is equal to the upload rate of the server plus the
upload rates of each of the individual peers, that is,
utotal = us + u1 + … + uN. The system must deliver (upload) F
bits to each of the N peers, thus delivering a total of NF bits.
This cannot be done at a rate faster than utotal.
•Thus, the minimum distribution time is also at least
NF/(u + u + … + u ).
10/11 Contd…
3
382. Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
Putting these three observations together, we obtain the
minimum distribution time for P2P, denoted by DP2P.
11/11 Contd…
This provides a lower bound for the minimum distribution time for the
P2P architecture.
If we imagine that each peer can redistribute a bit as soon as it
receives the bit, then there is a redistribution scheme that actually
achieves this lower bound.
384. Assumptions in the graph
•We have set F/u = 1 hour, us = 10u, and dmin ≥ us .
•Thus, a peer can transmit the entire file in one
hour, the server transmission rate is 10 times the
peer upload rate, and (for simplicity) the peer
download rates are set large enough so as not to
have an effect.
385. Comparison of Client Server Architecture
with P2P Architecture using the Graph
•For the client-server architecture, the distribution time
increases linearly and without bound as the number of
peers increases.
•For the P2P architecture, the minimal distribution time is
not only always less than the distribution time of the client-
server architecture; it is also less than one hour for any
number of peers N. Thus, applications with the P2P
architecture can be self-scaling
386. Bit Torrent
•BitTorrent is a popular P2P protocol for file distribution.
•In BitTorrent lingo, the collection of all peers
participating in the distribution of a particular file is
called a torrent.
•Peers in a torrent download equal-size chunks of the
file from one another, with a typical chunk size of 256
KBytes.
•When a peer first joins a torrent, it has no chunks. Over
time it accumulates more and more chunks.
1/2
387. Bit Torrent
•While it downloads chunks it also uploads chunks to
other peers.
•Once a peer has acquired the entire file, it may leave
the torrent, or remain in the torrent and continue to
upload chunks to other peers.
•Any peer may leave the torrent at any time with only a
subset of chunks, and later rejoin the torrent.
2/2 Contd…
388. Operation of Bit Torrent Protocol
•Each torrent has an infrastructure node called a
tracker.
•When a peer joins a torrent, it registers itself with the
tracker and periodically informs the tracker that it is
still in the torrent. Thus the tracker keeps track of the
peers that are participating in the torrent.
•A given torrent may have fewer than ten or more than
a thousand peers participating at any instant of time.
390. Working of Bit Torrent Protocol
•When a new peer, Alice, joins the torrent, the tracker
randomly selects a subset of peers from the set of
participating peers, and sends the IP addresses of
these 50 peers to Alice.
•Possessing this list of peers, Alice attempts to establish
concurrent TCP connections with all the peers on this
list.
1/5
391. Working of Bit Torrent Protocol
•Let’s call all the peers with which Alice succeeds in
establishing a TCP connection “neighboring peers.”
•As time evolves, some of these peers may leave
and other peers (outside the initial 50) may
attempt to establish TCP connections with Alice.
•So a peer’s neighboring peers will fluctuate over
time.
2/5 Contd…
392. Working of Bit Torrent Protocol
•At any given time, each peer will have a subset of
chunks from the file, with different peers having
different subsets.
•Periodically, Alice will ask each of her neighboring peers
for the list of the chunks they have.
•If Alice has L different neighbors, she will obtain L lists
of chunks. With this knowledge, Alice will issue
requests for chunks she currently does not have.
3/5 Contd…
393. Working of Bit Torrent Protocol
•So at any given instant of time, Alice will have a subset of
chunks and will know which chunks her neighbors have.
•With this information, Alice will have two important
decisions to make.
•First, which chunks should she request first from her
neighbors?
•Second, to which of her neighbors should she send
requested chunks?
4/5 Contd…
394. Working of Bit Torrent Protocol
•In deciding which chunks to request, Alice uses a technique
called rarest first.
•The idea is to determine, from among the chunks she does
not have, the chunks that are the rarest among her neighbors
(that is, the chunks that have the fewest repeated copies among her neighbors)
and then request those rarest chunks first.
•In this manner, the rarest chunks get more quickly
redistributed, aiming to (roughly) equalize the numbers of
copies of each chunk in the torrent.
5/5 Contd…
395. Clever Trading Algorithm
•To determine which requests Alice responds to,
BitTorrent uses a Clever Trading Algorithm.
•The basic idea is that Alice gives priority to the
neighbors that are currently supplying her data at the
highest rate.
1/6
396. Clever Trading Algorithm
•Specifically, for each of her neighbors, Alice
continually measures the rate at which she receives
bits and determines the four peers that are feeding
her bits at the highest rate.
•She then reciprocates by sending chunks to these
same four peers. Every 10 seconds, she recalculates
the rates and possibly modifies the set of four peers.
2/6 Contd…
397. Clever Trading Algorithm
•In BitTorrent lingo, these four peers are said to be
unchoked.
•Every 30 seconds, she also picks one additional
neighbor at random and sends it chunks. Let’s call the
randomly chosen peer Bob.
•In BitTorrent lingo, Bob is said to be optimistically
unchoked.
3/6 Contd…
398. Clever Trading Algorithm
•Because Alice is sending data to Bob, she may
become one of Bob’s top four uploaders, in which
case Bob would start to send data to Alice.
•If the rate at which Bob sends data to Alice is high
enough, Bob could then become one of Alice’s top
four uploaders.
4/6 Contd…
399. Clever Trading Algorithm
•In other words, every 30 seconds, Alice will randomly choose
a new trading partner and initiate trading with that partner.
•If the two peers are satisfied with the trading, they will put
each other in their top four lists and continue trading with
each other until one of the peers finds a better partner.
•The effect is that peers capable of uploading at compatible
rates tend to find each other.
5/6 Contd…
400. Clever Trading Algorithm
•The random neighbor selection also allows new peers to get
chunks, so that they can have something to trade.
•All other neighboring peers besides these five peers (four
“top” peers and one probing peer) are “choked,” that is, they
do not receive any chunks from Alice.
•BitTorrent has a number of interesting mechanisms including
pieces (mini-chunks), pipelining, random first selection,
endgame mode, and anti-snubbing.
6/6 Contd…