SlideShare a Scribd company logo
Module 1
Application Layer
VENKATESH BHAT
Senior Associate Professor
Department of CSE
AIET, Moodbidri
SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
CHAPTERS FROM THE TEXT BOOK 1
2.1 To 2.7
2.1 Principles of Network Applications -- Syllabus
2.1.1 Network Application Architectures.
2.1.2 Processes Communicating.
2.1.3 Transport Services Available to Applications.
2.1.4 Transport Services Provided by the Internet.
2.1.5 Application-Layer Protocols.
Principles of Network Applications
•Network application development is writing
programs that run on different end systems and
communicate with each other over the network.
Principles of Network Applications
•Example:
•Web Application
•Peer-to-Peer File Sharing System
Principles of Network Applications
• Example:
• Web Application:
• In the Web application, there are two distinct programs that
communicate with each other:
• The browser program running in the user’s host (desktop,
laptop, tablet, smartphone, and so on); and
• The Web server program running in the Web server host.
Principles of Network Applications
•Example:
•Peer-to-Peer File Sharing System
•In a P2P File-Sharing System, there is a program in
each host that participates in the file-sharing
community.
•In this case, the programs in the various hosts may
be similar or identical.
IMPORTANT POINTS TO BE CONSIDER WHEN WE
DEVELOP A NEW NETWORK APPLICATION
• When developing our new application, we need to write software that will run
on multiple end systems.
• This software could be written in C, Java, or Python.
• We do not need to write software that runs on network core devices, such as
routers or link-layer switches.
• Even if we wanted to write application software for these network-core
devices, we wouldn’t be able to do so.
• Network-core devices do not function at the application layer but instead
function at lower layers— specifically at the network layer and below.
• Communication for a network application takes place between end systems at
the application layer
Network Application Architectures
• An application’s architecture is distinctly different from the network
architecture.
• From the application developer’s perspective, the network architecture is fixed
and provides a specific set of services to applications.
• The application architecture is designed by the application developer and
dictates how the application is structured over the various end systems.
• An Application Developer will draw on one of the two predominant
architectural paradigms used in modern network applications:
• The client-server architecture
• The peer-to-peer (P2P) architecture
Client-Server Architecture
• There is an always-on host, called the server, which services requests
from many other hosts, called clients.
• A classic example is the Web application
• Here, one host is always-on is called as Web Server.
• Web Server services requests from browsers running on client hosts.
• When a Web server receives a request for an object from a client
host, it responds by sending the requested object to the client host.
Characteristics of Client Server Architecture
• With the client-server architecture, clients do not directly communicate with
each other; for example, in the Web application, two browsers do not
directly communicate.
• The server has a fixed, well-known address, called an IP address. Because the
server has a fixed, well-known address, and because the server is always on,
a client can always contact the server by sending a packet to the server’s IP
address.
• Some of the better-known applications with a client-server architecture
include the Web, FTP, Telnet, and e-mail.
• A single-server host is incapable of keeping up with all the requests from
clients. For this reason, a data center, housing a large number of hosts, is
often used to create a powerful virtual server
Some Examples For Client Server Architecture
• The most popular Internet services—such as search engines
(e.g., Google and Bing),
• Internet commerce (e.g., Amazon and e-Bay),
• Web-based email (e.g., Gmail and Yahoo Mail),
• Social networking (e.g., Facebook and Twitter)
— These above will employ one or more data centers.
Disadvantage
•Infrastructure Intensive  A data center can have
hundreds of thousands of servers, which must be
powered and maintained.
•Service providers must pay recurring
interconnection and bandwidth costs for sending
data and receiving data to and from Internet.
A Sample Client Server Architecture
Clients
Server
P2P Architecture
• There is minimal (or no) reliance on dedicated servers in data centers.
• The application exploits direct communication between pairs of intermittently
connected hosts, called peers.
• The peers are not owned by the service provider, but are instead desktops and
laptops controlled by users, with most of the peers residing in homes,
universities, and offices.
• Because the peers communicate without passing through a dedicated server,
the architecture is called peer-to-peer.
• Most popular and traffic-intensive applications are based on P2P architectures
P2P Applications include
• File Sharing (e.g., BitTorrent),
• Peer-Assisted Download Acceleration (e.g., Xunlei),
• Internet Telephony (e.g., Skype),
• IPTV (e.g., Kankan and PPstream),
• LimeWire (A Music NFT MarketPlace)
A Sample P2P Architecture
Hybrid Architectures
•Combining both client-server and P2P elements.
•For example, for many instant messaging applications,
servers are used to track the IP addresses of users, but
user-to-user messages are sent directly between user hosts
(without passing through intermediate servers)
Features of P2P architectures
• P2P Architectures are Self-Scalability.
• For example, in a P2P file-sharing application, although
each peer generates workload by requesting files, each
peer also adds service capacity to the system by distributing
files to other peers.
• P2P Architectures are also cost effective
• Since they normally don’t require significant server
infrastructure and server bandwidth (in contrast with
clients-server designs with datacenters), cost will be less.
P2P Applications Face Three Major Challenges
• ISP Friendly. Most residential ISPs (including DSL and cable ISPs) have been
dimensioned for “asymmetrical” bandwidth usage, that is, for much more
downstream than upstream traffic. But P2P video streaming and file
distribution applications shift upstream traffic from servers to residential ISPs,
thereby putting significant stress on the ISPs. Future P2P applications need to
be designed so that they are friendly to ISPs [Xie 2008].
• Security. Because of their highly distributed and open nature, P2P applications
can be a challenge to secure.
• Incentives. The success of future P2P applications also depends on convincing
users to volunteer bandwidth, storage, and computation resources to the
applications, which is the challenge of incentive design.
Process
•What is Process?
Process
•What is Process?
•A process can be thought of as a program that is
running within an end system.
Processes Communicating
•How Processes running on Same Host communicate?
•How Processes running on different hosts (with
potentially different operating systems)
communicate?
How Processes running on Same
Host communicate?
How Processes running on Same
Host communicate?
•When processes are running on the same end system,
they can communicate with each other with
interprocess communication, using rules that are
governed by the end system’s operating system.
How Processes running on different hosts
(with potentially different operating systems)
communicate?
How Processes running on different hosts
(with potentially different operating systems)
communicate?
•Processes on two different end systems communicate with
each other by exchanging messages across the computer
network. A sending process creates and sends messages
into the network; a receiving process receives these
messages and responds by sending messages back
Client and Server Processes
•A network application consists of pairs of processes that send
messages to each other over a network.
•For example,
•In the Web application, a client browser process
exchanges messages with a Web server process.
•In a P2P file-sharing system, a file is transferred from a
process in one peer to a process in another peer.
Client and Server Processes – Contd…
• For each pair of communicating processes, we typically label one of
the two processes as the client and the other process as the server.
• With the Web, a browser is a client process and a Web server
is a server process.
• With P2P file sharing, the peer that is downloading the file is
labeled as the client, and the peer that is uploading the file is
labeled as the server.
• In some applications, such as in P2P file sharing, a process can be
both a client and a server. A process in a P2P file-sharing system can
both upload and download files.
Definition of Client and Server Process
• In the context of a communication session between a pair
of processes,
•The process that initiates the communication (that is,
initially contacts the other process at the beginning of
the session) is labeled as the client.
•The process that waits to be contacted to begin the
session is the server.
Example for Client and Servers
•In the Web, a browser process initializes contact with a
Web server process; hence the browser process is the client
and the Web server process is the server.
•In P2P file sharing, when Peer A asks Peer B to send a
specific file, Peer A is the client and Peer B is the server in
the context of this specific communication session. When
there’s no confusion, we’ll sometimes also use the
terminology “client side and server side of an application.”
The Interface Between the Process and the
Computer Network
•A Process sends messages into, and receives
messages from, the network through a software
interface called a socket.
Socket Communication Between Two
Processes that communicate over the Internet.
Socket
•A Socket is the interface between the Application Layer
and the Transport Layer within a host.
•It is also referred to as the Application Programming
Interface (API) between the application and the network,
since the socket is the programming interface with which
network applications are built.
•The application developer has control of everything on
the application-layer side of the socket but has little
control of the transport-layer side of the socket.
The only control that the Application
Developer has on the Transport-Layer side is
•The only control that the Application Developer has on the
Transport-Layer side is
•The choice of transport protocol and
•The ability to fix a few transport-layer parameters
such as maximum buffer and maximum segment sizes.
•Once the application developer chooses a transport
protocol, the application is built using the transport-layer
services provided by that protocol.
Addressing Processes
•In order for a process running on one host to send packets
to a process running on another host, the receiving
process needs to have an address.
•To identify the receiving process, two pieces of
information need to be specified:
•The address of the host and
•An identifier that specifies the receiving process in
the destination host.
Addressing Processes – Contd…
•In the Internet, the host is identified by its IP address.
An IP address is a 32-bit quantity that we can think of
as uniquely identifying the host.
•The sending process must also identify the receiving
process running in the host. This information is
needed because a host could be running many
network applications. A destination port number
serves this purpose.
Popular applications have been assigned
Specific Port Numbers.
•For example,
•A Web server is identified by port number 80.
•A mail server process (using the SMTP protocol) is
identified by port number 25.
•A list of well-known port numbers for all Internet
standard protocols can be found at
http://www.iana.org.
SYLLABUS – MODULE 1
•2.1 Principles of Network Applications:
•Network Application Architectures,
•Processes Communicating,
•Transport Services Available to Applications,
•Transport Services Provided by the Internet,
•Application-Layer Protocols.
Transport Services Available to Applications
•Services that a transport-layer protocol can offer
to applications invoking it can be classified into
four dimensions :
1. Reliable data transfer
2. Throughput
3. Timing
4. Security
Reliable Data Transfer
• Packets can get lost within a computer network.
• A packet can overflow a buffer in a router, or can be discarded by
a host or router after having some of its bits corrupted.
• For many applications—such as electronic mail, file transfer,
remote host access, Web document transfers, and financial
applications—data loss can have devastating consequences.
• The data sent by one end of the application is delivered correctly
and completely to the other end of the application. If a protocol
provides such a guaranteed data delivery service, it is said to
provide reliable data transfer.
One important service that a transport-layer
protocol can potentially provide to an application
•One important service that a transport-layer
protocol can potentially provide to an application is
Process-To-Process Reliable Data Transfer.
•When a transport protocol provides this service, the
sending process can just pass its data into the socket
and know with complete confidence that the data
will arrive without errors at the receiving process.
Loss-Tolerant Applications
• When a transport-layer protocol doesn’t provide reliable
data transfer, some of the data sent by the sending process
may never arrive at the receiving process. This may be
acceptable for Loss-Tolerant Applications
• Most notably multimedia applications such as conversational
audio/video that can tolerate some amount of data loss. In
these multimedia applications, lost data might result in a
small glitch in the audio/video—not a crucial impairment
Throughput
• In the context of a communication session between two
processes along a network path, Throughput is the rate at which
the sending process can deliver bits to the receiving process.
• Because other sessions will be sharing the bandwidth along the
network path, and because these other sessions will be coming
and going, the available throughput can fluctuate with time.
These observations lead to another natural service that a
transport-layer protocol could provide, namely, guaranteed
available throughput at some specified rate.
Throughput – Contd…
•The application could request a guaranteed
throughput of r bits/sec, and the transport protocol
would then ensure that the available throughput is
always at least r bits/sec. Such a guaranteed
throughput service would appeal to many applications.
For Example
•If an Internet telephony application encodes voice at 32 kbps,
it needs to send data into the network and have data
delivered to the receiving application at this rate.
•If the transport protocol cannot provide this throughput, the
application would need to encode at a lower rate or may have
to give up, since receiving half of the needed throughput is of
little or no use to this Internet telephony application.
Bandwidth-Sensitive Applications
•Applications that have throughput requirements
are said to be bandwidth-sensitive applications.
Many current multimedia applications are
bandwidth sensitive, although some multimedia
applications may use adaptive coding techniques
to encode digitized voice or video at a rate that
matches the currently available throughput.
Elastic Applications
•While bandwidth-sensitive applications have
specific throughput requirements, elastic
applications can make use of as much, or as
little, throughput as happens to be available.
Electronic mail, file transfer, and Web transfers
are all elastic applications.
Timing
• A transport-layer protocol can also provide timing guarantees.
• As with throughput guarantees, timing guarantees can come in
many shapes and forms.
• An example guarantee might be that every bit that the sender
pumps into the socket arrives at the receiver’s socket no more
than 100 msec later.
• Such a service would be appealing to interactive real-time
applications, such as Internet telephony, virtual environments,
teleconferencing, and multiplayer games, all of which require
tight timing constraints on data delivery in order to be effective.
Timing – Contd…
•Long delays in Internet telephony, for example, tend
to result in unnatural pauses in the conversation;
•In a multiplayer game or virtual interactive
environment, a long delay between taking an
action and seeing the response from the
environment makes the application feel less
realistic.
Security
• A transport protocol can provide an application with one or more
security services.
• For example, in the sending host, a transport protocol can encrypt
all data transmitted by the sending process, and in the receiving
host, the transport-layer protocol can decrypt the data before
delivering the data to the receiving process.
• Such a service would provide confidentiality between the two
processes, even if the data is somehow observed between sending
and receiving processes.
• A transport protocol can also provide other security services in
addition to confidentiality, including data integrity and end-point
authentication
SYLLABUS – MODULE 1
•2.1 Principles of Network Applications:
•Network Application Architectures,
•Processes Communicating,
•Transport Services Available to Applications,
•Transport Services Provided by the Internet,
•Application-Layer Protocols.
Transport Services Provided by the Internet
•The Internet makes two transport protocols
available to applications,
•UDP and
•TCP.
TCP Services
•The TCP service model includes
•A Connection-Oriented Service and
•A Reliable Data Transfer Service.
•Congestion-Control Mechanism Service
•When an application invokes TCP as its transport
protocol, the application receives both of these
services from TCP.
Connection-Oriented Service
•TCP has the client and server exchange transport
layer control information with each other before
the application-level messages begin to flow.
These Control information are called as
handshaking.
•This handshaking procedure alerts the client and
server, allowing them to prepare packets.
Connection-Oriented Service
•After the handshaking phase, a TCP connection is
said to exist between the sockets of the two
processes. The connection is a full-duplex
connection in that the two processes can send
messages to each other over the connection at the
same time.
•When the application finishes sending messages, it
must tear down the connection.
Contd…
Reliable Data Transfer Service
•The communicating processes can rely on TCP to
deliver all data sent without error and in the
proper order.
•When one side of the application passes a stream
of bytes into a socket, it can count on TCP to
deliver the same stream of bytes to the receiving
socket, with no missing or duplicate bytes.
Congestion-Control Mechanism Service
•It is a service for the general welfare of the
Internet rather than for the direct benefit of the
communicating processes.
•The TCP congestion-control mechanism throttles
a sending process (client or server) when the
network is congested between sender and
receiver.
UDP Services
•UDP is connectionless, so there is no handshaking
before the two processes start to communicate.
•UDP provides an unreliable data transfer service
•that is, when a process sends a message
into a UDP socket, UDP provides no
guarantee that the message will ever
reach the receiving process.
UDP Services
•Messages that do arrive at the receiving process
may arrive out of order.
•UDP does not include a congestion-control
mechanism, so the sending side of UDP can pump
data into the layer below (the network layer) at
any rate it pleases.
Contd…
SYLLABUS – MODULE 1
•2.1 Principles of Network Applications:
•Network Application Architectures,
•Processes Communicating,
•Transport Services Available to Applications,
•Transport Services Provided by the Internet,
•Application-Layer Protocols.
Application-Layer Protocols
•An Application-Layer Protocol defines how an
application’s processes, running on different
end systems, pass messages to each other.
Services Not Provided by Internet
Transport Protocols
•Throughput guarantee or Timing guarantee—
services not provided by today’s Internet
transport protocols.
•Today’s Internet can often provide satisfactory
service to time-sensitive applications, but it
cannot provide any timing or throughput
guarantees.
In particular, an Application-Layer Protocol defines
•The types of messages exchanged, request messages
and response messages.
•The syntax of the various message types, such as the
fields in the message and how the fields are
delineated.
•The semantics of the fields, that is, the meaning of the
information in the fields.
•Rules for determining when and how a process sends
messages and responds to messages.
Distinguish between Network Applications and
Application-Layer Protocols
•An Application-Layer Protocol is only one
piece of a Network Application.
•Examples:
•Web Application
•Internet E-Mail Application
Example 1  Web Application
•The Web is a client-server application that allows users
to obtain documents from Web servers on demand.
•The Web application consists of many
components, including a standard for document
formats (that is, HTML), Web Browsers (for
example, Firefox and Microsoft Internet Explorer),
Web servers (for example, Apache and Microsoft
servers), and an Application-Layer protocol.
Example 1  Web Application
•The Web’s Application-Layer Protocol, HTTP, defines
the format and sequence of messages exchanged
between Browser and Web Server.
•Thus, HTTP is only one piece of the Web Application.
Contd…
Example 2: Internet E-Mail Application
•It has many components, including mail servers that
house user mailboxes; mail clients (such as Microsoft
Outlook) that allow users to read and create
messages; a standard for defining the structure of an
e-mail message; and application-layer protocols that
define how messages are passed between servers,
how messages are passed between servers and mail
clients, and how the contents of message headers are
to be interpreted.
Example 2: Internet E-Mail Application
•The principal Application-Layer Protocol for electronic
mail is SMTP (Simple Mail Transfer Protocol).
•Thus, e-mail’s Principal Application-Layer Protocol,
SMTP, is only one piece of the E-mail Application.
Contd…
End of 2.1
SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
Overview of HTTP
•The HyperText Transfer Protocol (HTTP), the
Web’s application-layer protocol, is at the
heart of the Web.
•HTTP is implemented in two programs:
•A Client Program and
•A Server Program.
Overview of HTTP
•The client program and server program, executing
on different end systems.
•They may talk to each other by exchanging HTTP
messages.
•HTTP defines the structure of these messages and
how the client and server exchange the messages.
•URL  Uniform Resource Locator
Contd…
Some Web Terminologies
•A Web page (also called a document)
consists of objects.
•An object is simply a file—such as an HTML
file, a JPEG image, a Java applet, or a video
clip—that is addressable by a single URL.
•Most Web pages consist of a base HTML file
and several referenced objects.
Some Web Terminologies
•For Example, if a Web page contains HTML
text and five JPEG images, then the Web
page has six objects: the base HTML file plus
the five images.
Contd…
Some Web Terminologies
•The base HTML file references the other
objects in the page with the objects’ URLs.
•Each URL has two components:
•The hostname of the server that houses
the object and
•The object’s path name.
Contd…
For Example
•Consider the URL:
http://www.abc.edu/myStore/picture.gif
•www.abc.edu for a hostname and
•/myStore/picture.gif for a path name
Some Web Terminologies
•Web browsers (such as Internet Explorer and
Firefox) implement the client side of HTTP
•Web servers implement the server side of HTTP,
house Web objects, each addressable by a URL.
•Popular Web servers include Apache and
Microsoft Internet Information Server
Contd…
•HTTP defines how Web clients request Web
pages from Web servers and how servers
transfer Web pages to clients.
•When a user requests a Web page (for example,
clicks on a hyperlink), the browser sends HTTP
request messages for the objects in the page to
the server. The server receives the requests and
responds with HTTP response messages that
contain the objects.
HTTP Request-Response Behavior
HTTP
•HTTP uses TCP as its underlying transport
protocol.
•The HTTP client first initiates a TCP connection
with the server.
•Once the connection is established, the browser
and the server processes access TCP through
their socket interfaces.
Client Side and Server Side Sockets
•On the Client Side, the socket interface is
the door between the client process and
the TCP connection;
•On the Server Side, it is the door between
the server process and the TCP connection.
Request Response Process
•The client sends HTTP request messages into
its socket interface and receives HTTP
response messages from its socket interface.
•Similarly, the HTTP server receives request
messages from its socket interface and sends
response messages into its socket interface.
HTTP is said to be a Stateless Protocol
•The server sends requested files to clients without
storing any state information about the client.
•If a particular client asks for the same object twice in a
period of a few seconds, the server does not respond
by saying that it just served the object to the client;
instead, the server resends the object, as it has
completely forgotten.
•Because an HTTP server maintains no information
about the clients, HTTP is said to be a stateless
protocol.
2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
Non-Persistent and Persistent Connections
•HTTP, which can use both non-persistent
connections and persistent connections.
•Although HTTP uses persistent connections
in its default mode, HTTP clients and servers
can be configured to use non-persistent
connections instead.
The steps of transferring a Web page from server
to client for the case of non-persistent connections
•Let’s suppose the page consists of a base HTML
file and 10 JPEG images, and that all 11 of these
objects reside on the same server.
•Suppose the URL for the base HTML file is:
http://www.abc.edu/myDepartment/home.index
Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•1. The HTTP client process initiates a TCP
connection to the server www.abc.edu on port
number 80, which is the default port number for
HTTP. Associated with the TCP connection, there
will be a socket at the client and a socket at the
server.
Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•2. The HTTP client sends an HTTP request
message to the server via its socket. The request
message includes the path name
/myDepartment/home.index.
Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•3. The HTTP server process receives the request
message via its socket, retrieves the object
/myDepartment/home.index from its storage
(RAM or disk), encapsulates the object in an
HTTP response message, and sends the
response message to the client via its socket.
Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•4. The HTTP server process tells TCP to close the
TCP connection. (But TCP doesn’t actually
terminate the connection until it knows for sure
that the client has received the response
message intact.)
Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•5. The HTTP client receives the response
message. The TCP connection terminates. The
message indicates that the encapsulated object
is an HTML file. The client extracts the file from
the response message, examines the HTML file,
and finds references to the 10 JPEG objects.
Steps involved in the request
http://www.abc.edu/myDepartment/home.index
•6. The first four steps are then repeated for each
of the referenced JPEG objects.
HTTP with Non-Persistent Connections
•The steps above illustrate the use of non-persistent
connections.
•Here, each TCP connection is closed after the server
sends the object—the connection does not persist
for other objects.
•Each TCP connection transports exactly one request
message and one response message.
•Thus, in this example, when a user requests the
Web page, 11 TCP connections are generated.
Round-Trip Time (RTT)
•It is the time it takes for a small packet to
travel from client to server and then back to
the client.
•The RTT includes packet-propagation delays,
packet queuing delays in intermediate routers
and switches, and packet-processing delays.
“three-way handshake”
•Client sends a small TCP segment to the server,
•The server acknowledges and responds with a
small TCP segment, and,
•The client acknowledges back to the server.
“three-way handshake”
•The first two parts of the three way
handshake take one RTT.
•After completing the first two parts of the
handshake, the client sends the HTTP request
message combined with the third part of the
three-way handshake (the acknowledgment)
into the TCP connection.
Contd…
“three-way handshake”
•Once the request message arrives at the server,
the server sends the HTML file into the TCP
connection.
•This HTTP request/response eats up another RTT.
•Thus the total response time is two RTTs plus the
transmission time at the server of the HTML file.
Contd…
Disadvantages of Non Persistent Connections
1. A new connection must be established and
maintained for each requested object. For each of
these connections, TCP buffers must be allocated and
TCP variables must be kept in both the client and
server. This can place a significant burden on the Web
server, which may be serving requests from hundreds
of different clients simultaneously.
2. Each object suffers a delivery delay of two RTTs— one
RTT to establish the TCP connection and one RTT to
request and receive an object.
HTTP with Persistent Connections
•The server leaves the TCP connection open
after sending a response.
•Subsequent requests and responses between
the same client and server can be sent over
the same connection.
1/4
HTTP with Persistent Connections
•In particular, an entire Web page (in our
example, the base HTML file and the 10 images)
can be sent over a single persistent TCP
connection.
•Multiple Web pages residing on the same
server can be sent from the server to the
same client over a single persistent TCP
connection.
Contd…
2/4
HTTP with Persistent Connections
•These requests for objects can be made back-
to-back, without waiting for replies to
pending requests (pipelining).
•The HTTP server closes a connection when it
isn’t used for a certain time (a configurable
timeout interval).
Contd…
3/4
HTTP with Persistent Connections
•When the server receives the back-to-back
requests, it sends the objects back-to-back.
•The default mode of HTTP uses persistent
connections with pipelining.
Contd…
4/4
2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
HTTP Message Format
•The HTTP specifications include the
definitions of the HTTP message formats.
•There are two types of HTTP messages,
•HTTP Request messages and
•HTTP Response messages
HTTP Request Message
•A typical HTTP request message:
GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
Connection: close
User-agent: Mozilla/5.0
Accept-language: fr
Characteristics of the Simple Request Message
•The message is written in ordinary ASCII text, so
that ordinary computer-literate human being can
read it.
•The message consists of five lines, each followed
by a carriage return and a line feed. The last line
is followed by an additional carriage return and
line feed. Although this particular request
message has five lines, a request message can
have many more lines or as few as one line.
Characteristics of the Simple Request Message
•The first line of an HTTP request message is called
the request line; the subsequent lines are called
the header lines.
•The request line has three fields:
•The method field,
•the URL field, and
•the HTTP version field.
Characteristics of the Simple Request Message
•The method field can take on several different
values, including GET, POST, HEAD, PUT, and DELETE.
•The great majority of HTTP request messages use
the GET method.
•The GET method is used when the browser requests
an object, with the requested object identified in
the URL field.
•In this example, the browser is requesting the object
/somedir/page.html.
Consider The Header line
Host: www.someschool.edu
•It specifies the host on which the object resides.
This header line is unnecessary, as there is
already a TCP connection in place to the host.
But the information provided by the host
header line is required by Web proxy caches.
Consider The Header line
Connection: close
•The Browser is telling the server that it
doesn’t want to bother with persistent
connections; it wants the server to close the
connection after sending the requested
object.
Consider The Header line
User-agent: Mozilla/5.0
•It specifies the user agent, that is, the browser
type that is making the request to the server.
•Here the user agent is Mozilla/5.0, a Firefox
browser.
•This header line is useful because the server can
actually send different versions of the same object
to different types of user agents.
Consider The Header line
Accept-language: fr
•indicates that the user prefers to receive a
French version of the object, if such an object
exists on the server; otherwise, the server
should send its default version.
•The Accept-language: header is just one of
many content negotiation headers available in
HTTP.
General Format of a Request Message
General Format of a Request Message
•After the header lines (and the additional carriage
return and line feed) there is an “entity body.”
•The entity body is empty with the GET method,
but is used with the POST method.
• An HTTP client uses the POST method when the
user fills out a form.
•For example, when a user provides search words
to a search engine.
General Format of a Request Message
•With a POST message, the user is still requesting a
Web page from the server, but the specific
contents of the Web page depend on what the
user entered into the form fields.
•If the value of the method field is POST, then the
entity body contains what the user entered into
the form fields.
Contd…
General Format of a Request Message
•A request generated with a form does not
necessarily use the POST method. Instead,
HTML forms use the GET method and include
the inputted data (in the form fields) in the
requested URL.
Contd…
General Format of a Request Message
•For example, if a form uses the GET method,
has two fields, and the inputs to the two fields
are dogs and cats, then the URL will have the
structure
www.abc.com/animalsearch?dogs&cats
Contd…
General Format of a Request Message
•The HEAD method is similar to the GET method.
•When a server receives a request with the HEAD
method, it responds with an HTTP message but it
leaves out the requested object.
•Application developers often use the HEAD
method for debugging.
Contd…
General Format of a Request Message
•The PUT method is often used in conjunction with
Web publishing tools.
•It allows a user to upload an object to a specific
path (directory) on a specific Web server.
•The PUT method is also used by applications that
need to upload objects to Web servers.
•The DELETE method allows a user, or an
application, to delete an object on a Web server.
Contd…
HTTP Response Message
HTTP/1.1 200 OK
Connection: close
Date: Tue, 09 Aug 2011 15:44:04 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT
Content-Length: 6821
Content-Type: text/html
(data data data data data ...)
HTTP Response Message
•HTTP Response Message has three sections:
•An initial status line,
•Six header lines, and
•The entity body.
•The entity body is the meat of the message—it
contains the requested object itself
(represented by data data data data data ...).
Contd…
HTTP Response Message  Status Line
•The status line has three fields:
•The protocol version field,
•A status code, and
•A corresponding status message.
•In this example, the status line indicates that the
server is using HTTP/1.1 and that everything is OK
(that is, the server has found, and is sending, the
requested object).
HTTP Response Message  Header Lines
•First Header Line is
Connection: close
This header line to tell the client that it is going
to close the TCP connection after sending the
message.
HTTP Response Message
HTTP/1.1 200 OK
Connection: close
Date: Tue, 09 Aug 2011 15:44:04 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT
Content-Length: 6821
Content-Type: text/html
(data data data data data ...)
HTTP Response Message  Header Lines
•Second Header Line is
Date: Tue, 09 Aug 2011 15:44:04 GMT
This header line indicates the time and date when
the HTTP response was created and sent by the
server. Note that this is not the time when the
object was created or last modified; it is the time
when the server retrieves the object from its file
system, inserts the object into the response
message, and sends the response message.
HTTP Response Message  Header Lines
•Third Header Line is
Server: Apache/2.2.3 (CentOS)
This header line indicates that the message was
generated by an Apache Web server; it is
analogous to the User-agent: header line in the
HTTP request message.
HTTP Response Message  Header Lines
•Fourth Header Line is
Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT
This header line indicates the time and date when
the object was created or last modified
HTTP Response Message  Header Lines
•Fifth Header Line is
Content-Length: 6821
This header line indicates the number of bytes in
the object being sent.
HTTP Response Message  Header Lines
•Sixth Header Line is
Content-Type: text/html
This header line indicates that the object in the
entity body is HTML text.
(The object type is officially indicated by the
Content-Type: header and not by the file
extension).
The status code and associated phrase
•The status code and associated phrase
indicate the result of the request.
Some common status codes and
associated phrases include
•200 OK: Request succeeded and the information
is returned in the response.
•301 Moved Permanently: Requested object has
been permanently moved; the new URL is
specified in Location: header of the response
message. The client software will automatically
retrieve the new URL.
Some common status codes and
associated phrases include
•400 Bad Request: This is a generic error code
indicating that the request could not be
understood by the server.
•404 Not Found: The requested document does not
exist on this server.
•505 HTTP Version Not Supported: The requested
HTTP protocol version is not supported by the
server
Contd…
2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
User-Server Interaction: Cookies
•Web servers that can handle thousands of
simultaneous TCP connections.
•Web site has to identify users, either because the
server wishes to restrict user access or because it
wants to serve content as a function of the user
identity. For these purposes, HTTP uses cookies.
•Cookies allow sites to keep track of users.
Cookie technology has four components
1. A cookie header line in the HTTP
response message;
2. A cookie header line in the HTTP request
message;
3. A cookie file kept on the user’s end system
and managed by the user’s browser; and
4. A back-end database at the Web site
VTU V SEM CNS Module 1 PPT 2018 Batch students
An Example of how Cookies work
•Suppose Sushanth, who always accesses the
Web using Internet Explorer from his home PC,
contacts amazon.com for the first time.
•Let us suppose that in the past he has already
visited the eBay site.
An Example of how Cookies work
•When the request comes into the Amazon Web
server, the server creates a unique
identification number and creates an entry in
its back-end database that is indexed by the
identification number.
Contd…
An Example of how Cookies work
•The Amazon Web server then responds to
Sushanth’s browser, including in the HTTP
response a Set-cookie: header, which contains
the identification number.
•For example, the header line might be:
Set-cookie: 1678
Contd…
•When Sushanth’s browser receives the HTTP
response message, it sees the Setcookie: header.
The browser then appends a line to the special
cookie file that it manages.
•This line includes the hostname of the server and
the identification number in the Set-cookie: header.
•Note that the cookie file already has an entry for
eBay, since Sushanth has visited that site in the past.
An Example of how Cookies work
Contd…
•As Sushanth continues to browse the Amazon site,
each time he requests a Web page, his browser
consults his cookie file, extracts his identification
number for this site, and puts a cookie header line that
includes the identification number in the HTTP request.
•Specifically, each of his HTTP requests to the Amazon
server includes the header line:
Cookie: 1678
An Example of how Cookies work
Contd…
Cookies can be used to identify a user
•The first time a user visits a site, the user can
provide a user identification (possibly his or her name).
•During the subsequent sessions, the browser
passes a cookie header to the server, thereby
identifying the user to the server.
•Cookies can thus be used to create a user session
layer on top of stateless HTTP.
Cookies can be used to identify a user
•For Example, when a user logs in to a Web-
based e-mail application (such as Hotmail),
the browser sends cookie information to the
server, permitting the server to identify the
user throughout the user’s session with the
application.
Contd…
2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
Web Caching (Or Proxy Server)
•It is a network entity that satisfies HTTP
requests on the behalf of an origin Web
server.
•The Web cache has its own disk storage and
keeps copies of recently requested objects in
this storage
Web Caching (Or Proxy Server)
•A user’s browser can be configured so that all
of the user’s HTTP requests are first directed
to the Web cache.
•Once a browser is configured, each browser
request for an object is first directed to the
Web cache
Contd…
Clients requesting objects through a Web cache
Clients requesting objects through a Web cache
•A user’s browser can be configured so that all
of the user’s HTTP requests are first directed
to the Web cache.
•Once a browser is configured, each browser
request for an object is first directed to the
Web cache
Contd…
Clients requesting objects through a Web cache
•Suppose a browser is requesting the object
http://www.someschool.edu/campus.gif.
•Here is what happens:
Contd…
Example
A browser is requesting the object
http://www.someschool.edu/campus.gif
• The browser establishes a TCP connection to the Web cache and sends an HTTP
request for the object to the Web cache.
• The Web cache checks to see if it has a copy of the object stored locally. If it
does, the Web cache returns the object within an HTTP response message to
the client browser.
• If the Web cache does not have the object, the Web cache opens a TCP
connection to the origin server, that is, to www.someschool.edu. The Web
cache then sends an HTTP request for the object into the cache-to-server TCP
connection. After receiving this request, the origin server sends the object
within an HTTP response to the Web cache.
• When the Web cache receives the object, it stores a copy in its local storage
and sends a copy, within an HTTP response message, to the client browser.
A browser is requesting the object
http://www.someschool.edu/campus.gif
•The browser establishes a TCP connection to the
Web cache and sends an HTTP request for the
object to the Web cache.
•The Web cache checks to see if it has a copy of
the object stored locally. If it does, the Web
cache returns the object within an HTTP
response message to the client browser.
1/3
A browser is requesting the object
http://www.someschool.edu/campus.gif
•If the Web cache does not have the object, the
Web cache opens a TCP connection to the origin
server, that is, to www.someschool.edu.
•The Web cache then sends an HTTP request for
the object into the cache-to-server TCP
connection.
Contd…
2/3
A browser is requesting the object
http://www.someschool.edu/campus.gif
•After receiving this request, the origin server
sends the object within an HTTP response to the
Web cache.
•When the Web cache receives the object, it
stores a copy in its local storage and sends a
copy, within an HTTP response message, to the
client browser.
Contd…
3/3
Note that a cache is both a server
and a client at the same time.
•When it receives requests from and sends
responses to a browser, it is a server.
•When it sends requests to and receives
responses from an origin server, it is a client.
Web caching has seen deployment in
the Internet for two reasons.
•First, A Web cache can substantially reduce the
response time for a client request, particularly if the
bottleneck bandwidth between the client and the origin
server is much less than the bottleneck bandwidth
between the client and the cache. If there is a high-
speed connection between the client and the cache and
if the cache has the requested object, then the cache
will be able to deliver the object rapidly to the client.
Web caching has seen deployment in
the Internet for two reasons.
•Second, as we will soon illustrate with an example, Web
caches can substantially reduce traffic on an institution’s
access link to the Internet. By reducing traffic, the
institution (for example, a company or a university) does
not have to upgrade bandwidth as quickly, thereby
reducing costs. Web caches can substantially reduce
Web traffic in the Internet as a whole, thereby
improving performance for all applications.
Bottleneck between an institutional network and the Internet
Explanation of the Diagram
•This figure shows two networks
•The institutional network and
•The rest of the public Internet.
The Institutional Network
•It is a high-speed LAN.
•A router in the institutional network and a router
in the Internet are connected by a 15 Mbps link.
•The origin servers are attached to the Internet
but are located all over the globe.
1/3
The Institutional Network
•Suppose that the average object size is 1 Mbits and
that the average request rate from the institution’s
browsers to the origin servers is 15 requests per
second.
•Suppose that the HTTP request messages are
negligibly small and thus create no traffic in the
networks or in the access link (from institutional
router to Internet router).
Contd…
2/3
The Institutional Network
•Suppose that the amount of time it takes from
when the router on the Internet side of the access
link forwards an HTTP request (within an IP
datagram) until it receives the response (within
many IP datagrams) is two seconds on average.
•Informally, we refer to this last delay as the
“Internet delay.”
Contd…
3/3
The Total Response Time
•It is the time from the browser’s request of an
object until its receipt of the object.
•It is the sum of the LAN delay, the access
delay (that is, the delay between the two
routers), and the Internet delay.
Calculation of these Delays
The Traffic intensity on the LAN is
(15 requests/sec)*(1 Mbits/request)/(100 Mbps) = 0.15
The traffic intensity on the access link (from the
Internet router to institution router) is
(15 requests/sec) (1 Mbits/request)/(15 Mbps) = 1
2.2 The Web and HTTP -- Syllabus
2.2.1 Overview of HTTP
2.2.2 Non-persistent and Persistent Connections
2.2.3 HTTP Message Format
2.2.4 User-Server Interaction: Cookies
2.2.5 Web Caching
2.2.6 Conditional GET
The Conditional GET
•Although caching can reduce user-perceived
response times, it introduces a new problem—the
copy of an object residing in the cache may be stale.
•In other words, the object housed in the Web
server may have been modified since the copy was
cached at the client.
Contd…
1/3
The Conditional GET
•HTTP has a mechanism that allows a cache to verify
that its objects are up to date. This mechanism is
called the Conditional GET.
•An HTTP request message is a so-called conditional
GET message if (1) the request message uses the
GET method and (2) the request message includes
an If-Modified-Since: header line
Contd…
2/3
How the Conditional GET Operates?
• First, On the behalf of a requesting browser, a proxy cache sends
a request message to a Web server:
GET /fruit/kiwi.gif HTTP/1.1
Host: www.exotiquecuisine.com
• Second, The Web server sends a response message with the
requested object to the cache:
HTTP/1.1 200 OK
Date: Sat, 8 Oct 2011 15:39:29
Server: Apache/1.3.0 (Unix)
Last-Modified: Wed, 7 Sep 2011 09:23:24
Content-Type: image/gif
(data data data data data ...)
How the Conditional GET Operates?
• The cache forwards the object to the requesting browser but
also caches the object locally. Importantly, the cache also stores
the last-modified date along with the object.
• Third, one week later, another browser requests the same object
via the cache, and the object is still in the cache. Since this
object may have been modified at the Web server in the past
week, the cache performs an up-to-date check by issuing a
conditional GET.
• The cache sends
GET /fruit/kiwi.gif HTTP/1.1
Host: www.exotiquecuisine.com
Contd…
How the Conditional GET Operates?
• The value of the If-modified-since: header line is
exactly equal to the value of the Last-Modified: header line that
was sent by the server one week ago. This conditional GET is
telling the server to send the object only if the object has been
modified since the specified date. Suppose the object has not
been modified since 7 Sep 2011 09:23:24.
• Then, fourth, the Web server sends a response message to the
cache:
HTTP/1.1 304 Not Modified
Date: Sat, 15 Oct 2011 15:39:29
Server: Apache/1.3.0 (Unix)
(empty entity body)
Contd…
How the Conditional GET Operates?
•We see that in response to the conditional GET, the Web
server still sends a response message but does not include
the requested object in the response message.
•Including the requested object would only waste bandwidth
and increase user-perceived response time, particularly if the
object is large.
•The last response message has 304 Not Modified in
the status line, which tells the cache that it can go ahead and
forward its (the proxy cache’s) cached copy of the object to
the requesting browser.
Contd…
End of 2.2
Web and HTTP
SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
File Transfer
•FTP Commands & Replies
File Transfer Protocol: FTP
•In a typical FTP session, the user is sitting in front of
one host (the local host) and wants to transfer files to
or from a remote host.
•In order for the user to access the remote account,
the user must provide a user identification and a
password.
•After providing this authorization information, the
user can transfer files from the local file system to the
remote file system and vice versa
FTP moves files between local and remote file system
FTP moves files between local and remote file system
•The user interacts with FTP through an FTP
user agent.
•The user first provides the hostname of the
remote host, causing the FTP client process in
the local host to establish a TCP connection
with the FTP server process in the remote
host.
Contd…
FTP moves files between local and remote file system
•The user then provides the user identification
and password, which are sent over the TCP
connection as part of FTP commands.
•Once the server has authorized the user, the
user copies one or more files stored in the
local file system into the remote file system
(or vice versa).
Contd…
Control Connection and Data Connection
TCP Connections in FTP
•FTP uses two parallel TCP connections to
transfer a file,
•A Control Connection and
•A Data Connection.
Control Connection and Data Connection
•The Control Connection is used for sending control
information between the two hosts—
•The Control information such as
•User identification,
•Password,
•Commands to change remote directory, and
•Commands to “put” and “get” files.
•The Data Connection is used to send actual file.
Difference Between FTP and HTTP
First Difference is
•FTP uses a separate Control connection, So FTP is said
to send its Control information out-of-band.
•HTTP sends request and response header lines into the
same TCP connection that carries the transferred file
itself. For this reason, HTTP is said to send its Control
information in-band
Difference Between FTP and HTTP
Second Difference is
• The FTP server must maintain state about the user.
• The server must associate the control connection with a specific
user account, and the server must keep track of the user’s
current directory as the user wanders about the remote
directory tree.
• Keeping track of this state information for each ongoing user
session significantly constrains the total number of sessions that
FTP can maintain simultaneously
• HTTP is stateless—it does not have to keep track of any user state.
Control Connection and Data Connection
Operation of FTP
•When a user starts an FTP session with a
remote host, the client side of FTP (user) first
initiates a control TCP connection with the
server side (remote host) on server port
number 21.
1/5
Operation of FTP
•The client side of FTP sends the user
identification and password over this control
connection.
•The client side of FTP also sends, over the
control connection, commands to change
the remote directory.
Contd…
2/5
Operation of FTP
•When the server side receives a command
for a file transfer over the control connection
(either to, or from, the remote host), the
server side initiates a TCP data connection to
the client side.
Contd…
3/5
Operation of FTP
•FTP sends exactly one file over the data
connection and then closes the data
connection.
•If, during the same session, the user wants to
transfer another file, FTP opens another data
connection.
Contd…
4/5
Operation of FTP
•Thus, with FTP, the control connection
remains open throughout the duration of the
user session, but a new data connection is
created for each file transferred within a
session (that is, the data connections are
non-persistent).
Contd…
5/5
FTP Commands and Replies
•The commands, from client to server, and replies,
from server to client, are sent across the control
connection in 7-bit ASCII format. Thus, like HTTP
commands, FTP commands are readable by
people.
•Each command consists of four uppercase ASCII
characters, some with optional arguments.
Some of the Commands are
•USER username: Used to send the user
identification to the server.
•PASS password: Used to send the user
password to the server.
1/4
Some of the Commands are
•LIST: Used to ask the server to send back a list
of all the files in the current remote directory.
The list of files is sent over a (new and non-
persistent) data connection rather than the
control TCP connection.
Contd…
2/4
Some of the Commands are
•RETR filename: Used to retrieve (that is, get)
a file from the current directory of the
remote host. This command causes the
remote host to initiate a data connection and
to send the requested file over the data
connection.
Contd…
3/4
Some of the Commands are
•STOR filename: Used to store (that is, put) a
file into the current directory of the remote
host.
Contd…
4/4
Contd…
•There is a one-to-one correspondence
between the command that the user issues
and the FTP command sent across the control
connection.
•Each command is followed by a reply, sent
from server to client. The replies are three-
digit numbers, with an optional message
following the number.
Some replies, along with their possible messages
331 Username OK, password required.
125 Data connection already open; transfer starting.
425 Can’t open data connection.
452 Error writing file.
SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with
HTTP, Mail Message Format, Mail Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
Introduction to E-Mail
•E-Mail is an asynchronous communication medium.
•People send and read messages when it is convenient for
them, without having to coordinate with other people’s
schedules.
•Electronic Mail is fast, easy to distribute, and inexpensive.
•Modern e-mail has many powerful features, including
messages with attachments, hyperlinks, HTML-formatted
text, and embedded photos.
High-Level View of the Internet Mail System
High-Level View of the
Internet Mail System
•Internet Mail has three major components:
•User Agents,
•Mail Servers, and
•The Simple Mail Transfer Protocol (SMTP)
Example
•Alice, sending an e-mail message to a recipient,
Bob.
•User agents allow users to read, reply to,
forward, save, and compose messages.
•Microsoft Outlook and Apple Mail are examples
of user agents for e-mail.
Example -- Contd…
•When Alice is finished composing her
message, her user agent sends the message to
her mail server, where the message is placed
in the mail server’s outgoing message queue.
•When Bob wants to read a message, his user
agent retrieves the message from his mailbox
in his mail server.
Example
•Mail servers form the core of the e-mail
infrastructure.
•Each recipient, such as Bob, has a mailbox
located in one of the mail servers.
•Bob’s mailbox manages and maintains the
messages that have been sent to him.
Contd…
Example
•A typical message starts its journey in the
sender’s user agent, travels to the sender’s
mail server, and travels to the recipient’s
mail server, where it is deposited in the
recipient’s mailbox
Contd…
Example
•When Bob wants to access the messages in
his mailbox, the mail server containing his
mailbox authenticates Bob (with usernames
and passwords).
•Alice’s mail server must also deal with failures
in Bob’s mail server.
Contd…
Example
•If Alice’s server cannot deliver mail to Bob’s
server, Alice’s server holds the message in a
message queue and attempts to transfer the
message later.
•Reattempts are often done every 30 minutes or
so; if there is no success after several days, the
server removes the message and notifies the
sender (Alice) with an e-mail message.
Contd…
2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
SMTP
•SMTP is the principal application-layer
protocol for Internet electronic mail.
•It uses the reliable data transfer service of
TCP to transfer mail from the sender’s mail
server to the recipient’s mail server.
SMTP has two sides
•SMTP has two sides
•A Client Side, which executes on the
sender’s mail server, and
•A Server Side, which executes on the
recipient’s mail server.
•Both the client and server sides of SMTP run
on every mail server.
SMTP has two sides
•When a mail server sends mail to other mail
servers, it acts as an SMTP client.
•When a mail server receives mail from other
mail servers, it acts as an SMTP server
2.4.1 SMTP
•SMTP is at the heart of Internet electronic mail.
•SMTP transfers messages from senders’ mail
servers to the recipients’ mail servers.
•SMTP is much older than HTTP.
•SMTP restricts the body (not just the headers)
of all mail messages to simple 7-bit ASCII.
The Basic Operation of SMTP
Suppose Alice wants to send Bob a simple ASCII message.
1. Alice invokes her user agent for e-mail, provides
Bob’s e-mail address (for example,
bob@someschool.edu), composes a message, and
instructs the user agent to send the message.
2. Alice’s user agent sends the message to her mail
server, where it is placed in a message queue
1/3
The Basic Operation of SMTP
Suppose Alice wants to send Bob a simple ASCII message.
3. The client side of SMTP, running on Alice’s mail
server, sees the message in the message queue. It
opens a TCP connection to an SMTP server,
running on Bob’s mail server.
4. After some initial SMTP handshaking, the SMTP
client sends Alice’s message into the TCP
connection.
Contd…
2/3
The Basic Operation of SMTP
Suppose Alice wants to send Bob a simple ASCII message.
5. At Bob’s mail server, the server side of SMTP
receives the message. Bob’s mail server then
places the message in Bob’s mailbox.
6. Bob invokes his user agent to read the
message at his convenience.
Contd…
3/3
Alice sends a message to Bob
•SMTP does not normally use
intermediate mail servers for sending
mail, even when the two mail servers
are located at opposite ends of the
world.
How SMTP transfers a message from a sending
mail server to a receiving mail server
•First, the client SMTP (running on the sending
mail server host) has TCP establish a
connection to port 25 at the server SMTP
(running on the receiving mail server host). If
the server is down, the client tries again later.
1/3
How SMTP transfers a message from a sending
mail server to a receiving mail server
•Once this connection is established, SMTP
client indicates the e-mail address of the sender
(the person who generated the message) and
the e-mail address of the recipient.
•The client sends the message.
Contd…
2/3
How SMTP transfers a message from a sending
mail server to a receiving mail server
•SMTP can count on the reliable data transfer
service of TCP to get the message to the server
without errors.
•The client then repeats this process over the
same TCP connection if it has other messages to
send to the server; otherwise, it instructs TCP to
close the connection.
Contd…
3/3
2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
Comparison of SMTP with HTTP
•HTTP transfers files (also called objects) from a
Web server to a Web client (typically a browser);
SMTP transfers files (that is, e-mail messages)
from one mail server to another mail server.
•When transferring the files, both persistent HTTP
and SMTP use persistent connections. Thus, the
two protocols have common characteristics.
Difference between SMTP with HTTP
•HTTP is a pull protocol—someone loads information on
a Web server and users use HTTP to pull the
information from the server at their convenience. The
TCP connection is initiated by the machine that wants
to receive the file.
•SMTP is a push protocol—the sending mail server
pushes the file to the receiving mail server. The TCP
connection is initiated by the machine that wants to
send the file.
The First difference is
Difference between SMTP with HTTP
•SMTP requires each message, including the body of
each message, to be in 7-bit ASCII format. If the
message contains characters that are not 7-bit ASCII
(for example, French characters with accents) or
contains binary data (such as an image file), then the
message has to be encoded into 7-bit ASCII.
•HTTP data does not impose this restriction.
The Second Difference is
Difference between SMTP with HTTP
•HTTP encapsulates each object in its own
HTTP response message.
•Internet mail (SMTP) places all of the
message’s objects into one message.
The Third Difference is
2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
2.4.3 Mail Message Formats
•When an e-mail message is sent from one
person to another, a header containing
peripheral information precedes the body of
the message itself.
•This peripheral information is contained in a
series of header lines.
Contd…
1/5
2.4.3 Mail Message Formats
•The header lines and the body of the message are
separated by a blank line
•Each header line contains readable text, consisting
of a keyword followed by a colon followed by a
value.
•Some of the keywords are required and others are
optional.
Contd…
2/5
2.4.3 Mail Message Formats
•Every header must have
•A From: header line
•A To: header line;
•A header may include a Subject: header line
•Other optional header lines.
•It is important to note that these header lines are
different from the SMTP commands
Contd…
3/5
2.4.3 Mail Message Formats
•A typical message header looks like this:
From: alice@crepes.fr
To: bob@hamburger.edu
Subject: Seeking Permission.
Contd…
4/5
2.4.3 Mail Message Formats
•After the message header, a blank line follows;
then the message body (in ASCII) follows.
•You should use Telnet to send a message to a
mail server that contains some header lines,
including the Subject: header line.
•To do this, issue telnet serverName 25
Contd…
5/5
2.4 Electronic Mail in the Internet
-- Syllabus
•SMTP,
•Comparison with HTTP,
•Mail Message Format,
•Mail Access Protocols
2.4.4 Mail Access Protocols
•Mail access uses a client-server architecture—
the user reads e-mail with a client that
executes on the user’s end system.
•Once SMTP delivers the message from Alice’s
mail server to Bob’s mail server, the message
is placed in Bob’s mailbox.
1/7
2.4.4 Mail Access Protocols
•Given that Bob (the recipient) executes his
user agent on his local PC, it is natural to
consider placing a mail server on his local PC.
•With this approach, Alice’s mail server would
dialogue directly with Bob’s PC.
Contd…
2/7
2.4.4 Mail Access Protocols
•There is a problem with this approach.
•A mail server manages mailboxes and runs the
client and server sides of SMTP.
•If Bob’s mail server were to reside on his local
PC, then Bob’s PC would have to remain always
on, and connected to the Internet, in order to
receive new mail, which can arrive at any time.
• This is impractical for many Internet users.
Contd…
3/7
2.4.4 Mail Access Protocols
•Instead, a typical user runs a user agent on
the local PC but accesses its mailbox stored
on an always-on shared mail server.
•This mail server is shared with other users
and is typically maintained by the user’s ISP
Contd…
4/7
2.4.4 Mail Access Protocols
•SMTP has been designed for pushing e-mail
from one host to another.
•The sender’s user agent does not dialogue
directly with the recipient’s mail server.
Contd…
5/7
2.4.4 Mail Access Protocols
•There are currently a number of popular mail
access protocols, including
•Post Office Protocol—Version 3 (POP3),
•Internet Mail Access Protocol (IMAP), and
•HTTP.
Contd…
6/7
2.4.4 Mail Access Protocols
•SMTP is used to transfer mail from the sender’s
mail server to the recipient’s mail server.
•SMTP is also used to transfer mail from the
sender’s user agent to the sender’s mail server.
• A mail access protocol, such as POP3, is used to
transfer mail from the recipient’s mail server to
the recipient’s user agent.
Contd…
7/7
POP3
•POP3 is an extremely simple mail access
protocol.
•It is short and quite readable.
•Because the protocol is so simple, its
functionality is rather limited.
Working of POP3
•POP3 begins when the user agent (the client)
opens a TCP connection to the mail server (the
server) on port 110.
•With the TCP connection established, POP3
progresses through three phases:
•Authorization,
•Transaction, and
•Update.
First Phase -- Authorization
•During the first phase, authorization, the
user agent sends a username and a
password to authenticate the user.
Second Phase -- Transaction
•During the second phase, transaction, the
user agent retrieves messages; also during
this phase, the user agent can mark messages
for deletion, remove deletion marks, and
obtain mail statistics.
Third Phase -- Update
•The third phase, update, occurs after the
client has issued the quit command, ending
the POP3 session.
•At this time, the mail server deletes the
messages that were marked for deletion.
•In a POP3 transaction, the user agent issues
commands, and the server responds to each
command with a reply.
Possible Responses of POP3 Transaction
•There are two possible responses:
•+OK (sometimes followed by server-to-
client data), used by the server to indicate
that the previous command was fine; and
•-ERR, used by the server to indicate that
something was wrong with the previous
command.
Consider the sample response message
telnet mailServer 110
+OK POP3 server ready
user bob
+OK
pass hungry
+OK user successfully logged on
If you misspell a command, the POP3 server will
reply with an -ERR message.
Two modes of User in the POP3 Transaction Phase
•A user agent using POP3 can be configured (by
the user) to “download and delete” or to
“download and keep”.
•The sequence of commands issued by a POP3
user agent depends on which of these two modes
the user agent is operating in.
•In the download-and-delete mode, the user agent
will issue the list, retr, and dele commands.
Transaction Message in the Download and Delete Mode
C: list
S: 1 498
S: 2 912
S: .
C: retr 1
S: (blah blah ...
S: .................
S: ..........blah)
S: .
C: dele 1
C: retr 2
S: (blah blah ...
S: .................
S: ..........blah)
S: .
C: dele 2
C: quit
S: +OK POP3 server signing off
Explanation of the Message
•The user agent first asks the mail server to list the size
of each of the stored messages.
•The user agent then retrieves and deletes each message
from the server. Note that after the authorization
phase, the user agent employed only four commands:
list, retr, dele, and quit.
•After processing the quit command, the POP3 server
enters the update phase and removes messages 1 and 2
from the mailbox.
Disadvantage of Download-and-Delete Mode
•The recipient may want to access his mail messages
from multiple machines (say, his office PC, his home
PC, and his portable computer). Such users are
called as nomadic user.
•The download-and-delete mode partitions
recipient’s mail messages over these three
machines; if he first reads a message on his office
PC, he will not be able to reread the message from
his portable at home later in the evening.
Download-and-Keep Mode
•In the download-and-keep mode, the user agent
leaves the messages on the mail server after
downloading them.
•In this case, the recipient can reread messages
from different machines;
•he can access a message from work and access
it again later in the week from home.
During a POP3 session between a
user agent and the mail server
•The POP3 server maintains some state information.
•The POP3 Server keeps track of which user
messages have been marked deleted.
•The POP3 server does not carry state information
across POP3 sessions.
•This lack of state information across the sessions
simplifies the implementation of a POP3 server.
Problem with POP3 for Nomadic User
•With POP3 access, once Bob has downloaded his
messages to the local machine, he can create mail
folders and move the downloaded messages into
the folders.
•Bob can then delete messages, move messages
across folders, and search for messages (by
sender name or subject).
1/2
Problem with POP3 for Nomadic User
•But this paradigm—namely, folders and messages
in the local machine—poses a problem for the
nomadic user, who would prefer to maintain a
folder hierarchy on a remote server that can be
accessed from any computer. This is not possible
with POP3—the POP3 protocol does not provide
any means for a user to create remote folders
and assign messages to folders.
Contd…
2/2
Solution is IMAP Protocol
•IMAP is a mail access protocol.
•It has many more features than POP3.
•It is also significantly more complex.
•Thus the client and server side
implementations are more complex.
IMAP Server
• An IMAP server will associate each message with a folder;
• When a message first arrives at the server, it is associated with the
recipient’s INBOX folder.
• The recipient can then move the message into a new, user-created
folder, read the message, delete the message, and so on.
• The IMAP protocol provides commands to allow users to create
folders and move messages from one folder to another.
• IMAP provides commands that allow users to search remote folders
for messages matching specific criteria.
• IMAP server maintains user state information across IMAP sessions—
for example, the names of the folders and which messages are
associated with which folders.
Another important feature of IMAP
•It has commands that permit a user agent to obtain
components of messages.
•For example, a user agent can obtain just the message
header of a message or just one part of a multipart MIME
message.
•This feature is useful when there is a low-bandwidth
connection between the user agent and its mail server.
•With a low bandwidth connection, the user may not want to
download all of the messages in its mailbox, particularly
avoiding long messages that might contain, for example, an
audio or video clip.
Web-Based E-Mail
•With this service, the user agent is an ordinary Web browser,
and the user communicates with its remote mailbox via HTTP.
•When a recipient wants to access a message in his mailbox,
the e-mail message is sent from Bob’s mail server to
reciever’s browser using the HTTP protocol rather than the
POP3 or IMAP protocol.
•When a sender wants to send an e-mail message, the e-mail
message is sent from sender browser to sender mail server
over HTTP rather than over SMTP.
•Sender’s mail server still sends messages to, and receives
messages from, other mail servers using SMTP.
End of the Chapter - 2.4
Electronic Mail in the Internet
SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by
DNS, Overview of How DNS Works, DNS Records and
Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
2.5 DNS --The Internet's Directory Service
Syllabus
2.5.1 Services Provided by DNS,
2.5.2 Overview of How DNS Works,
2.5.3 DNS Records and Messages
2.5 DNS --The Internet's Directory Service
•Internet hosts can be identified in many ways.
•One identifier for a host is its hostname.
•cnn.com
•www.yahoo.com
•gaia.cs.umass.edu
•cis.poly.edu
1/4
2.5 DNS --The Internet's Directory Service
•Hostnames provide little, if any, information
about the location within the Internet of the
host.
•A hostname such as www.eurecom.fr,
which ends with the country code .fr, tells
us that the host is probably in France.
Contd…
2/4
2.5 DNS --The Internet's Directory Service
•Hostnames can consist of variable-length
alphanumeric characters, they would be
difficult to process by routers. For these
reasons, hosts are also identified by so-called
IP addresses.
Contd…
3/4
2.5 DNS --The Internet's Directory Service
•An IP address consists of four bytes and has a rigid
hierarchical structure.
•An IP address looks like 121.7.106.83
•Each period separates one of the bytes expressed in
decimal notation from 0 to 255.
•An IP address is hierarchical because as we scan the
address from left to right, we obtain more and more
specific information about where the host is located in
the Internet.
Contd…
4/4
Two Ways to identify a Host
•By a hostname and
•By an IP address.
•People prefer the more mnemonic
hostname identifier.
•Routers prefer fixed-length, hierarchically
structured IP addresses.
Importance of DNS
•In order to reconcile these preferences, we
need a directory service that translates
hostnames to IP addresses.
•This is the main task of the Internet’s Domain
Name System (DNS).
Charecteristics of DNS
•The DNS is
•A distributed database implemented in
a hierarchy of DNS servers, and
•An application-layer protocol that allows
hosts to query the distributed database.
DNS Servers and Protocol
•The DNS servers are UNIX machines running
the Berkeley Internet Name Domain (BIND)
software.
•The DNS protocol runs over UDP and uses
port 53.
Contd…
1/2
DNS Servers and Protocol
•DNS is commonly employed by other
application-layer protocols—including HTTP,
SMTP, and FTP—to translate user-supplied
hostnames to IP addresses.
Contd…
2/2
What happens when a browser (i.e., an HTTP client),
running on some user’s host, requests the URL
www.someschool.edu/index.html.
•In order for the user’s host to be able to
send an HTTP request message to the Web
server www.someschool.edu, the user’s
host must first obtain the IP address of
www.someschool.edu.
•This is done as follows.
Steps when a browser (i.e., an HTTP client),
running on some user’s host, requests the URL
www.someschool.edu/index.html.
1. The user machine runs the client side of the
DNS application.
2. The browser extracts the hostname,
www.someschool.edu, from the URL
and passes the hostname to the client side
of the DNS application.
1/3
Steps when a browser (i.e., an HTTP client),
running on some user’s host, requests the URL
www.someschool.edu/index.html.
3. The DNS client sends a query containing the
hostname to a DNS server.
4. The DNS client eventually receives a reply,
which includes the IP address for the
hostname.
Contd…
2/3
Steps when a browser (i.e., an HTTP client), running
on some user’s host, requests the URL
www.someschool.edu/index.html.
5. Once the browser receives the IP address
from DNS, it can initiate a TCP connection
to the HTTP server process located at port
80 at that IP address.
Contd…
3/3
2.5.1 Services Provided by DNS
•DNS helps to reduce DNS network traffic as well
as the average DNS delay.
•DNS provides a few other important services in
addition to translating hostnames to IP addresses:
•Host aliasing.
•Mail server aliasing.
•Load distribution.
Host Aliasing
•A host with a complicated hostname can have
one or more alias names.
•For example, a hostname such as
relay1.westcoast.enterprise.com
could have two aliases such as
• enterprise.com and
•www.enterprise.com
1/2
Host Aliasing
•relay1.westcoast.enterprise.com
hostname is said to be a canonical hostname.
•Alias hostnames are more mnemonic than
canonical hostnames.
•DNS can be invoked by an application to
obtain the canonical hostname for a supplied
alias hostname and IP address of the host.
Contd…
2/2
Mail Server Aliasing
•E-mail addresses are mnemonic.
•For example, if Bob has an account with
Hotmail, Bob’s e-mail address might be as
simple as bob@hotmail.com.
1/4
Mail Server Aliasing
•The hostname of the Hotmail mail server is
more complicated and much less mnemonic
than simply hotmail.com
•For example, the canonical hostname might
be something like
relay1.west-coast.hotmail.com
Contd…
2/4
Mail Server Aliasing
•DNS can be invoked by a mail application to
obtain the canonical hostname for a supplied
alias hostname as well as the IP address of
the host.
Contd…
3/4
Mail Server Aliasing
•The MX record permits a company’s mail
server and Web server to have identical
(aliased) hostnames.
•For example, a company’s Web server and
mail server can both be called
enterprise.com.
Contd…
4/4
Load Distribution.
•DNS is also used to perform load distribution
among replicated servers, such as replicated
Web servers.
•Busy sites, such as cnn.com, are replicated over
multiple servers, with each server running on a
different end system and each having a
different IP address.
1/4
Load Distribution.
•For replicated Web servers, a set of IP addresses
is thus associated with one canonical hostname.
•The DNS database contains this set of IP
addresses.
•Clients make a DNS query for a name mapped to
a set of addresses.
Contd…
2/4
Load Distribution.
•The server responds with the entire set of IP
addresses, but rotates the ordering of the
addresses within each reply.
•Because a client sends its HTTP request message to
the IP address that is listed first in the set, DNS
rotation distributes the traffic among the replicated
servers.
Contd…
3/4
Load Distribution.
•DNS rotation is also used for e-mail so that
multiple mail servers can have the same alias
name.
•Content distribution companies such as
Akamai have used DNS in more sophisticated
ways to provide Web content distribution.
Contd…
4/4
2.5.2 Overview of How DNS Works
 hostname-to-IP-address translation service
•Suppose that some application (such as a Web
browser or a mail reader) running in a user’s host
needs to translate a hostname to an IP address.
•The application will invoke the client side of DNS,
specifying the hostname that needs to be
translated.
1/4
2.5.2 Overview of How DNS Works
 hostname-to-IP-address translation service
•On many UNIX-based machines, gethostbyname()
is the function call that an application calls in
order to perform the translation.
•DNS in the user’s host then takes over, sending a
query message into the network.
Contd…
2/4
2.5.2 Overview of How DNS Works
 hostname-to-IP-address translation service
•All DNS query and reply messages are sent within
UDP datagrams to port 53.
•After a delay, ranging from milliseconds to
seconds, DNS in the user’s host receives a DNS
reply message that provides the desired mapping
Contd…
3/4
2.5.2 Overview of How DNS Works
 hostname-to-IP-address translation service
•This mapping is then passed to the invoking
application.
•Thus, from the perspective of the invoking
application in the user’s host, DNS is a black box
providing a simple, straightforward translation
service.
Contd…
4/4
A Simple Design for DNS
•A simple design for DNS would have one DNS server
that contains all the mappings.
•In this centralized design,
•Clients simply direct all queries to the single
DNS server, and
•The DNS server responds directly to the
querying clients.
•Although the simplicity of this design is attractive, it is
inappropriate for today’s Internet, with its vast (and
growing) number of hosts.
The Problems with a Centralized Design
•A single point of failure. If the DNS server crashes,
so does the entire Internet.
•Traffic volume. A single DNS server would have to
handle all DNS queries (for all the HTTP requests
and e-mail messages generated from hundreds of
millions of hosts).
1/3
The Problems with a Centralized Design
•Distant centralized database. A single DNS server
cannot be “close to” all the querying clients. If we
put the single DNS server in New York City, then
all queries from Australia must travel to the other
side of the globe, perhaps over slow and
congested links. This can lead to significant delays.
Contd…
2/3
The Problems with a Centralized Design
•Maintenance. The single DNS server would have
to keep records for all Internet hosts. Not only
would this centralized database be huge, but it
would have to be updated frequently to account
for every new host.
Contd…
3/3
A Distributed, Hierarchical Database
•The DNS uses a large number of servers,
organized in a hierarchical fashion and
distributed around the world.
•No single DNS server has all of the mappings
for all of the hosts in the Internet. Instead, the
mappings are distributed across the DNS
servers.
Three Classes of DNS Servers
•There are Three Classes of DNS Servers:
•Root DNS Servers,
•Top-Level Domain (TLD) DNS Servers, and
•Authoritative DNS Servers
•All these are organized in a hierarchy.
Portion of the hierarchy of DNS server
How these three classes of servers interact
Suppose a DNS client wants to determine the IP address for the
hostname www.amazon.com.
•To a first approximation, the following events will take place.
•The client first contacts one of the root servers, which
returns IP addresses for TLD servers for the top-level
domain com.
•The client then contacts one of these TLD servers, which
returns the IP address of an authoritative server for
amazon.com.
•Finally, the client contacts one of the authoritative
servers for amazon.com, which returns the IP address
Root DNS Servers
•In the Internet, there are 13 root DNS servers
most of which are located in North America.
•These DNS root servers are listed in 2012
•These 13 DNS Servers are listed here in
(name, organization, location) format:
(Refer Next Slide)
1/3
Root DNS Servers
1. Verisign, Los Angeles, CA (5 other sites)
2. USC-ISI, Marina del Rey, CA
3. Cogent, Herndon, VA (5 other sites)
4. U, Maryland College Park, MD
5. NASA, Mt View, CA
6. Internet Software C, Palo Alto, CA (and 48 other sites)
7. US DoD, Columbus, OH (5 other sites)
8. ARL, Aberdeen, MD
9. Netnod, Stockholm (37 other sites)
10. Verisign, Dulles, VA (69 other sites )
11. RIPE, London (17 other sites)
12. ICANN, Los Angeles, CA (41 other sites)
13. WIDE, Tokyo (5 other sites)
2/3 Contd…
13 DNS Servers are listed here in
(name, organization, location)
format
Root DNS Servers
•Each “server” is actually a network of
replicated servers, for both security and
reliability purposes.
• All together, there are 247 root servers as of
fall 2011
Contd…
3/3
Top-Level Domain (TLD) Servers
•These servers are responsible for top-level domains
such as com, org, net, edu, and gov, and all of the
country top-level domains such as uk, fr, ca, and jp.
•The company Verisign Global Registry Services
maintains the TLD Servers for the com top-level domain.
•The company Educause maintains the TLD servers for
edu top-level domain. for a list of all top-level domains.
Authoritative DNS Servers
•Every organization with publicly accessible
hosts (such as Web servers and mail servers)
on the Internet must provide publicly
accessible DNS records that map the names
of those hosts to IP addresses.
1/3
Authoritative DNS Servers
•An organization’s authoritative DNS server
houses these DNS records.
•An organization can choose to implement its
own authoritative DNS server to hold these
records.
Contd…
2/3
Authoritative DNS Servers
•The organization can pay to have these records
stored in an authoritative DNS server of some
service provider.
•Most universities and large companies
implement and maintain their own primary and
secondary (backup) authoritative DNS server.
Contd…
3/3
Local DNS Server
•A local DNS server does not strictly belong to
the hierarchy of servers but is nevertheless
central to the DNS architecture.
•Each ISP—such as a university, an academic
department, an employee’s company, or a
residential ISP—has a local DNS server (also
called a default name server).
Contd…
1/4
Local DNS Server
•When a host connects to an ISP, the ISP
provides the host with the IP addresses
of one or more of its local DNS servers.
Contd…
2/4
Local DNS Server
•A host’s local DNS server is “close to” the
host.
•For an institutional ISP, local DNS server
may be on the same LAN as the host;
•For a residential ISP, it is separated from
the host by no more than a few routers.
Contd…
3/4
Local DNS Server
•When a host makes a DNS query, the query is
sent to the local DNS server, which acts a
proxy, forwarding the query into the DNS
server hierarchy.
Contd…
4/4
Interaction of the
various DNS server
1/6
Interaction of the various DNS Server
•Suppose the host cis.poly.edu desires the IP
address of gaia.cs.umass.edu.
•Suppose that Polytechnic’s local DNS server is
called dns.poly.edu and that an authoritative
DNS server for gaia.cs.umass.edu is called
dns.umass.edu.
Contd…
2/6
Interaction of the various DNS Server
•The host cis.poly.edu first sends a DNS query
message to its local DNS server, dns.poly.edu.
•The query message contains the hostname to
be translated, namely, gaia.cs.umass.edu.
•The local DNS server forwards the query
message to a root DNS server.
Contd…
3/6
Interaction of the various DNS Server
•The root DNS server takes note of the edu
suffix and returns to the local DNS server a
list of IP addresses for TLD servers responsible
for edu.
•The local DNS server then resends the query
message to one of these TLD servers.
Contd…
4/6
Interaction of the various DNS Server
•The TLD server takes note of the umass.edu
suffix and responds with the IP address of the
authoritative DNS server for the University of
Massachusetts, namely, dns.umass.edu.
Contd…
5/6
Interaction of the various DNS Server
•Finally, the local DNS server resends the query
message directly to dns.umass.edu, which
responds with the IP address of
gaia.cs.umass.edu.
•In order to obtain the mapping for one
hostname, eight DNS messages were sent: four
query messages and four reply messages.
Contd…
6/6
Interaction of the
various DNS server
Recursive Querries and Iterative Querries
•The query sent from cis.poly.edu to dns.poly.edu is a
recursive query, since the query asks dns.poly.edu to
obtain the mapping on its behalf.
•But the subsequent three queries are iterative since all
of the replies are directly returned to dns.poly.edu.
•Any DNS query can be iterative or recursive.
•The query from the requesting host to the local DNS
server is recursive, and the remaining queries are
iterative.
DNS Caching
•In a query chain, when a DNS server receives a
DNS reply (containing a mapping from a
hostname to an IP address), it can cache the
mapping in its local memory.
•For example, each time the local DNS server
dns.poly.edu receives a reply from some DNS
server, it can cache any of the information
contained in the reply.
1/6
DNS Caching
•If a hostname/IP address pair is cached in a
DNS server and another query arrives to the
DNS server for the same hostname, the DNS
server can provide the desired IP address,
even if it is not authoritative for the
hostname.
Contd…
2/6
DNS Caching
•Because hosts and mappings between
hostnames and IP addresses are by no
means permanent, DNS servers discard
cached information after a period of time.
•Suppose that a host apricot.poly.edu queries
dns.poly.edu for the IP address for the
hostname cnn.com.
Contd…
3/6
DNS Caching
•Suppose that a host apricot.poly.edu queries
dns.poly.edu for the IP address for the
hostname cnn.com.
•Suppose that a few hours later, another
Polytechnic University host, say, kiwi.poly.fr,
also queries dns.poly.edu with the same
hostname.
Contd…
4/6
DNS Caching
•Because of caching, the local DNS server will
be able to immediately return the IP address
of cnn.com to this second requesting host
without having to query any other DNS
servers.
Contd…
5/6
DNS Caching
•A local DNS server can also cache the IP
addresses of TLD servers, thereby allowing
the local DNS server to bypass the root DNS
servers in a query chain.
Contd…
6/6
2.5.3 DNS Records and Messages
•The DNS servers that together implement the
DNS distributed database store Resource
Records (RRs), including RRs that provide
hostname-to-IP address mappings.
•Each DNS reply message carries one or more
resource records.
A Resource Record is a four-tuple
that contains the following fields
(Name, Value, Type, TTL)
•TTL is the time to live of the resource record.
•It determines when a resource should be removed
from a cache.
•The meaning of Name and Value depend on Type.
The four values of TYPE field
•TYPE = A
•TYPE = NS
•TYPE = CNAME
•TYPE = MX
When TYPE = A
•If Type=A, then Name is a hostname and
Value is the IP address for the hostname.
•Thus, a Type A record provides the standard
hostname-to-IP address mapping.
•As an example, (relay1.bar.foo.com,
145.37.93.126, A) is a Type A record.
When TYPE = NS
•If Type=NS, then Name is a domain (such as
foo.com) and Value is the hostname of an
authoritative DNS server
•This record is used to route DNS queries
further along in the query chain.
•As an example, (foo.com, dns.foo.com, NS) is
a Type NS record.
When TYPE = CNAME
•If Type=CNAME, then Value is a canonical
hostname for the alias hostname Name.
•This record can provide querying hosts the
canonical name for a hostname.
•As an example, (foo.com, relay1.bar.foo.com,
CNAME) is a CNAME record.
When TYPE = MX
•If Type = MX, then Value is the canonical
name of a mail server that has an alias
hostname Name.
•As an example, (foo.com, mail.bar.foo.com, MX)
is an MX record.
•MX records allow the hostnames of mail servers
to have simple aliases.
1/2
When TYPE = MX
•Note that by using the MX record, a company can
have the same aliased name for its mail server
and for one of its other servers.
•To obtain the canonical name for the mail server, a
DNS client would query for an MX record; to
obtain the canonical name for the other server,
the DNS client would query for the CNAME record.
Contd…
2/2
DNS Server may be authoritative for a
particular hostname
•If a DNS server is authoritative for a particular
hostname, then the DNS server will contain a
Type A record for the hostname. (Even if the DNS
server is not authoritative, it may contain a Type
A record in its cache).
1/3
DNS Server may be authoritative for a
particular hostname
•If a server is not authoritative for a hostname,
then the server will contain a Type NS record for
the domain that includes the hostname; it will
also contain a Type A record that provides the IP
address of the DNS server in the Value field of the
NS record.
Contd…
2/3
DNS Server may be authoritative for a
particular hostname
• As an example, suppose an edu TLD server is not
authoritative for the host gaia.cs.umass.edu. Then this server
will contain a record for a domain that includes the host
gaia.cs.umass.edu.
• The edu TLD server would also contain a Type A record,
which maps the DNS server dns.umass.edu to an IP address,
for example, (dns.umass.edu, 128.119.40.111, A).
Contd…
3/3
DNS Messages
•Two kinds of DNS messages:
•DNS query message and.
•DNS reply message.
•Both query and reply messages have the same
format.
DNS Message Format
Identification Flags
Header Section 12 bytes
Number of
questions
Number of
answer RRs
Number of
authority RRs
Number of
additional RRs
Questions
(variable number of questions) Name, type fields for a query
Answers
(variable number of resource records) RRs in response to query
Authority
(variable number of resource records) Records for authoritative servers
Additional information Additional “helpful” info that
Sections in the DNS Message Format
•Header Sections (The first 12 bytes)
•Data Sections
DNS Message Format
Header Sections
• Identifier Field.
• Question Count field.
• Answer Count field.
• Authority Count field.
• Additional Information
Count field.
Data Sections
•Question Section
•Answer Section or Reply Section
•Authority Section
•Additional Information Section
Identifier Field
•It is the first field in the Header Section.
•It is a 16-bit number that identifies the query.
•This identifier is copied into the reply
message to a query, allowing the client to
match received replies with sent queries.
•There are a number of flags in the flag field.
Flags in the Flag field
•A 1-bit query/reply flag indicates whether
the message is a query (0) or a reply (1).
•A 1-bit authoritative flag is set in a reply
message when a DNS server is an
authoritative server for a queried name.
1/2
Flags in the Flag field
•A 1-bit recursion-desired flag is set when
a client (host or DNS server) desires that the
DNS server perform recursion when it
doesn’t have the record.
•A 1-bit recursion available flag is set in a
reply if the DNS server supports recursion.
Contd…
2/2
Other Header Sections in the
DNS Message Format
•In the header section, there are also four
number of fields.
•These fields indicate the number of
occurrences of the four types of data
sections that follow the header.
Four Fields in the Data Sections of
DNS Message Format
•Question Section
•Answer Section (or Reply Section or Response Section)
•Authority Section
•Additional Information Section
Question Section
• This Section contains information about the query that is
being made.
• This section includes
•A name field that contains the name that is
being queried, and
•A type field that indicates the type of question
being asked about the name.
• For Example, a host address associated with a name (Type A)
or the mail server for a name (Type MX).
Answer Section Or Reply Section Or
Response Section
•In a reply from a DNS server, the answer section
contains the resource records for the name that
was originally queried.
•In each resource record there is the Type (A, NS,
CNAME, and MX), the Value, and the TTL.
•A reply can return multiple RRs in the answer,
since a hostname can have multiple IP addresses.
Authority Section
•The Authority Section contains records
of other authoritative servers.
Additional Section
•The additional section contains other helpful
records.
•For example, the answer field in a reply to an MX
query contains a resource record providing the
canonical hostname of a mail server.
•The additional section contains a Type A record
providing the IP address for the canonical hostname
of the mail server.
How would you like to send a DNS query
message directly from the host you’re
working on to some DNS server?
•This can easily be done with the nslookup
program, which is available from most
Windows and UNIX platforms.
nslookup in the Windows Host
•Open the Command Prompt.
•Invoke the nslookup program by simply typing
“nslookup.”
•Send a DNS query to any DNS server (root, TLD, or
authoritative).
•Receiving the reply message from the DNS server.
•Now nslookup will display the records included in the
reply (in a human-readable format).
End of 2.5 DNS --
The Internet's
Directory Service.
SYLLABUS – MODULE 1
•Principles of Network Applications: Network Application Architectures, Processes
Communicating, Transport Services Available to Applications, Transport Services Provided
by the Internet, Application-Layer Protocols.
•The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP
Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET,
•File Transfer: FTP Commands & Replies,
•Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail
Access Protocols,
•DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS
Works, DNS Records and Messages,
•Peer-to-Peer Applications: P2P File Distribution, Distributed
Hash Tables,
•Socket Programming: creating Network Applications: Socket Programming with UDP, Socket
Programming with TCP.
Peer-to-Peer Applications
•P2P File Distribution
•Distributed Hash Tables
2.6.1 P2P File Distribution
•In Client-Server File Distribution, the server must
send a copy of the file to each of the peers—
placing an enormous burden on the server and
consuming a large amount of server bandwidth.
•In P2P File Distribution, each peer can
redistribute any portion of the file it has received
to any other peers, thereby assisting the server in
the distribution process
Most Popular P2P File Distribution Protocol
•As of 2012, the most popular P2P file distribution
protocol is BitTorrent.
•It was Originally developed by Bram Cohen.
•Now, there are many different independent
BitTorrent clients conforming to the BitTorrent
protocol, just as there are a number of Web
browser clients that conform to the HTTP protocol
Scalability of P2P Architectures
•Suppose the server and the peers are connected
to the Internet with access links.
•Let
• us be the upload rate of the server’s access link.
• ui be the upload rate of the ith
peer’s access link.
• di be the download rate of the ith peer’s access link.
• F be the size of the file to be distributed (in bits).
• N be the number of peers that want to obtain a copy of the file.
The Distribution Time
•The distribution time is the time it takes to get
a copy of the file to all N peers.
•Assume that the server and clients are not
participating in any other network applications, so
that all of their upload and download access
bandwidth can be fully devoted to distributing this
file.
File Distribution
Let’s first determine the distribution time for the
Client-Server Architecture
•Let DCS be the distribution time for the
client-server architecture.
•We make some observations
1/12
Let’s first determine the distribution time for the
Client-Server Architecture
•We make some observations
•The server must transmit one copy of the file to each of
the N peers. Thus the server must transmit___bits.
2/12 Contd…
Let’s first determine the distribution time for the
Client-Server Architecture
•We make some observations
•The server must transmit one copy of the file to each of
the N peers. Thus the server must transmit NF bits.
3/12 Contd…
Let’s first determine the distribution time for the
Client-Server Architecture
•We make some observations
•The server must transmit one copy of the file to each of
the N peers. Thus the server must transmit NF bits.
•The server’s upload rate is us,
•So, the time to distribute the file must be at least ____
4/12 Contd…
Let’s first determine the distribution time for the
Client-Server Architecture
•We make some observations
•The server must transmit one copy of the file to each of
the N peers. Thus the server must transmit NF bits.
•The server’s upload rate is us,
•So, the time to distribute the file must be at least NF/us
5/12 Contd…
1
Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = _______
6/12 Contd…
Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
7/12 Contd…
Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
•The peer with the lowest download rate cannot obtain
all F bits of the file in less than _____ seconds.
8/12 Contd…
Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
•The peer with the lowest download rate cannot obtain
all F bits of the file in less than F/dmin seconds.
•Thus the minimum distribution time is at least ____
9/12 Contd…
Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
•The peer with the lowest download rate cannot obtain
all F bits of the file in less than F/dmin seconds.
•Thus the minimum distribution time is at least F/dmin.
10/12 Contd…
Let’s first determine the distribution time for the
Client-Server Architecture
•Let dmin denote the download rate of the peer with the
lowest download rate, that is, dmin = min{d1,d2,...,dN}.
•The peer with the lowest download rate cannot obtain
all F bits of the file in less than F/dmin seconds.
•Thus the minimum distribution time is at least F/dmin.
11/12 Contd…
2
Let’s first determine the distribution time for the
Client-Server Architecture
•Putting these two observations together, we obtain
Contd…
12/12
This provides a lower bound on the minimum
distribution time for the client-server architecture.
Thus, the distribution time increases linearly with the
number of peers N.
Now Let’s determine the
distribution time for the
Peer-To-Peer Architecture
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
•In P2P architecture, each peer can assist the server in
distributing the file. In particular, when a peer receives some
file data, it can use its own upload capacity to redistribute
the data to other peers.
•Calculating the distribution time for the P2P architecture is
more complicated than for the client-server architecture,
since the distribution time depends on how each peer
distributes portions of the file to the other peers.
1/11
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•At the beginning of the distribution, only the server has the
file. To get this file into the community of peers, the server
must send each bit of the file at least once into its access
link.
•If us is the upload rate of server, then the time required to
upload 1 bit data is ____
2/11 Contd…
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•At the beginning of the distribution, only the server has the
file. To get this file into the community of peers, the server
must send each bit of the file at least once into its access
link.
•If us is the upload rate of server, then the time required to
upload 1 bit data is F/us
3/11 Contd…
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•At the beginning of the distribution, only the server has the
file. To get this file into the community of peers, the server
must send each bit of the file at least once into its access
link.
•If us is the upload rate of server, then the time required to
upload 1 bit data is F/us
•Thus, the minimum distribution time is at least F/u .
4/11 Contd…
1
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•At the beginning of the distribution, only the server has the
file. To get this file into the community of peers, the server
must send each bit of the file at least once into its access
link. Thus, the minimum distribution time is at least F/us .
(Unlike the client-server scheme, a bit sent once by the
server may not have to be sent by the server again, as the
peers may redistribute the bit among themselves.)
5/11 Contd…
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•If di is the download rate of ith
peer.
•So time required to download the file with F bits is _____
6/11 Contd…
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•If di is the download rate of ith
peer.
•So time required to download the file with F bits is F/dmin
7/11 Contd…
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•If di is the download rate of ith
peer.
•So time required to download the file with F bits is F/dmin
•The peer with the lowest download rate cannot obtain all F
bits of the file in less than F/dmin seconds. Thus the minimum
distribution time is at least F/dmin.
8/11 Contd…
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
•If di is the download rate of ith
peer.
•So time required to download the file with F bits is F/dmin
•The peer with the lowest download rate cannot obtain all F
bits of the file in less than F/dmin seconds. Thus the minimum
distribution time is at least F/dmin.
9/11 Contd…
2
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
We first make the following observations:
• Finally, observe that the total upload capacity of the system
as a whole is equal to the upload rate of the server plus the
upload rates of each of the individual peers, that is,
utotal = us + u1 + … + uN. The system must deliver (upload) F
bits to each of the N peers, thus delivering a total of NF bits.
This cannot be done at a rate faster than utotal.
•Thus, the minimum distribution time is also at least
NF/(u + u + … + u ).
10/11 Contd…
3
Determine the distribution time for the Peer-To-
Peer Architecture (P2P Architecture)
Putting these three observations together, we obtain the
minimum distribution time for P2P, denoted by DP2P.
11/11 Contd…
This provides a lower bound for the minimum distribution time for the
P2P architecture.
If we imagine that each peer can redistribute a bit as soon as it
receives the bit, then there is a redistribution scheme that actually
achieves this lower bound.
Distribution time for P2P and client-server architecture
Assumptions in the graph
•We have set F/u = 1 hour, us = 10u, and dmin ≥ us .
•Thus, a peer can transmit the entire file in one
hour, the server transmission rate is 10 times the
peer upload rate, and (for simplicity) the peer
download rates are set large enough so as not to
have an effect.
Comparison of Client Server Architecture
with P2P Architecture using the Graph
•For the client-server architecture, the distribution time
increases linearly and without bound as the number of
peers increases.
•For the P2P architecture, the minimal distribution time is
not only always less than the distribution time of the client-
server architecture; it is also less than one hour for any
number of peers N. Thus, applications with the P2P
architecture can be self-scaling
Bit Torrent
•BitTorrent is a popular P2P protocol for file distribution.
•In BitTorrent lingo, the collection of all peers
participating in the distribution of a particular file is
called a torrent.
•Peers in a torrent download equal-size chunks of the
file from one another, with a typical chunk size of 256
KBytes.
•When a peer first joins a torrent, it has no chunks. Over
time it accumulates more and more chunks.
1/2
Bit Torrent
•While it downloads chunks it also uploads chunks to
other peers.
•Once a peer has acquired the entire file, it may leave
the torrent, or remain in the torrent and continue to
upload chunks to other peers.
•Any peer may leave the torrent at any time with only a
subset of chunks, and later rejoin the torrent.
2/2 Contd…
Operation of Bit Torrent Protocol
•Each torrent has an infrastructure node called a
tracker.
•When a peer joins a torrent, it registers itself with the
tracker and periodically informs the tracker that it is
still in the torrent. Thus the tracker keeps track of the
peers that are participating in the torrent.
•A given torrent may have fewer than ten or more than
a thousand peers participating at any instant of time.
File distribution with BitTorrent
Working of Bit Torrent Protocol
•When a new peer, Alice, joins the torrent, the tracker
randomly selects a subset of peers from the set of
participating peers, and sends the IP addresses of
these 50 peers to Alice.
•Possessing this list of peers, Alice attempts to establish
concurrent TCP connections with all the peers on this
list.
1/5
Working of Bit Torrent Protocol
•Let’s call all the peers with which Alice succeeds in
establishing a TCP connection “neighboring peers.”
•As time evolves, some of these peers may leave
and other peers (outside the initial 50) may
attempt to establish TCP connections with Alice.
•So a peer’s neighboring peers will fluctuate over
time.
2/5 Contd…
Working of Bit Torrent Protocol
•At any given time, each peer will have a subset of
chunks from the file, with different peers having
different subsets.
•Periodically, Alice will ask each of her neighboring peers
for the list of the chunks they have.
•If Alice has L different neighbors, she will obtain L lists
of chunks. With this knowledge, Alice will issue
requests for chunks she currently does not have.
3/5 Contd…
Working of Bit Torrent Protocol
•So at any given instant of time, Alice will have a subset of
chunks and will know which chunks her neighbors have.
•With this information, Alice will have two important
decisions to make.
•First, which chunks should she request first from her
neighbors?
•Second, to which of her neighbors should she send
requested chunks?
4/5 Contd…
Working of Bit Torrent Protocol
•In deciding which chunks to request, Alice uses a technique
called rarest first.
•The idea is to determine, from among the chunks she does
not have, the chunks that are the rarest among her neighbors
(that is, the chunks that have the fewest repeated copies among her neighbors)
and then request those rarest chunks first.
•In this manner, the rarest chunks get more quickly
redistributed, aiming to (roughly) equalize the numbers of
copies of each chunk in the torrent.
5/5 Contd…
Clever Trading Algorithm
•To determine which requests Alice responds to,
BitTorrent uses a Clever Trading Algorithm.
•The basic idea is that Alice gives priority to the
neighbors that are currently supplying her data at the
highest rate.
1/6
Clever Trading Algorithm
•Specifically, for each of her neighbors, Alice
continually measures the rate at which she receives
bits and determines the four peers that are feeding
her bits at the highest rate.
•She then reciprocates by sending chunks to these
same four peers. Every 10 seconds, she recalculates
the rates and possibly modifies the set of four peers.
2/6 Contd…
Clever Trading Algorithm
•In BitTorrent lingo, these four peers are said to be
unchoked.
•Every 30 seconds, she also picks one additional
neighbor at random and sends it chunks. Let’s call the
randomly chosen peer Bob.
•In BitTorrent lingo, Bob is said to be optimistically
unchoked.
3/6 Contd…
Clever Trading Algorithm
•Because Alice is sending data to Bob, she may
become one of Bob’s top four uploaders, in which
case Bob would start to send data to Alice.
•If the rate at which Bob sends data to Alice is high
enough, Bob could then become one of Alice’s top
four uploaders.
4/6 Contd…
Clever Trading Algorithm
•In other words, every 30 seconds, Alice will randomly choose
a new trading partner and initiate trading with that partner.
•If the two peers are satisfied with the trading, they will put
each other in their top four lists and continue trading with
each other until one of the peers finds a better partner.
•The effect is that peers capable of uploading at compatible
rates tend to find each other.
5/6 Contd…
Clever Trading Algorithm
•The random neighbor selection also allows new peers to get
chunks, so that they can have something to trade.
•All other neighboring peers besides these five peers (four
“top” peers and one probing peer) are “choked,” that is, they
do not receive any chunks from Alice.
•BitTorrent has a number of interesting mechanisms including
pieces (mini-chunks), pipelining, random first selection,
endgame mode, and anti-snubbing.
6/6 Contd…
2.6.2 Distributed Hash Tables (DHTs)

More Related Content

PPTX
COMPUTER NW2 (1).pptx
PPTX
Types of connections -Peer to peer connection
PDF
Computer Networks Module 1-part 1.pdf
PPTX
Unit 1 web technology uptu slide
PDF
DrShivashankar_Computer Net_Module-3.pdf
PDF
Lecture 11 client_server_interaction
PPTX
Introduction to the Internet and Web.pptx
PPT
introduction to Web system
COMPUTER NW2 (1).pptx
Types of connections -Peer to peer connection
Computer Networks Module 1-part 1.pdf
Unit 1 web technology uptu slide
DrShivashankar_Computer Net_Module-3.pdf
Lecture 11 client_server_interaction
Introduction to the Internet and Web.pptx
introduction to Web system

Similar to VTU V SEM CNS Module 1 PPT 2018 Batch students (20)

PPTX
C/S archtecture including basic networking
PPTX
Web-Server & It's Architecture.pptx
PPTX
applayer.pptx
PDF
1. web technology basics
PPTX
Module 1 part 2.pptx with clear notes and explanation
PPTX
web world wide defination introduction.pptx
PPTX
INT_Ch17.pptx
PDF
Module notes artificial intelligence and
PPTX
OSI Application layer. tcp/ip application layer
PPTX
Cloud description
PDF
Unit-5_2 PPT on Distributed Web based System.pdf
PDF
SERVERS BASSIC INTRIDUCTION ,TYPES AND THEIR FUNCTIONALITIES
PDF
CS-324-6-3 (2).pdf
PDF
CS-324-6-3 (1).pdf
DOC
04 Client Server Computing
PPT
Part 1 network computing
PPTX
Introduction and Basics to web technology .pptx
PPT
Web Fendamentals
PPTX
MODULE-5_CCN.pptx
PDF
1-1.pdf
C/S archtecture including basic networking
Web-Server & It's Architecture.pptx
applayer.pptx
1. web technology basics
Module 1 part 2.pptx with clear notes and explanation
web world wide defination introduction.pptx
INT_Ch17.pptx
Module notes artificial intelligence and
OSI Application layer. tcp/ip application layer
Cloud description
Unit-5_2 PPT on Distributed Web based System.pdf
SERVERS BASSIC INTRIDUCTION ,TYPES AND THEIR FUNCTIONALITIES
CS-324-6-3 (2).pdf
CS-324-6-3 (1).pdf
04 Client Server Computing
Part 1 network computing
Introduction and Basics to web technology .pptx
Web Fendamentals
MODULE-5_CCN.pptx
1-1.pdf
Ad

More from VENKATESHBHAT25 (19)

PDF
BCS401 ADA Second IA Test Question Bank.pdf
PPTX
Fourth Semester BE CSE BCS401 ADA Module 3 PPT.pptx
PPTX
21CS642 Module 5 JDBC PPT.pptx VI SEM CSE Students
PPTX
21CS642 Module 4_2 JSP PPT.pptx VI SEM CSE
PPTX
21CS642 Module 4_1 Servlets PPT.pptx VI SEM CSE Students
PPTX
21CS642 Module 3 Strings PPT.pptx VI SEM CSE
PPTX
21CS642 Module 2 Generics PPT.pptx VI SEM CSE
PPTX
21CS642 Module 1 Enumerations PPT.pptx VI SEM CSE 2021 Batch Students
PPTX
BCS503 TOC Module 4 PPT.pptx V SEM CSE PPT
PPTX
BCS503 TOC Module 2 PPT.pptx VTU academic Year 2024-25 ODD SEM
PPTX
BCS503 TOC Module 1 PPT.pptx VTU academic Year 2024-25 ODD SEM
PPTX
BCS503 TOC Module 5 PPT.pptx VTU academic Year 2024-25 ODD SEM
PDF
Computer_Networking_A_Top-Down_Approach_6th_edition_ (2).pdf
PDF
CNS Nader F Mir.pdf VTU V SEM CNS Text Book 2018 Batch students
PPTX
VTU V SEM CNS Module 1 PPT 2018 Batch students
PPTX
VTU V SEM CNS Module 4 PPT 2018 Batch students
PPTX
VTU V SEM CNS Module 1 PPT 2018 Batch students
PDF
BCS401 ADA First IA Test Question Bank.pdf
PPTX
BCS401 ADA Module 1 PPT 2024-25 IV SEM.pptx
BCS401 ADA Second IA Test Question Bank.pdf
Fourth Semester BE CSE BCS401 ADA Module 3 PPT.pptx
21CS642 Module 5 JDBC PPT.pptx VI SEM CSE Students
21CS642 Module 4_2 JSP PPT.pptx VI SEM CSE
21CS642 Module 4_1 Servlets PPT.pptx VI SEM CSE Students
21CS642 Module 3 Strings PPT.pptx VI SEM CSE
21CS642 Module 2 Generics PPT.pptx VI SEM CSE
21CS642 Module 1 Enumerations PPT.pptx VI SEM CSE 2021 Batch Students
BCS503 TOC Module 4 PPT.pptx V SEM CSE PPT
BCS503 TOC Module 2 PPT.pptx VTU academic Year 2024-25 ODD SEM
BCS503 TOC Module 1 PPT.pptx VTU academic Year 2024-25 ODD SEM
BCS503 TOC Module 5 PPT.pptx VTU academic Year 2024-25 ODD SEM
Computer_Networking_A_Top-Down_Approach_6th_edition_ (2).pdf
CNS Nader F Mir.pdf VTU V SEM CNS Text Book 2018 Batch students
VTU V SEM CNS Module 1 PPT 2018 Batch students
VTU V SEM CNS Module 4 PPT 2018 Batch students
VTU V SEM CNS Module 1 PPT 2018 Batch students
BCS401 ADA First IA Test Question Bank.pdf
BCS401 ADA Module 1 PPT 2024-25 IV SEM.pptx
Ad

Recently uploaded (20)

PPTX
Software Engineering and software moduleing
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
PDF
Visual Aids for Exploratory Data Analysis.pdf
PDF
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
PDF
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Nature of X-rays, X- Ray Equipment, Fluoroscopy
PDF
Exploratory_Data_Analysis_Fundamentals.pdf
PPTX
Information Storage and Retrieval Techniques Unit III
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PPTX
communication and presentation skills 01
PPTX
Safety Seminar civil to be ensured for safe working.
PDF
Abrasive, erosive and cavitation wear.pdf
PPTX
"Array and Linked List in Data Structures with Types, Operations, Implementat...
PPTX
Management Information system : MIS-e-Business Systems.pptx
Software Engineering and software moduleing
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
Visual Aids for Exploratory Data Analysis.pdf
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
Automation-in-Manufacturing-Chapter-Introduction.pdf
Nature of X-rays, X- Ray Equipment, Fluoroscopy
Exploratory_Data_Analysis_Fundamentals.pdf
Information Storage and Retrieval Techniques Unit III
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
III.4.1.2_The_Space_Environment.p pdffdf
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
communication and presentation skills 01
Safety Seminar civil to be ensured for safe working.
Abrasive, erosive and cavitation wear.pdf
"Array and Linked List in Data Structures with Types, Operations, Implementat...
Management Information system : MIS-e-Business Systems.pptx

VTU V SEM CNS Module 1 PPT 2018 Batch students

  • 1. Module 1 Application Layer VENKATESH BHAT Senior Associate Professor Department of CSE AIET, Moodbidri
  • 2. SYLLABUS – MODULE 1 •Principles of Network Applications: Network Application Architectures, Processes Communicating, Transport Services Available to Applications, Transport Services Provided by the Internet, Application-Layer Protocols. •The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET, •File Transfer: FTP Commands & Replies, •Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail Access Protocols, •DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS Works, DNS Records and Messages, •Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables, •Socket Programming: creating Network Applications: Socket Programming with UDP, Socket Programming with TCP.
  • 3. CHAPTERS FROM THE TEXT BOOK 1 2.1 To 2.7
  • 4. 2.1 Principles of Network Applications -- Syllabus 2.1.1 Network Application Architectures. 2.1.2 Processes Communicating. 2.1.3 Transport Services Available to Applications. 2.1.4 Transport Services Provided by the Internet. 2.1.5 Application-Layer Protocols.
  • 5. Principles of Network Applications •Network application development is writing programs that run on different end systems and communicate with each other over the network.
  • 6. Principles of Network Applications •Example: •Web Application •Peer-to-Peer File Sharing System
  • 7. Principles of Network Applications • Example: • Web Application: • In the Web application, there are two distinct programs that communicate with each other: • The browser program running in the user’s host (desktop, laptop, tablet, smartphone, and so on); and • The Web server program running in the Web server host.
  • 8. Principles of Network Applications •Example: •Peer-to-Peer File Sharing System •In a P2P File-Sharing System, there is a program in each host that participates in the file-sharing community. •In this case, the programs in the various hosts may be similar or identical.
  • 9. IMPORTANT POINTS TO BE CONSIDER WHEN WE DEVELOP A NEW NETWORK APPLICATION • When developing our new application, we need to write software that will run on multiple end systems. • This software could be written in C, Java, or Python. • We do not need to write software that runs on network core devices, such as routers or link-layer switches. • Even if we wanted to write application software for these network-core devices, we wouldn’t be able to do so. • Network-core devices do not function at the application layer but instead function at lower layers— specifically at the network layer and below. • Communication for a network application takes place between end systems at the application layer
  • 10. Network Application Architectures • An application’s architecture is distinctly different from the network architecture. • From the application developer’s perspective, the network architecture is fixed and provides a specific set of services to applications. • The application architecture is designed by the application developer and dictates how the application is structured over the various end systems. • An Application Developer will draw on one of the two predominant architectural paradigms used in modern network applications: • The client-server architecture • The peer-to-peer (P2P) architecture
  • 11. Client-Server Architecture • There is an always-on host, called the server, which services requests from many other hosts, called clients. • A classic example is the Web application • Here, one host is always-on is called as Web Server. • Web Server services requests from browsers running on client hosts. • When a Web server receives a request for an object from a client host, it responds by sending the requested object to the client host.
  • 12. Characteristics of Client Server Architecture • With the client-server architecture, clients do not directly communicate with each other; for example, in the Web application, two browsers do not directly communicate. • The server has a fixed, well-known address, called an IP address. Because the server has a fixed, well-known address, and because the server is always on, a client can always contact the server by sending a packet to the server’s IP address. • Some of the better-known applications with a client-server architecture include the Web, FTP, Telnet, and e-mail. • A single-server host is incapable of keeping up with all the requests from clients. For this reason, a data center, housing a large number of hosts, is often used to create a powerful virtual server
  • 13. Some Examples For Client Server Architecture • The most popular Internet services—such as search engines (e.g., Google and Bing), • Internet commerce (e.g., Amazon and e-Bay), • Web-based email (e.g., Gmail and Yahoo Mail), • Social networking (e.g., Facebook and Twitter) — These above will employ one or more data centers.
  • 14. Disadvantage •Infrastructure Intensive  A data center can have hundreds of thousands of servers, which must be powered and maintained. •Service providers must pay recurring interconnection and bandwidth costs for sending data and receiving data to and from Internet.
  • 15. A Sample Client Server Architecture Clients Server
  • 16. P2P Architecture • There is minimal (or no) reliance on dedicated servers in data centers. • The application exploits direct communication between pairs of intermittently connected hosts, called peers. • The peers are not owned by the service provider, but are instead desktops and laptops controlled by users, with most of the peers residing in homes, universities, and offices. • Because the peers communicate without passing through a dedicated server, the architecture is called peer-to-peer. • Most popular and traffic-intensive applications are based on P2P architectures
  • 17. P2P Applications include • File Sharing (e.g., BitTorrent), • Peer-Assisted Download Acceleration (e.g., Xunlei), • Internet Telephony (e.g., Skype), • IPTV (e.g., Kankan and PPstream), • LimeWire (A Music NFT MarketPlace)
  • 18. A Sample P2P Architecture
  • 19. Hybrid Architectures •Combining both client-server and P2P elements. •For example, for many instant messaging applications, servers are used to track the IP addresses of users, but user-to-user messages are sent directly between user hosts (without passing through intermediate servers)
  • 20. Features of P2P architectures • P2P Architectures are Self-Scalability. • For example, in a P2P file-sharing application, although each peer generates workload by requesting files, each peer also adds service capacity to the system by distributing files to other peers. • P2P Architectures are also cost effective • Since they normally don’t require significant server infrastructure and server bandwidth (in contrast with clients-server designs with datacenters), cost will be less.
  • 21. P2P Applications Face Three Major Challenges • ISP Friendly. Most residential ISPs (including DSL and cable ISPs) have been dimensioned for “asymmetrical” bandwidth usage, that is, for much more downstream than upstream traffic. But P2P video streaming and file distribution applications shift upstream traffic from servers to residential ISPs, thereby putting significant stress on the ISPs. Future P2P applications need to be designed so that they are friendly to ISPs [Xie 2008]. • Security. Because of their highly distributed and open nature, P2P applications can be a challenge to secure. • Incentives. The success of future P2P applications also depends on convincing users to volunteer bandwidth, storage, and computation resources to the applications, which is the challenge of incentive design.
  • 23. Process •What is Process? •A process can be thought of as a program that is running within an end system.
  • 24. Processes Communicating •How Processes running on Same Host communicate? •How Processes running on different hosts (with potentially different operating systems) communicate?
  • 25. How Processes running on Same Host communicate?
  • 26. How Processes running on Same Host communicate? •When processes are running on the same end system, they can communicate with each other with interprocess communication, using rules that are governed by the end system’s operating system.
  • 27. How Processes running on different hosts (with potentially different operating systems) communicate?
  • 28. How Processes running on different hosts (with potentially different operating systems) communicate? •Processes on two different end systems communicate with each other by exchanging messages across the computer network. A sending process creates and sends messages into the network; a receiving process receives these messages and responds by sending messages back
  • 29. Client and Server Processes •A network application consists of pairs of processes that send messages to each other over a network. •For example, •In the Web application, a client browser process exchanges messages with a Web server process. •In a P2P file-sharing system, a file is transferred from a process in one peer to a process in another peer.
  • 30. Client and Server Processes – Contd… • For each pair of communicating processes, we typically label one of the two processes as the client and the other process as the server. • With the Web, a browser is a client process and a Web server is a server process. • With P2P file sharing, the peer that is downloading the file is labeled as the client, and the peer that is uploading the file is labeled as the server. • In some applications, such as in P2P file sharing, a process can be both a client and a server. A process in a P2P file-sharing system can both upload and download files.
  • 31. Definition of Client and Server Process • In the context of a communication session between a pair of processes, •The process that initiates the communication (that is, initially contacts the other process at the beginning of the session) is labeled as the client. •The process that waits to be contacted to begin the session is the server.
  • 32. Example for Client and Servers •In the Web, a browser process initializes contact with a Web server process; hence the browser process is the client and the Web server process is the server. •In P2P file sharing, when Peer A asks Peer B to send a specific file, Peer A is the client and Peer B is the server in the context of this specific communication session. When there’s no confusion, we’ll sometimes also use the terminology “client side and server side of an application.”
  • 33. The Interface Between the Process and the Computer Network •A Process sends messages into, and receives messages from, the network through a software interface called a socket.
  • 34. Socket Communication Between Two Processes that communicate over the Internet.
  • 35. Socket •A Socket is the interface between the Application Layer and the Transport Layer within a host. •It is also referred to as the Application Programming Interface (API) between the application and the network, since the socket is the programming interface with which network applications are built. •The application developer has control of everything on the application-layer side of the socket but has little control of the transport-layer side of the socket.
  • 36. The only control that the Application Developer has on the Transport-Layer side is •The only control that the Application Developer has on the Transport-Layer side is •The choice of transport protocol and •The ability to fix a few transport-layer parameters such as maximum buffer and maximum segment sizes. •Once the application developer chooses a transport protocol, the application is built using the transport-layer services provided by that protocol.
  • 37. Addressing Processes •In order for a process running on one host to send packets to a process running on another host, the receiving process needs to have an address. •To identify the receiving process, two pieces of information need to be specified: •The address of the host and •An identifier that specifies the receiving process in the destination host.
  • 38. Addressing Processes – Contd… •In the Internet, the host is identified by its IP address. An IP address is a 32-bit quantity that we can think of as uniquely identifying the host. •The sending process must also identify the receiving process running in the host. This information is needed because a host could be running many network applications. A destination port number serves this purpose.
  • 39. Popular applications have been assigned Specific Port Numbers. •For example, •A Web server is identified by port number 80. •A mail server process (using the SMTP protocol) is identified by port number 25. •A list of well-known port numbers for all Internet standard protocols can be found at http://www.iana.org.
  • 40. SYLLABUS – MODULE 1 •2.1 Principles of Network Applications: •Network Application Architectures, •Processes Communicating, •Transport Services Available to Applications, •Transport Services Provided by the Internet, •Application-Layer Protocols.
  • 41. Transport Services Available to Applications •Services that a transport-layer protocol can offer to applications invoking it can be classified into four dimensions : 1. Reliable data transfer 2. Throughput 3. Timing 4. Security
  • 42. Reliable Data Transfer • Packets can get lost within a computer network. • A packet can overflow a buffer in a router, or can be discarded by a host or router after having some of its bits corrupted. • For many applications—such as electronic mail, file transfer, remote host access, Web document transfers, and financial applications—data loss can have devastating consequences. • The data sent by one end of the application is delivered correctly and completely to the other end of the application. If a protocol provides such a guaranteed data delivery service, it is said to provide reliable data transfer.
  • 43. One important service that a transport-layer protocol can potentially provide to an application •One important service that a transport-layer protocol can potentially provide to an application is Process-To-Process Reliable Data Transfer. •When a transport protocol provides this service, the sending process can just pass its data into the socket and know with complete confidence that the data will arrive without errors at the receiving process.
  • 44. Loss-Tolerant Applications • When a transport-layer protocol doesn’t provide reliable data transfer, some of the data sent by the sending process may never arrive at the receiving process. This may be acceptable for Loss-Tolerant Applications • Most notably multimedia applications such as conversational audio/video that can tolerate some amount of data loss. In these multimedia applications, lost data might result in a small glitch in the audio/video—not a crucial impairment
  • 45. Throughput • In the context of a communication session between two processes along a network path, Throughput is the rate at which the sending process can deliver bits to the receiving process. • Because other sessions will be sharing the bandwidth along the network path, and because these other sessions will be coming and going, the available throughput can fluctuate with time. These observations lead to another natural service that a transport-layer protocol could provide, namely, guaranteed available throughput at some specified rate.
  • 46. Throughput – Contd… •The application could request a guaranteed throughput of r bits/sec, and the transport protocol would then ensure that the available throughput is always at least r bits/sec. Such a guaranteed throughput service would appeal to many applications.
  • 47. For Example •If an Internet telephony application encodes voice at 32 kbps, it needs to send data into the network and have data delivered to the receiving application at this rate. •If the transport protocol cannot provide this throughput, the application would need to encode at a lower rate or may have to give up, since receiving half of the needed throughput is of little or no use to this Internet telephony application.
  • 48. Bandwidth-Sensitive Applications •Applications that have throughput requirements are said to be bandwidth-sensitive applications. Many current multimedia applications are bandwidth sensitive, although some multimedia applications may use adaptive coding techniques to encode digitized voice or video at a rate that matches the currently available throughput.
  • 49. Elastic Applications •While bandwidth-sensitive applications have specific throughput requirements, elastic applications can make use of as much, or as little, throughput as happens to be available. Electronic mail, file transfer, and Web transfers are all elastic applications.
  • 50. Timing • A transport-layer protocol can also provide timing guarantees. • As with throughput guarantees, timing guarantees can come in many shapes and forms. • An example guarantee might be that every bit that the sender pumps into the socket arrives at the receiver’s socket no more than 100 msec later. • Such a service would be appealing to interactive real-time applications, such as Internet telephony, virtual environments, teleconferencing, and multiplayer games, all of which require tight timing constraints on data delivery in order to be effective.
  • 51. Timing – Contd… •Long delays in Internet telephony, for example, tend to result in unnatural pauses in the conversation; •In a multiplayer game or virtual interactive environment, a long delay between taking an action and seeing the response from the environment makes the application feel less realistic.
  • 52. Security • A transport protocol can provide an application with one or more security services. • For example, in the sending host, a transport protocol can encrypt all data transmitted by the sending process, and in the receiving host, the transport-layer protocol can decrypt the data before delivering the data to the receiving process. • Such a service would provide confidentiality between the two processes, even if the data is somehow observed between sending and receiving processes. • A transport protocol can also provide other security services in addition to confidentiality, including data integrity and end-point authentication
  • 53. SYLLABUS – MODULE 1 •2.1 Principles of Network Applications: •Network Application Architectures, •Processes Communicating, •Transport Services Available to Applications, •Transport Services Provided by the Internet, •Application-Layer Protocols.
  • 54. Transport Services Provided by the Internet •The Internet makes two transport protocols available to applications, •UDP and •TCP.
  • 55. TCP Services •The TCP service model includes •A Connection-Oriented Service and •A Reliable Data Transfer Service. •Congestion-Control Mechanism Service •When an application invokes TCP as its transport protocol, the application receives both of these services from TCP.
  • 56. Connection-Oriented Service •TCP has the client and server exchange transport layer control information with each other before the application-level messages begin to flow. These Control information are called as handshaking. •This handshaking procedure alerts the client and server, allowing them to prepare packets.
  • 57. Connection-Oriented Service •After the handshaking phase, a TCP connection is said to exist between the sockets of the two processes. The connection is a full-duplex connection in that the two processes can send messages to each other over the connection at the same time. •When the application finishes sending messages, it must tear down the connection. Contd…
  • 58. Reliable Data Transfer Service •The communicating processes can rely on TCP to deliver all data sent without error and in the proper order. •When one side of the application passes a stream of bytes into a socket, it can count on TCP to deliver the same stream of bytes to the receiving socket, with no missing or duplicate bytes.
  • 59. Congestion-Control Mechanism Service •It is a service for the general welfare of the Internet rather than for the direct benefit of the communicating processes. •The TCP congestion-control mechanism throttles a sending process (client or server) when the network is congested between sender and receiver.
  • 60. UDP Services •UDP is connectionless, so there is no handshaking before the two processes start to communicate. •UDP provides an unreliable data transfer service •that is, when a process sends a message into a UDP socket, UDP provides no guarantee that the message will ever reach the receiving process.
  • 61. UDP Services •Messages that do arrive at the receiving process may arrive out of order. •UDP does not include a congestion-control mechanism, so the sending side of UDP can pump data into the layer below (the network layer) at any rate it pleases. Contd…
  • 62. SYLLABUS – MODULE 1 •2.1 Principles of Network Applications: •Network Application Architectures, •Processes Communicating, •Transport Services Available to Applications, •Transport Services Provided by the Internet, •Application-Layer Protocols.
  • 63. Application-Layer Protocols •An Application-Layer Protocol defines how an application’s processes, running on different end systems, pass messages to each other.
  • 64. Services Not Provided by Internet Transport Protocols •Throughput guarantee or Timing guarantee— services not provided by today’s Internet transport protocols. •Today’s Internet can often provide satisfactory service to time-sensitive applications, but it cannot provide any timing or throughput guarantees.
  • 65. In particular, an Application-Layer Protocol defines •The types of messages exchanged, request messages and response messages. •The syntax of the various message types, such as the fields in the message and how the fields are delineated. •The semantics of the fields, that is, the meaning of the information in the fields. •Rules for determining when and how a process sends messages and responds to messages.
  • 66. Distinguish between Network Applications and Application-Layer Protocols •An Application-Layer Protocol is only one piece of a Network Application. •Examples: •Web Application •Internet E-Mail Application
  • 67. Example 1  Web Application •The Web is a client-server application that allows users to obtain documents from Web servers on demand. •The Web application consists of many components, including a standard for document formats (that is, HTML), Web Browsers (for example, Firefox and Microsoft Internet Explorer), Web servers (for example, Apache and Microsoft servers), and an Application-Layer protocol.
  • 68. Example 1  Web Application •The Web’s Application-Layer Protocol, HTTP, defines the format and sequence of messages exchanged between Browser and Web Server. •Thus, HTTP is only one piece of the Web Application. Contd…
  • 69. Example 2: Internet E-Mail Application •It has many components, including mail servers that house user mailboxes; mail clients (such as Microsoft Outlook) that allow users to read and create messages; a standard for defining the structure of an e-mail message; and application-layer protocols that define how messages are passed between servers, how messages are passed between servers and mail clients, and how the contents of message headers are to be interpreted.
  • 70. Example 2: Internet E-Mail Application •The principal Application-Layer Protocol for electronic mail is SMTP (Simple Mail Transfer Protocol). •Thus, e-mail’s Principal Application-Layer Protocol, SMTP, is only one piece of the E-mail Application. Contd…
  • 72. SYLLABUS – MODULE 1 •Principles of Network Applications: Network Application Architectures, Processes Communicating, Transport Services Available to Applications, Transport Services Provided by the Internet, Application-Layer Protocols. •The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET, •File Transfer: FTP Commands & Replies, •Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail Access Protocols, •DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS Works, DNS Records and Messages, •Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables, •Socket Programming: creating Network Applications: Socket Programming with UDP, Socket Programming with TCP.
  • 73. 2.2 The Web and HTTP -- Syllabus 2.2.1 Overview of HTTP 2.2.2 Non-persistent and Persistent Connections 2.2.3 HTTP Message Format 2.2.4 User-Server Interaction: Cookies 2.2.5 Web Caching 2.2.6 Conditional GET
  • 74. Overview of HTTP •The HyperText Transfer Protocol (HTTP), the Web’s application-layer protocol, is at the heart of the Web. •HTTP is implemented in two programs: •A Client Program and •A Server Program.
  • 75. Overview of HTTP •The client program and server program, executing on different end systems. •They may talk to each other by exchanging HTTP messages. •HTTP defines the structure of these messages and how the client and server exchange the messages. •URL  Uniform Resource Locator Contd…
  • 76. Some Web Terminologies •A Web page (also called a document) consists of objects. •An object is simply a file—such as an HTML file, a JPEG image, a Java applet, or a video clip—that is addressable by a single URL. •Most Web pages consist of a base HTML file and several referenced objects.
  • 77. Some Web Terminologies •For Example, if a Web page contains HTML text and five JPEG images, then the Web page has six objects: the base HTML file plus the five images. Contd…
  • 78. Some Web Terminologies •The base HTML file references the other objects in the page with the objects’ URLs. •Each URL has two components: •The hostname of the server that houses the object and •The object’s path name. Contd…
  • 79. For Example •Consider the URL: http://www.abc.edu/myStore/picture.gif •www.abc.edu for a hostname and •/myStore/picture.gif for a path name
  • 80. Some Web Terminologies •Web browsers (such as Internet Explorer and Firefox) implement the client side of HTTP •Web servers implement the server side of HTTP, house Web objects, each addressable by a URL. •Popular Web servers include Apache and Microsoft Internet Information Server Contd…
  • 81. •HTTP defines how Web clients request Web pages from Web servers and how servers transfer Web pages to clients. •When a user requests a Web page (for example, clicks on a hyperlink), the browser sends HTTP request messages for the objects in the page to the server. The server receives the requests and responds with HTTP response messages that contain the objects.
  • 83. HTTP •HTTP uses TCP as its underlying transport protocol. •The HTTP client first initiates a TCP connection with the server. •Once the connection is established, the browser and the server processes access TCP through their socket interfaces.
  • 84. Client Side and Server Side Sockets •On the Client Side, the socket interface is the door between the client process and the TCP connection; •On the Server Side, it is the door between the server process and the TCP connection.
  • 85. Request Response Process •The client sends HTTP request messages into its socket interface and receives HTTP response messages from its socket interface. •Similarly, the HTTP server receives request messages from its socket interface and sends response messages into its socket interface.
  • 86. HTTP is said to be a Stateless Protocol •The server sends requested files to clients without storing any state information about the client. •If a particular client asks for the same object twice in a period of a few seconds, the server does not respond by saying that it just served the object to the client; instead, the server resends the object, as it has completely forgotten. •Because an HTTP server maintains no information about the clients, HTTP is said to be a stateless protocol.
  • 87. 2.2 The Web and HTTP -- Syllabus 2.2.1 Overview of HTTP 2.2.2 Non-persistent and Persistent Connections 2.2.3 HTTP Message Format 2.2.4 User-Server Interaction: Cookies 2.2.5 Web Caching 2.2.6 Conditional GET
  • 88. Non-Persistent and Persistent Connections •HTTP, which can use both non-persistent connections and persistent connections. •Although HTTP uses persistent connections in its default mode, HTTP clients and servers can be configured to use non-persistent connections instead.
  • 89. The steps of transferring a Web page from server to client for the case of non-persistent connections •Let’s suppose the page consists of a base HTML file and 10 JPEG images, and that all 11 of these objects reside on the same server. •Suppose the URL for the base HTML file is: http://www.abc.edu/myDepartment/home.index
  • 90. Steps involved in the request http://www.abc.edu/myDepartment/home.index •1. The HTTP client process initiates a TCP connection to the server www.abc.edu on port number 80, which is the default port number for HTTP. Associated with the TCP connection, there will be a socket at the client and a socket at the server.
  • 91. Steps involved in the request http://www.abc.edu/myDepartment/home.index •2. The HTTP client sends an HTTP request message to the server via its socket. The request message includes the path name /myDepartment/home.index.
  • 92. Steps involved in the request http://www.abc.edu/myDepartment/home.index •3. The HTTP server process receives the request message via its socket, retrieves the object /myDepartment/home.index from its storage (RAM or disk), encapsulates the object in an HTTP response message, and sends the response message to the client via its socket.
  • 93. Steps involved in the request http://www.abc.edu/myDepartment/home.index •4. The HTTP server process tells TCP to close the TCP connection. (But TCP doesn’t actually terminate the connection until it knows for sure that the client has received the response message intact.)
  • 94. Steps involved in the request http://www.abc.edu/myDepartment/home.index •5. The HTTP client receives the response message. The TCP connection terminates. The message indicates that the encapsulated object is an HTML file. The client extracts the file from the response message, examines the HTML file, and finds references to the 10 JPEG objects.
  • 95. Steps involved in the request http://www.abc.edu/myDepartment/home.index •6. The first four steps are then repeated for each of the referenced JPEG objects.
  • 96. HTTP with Non-Persistent Connections •The steps above illustrate the use of non-persistent connections. •Here, each TCP connection is closed after the server sends the object—the connection does not persist for other objects. •Each TCP connection transports exactly one request message and one response message. •Thus, in this example, when a user requests the Web page, 11 TCP connections are generated.
  • 97. Round-Trip Time (RTT) •It is the time it takes for a small packet to travel from client to server and then back to the client. •The RTT includes packet-propagation delays, packet queuing delays in intermediate routers and switches, and packet-processing delays.
  • 98. “three-way handshake” •Client sends a small TCP segment to the server, •The server acknowledges and responds with a small TCP segment, and, •The client acknowledges back to the server.
  • 99. “three-way handshake” •The first two parts of the three way handshake take one RTT. •After completing the first two parts of the handshake, the client sends the HTTP request message combined with the third part of the three-way handshake (the acknowledgment) into the TCP connection. Contd…
  • 100. “three-way handshake” •Once the request message arrives at the server, the server sends the HTML file into the TCP connection. •This HTTP request/response eats up another RTT. •Thus the total response time is two RTTs plus the transmission time at the server of the HTML file. Contd…
  • 101. Disadvantages of Non Persistent Connections 1. A new connection must be established and maintained for each requested object. For each of these connections, TCP buffers must be allocated and TCP variables must be kept in both the client and server. This can place a significant burden on the Web server, which may be serving requests from hundreds of different clients simultaneously. 2. Each object suffers a delivery delay of two RTTs— one RTT to establish the TCP connection and one RTT to request and receive an object.
  • 102. HTTP with Persistent Connections •The server leaves the TCP connection open after sending a response. •Subsequent requests and responses between the same client and server can be sent over the same connection. 1/4
  • 103. HTTP with Persistent Connections •In particular, an entire Web page (in our example, the base HTML file and the 10 images) can be sent over a single persistent TCP connection. •Multiple Web pages residing on the same server can be sent from the server to the same client over a single persistent TCP connection. Contd… 2/4
  • 104. HTTP with Persistent Connections •These requests for objects can be made back- to-back, without waiting for replies to pending requests (pipelining). •The HTTP server closes a connection when it isn’t used for a certain time (a configurable timeout interval). Contd… 3/4
  • 105. HTTP with Persistent Connections •When the server receives the back-to-back requests, it sends the objects back-to-back. •The default mode of HTTP uses persistent connections with pipelining. Contd… 4/4
  • 106. 2.2 The Web and HTTP -- Syllabus 2.2.1 Overview of HTTP 2.2.2 Non-persistent and Persistent Connections 2.2.3 HTTP Message Format 2.2.4 User-Server Interaction: Cookies 2.2.5 Web Caching 2.2.6 Conditional GET
  • 107. HTTP Message Format •The HTTP specifications include the definitions of the HTTP message formats. •There are two types of HTTP messages, •HTTP Request messages and •HTTP Response messages
  • 108. HTTP Request Message •A typical HTTP request message: GET /somedir/page.html HTTP/1.1 Host: www.someschool.edu Connection: close User-agent: Mozilla/5.0 Accept-language: fr
  • 109. Characteristics of the Simple Request Message •The message is written in ordinary ASCII text, so that ordinary computer-literate human being can read it. •The message consists of five lines, each followed by a carriage return and a line feed. The last line is followed by an additional carriage return and line feed. Although this particular request message has five lines, a request message can have many more lines or as few as one line.
  • 110. Characteristics of the Simple Request Message •The first line of an HTTP request message is called the request line; the subsequent lines are called the header lines. •The request line has three fields: •The method field, •the URL field, and •the HTTP version field.
  • 111. Characteristics of the Simple Request Message •The method field can take on several different values, including GET, POST, HEAD, PUT, and DELETE. •The great majority of HTTP request messages use the GET method. •The GET method is used when the browser requests an object, with the requested object identified in the URL field. •In this example, the browser is requesting the object /somedir/page.html.
  • 112. Consider The Header line Host: www.someschool.edu •It specifies the host on which the object resides. This header line is unnecessary, as there is already a TCP connection in place to the host. But the information provided by the host header line is required by Web proxy caches.
  • 113. Consider The Header line Connection: close •The Browser is telling the server that it doesn’t want to bother with persistent connections; it wants the server to close the connection after sending the requested object.
  • 114. Consider The Header line User-agent: Mozilla/5.0 •It specifies the user agent, that is, the browser type that is making the request to the server. •Here the user agent is Mozilla/5.0, a Firefox browser. •This header line is useful because the server can actually send different versions of the same object to different types of user agents.
  • 115. Consider The Header line Accept-language: fr •indicates that the user prefers to receive a French version of the object, if such an object exists on the server; otherwise, the server should send its default version. •The Accept-language: header is just one of many content negotiation headers available in HTTP.
  • 116. General Format of a Request Message
  • 117. General Format of a Request Message •After the header lines (and the additional carriage return and line feed) there is an “entity body.” •The entity body is empty with the GET method, but is used with the POST method. • An HTTP client uses the POST method when the user fills out a form. •For example, when a user provides search words to a search engine.
  • 118. General Format of a Request Message •With a POST message, the user is still requesting a Web page from the server, but the specific contents of the Web page depend on what the user entered into the form fields. •If the value of the method field is POST, then the entity body contains what the user entered into the form fields. Contd…
  • 119. General Format of a Request Message •A request generated with a form does not necessarily use the POST method. Instead, HTML forms use the GET method and include the inputted data (in the form fields) in the requested URL. Contd…
  • 120. General Format of a Request Message •For example, if a form uses the GET method, has two fields, and the inputs to the two fields are dogs and cats, then the URL will have the structure www.abc.com/animalsearch?dogs&cats Contd…
  • 121. General Format of a Request Message •The HEAD method is similar to the GET method. •When a server receives a request with the HEAD method, it responds with an HTTP message but it leaves out the requested object. •Application developers often use the HEAD method for debugging. Contd…
  • 122. General Format of a Request Message •The PUT method is often used in conjunction with Web publishing tools. •It allows a user to upload an object to a specific path (directory) on a specific Web server. •The PUT method is also used by applications that need to upload objects to Web servers. •The DELETE method allows a user, or an application, to delete an object on a Web server. Contd…
  • 123. HTTP Response Message HTTP/1.1 200 OK Connection: close Date: Tue, 09 Aug 2011 15:44:04 GMT Server: Apache/2.2.3 (CentOS) Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT Content-Length: 6821 Content-Type: text/html (data data data data data ...)
  • 124. HTTP Response Message •HTTP Response Message has three sections: •An initial status line, •Six header lines, and •The entity body. •The entity body is the meat of the message—it contains the requested object itself (represented by data data data data data ...). Contd…
  • 125. HTTP Response Message  Status Line •The status line has three fields: •The protocol version field, •A status code, and •A corresponding status message. •In this example, the status line indicates that the server is using HTTP/1.1 and that everything is OK (that is, the server has found, and is sending, the requested object).
  • 126. HTTP Response Message  Header Lines •First Header Line is Connection: close This header line to tell the client that it is going to close the TCP connection after sending the message.
  • 127. HTTP Response Message HTTP/1.1 200 OK Connection: close Date: Tue, 09 Aug 2011 15:44:04 GMT Server: Apache/2.2.3 (CentOS) Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT Content-Length: 6821 Content-Type: text/html (data data data data data ...)
  • 128. HTTP Response Message  Header Lines •Second Header Line is Date: Tue, 09 Aug 2011 15:44:04 GMT This header line indicates the time and date when the HTTP response was created and sent by the server. Note that this is not the time when the object was created or last modified; it is the time when the server retrieves the object from its file system, inserts the object into the response message, and sends the response message.
  • 129. HTTP Response Message  Header Lines •Third Header Line is Server: Apache/2.2.3 (CentOS) This header line indicates that the message was generated by an Apache Web server; it is analogous to the User-agent: header line in the HTTP request message.
  • 130. HTTP Response Message  Header Lines •Fourth Header Line is Last-Modified: Tue, 09 Aug 2011 15:11:03 GMT This header line indicates the time and date when the object was created or last modified
  • 131. HTTP Response Message  Header Lines •Fifth Header Line is Content-Length: 6821 This header line indicates the number of bytes in the object being sent.
  • 132. HTTP Response Message  Header Lines •Sixth Header Line is Content-Type: text/html This header line indicates that the object in the entity body is HTML text. (The object type is officially indicated by the Content-Type: header and not by the file extension).
  • 133. The status code and associated phrase •The status code and associated phrase indicate the result of the request.
  • 134. Some common status codes and associated phrases include •200 OK: Request succeeded and the information is returned in the response. •301 Moved Permanently: Requested object has been permanently moved; the new URL is specified in Location: header of the response message. The client software will automatically retrieve the new URL.
  • 135. Some common status codes and associated phrases include •400 Bad Request: This is a generic error code indicating that the request could not be understood by the server. •404 Not Found: The requested document does not exist on this server. •505 HTTP Version Not Supported: The requested HTTP protocol version is not supported by the server Contd…
  • 136. 2.2 The Web and HTTP -- Syllabus 2.2.1 Overview of HTTP 2.2.2 Non-persistent and Persistent Connections 2.2.3 HTTP Message Format 2.2.4 User-Server Interaction: Cookies 2.2.5 Web Caching 2.2.6 Conditional GET
  • 137. User-Server Interaction: Cookies •Web servers that can handle thousands of simultaneous TCP connections. •Web site has to identify users, either because the server wishes to restrict user access or because it wants to serve content as a function of the user identity. For these purposes, HTTP uses cookies. •Cookies allow sites to keep track of users.
  • 138. Cookie technology has four components 1. A cookie header line in the HTTP response message; 2. A cookie header line in the HTTP request message; 3. A cookie file kept on the user’s end system and managed by the user’s browser; and 4. A back-end database at the Web site
  • 140. An Example of how Cookies work •Suppose Sushanth, who always accesses the Web using Internet Explorer from his home PC, contacts amazon.com for the first time. •Let us suppose that in the past he has already visited the eBay site.
  • 141. An Example of how Cookies work •When the request comes into the Amazon Web server, the server creates a unique identification number and creates an entry in its back-end database that is indexed by the identification number. Contd…
  • 142. An Example of how Cookies work •The Amazon Web server then responds to Sushanth’s browser, including in the HTTP response a Set-cookie: header, which contains the identification number. •For example, the header line might be: Set-cookie: 1678 Contd…
  • 143. •When Sushanth’s browser receives the HTTP response message, it sees the Setcookie: header. The browser then appends a line to the special cookie file that it manages. •This line includes the hostname of the server and the identification number in the Set-cookie: header. •Note that the cookie file already has an entry for eBay, since Sushanth has visited that site in the past. An Example of how Cookies work Contd…
  • 144. •As Sushanth continues to browse the Amazon site, each time he requests a Web page, his browser consults his cookie file, extracts his identification number for this site, and puts a cookie header line that includes the identification number in the HTTP request. •Specifically, each of his HTTP requests to the Amazon server includes the header line: Cookie: 1678 An Example of how Cookies work Contd…
  • 145. Cookies can be used to identify a user •The first time a user visits a site, the user can provide a user identification (possibly his or her name). •During the subsequent sessions, the browser passes a cookie header to the server, thereby identifying the user to the server. •Cookies can thus be used to create a user session layer on top of stateless HTTP.
  • 146. Cookies can be used to identify a user •For Example, when a user logs in to a Web- based e-mail application (such as Hotmail), the browser sends cookie information to the server, permitting the server to identify the user throughout the user’s session with the application. Contd…
  • 147. 2.2 The Web and HTTP -- Syllabus 2.2.1 Overview of HTTP 2.2.2 Non-persistent and Persistent Connections 2.2.3 HTTP Message Format 2.2.4 User-Server Interaction: Cookies 2.2.5 Web Caching 2.2.6 Conditional GET
  • 148. Web Caching (Or Proxy Server) •It is a network entity that satisfies HTTP requests on the behalf of an origin Web server. •The Web cache has its own disk storage and keeps copies of recently requested objects in this storage
  • 149. Web Caching (Or Proxy Server) •A user’s browser can be configured so that all of the user’s HTTP requests are first directed to the Web cache. •Once a browser is configured, each browser request for an object is first directed to the Web cache Contd…
  • 150. Clients requesting objects through a Web cache
  • 151. Clients requesting objects through a Web cache •A user’s browser can be configured so that all of the user’s HTTP requests are first directed to the Web cache. •Once a browser is configured, each browser request for an object is first directed to the Web cache Contd…
  • 152. Clients requesting objects through a Web cache •Suppose a browser is requesting the object http://www.someschool.edu/campus.gif. •Here is what happens: Contd… Example
  • 153. A browser is requesting the object http://www.someschool.edu/campus.gif • The browser establishes a TCP connection to the Web cache and sends an HTTP request for the object to the Web cache. • The Web cache checks to see if it has a copy of the object stored locally. If it does, the Web cache returns the object within an HTTP response message to the client browser. • If the Web cache does not have the object, the Web cache opens a TCP connection to the origin server, that is, to www.someschool.edu. The Web cache then sends an HTTP request for the object into the cache-to-server TCP connection. After receiving this request, the origin server sends the object within an HTTP response to the Web cache. • When the Web cache receives the object, it stores a copy in its local storage and sends a copy, within an HTTP response message, to the client browser.
  • 154. A browser is requesting the object http://www.someschool.edu/campus.gif •The browser establishes a TCP connection to the Web cache and sends an HTTP request for the object to the Web cache. •The Web cache checks to see if it has a copy of the object stored locally. If it does, the Web cache returns the object within an HTTP response message to the client browser. 1/3
  • 155. A browser is requesting the object http://www.someschool.edu/campus.gif •If the Web cache does not have the object, the Web cache opens a TCP connection to the origin server, that is, to www.someschool.edu. •The Web cache then sends an HTTP request for the object into the cache-to-server TCP connection. Contd… 2/3
  • 156. A browser is requesting the object http://www.someschool.edu/campus.gif •After receiving this request, the origin server sends the object within an HTTP response to the Web cache. •When the Web cache receives the object, it stores a copy in its local storage and sends a copy, within an HTTP response message, to the client browser. Contd… 3/3
  • 157. Note that a cache is both a server and a client at the same time. •When it receives requests from and sends responses to a browser, it is a server. •When it sends requests to and receives responses from an origin server, it is a client.
  • 158. Web caching has seen deployment in the Internet for two reasons. •First, A Web cache can substantially reduce the response time for a client request, particularly if the bottleneck bandwidth between the client and the origin server is much less than the bottleneck bandwidth between the client and the cache. If there is a high- speed connection between the client and the cache and if the cache has the requested object, then the cache will be able to deliver the object rapidly to the client.
  • 159. Web caching has seen deployment in the Internet for two reasons. •Second, as we will soon illustrate with an example, Web caches can substantially reduce traffic on an institution’s access link to the Internet. By reducing traffic, the institution (for example, a company or a university) does not have to upgrade bandwidth as quickly, thereby reducing costs. Web caches can substantially reduce Web traffic in the Internet as a whole, thereby improving performance for all applications.
  • 160. Bottleneck between an institutional network and the Internet
  • 161. Explanation of the Diagram •This figure shows two networks •The institutional network and •The rest of the public Internet.
  • 162. The Institutional Network •It is a high-speed LAN. •A router in the institutional network and a router in the Internet are connected by a 15 Mbps link. •The origin servers are attached to the Internet but are located all over the globe. 1/3
  • 163. The Institutional Network •Suppose that the average object size is 1 Mbits and that the average request rate from the institution’s browsers to the origin servers is 15 requests per second. •Suppose that the HTTP request messages are negligibly small and thus create no traffic in the networks or in the access link (from institutional router to Internet router). Contd… 2/3
  • 164. The Institutional Network •Suppose that the amount of time it takes from when the router on the Internet side of the access link forwards an HTTP request (within an IP datagram) until it receives the response (within many IP datagrams) is two seconds on average. •Informally, we refer to this last delay as the “Internet delay.” Contd… 3/3
  • 165. The Total Response Time •It is the time from the browser’s request of an object until its receipt of the object. •It is the sum of the LAN delay, the access delay (that is, the delay between the two routers), and the Internet delay.
  • 166. Calculation of these Delays The Traffic intensity on the LAN is (15 requests/sec)*(1 Mbits/request)/(100 Mbps) = 0.15 The traffic intensity on the access link (from the Internet router to institution router) is (15 requests/sec) (1 Mbits/request)/(15 Mbps) = 1
  • 167. 2.2 The Web and HTTP -- Syllabus 2.2.1 Overview of HTTP 2.2.2 Non-persistent and Persistent Connections 2.2.3 HTTP Message Format 2.2.4 User-Server Interaction: Cookies 2.2.5 Web Caching 2.2.6 Conditional GET
  • 168. The Conditional GET •Although caching can reduce user-perceived response times, it introduces a new problem—the copy of an object residing in the cache may be stale. •In other words, the object housed in the Web server may have been modified since the copy was cached at the client. Contd… 1/3
  • 169. The Conditional GET •HTTP has a mechanism that allows a cache to verify that its objects are up to date. This mechanism is called the Conditional GET. •An HTTP request message is a so-called conditional GET message if (1) the request message uses the GET method and (2) the request message includes an If-Modified-Since: header line Contd… 2/3
  • 170. How the Conditional GET Operates? • First, On the behalf of a requesting browser, a proxy cache sends a request message to a Web server: GET /fruit/kiwi.gif HTTP/1.1 Host: www.exotiquecuisine.com • Second, The Web server sends a response message with the requested object to the cache: HTTP/1.1 200 OK Date: Sat, 8 Oct 2011 15:39:29 Server: Apache/1.3.0 (Unix) Last-Modified: Wed, 7 Sep 2011 09:23:24 Content-Type: image/gif (data data data data data ...)
  • 171. How the Conditional GET Operates? • The cache forwards the object to the requesting browser but also caches the object locally. Importantly, the cache also stores the last-modified date along with the object. • Third, one week later, another browser requests the same object via the cache, and the object is still in the cache. Since this object may have been modified at the Web server in the past week, the cache performs an up-to-date check by issuing a conditional GET. • The cache sends GET /fruit/kiwi.gif HTTP/1.1 Host: www.exotiquecuisine.com Contd…
  • 172. How the Conditional GET Operates? • The value of the If-modified-since: header line is exactly equal to the value of the Last-Modified: header line that was sent by the server one week ago. This conditional GET is telling the server to send the object only if the object has been modified since the specified date. Suppose the object has not been modified since 7 Sep 2011 09:23:24. • Then, fourth, the Web server sends a response message to the cache: HTTP/1.1 304 Not Modified Date: Sat, 15 Oct 2011 15:39:29 Server: Apache/1.3.0 (Unix) (empty entity body) Contd…
  • 173. How the Conditional GET Operates? •We see that in response to the conditional GET, the Web server still sends a response message but does not include the requested object in the response message. •Including the requested object would only waste bandwidth and increase user-perceived response time, particularly if the object is large. •The last response message has 304 Not Modified in the status line, which tells the cache that it can go ahead and forward its (the proxy cache’s) cached copy of the object to the requesting browser. Contd…
  • 174. End of 2.2 Web and HTTP
  • 175. SYLLABUS – MODULE 1 •Principles of Network Applications: Network Application Architectures, Processes Communicating, Transport Services Available to Applications, Transport Services Provided by the Internet, Application-Layer Protocols. •The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET, •File Transfer: FTP Commands & Replies, •Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail Access Protocols, •DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS Works, DNS Records and Messages, •Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables, •Socket Programming: creating Network Applications: Socket Programming with UDP, Socket Programming with TCP.
  • 177. File Transfer Protocol: FTP •In a typical FTP session, the user is sitting in front of one host (the local host) and wants to transfer files to or from a remote host. •In order for the user to access the remote account, the user must provide a user identification and a password. •After providing this authorization information, the user can transfer files from the local file system to the remote file system and vice versa
  • 178. FTP moves files between local and remote file system
  • 179. FTP moves files between local and remote file system •The user interacts with FTP through an FTP user agent. •The user first provides the hostname of the remote host, causing the FTP client process in the local host to establish a TCP connection with the FTP server process in the remote host. Contd…
  • 180. FTP moves files between local and remote file system •The user then provides the user identification and password, which are sent over the TCP connection as part of FTP commands. •Once the server has authorized the user, the user copies one or more files stored in the local file system into the remote file system (or vice versa). Contd…
  • 181. Control Connection and Data Connection
  • 182. TCP Connections in FTP •FTP uses two parallel TCP connections to transfer a file, •A Control Connection and •A Data Connection.
  • 183. Control Connection and Data Connection •The Control Connection is used for sending control information between the two hosts— •The Control information such as •User identification, •Password, •Commands to change remote directory, and •Commands to “put” and “get” files. •The Data Connection is used to send actual file.
  • 184. Difference Between FTP and HTTP First Difference is •FTP uses a separate Control connection, So FTP is said to send its Control information out-of-band. •HTTP sends request and response header lines into the same TCP connection that carries the transferred file itself. For this reason, HTTP is said to send its Control information in-band
  • 185. Difference Between FTP and HTTP Second Difference is • The FTP server must maintain state about the user. • The server must associate the control connection with a specific user account, and the server must keep track of the user’s current directory as the user wanders about the remote directory tree. • Keeping track of this state information for each ongoing user session significantly constrains the total number of sessions that FTP can maintain simultaneously • HTTP is stateless—it does not have to keep track of any user state.
  • 186. Control Connection and Data Connection
  • 187. Operation of FTP •When a user starts an FTP session with a remote host, the client side of FTP (user) first initiates a control TCP connection with the server side (remote host) on server port number 21. 1/5
  • 188. Operation of FTP •The client side of FTP sends the user identification and password over this control connection. •The client side of FTP also sends, over the control connection, commands to change the remote directory. Contd… 2/5
  • 189. Operation of FTP •When the server side receives a command for a file transfer over the control connection (either to, or from, the remote host), the server side initiates a TCP data connection to the client side. Contd… 3/5
  • 190. Operation of FTP •FTP sends exactly one file over the data connection and then closes the data connection. •If, during the same session, the user wants to transfer another file, FTP opens another data connection. Contd… 4/5
  • 191. Operation of FTP •Thus, with FTP, the control connection remains open throughout the duration of the user session, but a new data connection is created for each file transferred within a session (that is, the data connections are non-persistent). Contd… 5/5
  • 192. FTP Commands and Replies •The commands, from client to server, and replies, from server to client, are sent across the control connection in 7-bit ASCII format. Thus, like HTTP commands, FTP commands are readable by people. •Each command consists of four uppercase ASCII characters, some with optional arguments.
  • 193. Some of the Commands are •USER username: Used to send the user identification to the server. •PASS password: Used to send the user password to the server. 1/4
  • 194. Some of the Commands are •LIST: Used to ask the server to send back a list of all the files in the current remote directory. The list of files is sent over a (new and non- persistent) data connection rather than the control TCP connection. Contd… 2/4
  • 195. Some of the Commands are •RETR filename: Used to retrieve (that is, get) a file from the current directory of the remote host. This command causes the remote host to initiate a data connection and to send the requested file over the data connection. Contd… 3/4
  • 196. Some of the Commands are •STOR filename: Used to store (that is, put) a file into the current directory of the remote host. Contd… 4/4
  • 197. Contd… •There is a one-to-one correspondence between the command that the user issues and the FTP command sent across the control connection. •Each command is followed by a reply, sent from server to client. The replies are three- digit numbers, with an optional message following the number.
  • 198. Some replies, along with their possible messages 331 Username OK, password required. 125 Data connection already open; transfer starting. 425 Can’t open data connection. 452 Error writing file.
  • 199. SYLLABUS – MODULE 1 •Principles of Network Applications: Network Application Architectures, Processes Communicating, Transport Services Available to Applications, Transport Services Provided by the Internet, Application-Layer Protocols. •The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET, •File Transfer: FTP Commands & Replies, •Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail Access Protocols, •DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS Works, DNS Records and Messages, •Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables, •Socket Programming: creating Network Applications: Socket Programming with UDP, Socket Programming with TCP.
  • 200. 2.4 Electronic Mail in the Internet -- Syllabus •SMTP, •Comparison with HTTP, •Mail Message Format, •Mail Access Protocols
  • 201. Introduction to E-Mail •E-Mail is an asynchronous communication medium. •People send and read messages when it is convenient for them, without having to coordinate with other people’s schedules. •Electronic Mail is fast, easy to distribute, and inexpensive. •Modern e-mail has many powerful features, including messages with attachments, hyperlinks, HTML-formatted text, and embedded photos.
  • 202. High-Level View of the Internet Mail System
  • 203. High-Level View of the Internet Mail System •Internet Mail has three major components: •User Agents, •Mail Servers, and •The Simple Mail Transfer Protocol (SMTP)
  • 204. Example •Alice, sending an e-mail message to a recipient, Bob. •User agents allow users to read, reply to, forward, save, and compose messages. •Microsoft Outlook and Apple Mail are examples of user agents for e-mail.
  • 205. Example -- Contd… •When Alice is finished composing her message, her user agent sends the message to her mail server, where the message is placed in the mail server’s outgoing message queue. •When Bob wants to read a message, his user agent retrieves the message from his mailbox in his mail server.
  • 206. Example •Mail servers form the core of the e-mail infrastructure. •Each recipient, such as Bob, has a mailbox located in one of the mail servers. •Bob’s mailbox manages and maintains the messages that have been sent to him. Contd…
  • 207. Example •A typical message starts its journey in the sender’s user agent, travels to the sender’s mail server, and travels to the recipient’s mail server, where it is deposited in the recipient’s mailbox Contd…
  • 208. Example •When Bob wants to access the messages in his mailbox, the mail server containing his mailbox authenticates Bob (with usernames and passwords). •Alice’s mail server must also deal with failures in Bob’s mail server. Contd…
  • 209. Example •If Alice’s server cannot deliver mail to Bob’s server, Alice’s server holds the message in a message queue and attempts to transfer the message later. •Reattempts are often done every 30 minutes or so; if there is no success after several days, the server removes the message and notifies the sender (Alice) with an e-mail message. Contd…
  • 210. 2.4 Electronic Mail in the Internet -- Syllabus •SMTP, •Comparison with HTTP, •Mail Message Format, •Mail Access Protocols
  • 211. SMTP •SMTP is the principal application-layer protocol for Internet electronic mail. •It uses the reliable data transfer service of TCP to transfer mail from the sender’s mail server to the recipient’s mail server.
  • 212. SMTP has two sides •SMTP has two sides •A Client Side, which executes on the sender’s mail server, and •A Server Side, which executes on the recipient’s mail server. •Both the client and server sides of SMTP run on every mail server.
  • 213. SMTP has two sides •When a mail server sends mail to other mail servers, it acts as an SMTP client. •When a mail server receives mail from other mail servers, it acts as an SMTP server
  • 214. 2.4.1 SMTP •SMTP is at the heart of Internet electronic mail. •SMTP transfers messages from senders’ mail servers to the recipients’ mail servers. •SMTP is much older than HTTP. •SMTP restricts the body (not just the headers) of all mail messages to simple 7-bit ASCII.
  • 215. The Basic Operation of SMTP Suppose Alice wants to send Bob a simple ASCII message. 1. Alice invokes her user agent for e-mail, provides Bob’s e-mail address (for example, bob@someschool.edu), composes a message, and instructs the user agent to send the message. 2. Alice’s user agent sends the message to her mail server, where it is placed in a message queue 1/3
  • 216. The Basic Operation of SMTP Suppose Alice wants to send Bob a simple ASCII message. 3. The client side of SMTP, running on Alice’s mail server, sees the message in the message queue. It opens a TCP connection to an SMTP server, running on Bob’s mail server. 4. After some initial SMTP handshaking, the SMTP client sends Alice’s message into the TCP connection. Contd… 2/3
  • 217. The Basic Operation of SMTP Suppose Alice wants to send Bob a simple ASCII message. 5. At Bob’s mail server, the server side of SMTP receives the message. Bob’s mail server then places the message in Bob’s mailbox. 6. Bob invokes his user agent to read the message at his convenience. Contd… 3/3
  • 218. Alice sends a message to Bob
  • 219. •SMTP does not normally use intermediate mail servers for sending mail, even when the two mail servers are located at opposite ends of the world.
  • 220. How SMTP transfers a message from a sending mail server to a receiving mail server •First, the client SMTP (running on the sending mail server host) has TCP establish a connection to port 25 at the server SMTP (running on the receiving mail server host). If the server is down, the client tries again later. 1/3
  • 221. How SMTP transfers a message from a sending mail server to a receiving mail server •Once this connection is established, SMTP client indicates the e-mail address of the sender (the person who generated the message) and the e-mail address of the recipient. •The client sends the message. Contd… 2/3
  • 222. How SMTP transfers a message from a sending mail server to a receiving mail server •SMTP can count on the reliable data transfer service of TCP to get the message to the server without errors. •The client then repeats this process over the same TCP connection if it has other messages to send to the server; otherwise, it instructs TCP to close the connection. Contd… 3/3
  • 223. 2.4 Electronic Mail in the Internet -- Syllabus •SMTP, •Comparison with HTTP, •Mail Message Format, •Mail Access Protocols
  • 224. Comparison of SMTP with HTTP •HTTP transfers files (also called objects) from a Web server to a Web client (typically a browser); SMTP transfers files (that is, e-mail messages) from one mail server to another mail server. •When transferring the files, both persistent HTTP and SMTP use persistent connections. Thus, the two protocols have common characteristics.
  • 225. Difference between SMTP with HTTP •HTTP is a pull protocol—someone loads information on a Web server and users use HTTP to pull the information from the server at their convenience. The TCP connection is initiated by the machine that wants to receive the file. •SMTP is a push protocol—the sending mail server pushes the file to the receiving mail server. The TCP connection is initiated by the machine that wants to send the file. The First difference is
  • 226. Difference between SMTP with HTTP •SMTP requires each message, including the body of each message, to be in 7-bit ASCII format. If the message contains characters that are not 7-bit ASCII (for example, French characters with accents) or contains binary data (such as an image file), then the message has to be encoded into 7-bit ASCII. •HTTP data does not impose this restriction. The Second Difference is
  • 227. Difference between SMTP with HTTP •HTTP encapsulates each object in its own HTTP response message. •Internet mail (SMTP) places all of the message’s objects into one message. The Third Difference is
  • 228. 2.4 Electronic Mail in the Internet -- Syllabus •SMTP, •Comparison with HTTP, •Mail Message Format, •Mail Access Protocols
  • 229. 2.4.3 Mail Message Formats •When an e-mail message is sent from one person to another, a header containing peripheral information precedes the body of the message itself. •This peripheral information is contained in a series of header lines. Contd… 1/5
  • 230. 2.4.3 Mail Message Formats •The header lines and the body of the message are separated by a blank line •Each header line contains readable text, consisting of a keyword followed by a colon followed by a value. •Some of the keywords are required and others are optional. Contd… 2/5
  • 231. 2.4.3 Mail Message Formats •Every header must have •A From: header line •A To: header line; •A header may include a Subject: header line •Other optional header lines. •It is important to note that these header lines are different from the SMTP commands Contd… 3/5
  • 232. 2.4.3 Mail Message Formats •A typical message header looks like this: From: alice@crepes.fr To: bob@hamburger.edu Subject: Seeking Permission. Contd… 4/5
  • 233. 2.4.3 Mail Message Formats •After the message header, a blank line follows; then the message body (in ASCII) follows. •You should use Telnet to send a message to a mail server that contains some header lines, including the Subject: header line. •To do this, issue telnet serverName 25 Contd… 5/5
  • 234. 2.4 Electronic Mail in the Internet -- Syllabus •SMTP, •Comparison with HTTP, •Mail Message Format, •Mail Access Protocols
  • 235. 2.4.4 Mail Access Protocols •Mail access uses a client-server architecture— the user reads e-mail with a client that executes on the user’s end system. •Once SMTP delivers the message from Alice’s mail server to Bob’s mail server, the message is placed in Bob’s mailbox. 1/7
  • 236. 2.4.4 Mail Access Protocols •Given that Bob (the recipient) executes his user agent on his local PC, it is natural to consider placing a mail server on his local PC. •With this approach, Alice’s mail server would dialogue directly with Bob’s PC. Contd… 2/7
  • 237. 2.4.4 Mail Access Protocols •There is a problem with this approach. •A mail server manages mailboxes and runs the client and server sides of SMTP. •If Bob’s mail server were to reside on his local PC, then Bob’s PC would have to remain always on, and connected to the Internet, in order to receive new mail, which can arrive at any time. • This is impractical for many Internet users. Contd… 3/7
  • 238. 2.4.4 Mail Access Protocols •Instead, a typical user runs a user agent on the local PC but accesses its mailbox stored on an always-on shared mail server. •This mail server is shared with other users and is typically maintained by the user’s ISP Contd… 4/7
  • 239. 2.4.4 Mail Access Protocols •SMTP has been designed for pushing e-mail from one host to another. •The sender’s user agent does not dialogue directly with the recipient’s mail server. Contd… 5/7
  • 240. 2.4.4 Mail Access Protocols •There are currently a number of popular mail access protocols, including •Post Office Protocol—Version 3 (POP3), •Internet Mail Access Protocol (IMAP), and •HTTP. Contd… 6/7
  • 241. 2.4.4 Mail Access Protocols •SMTP is used to transfer mail from the sender’s mail server to the recipient’s mail server. •SMTP is also used to transfer mail from the sender’s user agent to the sender’s mail server. • A mail access protocol, such as POP3, is used to transfer mail from the recipient’s mail server to the recipient’s user agent. Contd… 7/7
  • 242. POP3 •POP3 is an extremely simple mail access protocol. •It is short and quite readable. •Because the protocol is so simple, its functionality is rather limited.
  • 243. Working of POP3 •POP3 begins when the user agent (the client) opens a TCP connection to the mail server (the server) on port 110. •With the TCP connection established, POP3 progresses through three phases: •Authorization, •Transaction, and •Update.
  • 244. First Phase -- Authorization •During the first phase, authorization, the user agent sends a username and a password to authenticate the user.
  • 245. Second Phase -- Transaction •During the second phase, transaction, the user agent retrieves messages; also during this phase, the user agent can mark messages for deletion, remove deletion marks, and obtain mail statistics.
  • 246. Third Phase -- Update •The third phase, update, occurs after the client has issued the quit command, ending the POP3 session. •At this time, the mail server deletes the messages that were marked for deletion. •In a POP3 transaction, the user agent issues commands, and the server responds to each command with a reply.
  • 247. Possible Responses of POP3 Transaction •There are two possible responses: •+OK (sometimes followed by server-to- client data), used by the server to indicate that the previous command was fine; and •-ERR, used by the server to indicate that something was wrong with the previous command.
  • 248. Consider the sample response message telnet mailServer 110 +OK POP3 server ready user bob +OK pass hungry +OK user successfully logged on If you misspell a command, the POP3 server will reply with an -ERR message.
  • 249. Two modes of User in the POP3 Transaction Phase •A user agent using POP3 can be configured (by the user) to “download and delete” or to “download and keep”. •The sequence of commands issued by a POP3 user agent depends on which of these two modes the user agent is operating in. •In the download-and-delete mode, the user agent will issue the list, retr, and dele commands.
  • 250. Transaction Message in the Download and Delete Mode C: list S: 1 498 S: 2 912 S: . C: retr 1 S: (blah blah ... S: ................. S: ..........blah) S: . C: dele 1 C: retr 2 S: (blah blah ... S: ................. S: ..........blah) S: . C: dele 2 C: quit S: +OK POP3 server signing off
  • 251. Explanation of the Message •The user agent first asks the mail server to list the size of each of the stored messages. •The user agent then retrieves and deletes each message from the server. Note that after the authorization phase, the user agent employed only four commands: list, retr, dele, and quit. •After processing the quit command, the POP3 server enters the update phase and removes messages 1 and 2 from the mailbox.
  • 252. Disadvantage of Download-and-Delete Mode •The recipient may want to access his mail messages from multiple machines (say, his office PC, his home PC, and his portable computer). Such users are called as nomadic user. •The download-and-delete mode partitions recipient’s mail messages over these three machines; if he first reads a message on his office PC, he will not be able to reread the message from his portable at home later in the evening.
  • 253. Download-and-Keep Mode •In the download-and-keep mode, the user agent leaves the messages on the mail server after downloading them. •In this case, the recipient can reread messages from different machines; •he can access a message from work and access it again later in the week from home.
  • 254. During a POP3 session between a user agent and the mail server •The POP3 server maintains some state information. •The POP3 Server keeps track of which user messages have been marked deleted. •The POP3 server does not carry state information across POP3 sessions. •This lack of state information across the sessions simplifies the implementation of a POP3 server.
  • 255. Problem with POP3 for Nomadic User •With POP3 access, once Bob has downloaded his messages to the local machine, he can create mail folders and move the downloaded messages into the folders. •Bob can then delete messages, move messages across folders, and search for messages (by sender name or subject). 1/2
  • 256. Problem with POP3 for Nomadic User •But this paradigm—namely, folders and messages in the local machine—poses a problem for the nomadic user, who would prefer to maintain a folder hierarchy on a remote server that can be accessed from any computer. This is not possible with POP3—the POP3 protocol does not provide any means for a user to create remote folders and assign messages to folders. Contd… 2/2
  • 257. Solution is IMAP Protocol •IMAP is a mail access protocol. •It has many more features than POP3. •It is also significantly more complex. •Thus the client and server side implementations are more complex.
  • 258. IMAP Server • An IMAP server will associate each message with a folder; • When a message first arrives at the server, it is associated with the recipient’s INBOX folder. • The recipient can then move the message into a new, user-created folder, read the message, delete the message, and so on. • The IMAP protocol provides commands to allow users to create folders and move messages from one folder to another. • IMAP provides commands that allow users to search remote folders for messages matching specific criteria. • IMAP server maintains user state information across IMAP sessions— for example, the names of the folders and which messages are associated with which folders.
  • 259. Another important feature of IMAP •It has commands that permit a user agent to obtain components of messages. •For example, a user agent can obtain just the message header of a message or just one part of a multipart MIME message. •This feature is useful when there is a low-bandwidth connection between the user agent and its mail server. •With a low bandwidth connection, the user may not want to download all of the messages in its mailbox, particularly avoiding long messages that might contain, for example, an audio or video clip.
  • 260. Web-Based E-Mail •With this service, the user agent is an ordinary Web browser, and the user communicates with its remote mailbox via HTTP. •When a recipient wants to access a message in his mailbox, the e-mail message is sent from Bob’s mail server to reciever’s browser using the HTTP protocol rather than the POP3 or IMAP protocol. •When a sender wants to send an e-mail message, the e-mail message is sent from sender browser to sender mail server over HTTP rather than over SMTP. •Sender’s mail server still sends messages to, and receives messages from, other mail servers using SMTP.
  • 261. End of the Chapter - 2.4 Electronic Mail in the Internet
  • 262. SYLLABUS – MODULE 1 •Principles of Network Applications: Network Application Architectures, Processes Communicating, Transport Services Available to Applications, Transport Services Provided by the Internet, Application-Layer Protocols. •The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET, •File Transfer: FTP Commands & Replies, •Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail Access Protocols, •DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS Works, DNS Records and Messages, •Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables, •Socket Programming: creating Network Applications: Socket Programming with UDP, Socket Programming with TCP.
  • 263. 2.5 DNS --The Internet's Directory Service Syllabus 2.5.1 Services Provided by DNS, 2.5.2 Overview of How DNS Works, 2.5.3 DNS Records and Messages
  • 264. 2.5 DNS --The Internet's Directory Service •Internet hosts can be identified in many ways. •One identifier for a host is its hostname. •cnn.com •www.yahoo.com •gaia.cs.umass.edu •cis.poly.edu 1/4
  • 265. 2.5 DNS --The Internet's Directory Service •Hostnames provide little, if any, information about the location within the Internet of the host. •A hostname such as www.eurecom.fr, which ends with the country code .fr, tells us that the host is probably in France. Contd… 2/4
  • 266. 2.5 DNS --The Internet's Directory Service •Hostnames can consist of variable-length alphanumeric characters, they would be difficult to process by routers. For these reasons, hosts are also identified by so-called IP addresses. Contd… 3/4
  • 267. 2.5 DNS --The Internet's Directory Service •An IP address consists of four bytes and has a rigid hierarchical structure. •An IP address looks like 121.7.106.83 •Each period separates one of the bytes expressed in decimal notation from 0 to 255. •An IP address is hierarchical because as we scan the address from left to right, we obtain more and more specific information about where the host is located in the Internet. Contd… 4/4
  • 268. Two Ways to identify a Host •By a hostname and •By an IP address. •People prefer the more mnemonic hostname identifier. •Routers prefer fixed-length, hierarchically structured IP addresses.
  • 269. Importance of DNS •In order to reconcile these preferences, we need a directory service that translates hostnames to IP addresses. •This is the main task of the Internet’s Domain Name System (DNS).
  • 270. Charecteristics of DNS •The DNS is •A distributed database implemented in a hierarchy of DNS servers, and •An application-layer protocol that allows hosts to query the distributed database.
  • 271. DNS Servers and Protocol •The DNS servers are UNIX machines running the Berkeley Internet Name Domain (BIND) software. •The DNS protocol runs over UDP and uses port 53. Contd… 1/2
  • 272. DNS Servers and Protocol •DNS is commonly employed by other application-layer protocols—including HTTP, SMTP, and FTP—to translate user-supplied hostnames to IP addresses. Contd… 2/2
  • 273. What happens when a browser (i.e., an HTTP client), running on some user’s host, requests the URL www.someschool.edu/index.html. •In order for the user’s host to be able to send an HTTP request message to the Web server www.someschool.edu, the user’s host must first obtain the IP address of www.someschool.edu. •This is done as follows.
  • 274. Steps when a browser (i.e., an HTTP client), running on some user’s host, requests the URL www.someschool.edu/index.html. 1. The user machine runs the client side of the DNS application. 2. The browser extracts the hostname, www.someschool.edu, from the URL and passes the hostname to the client side of the DNS application. 1/3
  • 275. Steps when a browser (i.e., an HTTP client), running on some user’s host, requests the URL www.someschool.edu/index.html. 3. The DNS client sends a query containing the hostname to a DNS server. 4. The DNS client eventually receives a reply, which includes the IP address for the hostname. Contd… 2/3
  • 276. Steps when a browser (i.e., an HTTP client), running on some user’s host, requests the URL www.someschool.edu/index.html. 5. Once the browser receives the IP address from DNS, it can initiate a TCP connection to the HTTP server process located at port 80 at that IP address. Contd… 3/3
  • 277. 2.5.1 Services Provided by DNS •DNS helps to reduce DNS network traffic as well as the average DNS delay. •DNS provides a few other important services in addition to translating hostnames to IP addresses: •Host aliasing. •Mail server aliasing. •Load distribution.
  • 278. Host Aliasing •A host with a complicated hostname can have one or more alias names. •For example, a hostname such as relay1.westcoast.enterprise.com could have two aliases such as • enterprise.com and •www.enterprise.com 1/2
  • 279. Host Aliasing •relay1.westcoast.enterprise.com hostname is said to be a canonical hostname. •Alias hostnames are more mnemonic than canonical hostnames. •DNS can be invoked by an application to obtain the canonical hostname for a supplied alias hostname and IP address of the host. Contd… 2/2
  • 280. Mail Server Aliasing •E-mail addresses are mnemonic. •For example, if Bob has an account with Hotmail, Bob’s e-mail address might be as simple as bob@hotmail.com. 1/4
  • 281. Mail Server Aliasing •The hostname of the Hotmail mail server is more complicated and much less mnemonic than simply hotmail.com •For example, the canonical hostname might be something like relay1.west-coast.hotmail.com Contd… 2/4
  • 282. Mail Server Aliasing •DNS can be invoked by a mail application to obtain the canonical hostname for a supplied alias hostname as well as the IP address of the host. Contd… 3/4
  • 283. Mail Server Aliasing •The MX record permits a company’s mail server and Web server to have identical (aliased) hostnames. •For example, a company’s Web server and mail server can both be called enterprise.com. Contd… 4/4
  • 284. Load Distribution. •DNS is also used to perform load distribution among replicated servers, such as replicated Web servers. •Busy sites, such as cnn.com, are replicated over multiple servers, with each server running on a different end system and each having a different IP address. 1/4
  • 285. Load Distribution. •For replicated Web servers, a set of IP addresses is thus associated with one canonical hostname. •The DNS database contains this set of IP addresses. •Clients make a DNS query for a name mapped to a set of addresses. Contd… 2/4
  • 286. Load Distribution. •The server responds with the entire set of IP addresses, but rotates the ordering of the addresses within each reply. •Because a client sends its HTTP request message to the IP address that is listed first in the set, DNS rotation distributes the traffic among the replicated servers. Contd… 3/4
  • 287. Load Distribution. •DNS rotation is also used for e-mail so that multiple mail servers can have the same alias name. •Content distribution companies such as Akamai have used DNS in more sophisticated ways to provide Web content distribution. Contd… 4/4
  • 288. 2.5.2 Overview of How DNS Works  hostname-to-IP-address translation service •Suppose that some application (such as a Web browser or a mail reader) running in a user’s host needs to translate a hostname to an IP address. •The application will invoke the client side of DNS, specifying the hostname that needs to be translated. 1/4
  • 289. 2.5.2 Overview of How DNS Works  hostname-to-IP-address translation service •On many UNIX-based machines, gethostbyname() is the function call that an application calls in order to perform the translation. •DNS in the user’s host then takes over, sending a query message into the network. Contd… 2/4
  • 290. 2.5.2 Overview of How DNS Works  hostname-to-IP-address translation service •All DNS query and reply messages are sent within UDP datagrams to port 53. •After a delay, ranging from milliseconds to seconds, DNS in the user’s host receives a DNS reply message that provides the desired mapping Contd… 3/4
  • 291. 2.5.2 Overview of How DNS Works  hostname-to-IP-address translation service •This mapping is then passed to the invoking application. •Thus, from the perspective of the invoking application in the user’s host, DNS is a black box providing a simple, straightforward translation service. Contd… 4/4
  • 292. A Simple Design for DNS •A simple design for DNS would have one DNS server that contains all the mappings. •In this centralized design, •Clients simply direct all queries to the single DNS server, and •The DNS server responds directly to the querying clients. •Although the simplicity of this design is attractive, it is inappropriate for today’s Internet, with its vast (and growing) number of hosts.
  • 293. The Problems with a Centralized Design •A single point of failure. If the DNS server crashes, so does the entire Internet. •Traffic volume. A single DNS server would have to handle all DNS queries (for all the HTTP requests and e-mail messages generated from hundreds of millions of hosts). 1/3
  • 294. The Problems with a Centralized Design •Distant centralized database. A single DNS server cannot be “close to” all the querying clients. If we put the single DNS server in New York City, then all queries from Australia must travel to the other side of the globe, perhaps over slow and congested links. This can lead to significant delays. Contd… 2/3
  • 295. The Problems with a Centralized Design •Maintenance. The single DNS server would have to keep records for all Internet hosts. Not only would this centralized database be huge, but it would have to be updated frequently to account for every new host. Contd… 3/3
  • 296. A Distributed, Hierarchical Database •The DNS uses a large number of servers, organized in a hierarchical fashion and distributed around the world. •No single DNS server has all of the mappings for all of the hosts in the Internet. Instead, the mappings are distributed across the DNS servers.
  • 297. Three Classes of DNS Servers •There are Three Classes of DNS Servers: •Root DNS Servers, •Top-Level Domain (TLD) DNS Servers, and •Authoritative DNS Servers •All these are organized in a hierarchy.
  • 298. Portion of the hierarchy of DNS server
  • 299. How these three classes of servers interact Suppose a DNS client wants to determine the IP address for the hostname www.amazon.com. •To a first approximation, the following events will take place. •The client first contacts one of the root servers, which returns IP addresses for TLD servers for the top-level domain com. •The client then contacts one of these TLD servers, which returns the IP address of an authoritative server for amazon.com. •Finally, the client contacts one of the authoritative servers for amazon.com, which returns the IP address
  • 300. Root DNS Servers •In the Internet, there are 13 root DNS servers most of which are located in North America. •These DNS root servers are listed in 2012 •These 13 DNS Servers are listed here in (name, organization, location) format: (Refer Next Slide) 1/3
  • 301. Root DNS Servers 1. Verisign, Los Angeles, CA (5 other sites) 2. USC-ISI, Marina del Rey, CA 3. Cogent, Herndon, VA (5 other sites) 4. U, Maryland College Park, MD 5. NASA, Mt View, CA 6. Internet Software C, Palo Alto, CA (and 48 other sites) 7. US DoD, Columbus, OH (5 other sites) 8. ARL, Aberdeen, MD 9. Netnod, Stockholm (37 other sites) 10. Verisign, Dulles, VA (69 other sites ) 11. RIPE, London (17 other sites) 12. ICANN, Los Angeles, CA (41 other sites) 13. WIDE, Tokyo (5 other sites) 2/3 Contd… 13 DNS Servers are listed here in (name, organization, location) format
  • 302. Root DNS Servers •Each “server” is actually a network of replicated servers, for both security and reliability purposes. • All together, there are 247 root servers as of fall 2011 Contd… 3/3
  • 303. Top-Level Domain (TLD) Servers •These servers are responsible for top-level domains such as com, org, net, edu, and gov, and all of the country top-level domains such as uk, fr, ca, and jp. •The company Verisign Global Registry Services maintains the TLD Servers for the com top-level domain. •The company Educause maintains the TLD servers for edu top-level domain. for a list of all top-level domains.
  • 304. Authoritative DNS Servers •Every organization with publicly accessible hosts (such as Web servers and mail servers) on the Internet must provide publicly accessible DNS records that map the names of those hosts to IP addresses. 1/3
  • 305. Authoritative DNS Servers •An organization’s authoritative DNS server houses these DNS records. •An organization can choose to implement its own authoritative DNS server to hold these records. Contd… 2/3
  • 306. Authoritative DNS Servers •The organization can pay to have these records stored in an authoritative DNS server of some service provider. •Most universities and large companies implement and maintain their own primary and secondary (backup) authoritative DNS server. Contd… 3/3
  • 307. Local DNS Server •A local DNS server does not strictly belong to the hierarchy of servers but is nevertheless central to the DNS architecture. •Each ISP—such as a university, an academic department, an employee’s company, or a residential ISP—has a local DNS server (also called a default name server). Contd… 1/4
  • 308. Local DNS Server •When a host connects to an ISP, the ISP provides the host with the IP addresses of one or more of its local DNS servers. Contd… 2/4
  • 309. Local DNS Server •A host’s local DNS server is “close to” the host. •For an institutional ISP, local DNS server may be on the same LAN as the host; •For a residential ISP, it is separated from the host by no more than a few routers. Contd… 3/4
  • 310. Local DNS Server •When a host makes a DNS query, the query is sent to the local DNS server, which acts a proxy, forwarding the query into the DNS server hierarchy. Contd… 4/4
  • 311. Interaction of the various DNS server 1/6
  • 312. Interaction of the various DNS Server •Suppose the host cis.poly.edu desires the IP address of gaia.cs.umass.edu. •Suppose that Polytechnic’s local DNS server is called dns.poly.edu and that an authoritative DNS server for gaia.cs.umass.edu is called dns.umass.edu. Contd… 2/6
  • 313. Interaction of the various DNS Server •The host cis.poly.edu first sends a DNS query message to its local DNS server, dns.poly.edu. •The query message contains the hostname to be translated, namely, gaia.cs.umass.edu. •The local DNS server forwards the query message to a root DNS server. Contd… 3/6
  • 314. Interaction of the various DNS Server •The root DNS server takes note of the edu suffix and returns to the local DNS server a list of IP addresses for TLD servers responsible for edu. •The local DNS server then resends the query message to one of these TLD servers. Contd… 4/6
  • 315. Interaction of the various DNS Server •The TLD server takes note of the umass.edu suffix and responds with the IP address of the authoritative DNS server for the University of Massachusetts, namely, dns.umass.edu. Contd… 5/6
  • 316. Interaction of the various DNS Server •Finally, the local DNS server resends the query message directly to dns.umass.edu, which responds with the IP address of gaia.cs.umass.edu. •In order to obtain the mapping for one hostname, eight DNS messages were sent: four query messages and four reply messages. Contd… 6/6
  • 318. Recursive Querries and Iterative Querries •The query sent from cis.poly.edu to dns.poly.edu is a recursive query, since the query asks dns.poly.edu to obtain the mapping on its behalf. •But the subsequent three queries are iterative since all of the replies are directly returned to dns.poly.edu. •Any DNS query can be iterative or recursive. •The query from the requesting host to the local DNS server is recursive, and the remaining queries are iterative.
  • 319. DNS Caching •In a query chain, when a DNS server receives a DNS reply (containing a mapping from a hostname to an IP address), it can cache the mapping in its local memory. •For example, each time the local DNS server dns.poly.edu receives a reply from some DNS server, it can cache any of the information contained in the reply. 1/6
  • 320. DNS Caching •If a hostname/IP address pair is cached in a DNS server and another query arrives to the DNS server for the same hostname, the DNS server can provide the desired IP address, even if it is not authoritative for the hostname. Contd… 2/6
  • 321. DNS Caching •Because hosts and mappings between hostnames and IP addresses are by no means permanent, DNS servers discard cached information after a period of time. •Suppose that a host apricot.poly.edu queries dns.poly.edu for the IP address for the hostname cnn.com. Contd… 3/6
  • 322. DNS Caching •Suppose that a host apricot.poly.edu queries dns.poly.edu for the IP address for the hostname cnn.com. •Suppose that a few hours later, another Polytechnic University host, say, kiwi.poly.fr, also queries dns.poly.edu with the same hostname. Contd… 4/6
  • 323. DNS Caching •Because of caching, the local DNS server will be able to immediately return the IP address of cnn.com to this second requesting host without having to query any other DNS servers. Contd… 5/6
  • 324. DNS Caching •A local DNS server can also cache the IP addresses of TLD servers, thereby allowing the local DNS server to bypass the root DNS servers in a query chain. Contd… 6/6
  • 325. 2.5.3 DNS Records and Messages •The DNS servers that together implement the DNS distributed database store Resource Records (RRs), including RRs that provide hostname-to-IP address mappings. •Each DNS reply message carries one or more resource records.
  • 326. A Resource Record is a four-tuple that contains the following fields (Name, Value, Type, TTL) •TTL is the time to live of the resource record. •It determines when a resource should be removed from a cache. •The meaning of Name and Value depend on Type.
  • 327. The four values of TYPE field •TYPE = A •TYPE = NS •TYPE = CNAME •TYPE = MX
  • 328. When TYPE = A •If Type=A, then Name is a hostname and Value is the IP address for the hostname. •Thus, a Type A record provides the standard hostname-to-IP address mapping. •As an example, (relay1.bar.foo.com, 145.37.93.126, A) is a Type A record.
  • 329. When TYPE = NS •If Type=NS, then Name is a domain (such as foo.com) and Value is the hostname of an authoritative DNS server •This record is used to route DNS queries further along in the query chain. •As an example, (foo.com, dns.foo.com, NS) is a Type NS record.
  • 330. When TYPE = CNAME •If Type=CNAME, then Value is a canonical hostname for the alias hostname Name. •This record can provide querying hosts the canonical name for a hostname. •As an example, (foo.com, relay1.bar.foo.com, CNAME) is a CNAME record.
  • 331. When TYPE = MX •If Type = MX, then Value is the canonical name of a mail server that has an alias hostname Name. •As an example, (foo.com, mail.bar.foo.com, MX) is an MX record. •MX records allow the hostnames of mail servers to have simple aliases. 1/2
  • 332. When TYPE = MX •Note that by using the MX record, a company can have the same aliased name for its mail server and for one of its other servers. •To obtain the canonical name for the mail server, a DNS client would query for an MX record; to obtain the canonical name for the other server, the DNS client would query for the CNAME record. Contd… 2/2
  • 333. DNS Server may be authoritative for a particular hostname •If a DNS server is authoritative for a particular hostname, then the DNS server will contain a Type A record for the hostname. (Even if the DNS server is not authoritative, it may contain a Type A record in its cache). 1/3
  • 334. DNS Server may be authoritative for a particular hostname •If a server is not authoritative for a hostname, then the server will contain a Type NS record for the domain that includes the hostname; it will also contain a Type A record that provides the IP address of the DNS server in the Value field of the NS record. Contd… 2/3
  • 335. DNS Server may be authoritative for a particular hostname • As an example, suppose an edu TLD server is not authoritative for the host gaia.cs.umass.edu. Then this server will contain a record for a domain that includes the host gaia.cs.umass.edu. • The edu TLD server would also contain a Type A record, which maps the DNS server dns.umass.edu to an IP address, for example, (dns.umass.edu, 128.119.40.111, A). Contd… 3/3
  • 336. DNS Messages •Two kinds of DNS messages: •DNS query message and. •DNS reply message. •Both query and reply messages have the same format.
  • 337. DNS Message Format Identification Flags Header Section 12 bytes Number of questions Number of answer RRs Number of authority RRs Number of additional RRs Questions (variable number of questions) Name, type fields for a query Answers (variable number of resource records) RRs in response to query Authority (variable number of resource records) Records for authoritative servers Additional information Additional “helpful” info that
  • 338. Sections in the DNS Message Format •Header Sections (The first 12 bytes) •Data Sections
  • 339. DNS Message Format Header Sections • Identifier Field. • Question Count field. • Answer Count field. • Authority Count field. • Additional Information Count field. Data Sections •Question Section •Answer Section or Reply Section •Authority Section •Additional Information Section
  • 340. Identifier Field •It is the first field in the Header Section. •It is a 16-bit number that identifies the query. •This identifier is copied into the reply message to a query, allowing the client to match received replies with sent queries. •There are a number of flags in the flag field.
  • 341. Flags in the Flag field •A 1-bit query/reply flag indicates whether the message is a query (0) or a reply (1). •A 1-bit authoritative flag is set in a reply message when a DNS server is an authoritative server for a queried name. 1/2
  • 342. Flags in the Flag field •A 1-bit recursion-desired flag is set when a client (host or DNS server) desires that the DNS server perform recursion when it doesn’t have the record. •A 1-bit recursion available flag is set in a reply if the DNS server supports recursion. Contd… 2/2
  • 343. Other Header Sections in the DNS Message Format •In the header section, there are also four number of fields. •These fields indicate the number of occurrences of the four types of data sections that follow the header.
  • 344. Four Fields in the Data Sections of DNS Message Format •Question Section •Answer Section (or Reply Section or Response Section) •Authority Section •Additional Information Section
  • 345. Question Section • This Section contains information about the query that is being made. • This section includes •A name field that contains the name that is being queried, and •A type field that indicates the type of question being asked about the name. • For Example, a host address associated with a name (Type A) or the mail server for a name (Type MX).
  • 346. Answer Section Or Reply Section Or Response Section •In a reply from a DNS server, the answer section contains the resource records for the name that was originally queried. •In each resource record there is the Type (A, NS, CNAME, and MX), the Value, and the TTL. •A reply can return multiple RRs in the answer, since a hostname can have multiple IP addresses.
  • 347. Authority Section •The Authority Section contains records of other authoritative servers.
  • 348. Additional Section •The additional section contains other helpful records. •For example, the answer field in a reply to an MX query contains a resource record providing the canonical hostname of a mail server. •The additional section contains a Type A record providing the IP address for the canonical hostname of the mail server.
  • 349. How would you like to send a DNS query message directly from the host you’re working on to some DNS server? •This can easily be done with the nslookup program, which is available from most Windows and UNIX platforms.
  • 350. nslookup in the Windows Host •Open the Command Prompt. •Invoke the nslookup program by simply typing “nslookup.” •Send a DNS query to any DNS server (root, TLD, or authoritative). •Receiving the reply message from the DNS server. •Now nslookup will display the records included in the reply (in a human-readable format).
  • 351. End of 2.5 DNS -- The Internet's Directory Service.
  • 352. SYLLABUS – MODULE 1 •Principles of Network Applications: Network Application Architectures, Processes Communicating, Transport Services Available to Applications, Transport Services Provided by the Internet, Application-Layer Protocols. •The Web and HTTP: Overview of HTTP, Non-persistent and Persistent Connections, HTTP Message Format, User-Server Interaction: Cookies, Web Caching, The Conditional GET, •File Transfer: FTP Commands & Replies, •Electronic Mail in the Internet: SMTP, Comparison with HTTP, Mail Message Format, Mail Access Protocols, •DNS --The Internet's Directory Service: Services Provided by DNS, Overview of How DNS Works, DNS Records and Messages, •Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables, •Socket Programming: creating Network Applications: Socket Programming with UDP, Socket Programming with TCP.
  • 353. Peer-to-Peer Applications •P2P File Distribution •Distributed Hash Tables
  • 354. 2.6.1 P2P File Distribution •In Client-Server File Distribution, the server must send a copy of the file to each of the peers— placing an enormous burden on the server and consuming a large amount of server bandwidth. •In P2P File Distribution, each peer can redistribute any portion of the file it has received to any other peers, thereby assisting the server in the distribution process
  • 355. Most Popular P2P File Distribution Protocol •As of 2012, the most popular P2P file distribution protocol is BitTorrent. •It was Originally developed by Bram Cohen. •Now, there are many different independent BitTorrent clients conforming to the BitTorrent protocol, just as there are a number of Web browser clients that conform to the HTTP protocol
  • 356. Scalability of P2P Architectures •Suppose the server and the peers are connected to the Internet with access links. •Let • us be the upload rate of the server’s access link. • ui be the upload rate of the ith peer’s access link. • di be the download rate of the ith peer’s access link. • F be the size of the file to be distributed (in bits). • N be the number of peers that want to obtain a copy of the file.
  • 357. The Distribution Time •The distribution time is the time it takes to get a copy of the file to all N peers. •Assume that the server and clients are not participating in any other network applications, so that all of their upload and download access bandwidth can be fully devoted to distributing this file.
  • 359. Let’s first determine the distribution time for the Client-Server Architecture •Let DCS be the distribution time for the client-server architecture. •We make some observations 1/12
  • 360. Let’s first determine the distribution time for the Client-Server Architecture •We make some observations •The server must transmit one copy of the file to each of the N peers. Thus the server must transmit___bits. 2/12 Contd…
  • 361. Let’s first determine the distribution time for the Client-Server Architecture •We make some observations •The server must transmit one copy of the file to each of the N peers. Thus the server must transmit NF bits. 3/12 Contd…
  • 362. Let’s first determine the distribution time for the Client-Server Architecture •We make some observations •The server must transmit one copy of the file to each of the N peers. Thus the server must transmit NF bits. •The server’s upload rate is us, •So, the time to distribute the file must be at least ____ 4/12 Contd…
  • 363. Let’s first determine the distribution time for the Client-Server Architecture •We make some observations •The server must transmit one copy of the file to each of the N peers. Thus the server must transmit NF bits. •The server’s upload rate is us, •So, the time to distribute the file must be at least NF/us 5/12 Contd… 1
  • 364. Let’s first determine the distribution time for the Client-Server Architecture •Let dmin denote the download rate of the peer with the lowest download rate, that is, dmin = _______ 6/12 Contd…
  • 365. Let’s first determine the distribution time for the Client-Server Architecture •Let dmin denote the download rate of the peer with the lowest download rate, that is, dmin = min{d1,d2,...,dN}. 7/12 Contd…
  • 366. Let’s first determine the distribution time for the Client-Server Architecture •Let dmin denote the download rate of the peer with the lowest download rate, that is, dmin = min{d1,d2,...,dN}. •The peer with the lowest download rate cannot obtain all F bits of the file in less than _____ seconds. 8/12 Contd…
  • 367. Let’s first determine the distribution time for the Client-Server Architecture •Let dmin denote the download rate of the peer with the lowest download rate, that is, dmin = min{d1,d2,...,dN}. •The peer with the lowest download rate cannot obtain all F bits of the file in less than F/dmin seconds. •Thus the minimum distribution time is at least ____ 9/12 Contd…
  • 368. Let’s first determine the distribution time for the Client-Server Architecture •Let dmin denote the download rate of the peer with the lowest download rate, that is, dmin = min{d1,d2,...,dN}. •The peer with the lowest download rate cannot obtain all F bits of the file in less than F/dmin seconds. •Thus the minimum distribution time is at least F/dmin. 10/12 Contd…
  • 369. Let’s first determine the distribution time for the Client-Server Architecture •Let dmin denote the download rate of the peer with the lowest download rate, that is, dmin = min{d1,d2,...,dN}. •The peer with the lowest download rate cannot obtain all F bits of the file in less than F/dmin seconds. •Thus the minimum distribution time is at least F/dmin. 11/12 Contd… 2
  • 370. Let’s first determine the distribution time for the Client-Server Architecture •Putting these two observations together, we obtain Contd… 12/12 This provides a lower bound on the minimum distribution time for the client-server architecture. Thus, the distribution time increases linearly with the number of peers N.
  • 371. Now Let’s determine the distribution time for the Peer-To-Peer Architecture
  • 372. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) •In P2P architecture, each peer can assist the server in distributing the file. In particular, when a peer receives some file data, it can use its own upload capacity to redistribute the data to other peers. •Calculating the distribution time for the P2P architecture is more complicated than for the client-server architecture, since the distribution time depends on how each peer distributes portions of the file to the other peers. 1/11
  • 373. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: •At the beginning of the distribution, only the server has the file. To get this file into the community of peers, the server must send each bit of the file at least once into its access link. •If us is the upload rate of server, then the time required to upload 1 bit data is ____ 2/11 Contd…
  • 374. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: •At the beginning of the distribution, only the server has the file. To get this file into the community of peers, the server must send each bit of the file at least once into its access link. •If us is the upload rate of server, then the time required to upload 1 bit data is F/us 3/11 Contd…
  • 375. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: •At the beginning of the distribution, only the server has the file. To get this file into the community of peers, the server must send each bit of the file at least once into its access link. •If us is the upload rate of server, then the time required to upload 1 bit data is F/us •Thus, the minimum distribution time is at least F/u . 4/11 Contd… 1
  • 376. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: •At the beginning of the distribution, only the server has the file. To get this file into the community of peers, the server must send each bit of the file at least once into its access link. Thus, the minimum distribution time is at least F/us . (Unlike the client-server scheme, a bit sent once by the server may not have to be sent by the server again, as the peers may redistribute the bit among themselves.) 5/11 Contd…
  • 377. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: •If di is the download rate of ith peer. •So time required to download the file with F bits is _____ 6/11 Contd…
  • 378. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: •If di is the download rate of ith peer. •So time required to download the file with F bits is F/dmin 7/11 Contd…
  • 379. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: •If di is the download rate of ith peer. •So time required to download the file with F bits is F/dmin •The peer with the lowest download rate cannot obtain all F bits of the file in less than F/dmin seconds. Thus the minimum distribution time is at least F/dmin. 8/11 Contd…
  • 380. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: •If di is the download rate of ith peer. •So time required to download the file with F bits is F/dmin •The peer with the lowest download rate cannot obtain all F bits of the file in less than F/dmin seconds. Thus the minimum distribution time is at least F/dmin. 9/11 Contd… 2
  • 381. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) We first make the following observations: • Finally, observe that the total upload capacity of the system as a whole is equal to the upload rate of the server plus the upload rates of each of the individual peers, that is, utotal = us + u1 + … + uN. The system must deliver (upload) F bits to each of the N peers, thus delivering a total of NF bits. This cannot be done at a rate faster than utotal. •Thus, the minimum distribution time is also at least NF/(u + u + … + u ). 10/11 Contd… 3
  • 382. Determine the distribution time for the Peer-To- Peer Architecture (P2P Architecture) Putting these three observations together, we obtain the minimum distribution time for P2P, denoted by DP2P. 11/11 Contd… This provides a lower bound for the minimum distribution time for the P2P architecture. If we imagine that each peer can redistribute a bit as soon as it receives the bit, then there is a redistribution scheme that actually achieves this lower bound.
  • 383. Distribution time for P2P and client-server architecture
  • 384. Assumptions in the graph •We have set F/u = 1 hour, us = 10u, and dmin ≥ us . •Thus, a peer can transmit the entire file in one hour, the server transmission rate is 10 times the peer upload rate, and (for simplicity) the peer download rates are set large enough so as not to have an effect.
  • 385. Comparison of Client Server Architecture with P2P Architecture using the Graph •For the client-server architecture, the distribution time increases linearly and without bound as the number of peers increases. •For the P2P architecture, the minimal distribution time is not only always less than the distribution time of the client- server architecture; it is also less than one hour for any number of peers N. Thus, applications with the P2P architecture can be self-scaling
  • 386. Bit Torrent •BitTorrent is a popular P2P protocol for file distribution. •In BitTorrent lingo, the collection of all peers participating in the distribution of a particular file is called a torrent. •Peers in a torrent download equal-size chunks of the file from one another, with a typical chunk size of 256 KBytes. •When a peer first joins a torrent, it has no chunks. Over time it accumulates more and more chunks. 1/2
  • 387. Bit Torrent •While it downloads chunks it also uploads chunks to other peers. •Once a peer has acquired the entire file, it may leave the torrent, or remain in the torrent and continue to upload chunks to other peers. •Any peer may leave the torrent at any time with only a subset of chunks, and later rejoin the torrent. 2/2 Contd…
  • 388. Operation of Bit Torrent Protocol •Each torrent has an infrastructure node called a tracker. •When a peer joins a torrent, it registers itself with the tracker and periodically informs the tracker that it is still in the torrent. Thus the tracker keeps track of the peers that are participating in the torrent. •A given torrent may have fewer than ten or more than a thousand peers participating at any instant of time.
  • 389. File distribution with BitTorrent
  • 390. Working of Bit Torrent Protocol •When a new peer, Alice, joins the torrent, the tracker randomly selects a subset of peers from the set of participating peers, and sends the IP addresses of these 50 peers to Alice. •Possessing this list of peers, Alice attempts to establish concurrent TCP connections with all the peers on this list. 1/5
  • 391. Working of Bit Torrent Protocol •Let’s call all the peers with which Alice succeeds in establishing a TCP connection “neighboring peers.” •As time evolves, some of these peers may leave and other peers (outside the initial 50) may attempt to establish TCP connections with Alice. •So a peer’s neighboring peers will fluctuate over time. 2/5 Contd…
  • 392. Working of Bit Torrent Protocol •At any given time, each peer will have a subset of chunks from the file, with different peers having different subsets. •Periodically, Alice will ask each of her neighboring peers for the list of the chunks they have. •If Alice has L different neighbors, she will obtain L lists of chunks. With this knowledge, Alice will issue requests for chunks she currently does not have. 3/5 Contd…
  • 393. Working of Bit Torrent Protocol •So at any given instant of time, Alice will have a subset of chunks and will know which chunks her neighbors have. •With this information, Alice will have two important decisions to make. •First, which chunks should she request first from her neighbors? •Second, to which of her neighbors should she send requested chunks? 4/5 Contd…
  • 394. Working of Bit Torrent Protocol •In deciding which chunks to request, Alice uses a technique called rarest first. •The idea is to determine, from among the chunks she does not have, the chunks that are the rarest among her neighbors (that is, the chunks that have the fewest repeated copies among her neighbors) and then request those rarest chunks first. •In this manner, the rarest chunks get more quickly redistributed, aiming to (roughly) equalize the numbers of copies of each chunk in the torrent. 5/5 Contd…
  • 395. Clever Trading Algorithm •To determine which requests Alice responds to, BitTorrent uses a Clever Trading Algorithm. •The basic idea is that Alice gives priority to the neighbors that are currently supplying her data at the highest rate. 1/6
  • 396. Clever Trading Algorithm •Specifically, for each of her neighbors, Alice continually measures the rate at which she receives bits and determines the four peers that are feeding her bits at the highest rate. •She then reciprocates by sending chunks to these same four peers. Every 10 seconds, she recalculates the rates and possibly modifies the set of four peers. 2/6 Contd…
  • 397. Clever Trading Algorithm •In BitTorrent lingo, these four peers are said to be unchoked. •Every 30 seconds, she also picks one additional neighbor at random and sends it chunks. Let’s call the randomly chosen peer Bob. •In BitTorrent lingo, Bob is said to be optimistically unchoked. 3/6 Contd…
  • 398. Clever Trading Algorithm •Because Alice is sending data to Bob, she may become one of Bob’s top four uploaders, in which case Bob would start to send data to Alice. •If the rate at which Bob sends data to Alice is high enough, Bob could then become one of Alice’s top four uploaders. 4/6 Contd…
  • 399. Clever Trading Algorithm •In other words, every 30 seconds, Alice will randomly choose a new trading partner and initiate trading with that partner. •If the two peers are satisfied with the trading, they will put each other in their top four lists and continue trading with each other until one of the peers finds a better partner. •The effect is that peers capable of uploading at compatible rates tend to find each other. 5/6 Contd…
  • 400. Clever Trading Algorithm •The random neighbor selection also allows new peers to get chunks, so that they can have something to trade. •All other neighboring peers besides these five peers (four “top” peers and one probing peer) are “choked,” that is, they do not receive any chunks from Alice. •BitTorrent has a number of interesting mechanisms including pieces (mini-chunks), pipelining, random first selection, endgame mode, and anti-snubbing. 6/6 Contd…
  • 401. 2.6.2 Distributed Hash Tables (DHTs)