SlideShare a Scribd company logo
CS 640 1
The World Wide Web
Outline
Background
Structure
Protocols
CS 640 2
WWW Background
• 1989-1990 – Tim Berners-Lee invents the World Wide
Web at CERN
– Means for transferring text and graphics simultaneously
– Client/Server data transfer protocol
• Communication via application level protocol
• System ran on top of standard networking infrastructure
– Text mark up language
• Not invented by Bernes-Lee
• Simple and easy to use
• Requires a client application to render text/graphics
CS 640 3
WWW History contd.
• 1994 – Mark Andreesen invents MOSAIC at National Center for
Super Computing Applications (NCSA)
– First graphical browser
– Internet’s first “killer app”
– Freely distributed
– Became Netscape Inc.
• 1995 (approx.) – Web traffic becomes dominant
– Exponential growth
– E-commerce
– Web infrastructure companies
– World Wide Web Consortium
• Reference: “Web Protocols and Practice”, Krishnamurthy and
Rexford
CS 640 4
WWW Components
• Structural Components
– Clients/browsers – to dominant implementations
– Servers – run on sophisticated hardware
– Caches – many interesting implementations
– Internet – the global infrastructure which facilitates data transfer
• Semantic Components
– Hyper Text Transfer Protocol (HTTP)
– Hyper Text Markup Language (HTML)
• eXtensible Markup Language (XML)
– Uniform Resource Identifiers (URIs)
CS 640 5
Quick Aside – Web server use
Source: Netcraft Server Survey, 2001
CS 640 6
WWW Structure
• Clients use browser application to send URIs via HTTP to servers
requesting a Web page
• Web pages constructed using HTML (or other markup language)
and consist of text, graphics, sounds plus embedded files
• Servers (or caches) respond with requested Web page
– Or with error message
• Client’s browser renders Web page returned by server
– Page is written using Hyper Text Markup Language (HTML)
– Displaying text, graphics and sound in browser
– Writing data as well
• The entire system runs over standard networking protocols
(TCP/IP, DNS,…)
CS 640 7
Uniform Resource Identifiers
• Web resources need names/identifiers – Uniform Resource
Identifiers (URIs)
– Resource can reside anywhere on the Internet
• URIs are a somewhat abstract notion
– A pointer to a resource to which request methods can be applied to
generate potentially different responses
• A request method is eg. fetching or changing the object
• Instance: http://www.foo.com/index.html
– Protocol, server, resource
• Most popular form of a URI is the Uniform Resource Locator
(URL)
– Differences between URI and URL are beyond scope
– RFC 2396
CS 640 8
HTTP Basics
• Protocol for client/server communication
– The heart of the Web
– Very simple request/response protocol
• Client sends request message, server replies with response message
– Stateless
– Relies on URI naming mechanism
• Three versions have been used
– 09/1.0 – very close to Berners-Lee’s original
• RFC 1945 (original RFC is now expired)
– 1.1 – developed to enhance performance, caching, compression
• RFC 2068
– 1.0 dominates today but 1.1 is catching up
CS 640 9
HTTP Request Messages
• GET – retrieve document specified by URL
• PUT – store specified document under given URL
• HEAD – retrieve info. about document specified by URL
• OPTIONS – retrieve information about available options
• POST – give information (eg. annotation) to the server
• DELETE – remove document specified by URL
• TRACE – loopback request message
• CONNECT – for use by caches
CS 640 10
HTTP Request Format
• First type of HTTP message: requests
– Client browsers construct and send message
• Typical HTTP request:
– GET http://www.cs.wisc.edu/index.html HTTP/1.0
request-line ( request request-URI HTTP-version)
headers (0 or more)
<blank line>
body (only for POST request)
CS 640 11
HTTP Response Format
• Second type of HTTP message: response
– Web servers construct and send response messages
• Typical HTTP response:
– HTTP/1.0 301 Moved Permanently
Location: http://www.wisc.edu/cs/index.html
status-line (HTTP-version response-code response-phrase)
headers (0 or more)
<blank line>
body
CS 640 12
HTTP Response Codes
• 1xx – Informational – request received, processing
• 2xx – Success – action received, understood, accepted
• 3xx – Redirection – further action necessary
• 4xx – Client Error – bad syntax or cannot be fulfilled
• 5xx – Server Error – server failed
CS 640 13
HTTP Headers
• Both requests and responses can contain a variable
number of header fields
– Consists of field name, colon, space, field value
– 17 possible header types divided into three categories
• Request
• Response
• Body
• Example: Date: Friday, 27-Apr-01 13:30:01 GMT
• Example: Content-length: 3001
CS 640 14
HTTP/1.0 Network Interaction
• Clients make requests to port 80 on servers
– Uses DNS to resolve server name
• Clients make separate TCP connection for each URL
– Some browsers open multiple TCP connections
• Netscape default = 4
• Server returns HTML page
– Many types of servers with a variety of implementations
– Apache is the most widely used
• Freely available in source form
• Client parses page
– Requests embedded objects
CS 640 15
HTTP/1.1 Performance Enhancements
• HTTP/1.0 is a “stop and wait” protocol
– Separate TCP connection for each file
• Connect setup and tear down is incurred for each file
• Inefficient use of packets
• Server must maintain many connections in TIME_WAIT
• Mogul and Padmanabahn studied these issues in ’95
– Resulted in HTTP/1.1 specification focused on performance
enhancements
• Persistent connections
• Pipelining
• Enhanced caching options
• Support for compression
CS 640 16
Persistent Connections and Pipelining
• Persistent connections
– Use the same TCP connection(s) for transfer of multiple files
– Reduces packet traffic significantly
– May or may not increase performance from client perspective
• Load on server increases
• Pipelining
– Pack as much data into a packet as possible
– Requires length field(s) within header
– May or may not reduce packet traffic or increase performance
• Page structure is critical
CS 640 17
HTML Basics
• Hyper-Text Markup Language
– A subset of Standardized General Markup Language (SGML)
– Facilitates a hyper-media environment
• Embedded links to other documents and applications
• Documents use elements to “mark up” or identify sections of text
for different purposes or display characteristics
• Mark up elements are not seen by the user when page is displayed
• Documents are rendered by browsers
• NOTE: Not all documents in the Web are HTML!
• Most people use WYSIWYG editors (MS Word) to generate
HTML
CS 640 18
HTML Example
<HTML>
<HEAD>
<TITLE> PB’s HomePage </TITLE>
</HEAD>
<BODY>
<CENTER><IMG SRC = “bad_picture.gif” ALT = “ “><BR></CENTER>
<P><CENTER><H1>UW Computer Science Department</H1></CENTER>
Welcome to my goofy HomePage!
…
<A HREF = http://www.cs.wisc.edu/~pb/mydogs_page.html> Spot’s Page </A>
</BODY>
</HTML>

More Related Content

PPT
www | HTTP | HTML - Tutorial
PPT
A detailed presentation on the World Wide Web
PPT
PPTX
Introduction to the Internet and Web.pptx
PPTX
Http protocol
PPTX
Module 5 Application and presentation Layer .pptx
PDF
Web architecturesWeb architecturesWeb architectures
www | HTTP | HTML - Tutorial
A detailed presentation on the World Wide Web
Introduction to the Internet and Web.pptx
Http protocol
Module 5 Application and presentation Layer .pptx
Web architecturesWeb architecturesWeb architectures

Similar to The world wide web science and technology.00000002ppt (20)

PDF
Web services
PDF
Web Landscape - updated in Jan 2016
PPTX
Web & HTTP
PPTX
An Introduction To World Wide Web
PPT
introduction to Web system
PDF
CNIT 129S - Ch 3: Web Application Technologies
PDF
CNIT 129S: Ch 3: Web Application Technologies
PPTX
Compute rNetwork.pptx
PDF
Advanced Web Design And Development BIT 3207
PPT
ch01-Internet & Web Basics &.ppt
PDF
Cs8591 Computer Networks - UNIT V
PPTX
Basics of the Web Platform
PPTX
Web Unit 1. it is very good material for web development
PPTX
Web technologies-course 01.pptx
PPTX
cross document messaging, html 5
PPT
21 Www Web Services
PPTX
Evolution Of The Web Platform & Browser Security
PPTX
IP UNIT 1.pptx
PPT
Slides cao
PPTX
Presentation1.pptx
Web services
Web Landscape - updated in Jan 2016
Web & HTTP
An Introduction To World Wide Web
introduction to Web system
CNIT 129S - Ch 3: Web Application Technologies
CNIT 129S: Ch 3: Web Application Technologies
Compute rNetwork.pptx
Advanced Web Design And Development BIT 3207
ch01-Internet & Web Basics &.ppt
Cs8591 Computer Networks - UNIT V
Basics of the Web Platform
Web Unit 1. it is very good material for web development
Web technologies-course 01.pptx
cross document messaging, html 5
21 Www Web Services
Evolution Of The Web Platform & Browser Security
IP UNIT 1.pptx
Slides cao
Presentation1.pptx
Ad

Recently uploaded (20)

PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Computing-Curriculum for Schools in Ghana
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
RMMM.pdf make it easy to upload and study
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Basic Mud Logging Guide for educational purpose
PDF
Complications of Minimal Access Surgery at WLH
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Pre independence Education in Inndia.pdf
O5-L3 Freight Transport Ops (International) V1.pdf
Microbial diseases, their pathogenesis and prophylaxis
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Computing-Curriculum for Schools in Ghana
human mycosis Human fungal infections are called human mycosis..pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Abdominal Access Techniques with Prof. Dr. R K Mishra
RMMM.pdf make it easy to upload and study
Final Presentation General Medicine 03-08-2024.pptx
Sports Quiz easy sports quiz sports quiz
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Basic Mud Logging Guide for educational purpose
Complications of Minimal Access Surgery at WLH
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
VCE English Exam - Section C Student Revision Booklet
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Pre independence Education in Inndia.pdf
Ad

The world wide web science and technology.00000002ppt

  • 1. CS 640 1 The World Wide Web Outline Background Structure Protocols
  • 2. CS 640 2 WWW Background • 1989-1990 – Tim Berners-Lee invents the World Wide Web at CERN – Means for transferring text and graphics simultaneously – Client/Server data transfer protocol • Communication via application level protocol • System ran on top of standard networking infrastructure – Text mark up language • Not invented by Bernes-Lee • Simple and easy to use • Requires a client application to render text/graphics
  • 3. CS 640 3 WWW History contd. • 1994 – Mark Andreesen invents MOSAIC at National Center for Super Computing Applications (NCSA) – First graphical browser – Internet’s first “killer app” – Freely distributed – Became Netscape Inc. • 1995 (approx.) – Web traffic becomes dominant – Exponential growth – E-commerce – Web infrastructure companies – World Wide Web Consortium • Reference: “Web Protocols and Practice”, Krishnamurthy and Rexford
  • 4. CS 640 4 WWW Components • Structural Components – Clients/browsers – to dominant implementations – Servers – run on sophisticated hardware – Caches – many interesting implementations – Internet – the global infrastructure which facilitates data transfer • Semantic Components – Hyper Text Transfer Protocol (HTTP) – Hyper Text Markup Language (HTML) • eXtensible Markup Language (XML) – Uniform Resource Identifiers (URIs)
  • 5. CS 640 5 Quick Aside – Web server use Source: Netcraft Server Survey, 2001
  • 6. CS 640 6 WWW Structure • Clients use browser application to send URIs via HTTP to servers requesting a Web page • Web pages constructed using HTML (or other markup language) and consist of text, graphics, sounds plus embedded files • Servers (or caches) respond with requested Web page – Or with error message • Client’s browser renders Web page returned by server – Page is written using Hyper Text Markup Language (HTML) – Displaying text, graphics and sound in browser – Writing data as well • The entire system runs over standard networking protocols (TCP/IP, DNS,…)
  • 7. CS 640 7 Uniform Resource Identifiers • Web resources need names/identifiers – Uniform Resource Identifiers (URIs) – Resource can reside anywhere on the Internet • URIs are a somewhat abstract notion – A pointer to a resource to which request methods can be applied to generate potentially different responses • A request method is eg. fetching or changing the object • Instance: http://www.foo.com/index.html – Protocol, server, resource • Most popular form of a URI is the Uniform Resource Locator (URL) – Differences between URI and URL are beyond scope – RFC 2396
  • 8. CS 640 8 HTTP Basics • Protocol for client/server communication – The heart of the Web – Very simple request/response protocol • Client sends request message, server replies with response message – Stateless – Relies on URI naming mechanism • Three versions have been used – 09/1.0 – very close to Berners-Lee’s original • RFC 1945 (original RFC is now expired) – 1.1 – developed to enhance performance, caching, compression • RFC 2068 – 1.0 dominates today but 1.1 is catching up
  • 9. CS 640 9 HTTP Request Messages • GET – retrieve document specified by URL • PUT – store specified document under given URL • HEAD – retrieve info. about document specified by URL • OPTIONS – retrieve information about available options • POST – give information (eg. annotation) to the server • DELETE – remove document specified by URL • TRACE – loopback request message • CONNECT – for use by caches
  • 10. CS 640 10 HTTP Request Format • First type of HTTP message: requests – Client browsers construct and send message • Typical HTTP request: – GET http://www.cs.wisc.edu/index.html HTTP/1.0 request-line ( request request-URI HTTP-version) headers (0 or more) <blank line> body (only for POST request)
  • 11. CS 640 11 HTTP Response Format • Second type of HTTP message: response – Web servers construct and send response messages • Typical HTTP response: – HTTP/1.0 301 Moved Permanently Location: http://www.wisc.edu/cs/index.html status-line (HTTP-version response-code response-phrase) headers (0 or more) <blank line> body
  • 12. CS 640 12 HTTP Response Codes • 1xx – Informational – request received, processing • 2xx – Success – action received, understood, accepted • 3xx – Redirection – further action necessary • 4xx – Client Error – bad syntax or cannot be fulfilled • 5xx – Server Error – server failed
  • 13. CS 640 13 HTTP Headers • Both requests and responses can contain a variable number of header fields – Consists of field name, colon, space, field value – 17 possible header types divided into three categories • Request • Response • Body • Example: Date: Friday, 27-Apr-01 13:30:01 GMT • Example: Content-length: 3001
  • 14. CS 640 14 HTTP/1.0 Network Interaction • Clients make requests to port 80 on servers – Uses DNS to resolve server name • Clients make separate TCP connection for each URL – Some browsers open multiple TCP connections • Netscape default = 4 • Server returns HTML page – Many types of servers with a variety of implementations – Apache is the most widely used • Freely available in source form • Client parses page – Requests embedded objects
  • 15. CS 640 15 HTTP/1.1 Performance Enhancements • HTTP/1.0 is a “stop and wait” protocol – Separate TCP connection for each file • Connect setup and tear down is incurred for each file • Inefficient use of packets • Server must maintain many connections in TIME_WAIT • Mogul and Padmanabahn studied these issues in ’95 – Resulted in HTTP/1.1 specification focused on performance enhancements • Persistent connections • Pipelining • Enhanced caching options • Support for compression
  • 16. CS 640 16 Persistent Connections and Pipelining • Persistent connections – Use the same TCP connection(s) for transfer of multiple files – Reduces packet traffic significantly – May or may not increase performance from client perspective • Load on server increases • Pipelining – Pack as much data into a packet as possible – Requires length field(s) within header – May or may not reduce packet traffic or increase performance • Page structure is critical
  • 17. CS 640 17 HTML Basics • Hyper-Text Markup Language – A subset of Standardized General Markup Language (SGML) – Facilitates a hyper-media environment • Embedded links to other documents and applications • Documents use elements to “mark up” or identify sections of text for different purposes or display characteristics • Mark up elements are not seen by the user when page is displayed • Documents are rendered by browsers • NOTE: Not all documents in the Web are HTML! • Most people use WYSIWYG editors (MS Word) to generate HTML
  • 18. CS 640 18 HTML Example <HTML> <HEAD> <TITLE> PB’s HomePage </TITLE> </HEAD> <BODY> <CENTER><IMG SRC = “bad_picture.gif” ALT = “ “><BR></CENTER> <P><CENTER><H1>UW Computer Science Department</H1></CENTER> Welcome to my goofy HomePage! … <A HREF = http://www.cs.wisc.edu/~pb/mydogs_page.html> Spot’s Page </A> </BODY> </HTML>