SlideShare a Scribd company logo
1-Introduction                                                         2/2
                                                   Internet Engineering Course
                                                           University of Tehran
                                                                 Abbas Nayebi

These slides are based on the presentation files provided by Lennart Herlaar, Utrecht University, the
                                                                                       Netherlands
                                                              (http://www.cs.uu.nl/people/lennart)




                                                                                                        1
Internet prog. Environment characteristics
Client/server architecture
Protocols
Addressing under IP
Names versus numbers
IP Packet, TCP
WWW: Document Formats, Markup,
 HTML, Browsers



Summary of the last session
                                              2
Web  servers
The http protocol
URLs, URIs, and URNs
MIME
Scripting languages
 ◦ Client side
 ◦ Server side: SSI, CGI, servlets




Summary for this session
                                     3
   Specialized program for handling document requests and executing scripts
    and programs (written in PHP, Perl, Java (JSP), ASP, ...).
   Javascript, HTML and CSS are handled by the browser (the client).
   Several servers exist: Apache (52%), Microsoft’s IIS (33%), and IIS’s
    smaller brother PWS.
   Number of servers used to double about every half year, but...
   Browser communicates request to server via port numbers (usually 80).
   Other numbers are possible: http://knor.glob.nl:810/index.html
   Apache’s Tomcat supports servlets, Java programs which extend the
    capabilities of a server. Tomcat works on port 8080 alongside another
    server.
   Internet applications can be programmed without using Web servers: set
    up your own protocols and use sockets.
   Webservers can be clients of a database server: three-tier organization.




Web servers
                                                                               4
 Possible   requests to a web server:
  ◦   GET – retrieve document
  ◦   HEAD – retrieve the head of the document
  ◦   POST – execute the document, using enclosed data
  ◦   PUT – replace document with enclosed data
  ◦   DELETE – delete the document
 Most used are GET and POST.
 Other information passed along      with request
  ◦   a list of acceptable mime types,
  ◦   e.g. Accept: text/* and Accept: image/gif
  ◦   For caching: If-Modified-since: date
  ◦   parameters for server side scripts
 GET,   DELETE and HEAD do not have message bodies.




The http protocol
                                                         5
A status line: 200 OK, 404 File not found,....
Retrieval date and various header fields
A blank line
Hopefully the requested body of text, if any




The return of the http protocol
                                                  6
   URI = Uniform Resource Identifier which generalizes URL
   A grammar for Uniform Resource Locators (URLs)
    ◦   url ::= scheme address
    ◦   scheme ::= http:// | ftp:// | mailto: | file:// |...
    ◦   address ::= full-domain-name/path-to-document[#anchor]
    ◦   full-domain-name ::= host.domainname[:portnr]
    ◦   domainname ::= domain . domainname | domain
    ◦   domain ::= string-without-whitespace-and-{; : &}
    ◦   path-to-document ::= /-separated-strings-without-whitespace-and-{; : &}
 Spaces in filenames should be coded as %20 (ASCII in hexadecimal).
 If no html filename is given for the http protocol, the file index.html,
  index.php or index.htm may be used (this depends on the server, though).
 URN: Uniform Resource Name
 Example:
    ◦ URL http://www.pierobon.org/iis/review1.htm
    ◦ URN www.pierobon.org/iis/review1.htm#one
    ◦ URI http://www.pierobon.org/iis/review1.htm.html#one




URLs, URIs, and URNs
                                                                                  7
   How does a browser know what kind of file it is retrieving?
   A GIF file is displayed differently from a HTML or PDF file.
   For some types, like PDF, a separate helper application must be started.
   Sometimes a plug-in or filter is available, e.g. Macromedia Flash or some
    to-HTML converter.
   Would be nice to automatically start the right one.
   The same problem occurred with attachments to e-mail.
   Here Multipurpose Internet Mail Extensions (MIME) were developed.
   A MIME specificiation is type/subtype.
   Common examples are text/html, text/plain, image/jpeg,
    image/gif, audio/x-realaudio and application/x-shockwave-flash.
   Experimental MIME types are of the form type/x-subtype
   When a MIME type is not present, the extension is usually the deciding
    factor.




MIME
                                                                                8
   WWW for dummies: use elaborate tools to construct large
    websites
   Little programming involved, just plug and play.
   We don’t want to make fancy/professional websites. We try to
    capture the concepts and make a dependable knowledge for web
    programming.
   You may use drawing tools like Gimp, Photoshop, CoolEdit and
    the like for making sounds, pictures, banners, icons and movies.
   You may not use web authoring tools like Macromedia
    Dreamweaver, Microsoft Frontpage, Adobe PageMill,....
   What is allowed depends on the assignment: notepad++, PHP Ed,
    Eclipse, …
   Documentation about the tools you use: include it.
   Of course, in real life you should use tools wherever applicable,
    but whatever you do: make sure the result is maintainable.




Authoring tools
                                                                        9
 Inthis course, we program the Internet using the
  Web.
 Question: why develop for the Web?
 Possible answers:
  ◦   Browser handles displaying
  ◦   “Everybody” has a browser
  ◦   Servers handle large parts of the transmission details
  ◦   Database servers do much of the rest
 Software  maintenance/upgrade problems in a
  large organization.
 Security issues: client program keeps a password
  to connect to the database server.
 A lot of work has been done. Why not use it?




Why use the Web?
                                                               10
 Invent your own protocol
 Write your own server
 Write client programs for   as many platforms as you
  can
 Try to get client and server software at the right
  places
 Examples: (s)ftp, MSN, BitTorrent (peer-to-peer),
  MMORPG (Massively multiplayer online role-playing game)
 The main problem is to get the proper software in the
  proper places.
 People have to download your client software and
  install it on their local (home) computers
 Your server software must be installed e.g. at
  Internet Service Provider computers.


Making your own Internet
applications
                                                            11
 Used to make web sites    dynamic.
 A fully dynamic website   could be called a web
  application.
 Scripting languages exist on both the client side and
  the server side.
 Client side scripts are executed by the browser.
  ◦ These are used mainly to overcome shortcomings in the
    protocol / presentation (!).
  ◦ Dynamic HTML
  ◦ Examples are: JavaScript (and Java Applets to a lesser
    extent).
 Server side scripts are executed by the webserver.
 Several means of integration exist.
 Examples of languages are: PHP, Perl, Python, C++,
 ASP.



On scripting languages
                                                             12
Some  executables codes transferred to
 the browser for execution.
Now a days, are not appreciated generally
Some technologies:
 ◦ Java Applets (executed on a VM, secured by a
   sandbox, multi-platform)
 ◦ Microsoft ActiveX (executed natively, dangerous
   !, single-OS, single-browser)
Both   of them can be signed digitally
 ◦ Normally, looser security rules are applied to
   the signed applets.



Client side executables
                                                    13
Interpreted  languages
No (insistence on) declarations
Run-time typing
Exceptions exist
Libraries for accessing various databases
Regular expressions
Easy reporting/printing
Many libraries available



General aspects of scripting
languages
                                             14
 Most server side scripting languages use the one-page-at-a
  -time philosophy: they process an scripted page.
 On request, a webserver can execute a program (CGI).
  The program generates the whole page.
 Or replace inline code by its output (PHP).
 The result is usually an HTML file which is returned to the
  client.
 Input to server side scripts is usually a form.
 ”Running a web application” usually consists of a long
  string of script incarnations.
 Page based (in form of page) ping ponging of data between
  client and server.
 Form, submit, form, submit, form, submit, etc.
 Page based requests generate lots of overhead.
 Lack of state generates lots of overhead (and security
  issues!).



Server side scripting
                                                            15
Communication      between client and server
 (and vice versa)
 ◦ HTTP, HTMLx
 ◦ Forms, parameters
Communication      between server and script
 (and vice versa)
 ◦ Inline, CGI, servlets
 ◦ Direct, standard input/output, environment
Communication      between script and database
 (and vice versa)
 ◦ SQL, resultset


Interesting parts of server side
scripting
                                                  16
 Snippets of program scattered through HTML.
 Executed from top to bottom.
 Results embedded into the returned document         by web
  server.
 Examples
  ◦ Server Side Includes (SSI)
  ◦ PHP mainly uses inline, i.e., within HTML between <?php
    and ?>.
  ◦ Java Server Pages.
  ◦ Active Server Pages (used with VBScript or Javascript)
 Often used when only small parts of the document
  are computations.
 Thin separation between code and presentation
 Code becomes quickly unreadable.
 Solutions to this problem exist (templates).


Method 1: Inlining code into a
document
                                                              17
 HTML  files may include preprocessing directives.
 HTML  recognized by extension (e.g., shtml) or
  having exec flag on.
 Server parses HTML files and executes directives.
 General form:
  ◦ <!--#element attribute=value attribute=value ... -->
 Variables   can be used as well, and #if directives.




Server Side Includes
                                                           18
 A protocol for web browsers to interface with applications
  via a webserver.
 Mainly used for running programs on the server (with
  certain permissions)
 Identifiable by reference to file in special directory .../cgi-
  bin/xxxx or file extension.
 Parameters are passed to it.
 Result is usually a HTML document which is sent to the
  client.
 Perl, Python or shell scripts are often used in this fashion.
 But all languages are possible, even C++.
 There are slight differences between CGI executing and
  ordinary execution of programs.
 CGI is deprecated now.




Method 2: Common Gateway
Interface
                                                                    19
 Client (browser) sends a request to the server by TCP/IP
  (socket)
 We focus here on the commands GET, HEAD, POST, PUT
 GET = give a document
 HEAD = give only the header of document (email/news
  header)
 POST = send the contents of a form (upload form data)
 PUT = upload of a file
 For CGI mostly GET or POST
 The request line will be followed by 0 or more headers and
  an empty line
 with POST/PUT followed by a body (contents of form
  and/or file)




HTTP requests
                                                               20
The  URL used for this request was:
http://sunshine.cs.uu.nl:8000/docs/vakken
 /inp/dwarf.html?M=November&Y=2004




The body was empty

Example request
                                            21
   With GET the contents of the form, or a parameter is given as
    part of the URL (URL=cgi-bin/xxx?parameters)
   Special characters are given in hex.
   With POST the contents of the form is given as the body of the
    request.
   The server puts all relevant information in environment
    variables
   The body (POST) is given to the standard input of the script.
   Hence, scripts reacting to a POST should read from standard
    input.
   The server does not indicate end-of-file (length field in the
    headers)
   In most cases, a special CGI module makes access to these
    information transparent.



GET versus POST
                                                                 22
   Important environment variables:
    ◦ QUERY_STRING: The query (the part after ? in the URL). Usually a
      series of the form field=value, separated by &.
    ◦ CONTENT_LENGTH length of the data (POST)
    ◦ CONTENT_TYPE type of the data (MIME)
 Not all of the normal environment is sent along: the server
  filters it.
 Test your program by faking the parameters (environment
  variables) and running it from the prompt.
 The CGI program should give a correct document (with
  header and data) on standard output (say, using printf() ).
 First thing to do: print the Content-type.
 The server sends this document to the client (browser).




CGI Environment
                                                                         23
Many  languages offer libraries for CGI
 programming.
These libraries take care of tedious tasks,
 such as accessing the environment
 variables and decoding QUERY_STRING.
Always look for, and use libraries for CGI.




Libraries for CGI
                                               24
   Independent of a programming language
   Isolation of processes. CGI cannot damage the server
   CGI programs can get limited permissions (only for part of the
    Web site)
   Performance: for each request a CGI script must be started as a
    separate process
   Starting a new process is very expensive, especially on Windows.
   No possibility to give other information to the server (like logging,
    program should write the logs in its own repository)
   One time Request/Response: stateless (not just CGI, also for
    inline).
   Ex: consider an electronic shopping cart (e-commerce): where do
    I maintain the shopping list?




CGI advantages/disadvantages
                                                                        25
   For a stateless server the client must keep the state.
   In a browser: this could be done with Javascript, Java, etc.
   In Forms: “hidden” fields can be filled.
   When the form is sent these fields are included.
   Cumulative information (like shopping card) could be stored in this
    way.
   Problem: when the user “returns” (with the BACK key) or enters the
    URL directly in the browser the field gets lost. Not bad when each
    user wants to keep multiple workflows simultaneously.
   If GET is used, it is also possible to put extra information in the URL if
    no form is present.
   Exe: Find a website that gets parameters from the URL. Try to
    change the parameters and retrieve the infoprmation without
    navigating through the pages, e.g., zzzzz.com?
    picId=100&newsId=56
   Keeping state in the client or URL is prone to security issues.
   Other (better) solutions exist to the problem of statelessness.

Client side state
                                                                             26
Four steps: article selection, personal
 data, credit card details, confirmation.
Four scripts or one script?
Maintaining state.
Validating input.
Security.




Example: order process of an
online shop
                                            27
(Reading)
SQL injection.
Cross site scripting.




Some other security issues
                             28
 A more recent development.
 Servlet = applet run on a server.
 Servlets, in general define extensions to servers (WWW or
  other).
 Many people know Java and its many libraries.
 Java is typed.
 No need to learn a new language.
 Basic packages are javax.servlet and javax.servlet.http
 JSP is used when only little code is included in HTML
  documents.
 Use javax.servlet.jsp and javax.servlet.jsp.tagext.
 Together JSP and servlets constitute the Web side of J2EE.
 Database connecting using Java DataBase Connectivity
  (JDBC).

Method 3: Java Servlets
                                                               29
Exe:   Answer the following question:
 ◦ What is a keep-alive connection?
Exe: List three enhances in Http 1.1 over
 Http 1.0




Keep-Alive Connections
                                             30
Web  servers
The http protocol
URLs, URIs, and URNs
MIME
Scripting languages
 ◦ Client side
 ◦ Server side: SSI, CGI, servlets




Summary for this session
                                     31

More Related Content

PPTX
Introduction to Web Technology
PPTX
Unit 1 introduction to web programming
PPT
Introduction to web technology
PPTX
HTML, CSS and XML
PPT
introduction to web technology
PDF
Internet programming lecture 1
PPT
introduction to Web system
PPTX
Web Technology Fundamentals
Introduction to Web Technology
Unit 1 introduction to web programming
Introduction to web technology
HTML, CSS and XML
introduction to web technology
Internet programming lecture 1
introduction to Web system
Web Technology Fundamentals

What's hot (20)

PDF
CS6501 - Internet programming
PDF
Slides 1 - Internet and Web
PPT
Overview of TCP IP
PPT
Internet
PPT
Html
PDF
Intro to Dynamic Web Pages
DOC
Prog db-and-web-with-html-php-and-my sql
PDF
Introduction to Web Technology
PPTX
0 csc 3311 slide internet programming
PPT
Lecture3 introduction towebpages
PPT
PPT
Internet
PPT
Internet
PPTX
PPTX
Eba ppt rajesh
PPTX
Introduction to Web Architecture
PPTX
Web technology unit I - Part B
PPT
Howthe internet
PPT
Intro. to the internet and web
PPTX
Web technology Unit-I Part E
CS6501 - Internet programming
Slides 1 - Internet and Web
Overview of TCP IP
Internet
Html
Intro to Dynamic Web Pages
Prog db-and-web-with-html-php-and-my sql
Introduction to Web Technology
0 csc 3311 slide internet programming
Lecture3 introduction towebpages
Internet
Internet
Eba ppt rajesh
Introduction to Web Architecture
Web technology unit I - Part B
Howthe internet
Intro. to the internet and web
Web technology Unit-I Part E
Ad

Viewers also liked (20)

PPTX
One IOTA at a time: A Case Study of OpenURL Success Metrics
DOCX
PPTX
What Provosts Want Librarians to Know, Beth Paul, Stetson University
PPTX
Titans (iv 2)
PDF
5kount brochure booklet
PDF
Ndp rekonstruksi
PPTX
Hatzalah ppt deck design
PDF
Nilai-Nilai Dasar Perjuangan versi Konstitusi
PPTX
NY Equity Investments PPT Deck design
PPTX
Getoomi PPT Deck design
PDF
CMS presentation deck design
PPTX
Status26 Pubcon PPT deck design
PPTX
Help Me See PPT Deck design
PPTX
ComScore Suite PPT Deck design
PPTX
Acquiring Small Press Monographs: Trends and Analyses
PPTX
Managing Journals by Committee
PPTX
2013 smrf-nodexl-sna-socialmedia-fr version -130320011951-phpapp01-1 2
PDF
Catalogue formations altasys_conseil2013_
PDF
Diario Paka y Nacho
One IOTA at a time: A Case Study of OpenURL Success Metrics
What Provosts Want Librarians to Know, Beth Paul, Stetson University
Titans (iv 2)
5kount brochure booklet
Ndp rekonstruksi
Hatzalah ppt deck design
Nilai-Nilai Dasar Perjuangan versi Konstitusi
NY Equity Investments PPT Deck design
Getoomi PPT Deck design
CMS presentation deck design
Status26 Pubcon PPT deck design
Help Me See PPT Deck design
ComScore Suite PPT Deck design
Acquiring Small Press Monographs: Trends and Analyses
Managing Journals by Committee
2013 smrf-nodexl-sna-socialmedia-fr version -130320011951-phpapp01-1 2
Catalogue formations altasys_conseil2013_
Diario Paka y Nacho
Ad

Similar to 02 intro (20)

PPTX
Lec 01 Introduction.pptx
PPT
Ch-1_.ppt
PPT
introduction to web application development
PPTX
CN UNIT5.pptxCN unit5CN unit5CN unit5CN unit5CN unit5CN unit5CN unit5CN unit5...
PPTX
Lecture 1- Introduction to Computers and the Internet.pptx
PDF
Application_layer.pdf
PPTX
Introduction_to_computershfffffffffffffffffffffffffffffffffffffffffffffff_and...
PPT
1 web overview
PDF
Natural Language processing and web deigning notes
PDF
WEB I - 01 - Introduction to Web Development
PPT
A detailed presentation on the World Wide Web
PDF
Fundamental Internet Programming.pdf
PPTX
Web Database
PPTX
Web application development ( basics )
PPTX
3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP
PPTX
Www(alyssa) (2)
PDF
Web Introduction
PPTX
Servlet & jsp
PPTX
Www and http
Lec 01 Introduction.pptx
Ch-1_.ppt
introduction to web application development
CN UNIT5.pptxCN unit5CN unit5CN unit5CN unit5CN unit5CN unit5CN unit5CN unit5...
Lecture 1- Introduction to Computers and the Internet.pptx
Application_layer.pdf
Introduction_to_computershfffffffffffffffffffffffffffffffffffffffffffffff_and...
1 web overview
Natural Language processing and web deigning notes
WEB I - 01 - Introduction to Web Development
A detailed presentation on the World Wide Web
Fundamental Internet Programming.pdf
Web Database
Web application development ( basics )
3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP
Www(alyssa) (2)
Web Introduction
Servlet & jsp
Www and http

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Modernizing your data center with Dell and AMD
PDF
Encapsulation theory and applications.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation_ Review paper, used for researhc scholars
Review of recent advances in non-invasive hemoglobin estimation
Per capita expenditure prediction using model stacking based on satellite ima...
Building Integrated photovoltaic BIPV_UPV.pdf
Electronic commerce courselecture one. Pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Advanced methodologies resolving dimensionality complications for autism neur...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Spectral efficient network and resource selection model in 5G networks
Modernizing your data center with Dell and AMD
Encapsulation theory and applications.pdf
Big Data Technologies - Introduction.pptx
Unlocking AI with Model Context Protocol (MCP)
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Monthly Chronicles - July 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf

02 intro

  • 1. 1-Introduction 2/2 Internet Engineering Course University of Tehran Abbas Nayebi These slides are based on the presentation files provided by Lennart Herlaar, Utrecht University, the Netherlands (http://www.cs.uu.nl/people/lennart) 1
  • 2. Internet prog. Environment characteristics Client/server architecture Protocols Addressing under IP Names versus numbers IP Packet, TCP WWW: Document Formats, Markup, HTML, Browsers Summary of the last session 2
  • 3. Web servers The http protocol URLs, URIs, and URNs MIME Scripting languages ◦ Client side ◦ Server side: SSI, CGI, servlets Summary for this session 3
  • 4. Specialized program for handling document requests and executing scripts and programs (written in PHP, Perl, Java (JSP), ASP, ...).  Javascript, HTML and CSS are handled by the browser (the client).  Several servers exist: Apache (52%), Microsoft’s IIS (33%), and IIS’s smaller brother PWS.  Number of servers used to double about every half year, but...  Browser communicates request to server via port numbers (usually 80).  Other numbers are possible: http://knor.glob.nl:810/index.html  Apache’s Tomcat supports servlets, Java programs which extend the capabilities of a server. Tomcat works on port 8080 alongside another server.  Internet applications can be programmed without using Web servers: set up your own protocols and use sockets.  Webservers can be clients of a database server: three-tier organization. Web servers 4
  • 5.  Possible requests to a web server: ◦ GET – retrieve document ◦ HEAD – retrieve the head of the document ◦ POST – execute the document, using enclosed data ◦ PUT – replace document with enclosed data ◦ DELETE – delete the document  Most used are GET and POST.  Other information passed along with request ◦ a list of acceptable mime types, ◦ e.g. Accept: text/* and Accept: image/gif ◦ For caching: If-Modified-since: date ◦ parameters for server side scripts  GET, DELETE and HEAD do not have message bodies. The http protocol 5
  • 6. A status line: 200 OK, 404 File not found,.... Retrieval date and various header fields A blank line Hopefully the requested body of text, if any The return of the http protocol 6
  • 7. URI = Uniform Resource Identifier which generalizes URL  A grammar for Uniform Resource Locators (URLs) ◦ url ::= scheme address ◦ scheme ::= http:// | ftp:// | mailto: | file:// |... ◦ address ::= full-domain-name/path-to-document[#anchor] ◦ full-domain-name ::= host.domainname[:portnr] ◦ domainname ::= domain . domainname | domain ◦ domain ::= string-without-whitespace-and-{; : &} ◦ path-to-document ::= /-separated-strings-without-whitespace-and-{; : &}  Spaces in filenames should be coded as %20 (ASCII in hexadecimal).  If no html filename is given for the http protocol, the file index.html, index.php or index.htm may be used (this depends on the server, though).  URN: Uniform Resource Name  Example: ◦ URL http://www.pierobon.org/iis/review1.htm ◦ URN www.pierobon.org/iis/review1.htm#one ◦ URI http://www.pierobon.org/iis/review1.htm.html#one URLs, URIs, and URNs 7
  • 8. How does a browser know what kind of file it is retrieving?  A GIF file is displayed differently from a HTML or PDF file.  For some types, like PDF, a separate helper application must be started.  Sometimes a plug-in or filter is available, e.g. Macromedia Flash or some to-HTML converter.  Would be nice to automatically start the right one.  The same problem occurred with attachments to e-mail.  Here Multipurpose Internet Mail Extensions (MIME) were developed.  A MIME specificiation is type/subtype.  Common examples are text/html, text/plain, image/jpeg, image/gif, audio/x-realaudio and application/x-shockwave-flash.  Experimental MIME types are of the form type/x-subtype  When a MIME type is not present, the extension is usually the deciding factor. MIME 8
  • 9. WWW for dummies: use elaborate tools to construct large websites  Little programming involved, just plug and play.  We don’t want to make fancy/professional websites. We try to capture the concepts and make a dependable knowledge for web programming.  You may use drawing tools like Gimp, Photoshop, CoolEdit and the like for making sounds, pictures, banners, icons and movies.  You may not use web authoring tools like Macromedia Dreamweaver, Microsoft Frontpage, Adobe PageMill,....  What is allowed depends on the assignment: notepad++, PHP Ed, Eclipse, …  Documentation about the tools you use: include it.  Of course, in real life you should use tools wherever applicable, but whatever you do: make sure the result is maintainable. Authoring tools 9
  • 10.  Inthis course, we program the Internet using the Web.  Question: why develop for the Web?  Possible answers: ◦ Browser handles displaying ◦ “Everybody” has a browser ◦ Servers handle large parts of the transmission details ◦ Database servers do much of the rest  Software maintenance/upgrade problems in a large organization.  Security issues: client program keeps a password to connect to the database server.  A lot of work has been done. Why not use it? Why use the Web? 10
  • 11.  Invent your own protocol  Write your own server  Write client programs for as many platforms as you can  Try to get client and server software at the right places  Examples: (s)ftp, MSN, BitTorrent (peer-to-peer), MMORPG (Massively multiplayer online role-playing game)  The main problem is to get the proper software in the proper places.  People have to download your client software and install it on their local (home) computers  Your server software must be installed e.g. at Internet Service Provider computers. Making your own Internet applications 11
  • 12.  Used to make web sites dynamic.  A fully dynamic website could be called a web application.  Scripting languages exist on both the client side and the server side.  Client side scripts are executed by the browser. ◦ These are used mainly to overcome shortcomings in the protocol / presentation (!). ◦ Dynamic HTML ◦ Examples are: JavaScript (and Java Applets to a lesser extent).  Server side scripts are executed by the webserver.  Several means of integration exist.  Examples of languages are: PHP, Perl, Python, C++, ASP. On scripting languages 12
  • 13. Some executables codes transferred to the browser for execution. Now a days, are not appreciated generally Some technologies: ◦ Java Applets (executed on a VM, secured by a sandbox, multi-platform) ◦ Microsoft ActiveX (executed natively, dangerous !, single-OS, single-browser) Both of them can be signed digitally ◦ Normally, looser security rules are applied to the signed applets. Client side executables 13
  • 14. Interpreted languages No (insistence on) declarations Run-time typing Exceptions exist Libraries for accessing various databases Regular expressions Easy reporting/printing Many libraries available General aspects of scripting languages 14
  • 15.  Most server side scripting languages use the one-page-at-a -time philosophy: they process an scripted page.  On request, a webserver can execute a program (CGI). The program generates the whole page.  Or replace inline code by its output (PHP).  The result is usually an HTML file which is returned to the client.  Input to server side scripts is usually a form.  ”Running a web application” usually consists of a long string of script incarnations.  Page based (in form of page) ping ponging of data between client and server.  Form, submit, form, submit, form, submit, etc.  Page based requests generate lots of overhead.  Lack of state generates lots of overhead (and security issues!). Server side scripting 15
  • 16. Communication between client and server (and vice versa) ◦ HTTP, HTMLx ◦ Forms, parameters Communication between server and script (and vice versa) ◦ Inline, CGI, servlets ◦ Direct, standard input/output, environment Communication between script and database (and vice versa) ◦ SQL, resultset Interesting parts of server side scripting 16
  • 17.  Snippets of program scattered through HTML.  Executed from top to bottom.  Results embedded into the returned document by web server.  Examples ◦ Server Side Includes (SSI) ◦ PHP mainly uses inline, i.e., within HTML between <?php and ?>. ◦ Java Server Pages. ◦ Active Server Pages (used with VBScript or Javascript)  Often used when only small parts of the document are computations.  Thin separation between code and presentation  Code becomes quickly unreadable.  Solutions to this problem exist (templates). Method 1: Inlining code into a document 17
  • 18.  HTML files may include preprocessing directives.  HTML recognized by extension (e.g., shtml) or having exec flag on.  Server parses HTML files and executes directives.  General form: ◦ <!--#element attribute=value attribute=value ... -->  Variables can be used as well, and #if directives. Server Side Includes 18
  • 19.  A protocol for web browsers to interface with applications via a webserver.  Mainly used for running programs on the server (with certain permissions)  Identifiable by reference to file in special directory .../cgi- bin/xxxx or file extension.  Parameters are passed to it.  Result is usually a HTML document which is sent to the client.  Perl, Python or shell scripts are often used in this fashion.  But all languages are possible, even C++.  There are slight differences between CGI executing and ordinary execution of programs.  CGI is deprecated now. Method 2: Common Gateway Interface 19
  • 20.  Client (browser) sends a request to the server by TCP/IP (socket)  We focus here on the commands GET, HEAD, POST, PUT  GET = give a document  HEAD = give only the header of document (email/news header)  POST = send the contents of a form (upload form data)  PUT = upload of a file  For CGI mostly GET or POST  The request line will be followed by 0 or more headers and an empty line  with POST/PUT followed by a body (contents of form and/or file) HTTP requests 20
  • 21. The URL used for this request was: http://sunshine.cs.uu.nl:8000/docs/vakken /inp/dwarf.html?M=November&Y=2004 The body was empty Example request 21
  • 22. With GET the contents of the form, or a parameter is given as part of the URL (URL=cgi-bin/xxx?parameters)  Special characters are given in hex.  With POST the contents of the form is given as the body of the request.  The server puts all relevant information in environment variables  The body (POST) is given to the standard input of the script.  Hence, scripts reacting to a POST should read from standard input.  The server does not indicate end-of-file (length field in the headers)  In most cases, a special CGI module makes access to these information transparent. GET versus POST 22
  • 23. Important environment variables: ◦ QUERY_STRING: The query (the part after ? in the URL). Usually a series of the form field=value, separated by &. ◦ CONTENT_LENGTH length of the data (POST) ◦ CONTENT_TYPE type of the data (MIME)  Not all of the normal environment is sent along: the server filters it.  Test your program by faking the parameters (environment variables) and running it from the prompt.  The CGI program should give a correct document (with header and data) on standard output (say, using printf() ).  First thing to do: print the Content-type.  The server sends this document to the client (browser). CGI Environment 23
  • 24. Many languages offer libraries for CGI programming. These libraries take care of tedious tasks, such as accessing the environment variables and decoding QUERY_STRING. Always look for, and use libraries for CGI. Libraries for CGI 24
  • 25. Independent of a programming language  Isolation of processes. CGI cannot damage the server  CGI programs can get limited permissions (only for part of the Web site)  Performance: for each request a CGI script must be started as a separate process  Starting a new process is very expensive, especially on Windows.  No possibility to give other information to the server (like logging, program should write the logs in its own repository)  One time Request/Response: stateless (not just CGI, also for inline).  Ex: consider an electronic shopping cart (e-commerce): where do I maintain the shopping list? CGI advantages/disadvantages 25
  • 26. For a stateless server the client must keep the state.  In a browser: this could be done with Javascript, Java, etc.  In Forms: “hidden” fields can be filled.  When the form is sent these fields are included.  Cumulative information (like shopping card) could be stored in this way.  Problem: when the user “returns” (with the BACK key) or enters the URL directly in the browser the field gets lost. Not bad when each user wants to keep multiple workflows simultaneously.  If GET is used, it is also possible to put extra information in the URL if no form is present.  Exe: Find a website that gets parameters from the URL. Try to change the parameters and retrieve the infoprmation without navigating through the pages, e.g., zzzzz.com? picId=100&newsId=56  Keeping state in the client or URL is prone to security issues.  Other (better) solutions exist to the problem of statelessness. Client side state 26
  • 27. Four steps: article selection, personal data, credit card details, confirmation. Four scripts or one script? Maintaining state. Validating input. Security. Example: order process of an online shop 27
  • 28. (Reading) SQL injection. Cross site scripting. Some other security issues 28
  • 29.  A more recent development.  Servlet = applet run on a server.  Servlets, in general define extensions to servers (WWW or other).  Many people know Java and its many libraries.  Java is typed.  No need to learn a new language.  Basic packages are javax.servlet and javax.servlet.http  JSP is used when only little code is included in HTML documents.  Use javax.servlet.jsp and javax.servlet.jsp.tagext.  Together JSP and servlets constitute the Web side of J2EE.  Database connecting using Java DataBase Connectivity (JDBC). Method 3: Java Servlets 29
  • 30. Exe: Answer the following question: ◦ What is a keep-alive connection? Exe: List three enhances in Http 1.1 over Http 1.0 Keep-Alive Connections 30
  • 31. Web servers The http protocol URLs, URIs, and URNs MIME Scripting languages ◦ Client side ◦ Server side: SSI, CGI, servlets Summary for this session 31

Editor's Notes

  • #7: Ex5: Connect to a web server using the telnet command: C:\\&gt;telnet www.google.com 80 GET / HTTP/1.0 HOST:www.google.com &lt;blank line&gt;
  • #14: The sandbox is a set of rules that are used when creating an applet that prevents certain functions when the applet is sent as part of a Web page
  • #16: CGI=Common Gateway Interface is a standard protocol that defines how webserver software can delegate the generation of webpages to a console application
  • #19: Text in a pre element is displayed in a fixed-width font, and it preserves both spaces and line breaks. Useful for code samples.
  • #27: http://www.tehran.ir/Default.aspx?tabid=536 … 537