SlideShare a Scribd company logo
2 December 2005 
Web Information Systems 
Web Architectures 
Prof. Beat Signer 
Department of Computer Science 
Vrije Universiteit Brussel 
http://www.beatsigner.com
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 2 
Web Information Systems 
 A web information system uses web technologies 
for information and service delivery 
 Modern web information systems and web architectures 
have to 
 be extensible to cater for emerging technolgies and new forms of 
interaction (e.g. multimodal interaction) 
 manage heterogeneous information such as documents, structured 
data, multimedia resources, semi-structured information, ... 
 integrate various sources (e.g. DBs) via multi-tier architectures 
 offer a notion of state to reflect the current application context 
 deal with information about users and their environment (context) 
 ...
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 3 
Basic Client-Server Web Architecture 
 Effect of typing http://www.vub.ac.be in the broswer bar 
(1) use a Domain Name Service (DNS) to get the IP address for 
www.vub.ac.be (answer 134.184.129.2) 
(2) create a TCP connection to 134.184.129.2 
(3) send an HTTP request message over the TCP connection 
(4) visualise the received HTTP response message in the browser 
Internet 
Client Server 
HTTP Request 
HTTP Response
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 4 
Web Server 
 Tasks of a web server 
(1) setup connection 
(2) receive and process 
HTTP request 
(3) fetch resource 
(4) create and send 
HTTP response 
(5) logging 
 The most prominent web servers are the Apache HTTP 
Server and Microsoft's Internet Information Services (IIS) 
 A lot of devices have an embedded web server 
 printers, WLAN routers, TVs, ... 
Worldwide Web Servers, http://news.netcraft.com
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 5 
Example HTTP Request Message 
GET / HTTP/1.1 
Host: www.vub.ac.be 
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 
Firefox/24.0 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 
Accept-Language: en-gb,en;q=0.5 
Accept-Encoding: gzip, deflate 
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 
Connection: keep-alive
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 6 
Example HTTP Response Message 
HTTP/1.1 200 OK 
Date: Thu, 03 Oct 2013 17:02:19 GMT 
Server: Apache/2.2.14 (Ubuntu) 
X-Powered-By: PHP/5.3.2-1ubuntu4.15 
Content-Language: nl 
Set-Cookie: lang=nl; path=/; domain=.vub.ac.be; expires=Mon, 18-Sep-2073 
17:02:16 GMT 
Content-Type: text/html; charset=utf-8 
Keep-Alive: timeout=15, max=987 
Connection: Keep-Alive 
Transfer-Encoding: chunked 
<!DOCTYPE html> 
<html lang="nl" dir="ltr"> 
<head> 
... 
<title>Vrije Universiteit Brussel | Redelijk eigenzinnig</title> 
<meta name="Description" content="Welkom aan de VUB" /> 
... 
</html>
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 7 
HTTP Protocol 
 Request/response communication model 
 HTTP Request 
 HTTP Response 
 Communication always has to be initiated by the client 
 Stateless protocol 
 HTTP can be used on top of various reliable protocols 
 TCP is by far the most commonly used one 
 runs on TCP port 80 by default 
 Latest version: HTTP/1.1 
 HTTPS scheme used for encrypted connections
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 8 
Uniform Resource Identifier (URI) 
 A Uniform Resource Identifier (URI) uniquely 
identifies a resource 
 There are two types of URIs 
 Uniform Resource Locator (URL) 
- contains information about the exact location of a resource 
- consists of a scheme, a host and the path (resource name) 
- e.g. http://wise.vub.ac.be/beat-signer/ 
- problem: the URL changes if resource is moved! 
• idea of Persistent Uniform Resource Locators (PURLs) [https://purl.oclc.org] 
 Uniform Resource Name (URN) 
- unique and location independent name for a resource 
- consists of a scheme name, a namespace identifier and a namespace-specific 
string (separated by colons) 
- e.g. urn:ISBN:3837027139
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 9 
HTTP Message Format 
 Request and response messages have the same format 
<html> 
... 
</html> 
HTTP/1.1 200 OK 
Date: Thu, 03 Oct 2013 17:02:19 GMT 
Server: Apache/2.2.14 (Ubuntu) 
X-Powered-By: PHP/5.3.2-1ubuntu4.15 
Transfer-Encoding: chunked 
Content-Type: text/html 
header field(s) 
blank line (CRLF) 
message body (optional) 
start line 
HTTP_message = start_line , {header} , "CRLF" , {body};
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 10 
HTTP Request Message 
 Request-specific start line 
 Methods 
 GET : get a resource from the server 
 HEAD : get the header only (no body) 
 POST : send data (in the body) to the server 
 PUT : store request body on server 
 TRACE : get the "final" request (after it has potentially been modified by proxies) 
 OPTIONS : get a list of methods supported by the server 
 DELETE: delete a resource on the server 
start_line = method, " " , resource , " " , version; 
method = "GET" , "HEAD" , "POST" , "PUT" , "TRACE" , 
"OPTIONS" , "DELETE"; 
resource = complete_URL | path; 
version = "HTTP/" , major_version, "." , minor_version;
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 11 
HTTP Response Message 
 Response-specific start line 
 Status codes 
 100-199 : informational 
 200-299 : success (e.g. 200 for 'OK') 
 300-399 : redirection 
 400-499 : client error (e.g. 404 for 'Not Found') 
 500-599 : server error (e.g. 503 for 'Service Unavailable') 
start_line = version , status_code , reason; 
version = "HTTP/" , major_version, "." , minor_version; 
status_code = digit , digit , digit; 
reason = string_phrase;
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 12 
HTTP Header Fields 
 There exist general headers (for requests and 
responses), request headers, response headers, entity 
headers and extension headers 
 Some important headers 
 Accept 
- request header definining the Multipurpose Internet Mail Extensions (MIME) 
that the client will accept 
 User-Agent 
- request header specifying the type of client 
 Keep-Alive (HTTP/1.0) and Persistent (HTTP/1.1) 
- general header helping to improve the performance since otherwise a new 
HTTP connection has to be established for every single webpage element 
 Content-Type 
- entity header specifing the body's MIME type
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 13 
HTTP Header Fields ... 
 Some important headers ... 
 If-Modified-Since 
- request header that is used in combination with a GET request (conditional 
GET); the resource is only returned if it has been modified since the specified 
date
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 14 
MIME Types 
 The MIME type defines the request or response body's 
content and is used for the appropiate processing 
 Standard MIME types are registered with the Internet 
Assigned Numbers Authority (IANA) [RFC-2045] 
mime = toplevel_type , "/" , subtype; 
MIME Type Description 
text/plain Human-readable text without formatting information 
text/html HTML document 
image/jpeg JPEG-encoded image 
... ...
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 15 
HTTP Message Information 
 Various tools for HTTP message logging 
 e.g. HttpFox add-on for Firefox browser 
 Simple telnet connection 
 Until 1999 the W3C has been working on the HTTP Next 
Generation (HTTP-NG) protocol as a replacement for 
HTTP/1.1 
 never introduced 
 recently some work on HTTP/2.0 
telnet wise.vub.ac.be 80 (press Enter) 
GET /beat-signer HTTP/1.1 (press Enter) 
Host: wise.vub.ac.be (press Enter 2 times)
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 16 
Proxies 
 A web proxy is situated between the client and the server 
 acts as a server to the client and as a client to the server 
 can for example be specified in the browser settings; used for 
- firewalls and content filters 
- transcoding (on the fly transformation of HTTP message body) 
- content router (e.g. select optimal server in content distribution networks) 
- anonymous browsing, ... 
Internet 
Client Server 
Proxy
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 17 
Caches 
 A proxy cache is a special type of proxy server 
 can reduce server load if multiple clients share the same cache 
 often multi-level hierarchies of caches (e.g. continent, country 
and regional level) with communication between sibling and 
parent caches as defined by the Internet Cache Protocol (ICP) 
 passive or active (prefetching) caches 
Internet 
Client 1 
Proxy Cache Server 
Client 2 
1 
2 
2 1
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 18 
Caches ... 
 Special HTTP cache control header fields 
 Expires 
- expiration date after which the cached resource has to be refetched 
 Cache-Control: max-age 
- maximum age of a document (in seconds) after it has been added to the cache 
 Cache-Control: no-cache 
- response cannot be directly served from the cache (has to be revalidated first) 
 ... 
 Validators 
 Last-modified time as validator 
- cache with resource that has been last modified at time t uses an 
If-Modified-Since t request for updates 
 Entity tags (ETag) 
- changed by the publisher if content has changed; If-None-Match etag request
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 19 
Caches ... 
 Advantages 
 reduces latency and used network bandwidth 
 reduces server load (client and reverse proxy caches) 
 transparent to client and server 
 Disadvantages 
 additional resources (hardware) required 
 might get stale data out of the cache 
 creates additional network traffic if we use an active caching 
approach (prefetching) but achieve a low cache hit rate 
 server loses control (e.g. access statistics) since no longer all 
requests have to be sent to the server
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 20 
Tunnels 
 Implement one protocol on top of another protocol 
 e.g. HTTP as a carrier for SSL connections 
 Often used to "open" a firewall to protocols that would 
otherwise be blocked 
 e.g. tunneling of SSL connections through an open HTTP port 
Internet 
SSL Client SSL Server 
SSL 
HTTP 
SSL 
HTTP[SSL] HTTP[SSL]
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 21 
Gateways 
 A gateway can act as a kind of "glue" between 
applications (client) and resources (server) 
 translate between two protocols (e.g. from HTTP to FTP) 
 security accelerator (e.g. HTTPS/HTTP on the server side) 
 often the gateway and destination server are combined in a single 
application server (HTTP to server application translator) 
Internet 
HTTP Client FTP Server 
HTTP/FTP 
Gateway 
HTTP 
FTP
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 22 
Session Management 
 HTTP is a stateless protocol 
 Session (state) tracking solutions 
 use of IP address 
- problem: IP address is often not uniquely assigned to a single user 
 browser login 
- use of special HTTP authenticate headers 
- after a login the browser sends the user information in each request 
 URL rewriting 
- add information to the URL in each request 
 hidden form fields 
- similar to URL rewriting but information can also be in body (POST request) 
 cookies 
- the server stores a piece of information on the client which is then sent back to 
the server with each request
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 23 
Cookies 
 Introduced by Netscape in June 1994 
 A cookie is a piece of information that is 
assigned to a client on their first visit 
 list of <key,value> pairs 
 often just a unique identifier 
 sent via Set-Cookie or Set-Cookie2 HTTP response headers 
 Browser stores the information in a "cookie database" and 
sends it back every time the same server is accessed 
 Potential privacy issues 
 third-party websites might use persistent cookies for user tracking 
 Cookies can be disabled in the browser settings
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 24 
Hypertext Markup Language (HTML) 
 Dominant markup language for webpages 
 If you never heard about HTML have a look at 
 http://www.w3schools.com/html/ 
 More details in the exercise and in the next lecture 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 
<head> 
<title>Beat Signer: Interactive Paper, PaperWorks, Paper++, ...</title> 
</head> 
<body> 
Beat Signer is Associate Professor of Computer Science at the VUB ... 
</body> 
</html>
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 25 
Dynamic Web Content 
 Often it is not enough to serve static web pages but 
content should be changed on the client or server side 
 Server-side processing 
 Common Gateway Interface (CGI) 
 Java Servlets 
 JavaServer Pages (JSP) 
 PHP: Hypertext Preprocessor (PHP) 
 ... 
 Client-side processing 
 JavaScript 
 Java Applets 
 Adobe Flash 
 ...
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 26 
Common Gateway Interface (CGI) 
 CGI was the first server-side processing solution 
 transparent to the user 
 certain requests (e.g. /account.pl) are forwarded via CGI to a 
program by creating a new process 
 program processes the request and creates an answer with 
optional HTTP response headers 
Internet 
Client Server 
HTTP Request 
HTTP Response 
Program in 
Perl, Tcl, C, 
C++, Java, .. 
HTML Pages 
CGI
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 27 
Common Gateway Interface (CGI) ... 
 CGI Problems 
 a new process has to be started for each request 
 if the CGI program for example acts as a gateway to a database, 
a new DB connection has to be established for each request 
which results in a very poor performance 
 FastCGI solves some of the problems by introducing 
persistent processes and process pools 
 CGI/FastCGI becomes more and more replaced by other 
technologies (e.g. Java Servlets)
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 28 
Java Servlets 
 A Java servlet is a Java class that has to extend the 
abstract HTTPServlet class 
 The Java servlet class is loaded by a servlet container 
and relevant requests (based on a servlet binding) are 
forwarded to the servlet instance for further processing 
Internet 
Client Server 
HTTP Request 
HTTP Response 
HTML Pages 
Servlet 
Container 
Servlets
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 29 
Java Servlets ... 
 Main HttpServlet methods 
 Servlet life cycle 
 a servlet is initialised once via the init() method 
 the doGet(), doPost() methods may be executed multiple 
times (by different HTTP requests) 
 finally the servlet container may unload a servlet (upcall of the 
destroy() method before that happens) 
 Servlet container (e.g. Apache Tomcat) either integrated 
with web server or as standalone component 
doGet(HttpServletRequest req, HttpServletResponse resp) 
doPost(HttpServletRequest req, HttpServletResponse resp) 
init(ServletConfig config) 
destroy()
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 30 
Java Servlet Example 
 In the exercise you will learn how to process parameters etc. 
package org.vub.wise; 
import java.io.*; 
import java.util.Date; 
import javax.servlet.http.*; 
import javax.servlet.*; 
public class HelloWorldServlet extends HttpServlet { 
public void doGet (HttpServletRequest req, HttpServletResponse res) 
throws ServletException, IOException { 
PrintWriter out = res.getWriter(); 
out.println("<html>"); 
out.println("<head><title>Hello World</title></head>"); 
out.println("<body>The time is " + new Date().toString() + "</body>"); 
out.println("</html>"); 
out.close(); 
} 
}
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 31 
JavaServer Pages (JSP) 
 A "drawback" of Java servlets is that the whole page 
(e.g. HTML) has to be defined within the servlet 
 not easy to share tasks between web designer and programmer 
 Add program code through scriptlets and markup to 
existing HTML pages 
 These JSP documents are then either interpreted on the 
fly (Apache Tomcat) or compiled into Java servlets 
 The JSP approach is similar to PHP or Active Server 
Pages (ASP) 
 Note that Java servlets become more and more an 
enabling technology (as with JSP)
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 32 
JavaScript 
 Interpreted scripting language for client-side processing 
 JavaScript functionality often embedded in HTML 
documents but can also be provided in separate files 
 JavaScript often used to 
 validate data (e.g. in a form) 
 dynamically add content to a webpage 
 process events (onLoad, onFocus, etc.) 
 change parts of the original HTML document 
 create cookies 
 ... 
 Note: Java and JavaScript are completely different 
languages!
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 33 
JavaScript Example 
 Please have a look at the following JavaScript tutorial to 
learn some of the basic constructs (operators, control 
statements, etc.) 
 http://www.w3schools.com/JS/ 
 In the exercise session you will use JavaScript to 
implement a web application 
<html> 
<body> 
<script type="text/javascript"> 
document.write("<h1>Hello World!</h1>"); 
</script> 
</body> 
</html>
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 34 
Java Applets 
 A Java applet is a program delivered to the client side in 
the form of Java bytecode 
 executed in the browser using a Java Virtual Machine (JVM) 
 an applet has to extend the Applet or JApplet class 
 runs in the sandbox 
 Advantages 
 the user automatically always has the most recent version 
 high security for untrusted applets 
 full Java API available 
 Disadvantages 
 requires a browser Java plug-in
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 35 
Java Applets ... 
 Disadvantages ... 
 only signed applets can get more advanced functionality 
- e.g. network connections to other machines than the source machine 
 More recently Java Web Start (JavaWS) is replacing 
Java Applets 
 program no longer runs within the browser 
- less problematic security restrictions 
- less browser compatibility issues 
 Java Chess Applet Example 
 http://english.op.org/~peter/ChessApp/
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 36 
Exercise 2 
 Hands-on experience with the HTTP protocol
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 37 
References 
 David Gourley et al., HTTP: The Definitive 
Guide, O'Reilly Media, September 2002 
 R. Fielding et al., RFC2616 - Hypertext Transfer 
Protocol - HTTP/1.1 
 http://www.faqs.org/rfcs/rfc2616.html 
 N. Freed et al., RFC2045 - Multipurpose Internet Mail 
Extensions (MIME) 
 http://www.faqs.org/rfcs/rfc2045.html 
 HTML and JavaScript Tutorials 
 http://www.w3schools.com
October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 38 
References ... 
 Mick Knutson, HTTP: The Hypertext Transfer 
Protocol (refcardz #172) 
 http://refcardz.dzone.com/refcardz/http-hypertext-transfer- 
0 
 W. Jason Gilmore, PHP 5.4 (refcardz #23) 
 http://refcardz.dzone.com/refcardz/php-54-scalable 
 Java Servlet Tutorial 
 http://www.tutorialspoint.com/servlets/
2 December 2005 
Next Lecture 
HTML5 and the Open Web Platform

More Related Content

PDF
HTML5 and the Open Web Platform - Lecture 03 - Web Information Systems (WE-DI...
PDF
Web 2.0 Basics - Lecture 06 - Web Information Systems (4011474FNR)
PDF
Security, Privacy and Trust - Lecture 11 - Web Information Systems (4011474FNR)
PDF
Web Search - Lecture 10 - Web Information Systems (4011474FNR)
PDF
Web Architectures - Web Technologies (1019888BNR)
PDF
Web Application Frameworks - Web Technologies (1019888BNR)
PDF
CSS3 and Responsive Web Design - Web Technologies (1019888BNR)
PDF
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
HTML5 and the Open Web Platform - Lecture 03 - Web Information Systems (WE-DI...
Web 2.0 Basics - Lecture 06 - Web Information Systems (4011474FNR)
Security, Privacy and Trust - Lecture 11 - Web Information Systems (4011474FNR)
Web Search - Lecture 10 - Web Information Systems (4011474FNR)
Web Architectures - Web Technologies (1019888BNR)
Web Application Frameworks - Web Technologies (1019888BNR)
CSS3 and Responsive Web Design - Web Technologies (1019888BNR)
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)

Similar to Web Architectures - Lecture 02 - Web Information Systems (4011474FNR) (20)

PDF
Introduction to HTTP
PPT
Browser Security
PPT
Introduction to Web Programming - first course
PDF
Web Architecture and Technologies
PPT
Introduction to the web
PPT
21 Www Web Services
PPTX
Web fundamentals - part 1
PPT
A detailed presentation on the World Wide Web
PPTX
Web Technologies- lecture1&2.Web Technologies- lecture1&2.pptxpptx
PPT
Introduction to internet
PDF
Web Technologies (2/12): Web Programming – HTTP. Cookies. Web Sessions
PPT
basic web concepts.ppt
PPTX
Lecture1-Introduction to Web.pptx
PPT
Application layer protocols
PPTX
Distributed web based systems
PDF
CSU33012-I-microservices.pdf
PPT
Application layer
PPTX
How Web Browsers Work
PDF
20190516 web security-basic
PPT
Introduction to HTTP
Browser Security
Introduction to Web Programming - first course
Web Architecture and Technologies
Introduction to the web
21 Www Web Services
Web fundamentals - part 1
A detailed presentation on the World Wide Web
Web Technologies- lecture1&2.Web Technologies- lecture1&2.pptxpptx
Introduction to internet
Web Technologies (2/12): Web Programming – HTTP. Cookies. Web Sessions
basic web concepts.ppt
Lecture1-Introduction to Web.pptx
Application layer protocols
Distributed web based systems
CSU33012-I-microservices.pdf
Application layer
How Web Browsers Work
20190516 web security-basic
Ad

More from Beat Signer (20)

PDF
Use Cases and Course Review - Lecture 8 - Human-Computer Interaction (1023841...
PDF
HCI Research Methods - Lecture 7 - Human-Computer Interaction (1023841ANR)
PDF
Evaluation Methods - Lecture 6 - Human-Computer Interaction (1023841ANR)
PDF
Design Guidelines and Models - Lecture 5 - Human-Computer Interaction (102384...
PDF
Human Perception and Cognition - Lecture 4 - Human-Computer Interaction (1023...
PDF
Requirements Analysis and Prototyping - Lecture 3 - Human-Computer Interactio...
PDF
HCI and Interaction Design - Lecture 2 - Human-Computer Interaction (1023841ANR)
PDF
Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)
PDF
Indoor Positioning Using the OpenHPS Framework
PDF
Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...
PDF
Cross-Media Technologies and Applications - Future Directions for Personal In...
PDF
Bridging the Gap: Managing and Interacting with Information Across Media Boun...
PDF
Codeschool in a Box: A Low-Barrier Approach to Packaging Programming Curricula
PDF
The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions
PDF
Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...
PDF
Dashboards - Lecture 11 - Information Visualisation (4019538FNR)
PDF
Interaction - Lecture 10 - Information Visualisation (4019538FNR)
PDF
View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...
PDF
Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)
PDF
Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...
Use Cases and Course Review - Lecture 8 - Human-Computer Interaction (1023841...
HCI Research Methods - Lecture 7 - Human-Computer Interaction (1023841ANR)
Evaluation Methods - Lecture 6 - Human-Computer Interaction (1023841ANR)
Design Guidelines and Models - Lecture 5 - Human-Computer Interaction (102384...
Human Perception and Cognition - Lecture 4 - Human-Computer Interaction (1023...
Requirements Analysis and Prototyping - Lecture 3 - Human-Computer Interactio...
HCI and Interaction Design - Lecture 2 - Human-Computer Interaction (1023841ANR)
Introduction - Lecture 1 - Human-Computer Interaction (1023841ANR)
Indoor Positioning Using the OpenHPS Framework
Personalised Learning Environments Based on Knowledge Graphs and the Zone of ...
Cross-Media Technologies and Applications - Future Directions for Personal In...
Bridging the Gap: Managing and Interacting with Information Across Media Boun...
Codeschool in a Box: A Low-Barrier Approach to Packaging Programming Curricula
The RSL Hypermedia Metamodel and Its Application in Cross-Media Solutions
Case Studies and Course Review - Lecture 12 - Information Visualisation (4019...
Dashboards - Lecture 11 - Information Visualisation (4019538FNR)
Interaction - Lecture 10 - Information Visualisation (4019538FNR)
View Manipulation and Reduction - Lecture 9 - Information Visualisation (4019...
Visualisation Techniques - Lecture 8 - Information Visualisation (4019538FNR)
Design Guidelines and Principles - Lecture 7 - Information Visualisation (401...
Ad

Recently uploaded (20)

PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Classroom Observation Tools for Teachers
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
master seminar digital applications in india
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Lesson notes of climatology university.
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
RMMM.pdf make it easy to upload and study
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
01-Introduction-to-Information-Management.pdf
Final Presentation General Medicine 03-08-2024.pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
STATICS OF THE RIGID BODIES Hibbelers.pdf
Classroom Observation Tools for Teachers
2.FourierTransform-ShortQuestionswithAnswers.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Microbial disease of the cardiovascular and lymphatic systems
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
VCE English Exam - Section C Student Revision Booklet
Microbial diseases, their pathogenesis and prophylaxis
master seminar digital applications in india
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Lesson notes of climatology university.
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
RMMM.pdf make it easy to upload and study
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx

Web Architectures - Lecture 02 - Web Information Systems (4011474FNR)

  • 1. 2 December 2005 Web Information Systems Web Architectures Prof. Beat Signer Department of Computer Science Vrije Universiteit Brussel http://www.beatsigner.com
  • 2. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 2 Web Information Systems  A web information system uses web technologies for information and service delivery  Modern web information systems and web architectures have to  be extensible to cater for emerging technolgies and new forms of interaction (e.g. multimodal interaction)  manage heterogeneous information such as documents, structured data, multimedia resources, semi-structured information, ...  integrate various sources (e.g. DBs) via multi-tier architectures  offer a notion of state to reflect the current application context  deal with information about users and their environment (context)  ...
  • 3. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 3 Basic Client-Server Web Architecture  Effect of typing http://www.vub.ac.be in the broswer bar (1) use a Domain Name Service (DNS) to get the IP address for www.vub.ac.be (answer 134.184.129.2) (2) create a TCP connection to 134.184.129.2 (3) send an HTTP request message over the TCP connection (4) visualise the received HTTP response message in the browser Internet Client Server HTTP Request HTTP Response
  • 4. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 4 Web Server  Tasks of a web server (1) setup connection (2) receive and process HTTP request (3) fetch resource (4) create and send HTTP response (5) logging  The most prominent web servers are the Apache HTTP Server and Microsoft's Internet Information Services (IIS)  A lot of devices have an embedded web server  printers, WLAN routers, TVs, ... Worldwide Web Servers, http://news.netcraft.com
  • 5. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 5 Example HTTP Request Message GET / HTTP/1.1 Host: www.vub.ac.be User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Firefox/24.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-gb,en;q=0.5 Accept-Encoding: gzip, deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Connection: keep-alive
  • 6. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 6 Example HTTP Response Message HTTP/1.1 200 OK Date: Thu, 03 Oct 2013 17:02:19 GMT Server: Apache/2.2.14 (Ubuntu) X-Powered-By: PHP/5.3.2-1ubuntu4.15 Content-Language: nl Set-Cookie: lang=nl; path=/; domain=.vub.ac.be; expires=Mon, 18-Sep-2073 17:02:16 GMT Content-Type: text/html; charset=utf-8 Keep-Alive: timeout=15, max=987 Connection: Keep-Alive Transfer-Encoding: chunked <!DOCTYPE html> <html lang="nl" dir="ltr"> <head> ... <title>Vrije Universiteit Brussel | Redelijk eigenzinnig</title> <meta name="Description" content="Welkom aan de VUB" /> ... </html>
  • 7. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 7 HTTP Protocol  Request/response communication model  HTTP Request  HTTP Response  Communication always has to be initiated by the client  Stateless protocol  HTTP can be used on top of various reliable protocols  TCP is by far the most commonly used one  runs on TCP port 80 by default  Latest version: HTTP/1.1  HTTPS scheme used for encrypted connections
  • 8. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 8 Uniform Resource Identifier (URI)  A Uniform Resource Identifier (URI) uniquely identifies a resource  There are two types of URIs  Uniform Resource Locator (URL) - contains information about the exact location of a resource - consists of a scheme, a host and the path (resource name) - e.g. http://wise.vub.ac.be/beat-signer/ - problem: the URL changes if resource is moved! • idea of Persistent Uniform Resource Locators (PURLs) [https://purl.oclc.org]  Uniform Resource Name (URN) - unique and location independent name for a resource - consists of a scheme name, a namespace identifier and a namespace-specific string (separated by colons) - e.g. urn:ISBN:3837027139
  • 9. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 9 HTTP Message Format  Request and response messages have the same format <html> ... </html> HTTP/1.1 200 OK Date: Thu, 03 Oct 2013 17:02:19 GMT Server: Apache/2.2.14 (Ubuntu) X-Powered-By: PHP/5.3.2-1ubuntu4.15 Transfer-Encoding: chunked Content-Type: text/html header field(s) blank line (CRLF) message body (optional) start line HTTP_message = start_line , {header} , "CRLF" , {body};
  • 10. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 10 HTTP Request Message  Request-specific start line  Methods  GET : get a resource from the server  HEAD : get the header only (no body)  POST : send data (in the body) to the server  PUT : store request body on server  TRACE : get the "final" request (after it has potentially been modified by proxies)  OPTIONS : get a list of methods supported by the server  DELETE: delete a resource on the server start_line = method, " " , resource , " " , version; method = "GET" , "HEAD" , "POST" , "PUT" , "TRACE" , "OPTIONS" , "DELETE"; resource = complete_URL | path; version = "HTTP/" , major_version, "." , minor_version;
  • 11. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 11 HTTP Response Message  Response-specific start line  Status codes  100-199 : informational  200-299 : success (e.g. 200 for 'OK')  300-399 : redirection  400-499 : client error (e.g. 404 for 'Not Found')  500-599 : server error (e.g. 503 for 'Service Unavailable') start_line = version , status_code , reason; version = "HTTP/" , major_version, "." , minor_version; status_code = digit , digit , digit; reason = string_phrase;
  • 12. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 12 HTTP Header Fields  There exist general headers (for requests and responses), request headers, response headers, entity headers and extension headers  Some important headers  Accept - request header definining the Multipurpose Internet Mail Extensions (MIME) that the client will accept  User-Agent - request header specifying the type of client  Keep-Alive (HTTP/1.0) and Persistent (HTTP/1.1) - general header helping to improve the performance since otherwise a new HTTP connection has to be established for every single webpage element  Content-Type - entity header specifing the body's MIME type
  • 13. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 13 HTTP Header Fields ...  Some important headers ...  If-Modified-Since - request header that is used in combination with a GET request (conditional GET); the resource is only returned if it has been modified since the specified date
  • 14. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 14 MIME Types  The MIME type defines the request or response body's content and is used for the appropiate processing  Standard MIME types are registered with the Internet Assigned Numbers Authority (IANA) [RFC-2045] mime = toplevel_type , "/" , subtype; MIME Type Description text/plain Human-readable text without formatting information text/html HTML document image/jpeg JPEG-encoded image ... ...
  • 15. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 15 HTTP Message Information  Various tools for HTTP message logging  e.g. HttpFox add-on for Firefox browser  Simple telnet connection  Until 1999 the W3C has been working on the HTTP Next Generation (HTTP-NG) protocol as a replacement for HTTP/1.1  never introduced  recently some work on HTTP/2.0 telnet wise.vub.ac.be 80 (press Enter) GET /beat-signer HTTP/1.1 (press Enter) Host: wise.vub.ac.be (press Enter 2 times)
  • 16. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 16 Proxies  A web proxy is situated between the client and the server  acts as a server to the client and as a client to the server  can for example be specified in the browser settings; used for - firewalls and content filters - transcoding (on the fly transformation of HTTP message body) - content router (e.g. select optimal server in content distribution networks) - anonymous browsing, ... Internet Client Server Proxy
  • 17. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 17 Caches  A proxy cache is a special type of proxy server  can reduce server load if multiple clients share the same cache  often multi-level hierarchies of caches (e.g. continent, country and regional level) with communication between sibling and parent caches as defined by the Internet Cache Protocol (ICP)  passive or active (prefetching) caches Internet Client 1 Proxy Cache Server Client 2 1 2 2 1
  • 18. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 18 Caches ...  Special HTTP cache control header fields  Expires - expiration date after which the cached resource has to be refetched  Cache-Control: max-age - maximum age of a document (in seconds) after it has been added to the cache  Cache-Control: no-cache - response cannot be directly served from the cache (has to be revalidated first)  ...  Validators  Last-modified time as validator - cache with resource that has been last modified at time t uses an If-Modified-Since t request for updates  Entity tags (ETag) - changed by the publisher if content has changed; If-None-Match etag request
  • 19. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 19 Caches ...  Advantages  reduces latency and used network bandwidth  reduces server load (client and reverse proxy caches)  transparent to client and server  Disadvantages  additional resources (hardware) required  might get stale data out of the cache  creates additional network traffic if we use an active caching approach (prefetching) but achieve a low cache hit rate  server loses control (e.g. access statistics) since no longer all requests have to be sent to the server
  • 20. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 20 Tunnels  Implement one protocol on top of another protocol  e.g. HTTP as a carrier for SSL connections  Often used to "open" a firewall to protocols that would otherwise be blocked  e.g. tunneling of SSL connections through an open HTTP port Internet SSL Client SSL Server SSL HTTP SSL HTTP[SSL] HTTP[SSL]
  • 21. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 21 Gateways  A gateway can act as a kind of "glue" between applications (client) and resources (server)  translate between two protocols (e.g. from HTTP to FTP)  security accelerator (e.g. HTTPS/HTTP on the server side)  often the gateway and destination server are combined in a single application server (HTTP to server application translator) Internet HTTP Client FTP Server HTTP/FTP Gateway HTTP FTP
  • 22. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 22 Session Management  HTTP is a stateless protocol  Session (state) tracking solutions  use of IP address - problem: IP address is often not uniquely assigned to a single user  browser login - use of special HTTP authenticate headers - after a login the browser sends the user information in each request  URL rewriting - add information to the URL in each request  hidden form fields - similar to URL rewriting but information can also be in body (POST request)  cookies - the server stores a piece of information on the client which is then sent back to the server with each request
  • 23. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 23 Cookies  Introduced by Netscape in June 1994  A cookie is a piece of information that is assigned to a client on their first visit  list of <key,value> pairs  often just a unique identifier  sent via Set-Cookie or Set-Cookie2 HTTP response headers  Browser stores the information in a "cookie database" and sends it back every time the same server is accessed  Potential privacy issues  third-party websites might use persistent cookies for user tracking  Cookies can be disabled in the browser settings
  • 24. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 24 Hypertext Markup Language (HTML)  Dominant markup language for webpages  If you never heard about HTML have a look at  http://www.w3schools.com/html/  More details in the exercise and in the next lecture <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Beat Signer: Interactive Paper, PaperWorks, Paper++, ...</title> </head> <body> Beat Signer is Associate Professor of Computer Science at the VUB ... </body> </html>
  • 25. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 25 Dynamic Web Content  Often it is not enough to serve static web pages but content should be changed on the client or server side  Server-side processing  Common Gateway Interface (CGI)  Java Servlets  JavaServer Pages (JSP)  PHP: Hypertext Preprocessor (PHP)  ...  Client-side processing  JavaScript  Java Applets  Adobe Flash  ...
  • 26. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 26 Common Gateway Interface (CGI)  CGI was the first server-side processing solution  transparent to the user  certain requests (e.g. /account.pl) are forwarded via CGI to a program by creating a new process  program processes the request and creates an answer with optional HTTP response headers Internet Client Server HTTP Request HTTP Response Program in Perl, Tcl, C, C++, Java, .. HTML Pages CGI
  • 27. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 27 Common Gateway Interface (CGI) ...  CGI Problems  a new process has to be started for each request  if the CGI program for example acts as a gateway to a database, a new DB connection has to be established for each request which results in a very poor performance  FastCGI solves some of the problems by introducing persistent processes and process pools  CGI/FastCGI becomes more and more replaced by other technologies (e.g. Java Servlets)
  • 28. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 28 Java Servlets  A Java servlet is a Java class that has to extend the abstract HTTPServlet class  The Java servlet class is loaded by a servlet container and relevant requests (based on a servlet binding) are forwarded to the servlet instance for further processing Internet Client Server HTTP Request HTTP Response HTML Pages Servlet Container Servlets
  • 29. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 29 Java Servlets ...  Main HttpServlet methods  Servlet life cycle  a servlet is initialised once via the init() method  the doGet(), doPost() methods may be executed multiple times (by different HTTP requests)  finally the servlet container may unload a servlet (upcall of the destroy() method before that happens)  Servlet container (e.g. Apache Tomcat) either integrated with web server or as standalone component doGet(HttpServletRequest req, HttpServletResponse resp) doPost(HttpServletRequest req, HttpServletResponse resp) init(ServletConfig config) destroy()
  • 30. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 30 Java Servlet Example  In the exercise you will learn how to process parameters etc. package org.vub.wise; import java.io.*; import java.util.Date; import javax.servlet.http.*; import javax.servlet.*; public class HelloWorldServlet extends HttpServlet { public void doGet (HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException { PrintWriter out = res.getWriter(); out.println("<html>"); out.println("<head><title>Hello World</title></head>"); out.println("<body>The time is " + new Date().toString() + "</body>"); out.println("</html>"); out.close(); } }
  • 31. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 31 JavaServer Pages (JSP)  A "drawback" of Java servlets is that the whole page (e.g. HTML) has to be defined within the servlet  not easy to share tasks between web designer and programmer  Add program code through scriptlets and markup to existing HTML pages  These JSP documents are then either interpreted on the fly (Apache Tomcat) or compiled into Java servlets  The JSP approach is similar to PHP or Active Server Pages (ASP)  Note that Java servlets become more and more an enabling technology (as with JSP)
  • 32. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 32 JavaScript  Interpreted scripting language for client-side processing  JavaScript functionality often embedded in HTML documents but can also be provided in separate files  JavaScript often used to  validate data (e.g. in a form)  dynamically add content to a webpage  process events (onLoad, onFocus, etc.)  change parts of the original HTML document  create cookies  ...  Note: Java and JavaScript are completely different languages!
  • 33. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 33 JavaScript Example  Please have a look at the following JavaScript tutorial to learn some of the basic constructs (operators, control statements, etc.)  http://www.w3schools.com/JS/  In the exercise session you will use JavaScript to implement a web application <html> <body> <script type="text/javascript"> document.write("<h1>Hello World!</h1>"); </script> </body> </html>
  • 34. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 34 Java Applets  A Java applet is a program delivered to the client side in the form of Java bytecode  executed in the browser using a Java Virtual Machine (JVM)  an applet has to extend the Applet or JApplet class  runs in the sandbox  Advantages  the user automatically always has the most recent version  high security for untrusted applets  full Java API available  Disadvantages  requires a browser Java plug-in
  • 35. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 35 Java Applets ...  Disadvantages ...  only signed applets can get more advanced functionality - e.g. network connections to other machines than the source machine  More recently Java Web Start (JavaWS) is replacing Java Applets  program no longer runs within the browser - less problematic security restrictions - less browser compatibility issues  Java Chess Applet Example  http://english.op.org/~peter/ChessApp/
  • 36. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 36 Exercise 2  Hands-on experience with the HTTP protocol
  • 37. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 37 References  David Gourley et al., HTTP: The Definitive Guide, O'Reilly Media, September 2002  R. Fielding et al., RFC2616 - Hypertext Transfer Protocol - HTTP/1.1  http://www.faqs.org/rfcs/rfc2616.html  N. Freed et al., RFC2045 - Multipurpose Internet Mail Extensions (MIME)  http://www.faqs.org/rfcs/rfc2045.html  HTML and JavaScript Tutorials  http://www.w3schools.com
  • 38. October 3, 2014 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 38 References ...  Mick Knutson, HTTP: The Hypertext Transfer Protocol (refcardz #172)  http://refcardz.dzone.com/refcardz/http-hypertext-transfer- 0  W. Jason Gilmore, PHP 5.4 (refcardz #23)  http://refcardz.dzone.com/refcardz/php-54-scalable  Java Servlet Tutorial  http://www.tutorialspoint.com/servlets/
  • 39. 2 December 2005 Next Lecture HTML5 and the Open Web Platform