SlideShare a Scribd company logo
World Wide Web Hypertext documents Text Links Web billions of documents authored by millions of diverse people edited by no one in particular distributed over millions of computers, connected by variety of media
History of Hypertext Citation, Hyperlinking Ramayana, Mahabharata, Talmud branching, non-linear discourse, nested commentary, Dictionary, encyclopedia self-contained networks of textual nodes  joined by referential links
Hypertext systems Memex [Vannevar Bush] stands for “memory extension” photoelectrical-mechanical storage and computing device Aim: to create and help follow hyperlinks across documents Hypertext Coined by Ted Nelson Xanadu hypertext: system with  robust two-way hyperlinks, version management, controversy management, annotation and copyright management.
World-wide Web Initiated at CERN (the European Organization for Nuclear Research) By Tim Berners-Lee GUIs Berners-Lee (1990) Erwise  and  Viola(1992), Midas (1993) Mosaic  (1993) a hypertext GUI for the X-window system HTML: markup language for rendering hypertext HTTP: hypertext transport protocol for sending HTML and other data over the Internet CERN HTTPD:  server of hypertext documents
The early days of the Web : CERN HTTP traffic grows by 1000   between 1991-1994 (image courtesy W3C)
The early days of the Web: The number of servers grows from a few hundred to a million between 1991 and 1997 (image courtesy Nielsen)
1994: the landmark year Foundation of the “Mosaic Communications Corporation" first World-wide Web conference MIT and CERN agreed to set up the World-wide Web Consortium (W3C).
Web: A  populist, participatory medium number of writers =(approx) number of readers. the evolution of  MEMES ideas, theories etc  that spread from person to person by imitation. Now they have constructed the Internet !! E.g.:  “Free speech online", chain letters, and email viruses
Abundance and authority crisis liberal and informal culture of content generation and dissemination. Very little uniform civil code. redundancy and non-standard form and content. millions of qualifying pages for most broad queries Example:  java  or  kayaking no authoritative information about the reliability of a site
Problems due to Uniform accessibility little support for adapting to the background of specific users. commercial interests routinely influence the operation of Web search “ Search Engine Optimization“ !!
Hypertext data Semi-structured  or  unstructured No  schema Large number of attributes
Crawling and indexing Purpose of crawling and indexing quick fetching of large number of Web pages into a local repository  indexing based on keywords Ordering responses to maximize user’s chances of the first few responses satisfying his information need. Earliest search engine:  Lycos  (Jan 1994) Followed by…. Alta Vista  (1995), HotBot and Inktomi, Excite
Topic directories Yahoo!   directory to locate useful Web sites Efforts for organizing knowledge into  ontologies Centralized: (Yahoo!) Decentralized: About.COM and the Open Directory
Clustering and classification Clustering discover groups in the set of documents such that documents within a group are more similar than documents across groups. Subjective disagreements due to  different similarity measures Large feature sets Classification For assisting human efforts in maintaining taxonomies  E.g.: IBM's Lotus Notes text processing system & Universal Database text extenders
Hyperlink analysis Take advantage of the structure of the Web graph. Indicators of prestige of a page (E.g. citations) HITS & PageRank Bibliometry bibliographic citation graph of academic papers Topic distillation Adapting to idioms of Web authorship and linking styles
Resource discovery and vertical portals Federations of crawling and search services each specializing in specific topical areas. Goal-driven Web resource discovery language analysis does not scale to billions of documents counter by throwing more hardware
Structured vs. Web data mining traditional data mining data is structured and relational well-defined tables, columns, rows, keys, and constraints. Web data readily available data rich in features and patterns spontaneous formation and evolution of  topic-induced graph clusters  hyperlink-induced communities  Goal of book: discovering  patterns which are spontaneously driven by semantics,

More Related Content

PPTX
Semantic Web Technologies: Changing Bibliographic Descriptions?
PPTX
Introduction to databases and metadata
PDF
20140506 edrene athens_winer
PPTX
New Directions in Information Organization: A Linked Data Model with BIBFRAME
PPT
Information Retrieval and Social Media
PPTX
Web Information Systems Introduction and Origin of World Wide Web
PPTX
Introduction to digital scholarship tools
PPT
Folksonomies: a bottom-up social categorization system
Semantic Web Technologies: Changing Bibliographic Descriptions?
Introduction to databases and metadata
20140506 edrene athens_winer
New Directions in Information Organization: A Linked Data Model with BIBFRAME
Information Retrieval and Social Media
Web Information Systems Introduction and Origin of World Wide Web
Introduction to digital scholarship tools
Folksonomies: a bottom-up social categorization system

What's hot (18)

PDF
Is data publication the right metaphor?
PDF
RDA, Data Citation, and PIDs for DataOne
PPT
Subject information gateway in information technology (sigit) an introduction
PPT
Fuller Disclosure: Getting More Collections into the Network Flow
PPTX
Building the Archive of DH Research
PPTX
Digital Humanities & UTA libraries
PDF
Leslie Johnston Keynote, Best Practices Exchange 2011
PPTX
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
PDF
The Knowledge Discovery Quest
PPTX
Linked Data: Why Bother?
PPTX
Enaktin
PPTX
Open access and Benguet State University's dark web, repository, and open jou...
PPT
Access to electronic information resources in libraries
PPT
Design and development of subject gateways with special reference to lisgateway
PPTX
Workset Creation for Scholarly Analysis Project presentation at CNI 2013
PPT
Pratt Sils LIS653 4 Fall 2007
PPT
Hartley Presentation on Cataloging & Metadata Trends
Is data publication the right metaphor?
RDA, Data Citation, and PIDs for DataOne
Subject information gateway in information technology (sigit) an introduction
Fuller Disclosure: Getting More Collections into the Network Flow
Building the Archive of DH Research
Digital Humanities & UTA libraries
Leslie Johnston Keynote, Best Practices Exchange 2011
Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Coll...
The Knowledge Discovery Quest
Linked Data: Why Bother?
Enaktin
Open access and Benguet State University's dark web, repository, and open jou...
Access to electronic information resources in libraries
Design and development of subject gateways with special reference to lisgateway
Workset Creation for Scholarly Analysis Project presentation at CNI 2013
Pratt Sils LIS653 4 Fall 2007
Hartley Presentation on Cataloging & Metadata Trends
Ad

Viewers also liked (14)

PPT
63demo dfa
PDF
10 slides
PDF
10 slides
PPT
Payroll tables-1214838920274746-8
PPT
63demo dfa
PDF
Jagmohan presentation2008
PDF
Kievbrandingshortopt 101123142128-phpapp02
PDF
10 slides
PDF
Payroll tables-1214838920274746-8
PPT
Jagmohan presentation2008
PPTX
PPT
Finalprez webversion-101012091602-phpapp02
PPT
Slideshare Power Point Sunumu
63demo dfa
10 slides
10 slides
Payroll tables-1214838920274746-8
63demo dfa
Jagmohan presentation2008
Kievbrandingshortopt 101123142128-phpapp02
10 slides
Payroll tables-1214838920274746-8
Jagmohan presentation2008
Finalprez webversion-101012091602-phpapp02
Slideshare Power Point Sunumu
Ad

Similar to 63demo dfa (20)

PPT
Www history by Mumtaz Khan
PDF
web technologies
PPTX
Internet and its applications
PDF
Web Information Systems Lecture 1: Introduction
PPTX
Internet and Its Applications
PDF
Another history of the Web from its architecture
PPTX
An overview of the development of the world wide web
DOC
Understanding The World Wide Web
DOC
Understanding the world wide web
PPTX
WWW REPORT
PPTX
Allahverdiyeva Əzizbikə 695.21 Tex.xarici dil.pptx
PDF
Episode 3(3): Birth & explosion of the World Wide Web - Meetup session11
DOC
1 web programming
PDF
World Wide Web - wikipedia.pdf. in Living in IT Era
PPTX
wwworworldwideweb.pptx_20241003_234415_0000.pptx
PPTX
wwworworldwidewebdefinitionwithexamples.pptx
PDF
Week 2 computers, web and the internet
PPT
W w w49871006
PPT
W w w49871006黃敬隆
Www history by Mumtaz Khan
web technologies
Internet and its applications
Web Information Systems Lecture 1: Introduction
Internet and Its Applications
Another history of the Web from its architecture
An overview of the development of the world wide web
Understanding The World Wide Web
Understanding the world wide web
WWW REPORT
Allahverdiyeva Əzizbikə 695.21 Tex.xarici dil.pptx
Episode 3(3): Birth & explosion of the World Wide Web - Meetup session11
1 web programming
World Wide Web - wikipedia.pdf. in Living in IT Era
wwworworldwideweb.pptx_20241003_234415_0000.pptx
wwworworldwidewebdefinitionwithexamples.pptx
Week 2 computers, web and the internet
W w w49871006
W w w49871006黃敬隆

More from Jag Mohan Singh (20)

PDF
10 slides
PDF
10 slides
PDF
10 slides
PDF
Keynote original
PDF
Keynote original hyperlink
PDF
Keynote original hyperlink
PDF
3dbody outline
PDF
10 slides
PDF
Payroll tables-1214838920274746-8
PPT
Nister iccv2005tutorial
KEY
KEY
KEY
KEY
Updatedpdxcruslideshow vision-100818000958-phpapp01
PDF
Linkyes 101005122038-phpapp01
PDF
Ppt Print To Cutepdf Hyperlink
PDF
Pptx Print2cutepdf Hyperlink
PPTX
Pptx Hyperlink
PDF
Open Office Print2cutepdf Hyperlink
10 slides
10 slides
10 slides
Keynote original
Keynote original hyperlink
Keynote original hyperlink
3dbody outline
10 slides
Payroll tables-1214838920274746-8
Nister iccv2005tutorial
Updatedpdxcruslideshow vision-100818000958-phpapp01
Linkyes 101005122038-phpapp01
Ppt Print To Cutepdf Hyperlink
Pptx Print2cutepdf Hyperlink
Pptx Hyperlink
Open Office Print2cutepdf Hyperlink

63demo dfa

  • 1. World Wide Web Hypertext documents Text Links Web billions of documents authored by millions of diverse people edited by no one in particular distributed over millions of computers, connected by variety of media
  • 2. History of Hypertext Citation, Hyperlinking Ramayana, Mahabharata, Talmud branching, non-linear discourse, nested commentary, Dictionary, encyclopedia self-contained networks of textual nodes joined by referential links
  • 3. Hypertext systems Memex [Vannevar Bush] stands for “memory extension” photoelectrical-mechanical storage and computing device Aim: to create and help follow hyperlinks across documents Hypertext Coined by Ted Nelson Xanadu hypertext: system with robust two-way hyperlinks, version management, controversy management, annotation and copyright management.
  • 4. World-wide Web Initiated at CERN (the European Organization for Nuclear Research) By Tim Berners-Lee GUIs Berners-Lee (1990) Erwise and Viola(1992), Midas (1993) Mosaic (1993) a hypertext GUI for the X-window system HTML: markup language for rendering hypertext HTTP: hypertext transport protocol for sending HTML and other data over the Internet CERN HTTPD: server of hypertext documents
  • 5. The early days of the Web : CERN HTTP traffic grows by 1000 between 1991-1994 (image courtesy W3C)
  • 6. The early days of the Web: The number of servers grows from a few hundred to a million between 1991 and 1997 (image courtesy Nielsen)
  • 7. 1994: the landmark year Foundation of the “Mosaic Communications Corporation" first World-wide Web conference MIT and CERN agreed to set up the World-wide Web Consortium (W3C).
  • 8. Web: A populist, participatory medium number of writers =(approx) number of readers. the evolution of MEMES ideas, theories etc that spread from person to person by imitation. Now they have constructed the Internet !! E.g.: “Free speech online", chain letters, and email viruses
  • 9. Abundance and authority crisis liberal and informal culture of content generation and dissemination. Very little uniform civil code. redundancy and non-standard form and content. millions of qualifying pages for most broad queries Example: java or kayaking no authoritative information about the reliability of a site
  • 10. Problems due to Uniform accessibility little support for adapting to the background of specific users. commercial interests routinely influence the operation of Web search “ Search Engine Optimization“ !!
  • 11. Hypertext data Semi-structured or unstructured No schema Large number of attributes
  • 12. Crawling and indexing Purpose of crawling and indexing quick fetching of large number of Web pages into a local repository indexing based on keywords Ordering responses to maximize user’s chances of the first few responses satisfying his information need. Earliest search engine: Lycos (Jan 1994) Followed by…. Alta Vista (1995), HotBot and Inktomi, Excite
  • 13. Topic directories Yahoo! directory to locate useful Web sites Efforts for organizing knowledge into ontologies Centralized: (Yahoo!) Decentralized: About.COM and the Open Directory
  • 14. Clustering and classification Clustering discover groups in the set of documents such that documents within a group are more similar than documents across groups. Subjective disagreements due to different similarity measures Large feature sets Classification For assisting human efforts in maintaining taxonomies E.g.: IBM's Lotus Notes text processing system & Universal Database text extenders
  • 15. Hyperlink analysis Take advantage of the structure of the Web graph. Indicators of prestige of a page (E.g. citations) HITS & PageRank Bibliometry bibliographic citation graph of academic papers Topic distillation Adapting to idioms of Web authorship and linking styles
  • 16. Resource discovery and vertical portals Federations of crawling and search services each specializing in specific topical areas. Goal-driven Web resource discovery language analysis does not scale to billions of documents counter by throwing more hardware
  • 17. Structured vs. Web data mining traditional data mining data is structured and relational well-defined tables, columns, rows, keys, and constraints. Web data readily available data rich in features and patterns spontaneous formation and evolution of topic-induced graph clusters hyperlink-induced communities Goal of book: discovering patterns which are spontaneously driven by semantics,