SlideShare a Scribd company logo
Deep-Hidden-Invisible Web Prepared by Prof.K.Prabhakar  Assisted by P.Subha [email_address]
Background of the Invisible Web   " Invisible ” is used in the context of world wide web. It is not Invisible in the sense that it cannot be “ seen ”, the content is not available while we are searching using the most commonly used search engines such as Google or MSN or Yahoo. That is reason why many prefer using the word  DEEP WEB  rather than  INVISIBLE WEB.
The size of Invisible web The size is difficult to measure and no estimation is possible due to dynamic nature of creation of content and fast pace with which the search engines index WebPages. However its size as well as content is too large to ignore.
Will the situation change? Google  said that it is dedicated to indexing the world's content, however long it takes. Also, more previously invisible pages are getting indexed because of manually-added links to them from visible pages.
Why “Invisibility”? Invisible does not mean that it is inaccessible. It means that it is not indexed by a search engine and is invisible to the person who is searching the net. If you find no results in the search engine that does not mean that the content is not available.
Reasons for Invisibility   Dynamic URLs : If web pages have long string of parameters and equal signs and question marks, such that they get duplicated what is in their data base.  Form controlled Entry : Pages are displayed only when some actions are taken by human.  Hidden Pages :Hidden means there is simply no sequence of hyperlink clicks that could take you to such a page. The pages are accessible, but only people who know of their existence know how to view them. New Pages: If the pages are new then they will not be indexed by any engine.
Reasons for Invisibility Flash Presentations: Text content in Flash presentations is not indexed. Geo-Tagged: Computers from certain regions may be blocked out. That may include blocking of search engines also. Many of the American TV broadcasters are showing TV online, however, they are not available for searching.  There may be other reasons for not invisibility.
Some  Ways to Make Invisible Content Visible by site owner  There are ways to make deep web visible. We will study what a site owner could do and then what a searcher can do. Let us consider the what the site owner can do  Link it to a visible or indexed page . If some content is available to you , you may put it on a  static HTML page, with relevant formatting and necessary hyperlinks, then link to this static page from an already "visible" (indexed) page.
Convert formats . For flash and other files transcribe them in to words and add as text.  For Audio :A udio content such as a pod cast may be  transcribed and published as supplementary text.
Build links . Link to your own pages from other related pages. If you write about, say, trees on page A, then write about trees again on page B, link from page B to page A to give A more relevance. If page A hasn't been indexed, it will be after B is indexed. Points 6-9 are alternate ways to build links, hence helping make content visible.
Build a topic pyramid . This is a specialized form of sitemap that actually spans many pages. The apex (top-most) page has general topics and links to the next layer of pages, which have more specific topics and links to the next layer. The bottom-most layer of the topic pyramid are your original Web pages or blog posts, which have the most specific content. This method builds page relevance via the serial linking, which induces spiders to want to visit and index.
Socially bookmark it . If you find something, say a book at The Gutenberg Project, that you like, bookmark the URL at a social bookmaking site such as  Del.icio.us  with a brief description.  Remove access restrictions . Get rid of the need to login, or don't apply time-limits.
What user can do  Use a site's search engine . Some times the site search engine may provide better information.  Use site archive navigation . On web logs in particular, you can use the archive links to find info, albeit through manual searching.
What user can do  Using the word "database" in regular search engine query will find information that is difficult to find. For example, if you are looking for a database of images, you can type the search string  images database  into Google or one of the other engines. Somewhere down the results list in Google, you'll find  Full-Text Database Images  from the USPTO (US Patent and Trademark Office). You can then use the Quick or Advanced search forms to find patents relating to one or more terms. If there are images to be seen, there will be links to them.
What user can do  We can use an "invisible Web" directory, portal or specialized search engine such as  Google Book Search ,  Google Scholar ,  Librarian's Internet Index , or BrightPlanet's  Complete Planet  (70,000 searchable databases and specialty search engines).
Invisible web search tools  Deep Web Search Engine —  Clusty .  Art —  Musie   du   Louvre .  Books Online —  The Online Books Page .  Business —  Explorit  Now! .  Consumer —  US Consumer Products Safety Commission Recalled Products .  Economic and Job Data —  FreeLunc.com  — A searchable directory of free economic data.  Finance and Investing —  Bankrate.com .
Invisible web search tools General Research —  GPO's Catalog of US Government Publications .  Government Data —  Copyright Records (LOCIS) .  International —  International  Data Base (IDB) .  Law and Politics —  THOMAS (Library of Congress) .  Library of Congress —  Library of Congress .  Medical and Health —  PubMed .  Science —  ScienceResearch.com .  Transportation —  FAA Flight Delay Information .
Further research tools  About WebSearch —  Christmas 2006 web search guide .  About Websearch —  The deep web — find out more about the deep web — deep web search .  ALA —  American Library Association .  BrightPlanet —  FAQ .
Further Research  Deep Web Research  — A gigantic list of resources.  Deep Web Technologies .  Ellipsis —  Metadata, Google, and the Invisible Web .  Envisional .
Further Research  Google  Librarian Center .  Google  Library Project .  Lifehacker —  How to search the invisible web .  MediaBistro —  Some resources for freelancers .
Further Research  MetaQuerier —  Exploring and integrating the deep web .  QProber —  Classifying and searching hidden-web text databases .  The Invisible Web  Weblog .  University of California, Berkeley —  Invisible or deep web .
One of the most important site  Please go through  http://oedb.org/library   This is an online education data base that will provide you information on various areas relating to career and  education.

More Related Content

PPS
Toolicious Presentation at SoCon07
 
PPT
Web 2.0 for IA's
PPT
Web Metrics - Cell Carrier Buzz on the Web
PPT
Web 2.0: Exploring Information Users
PPT
Using Tags and Clustering to Identify Topic-specific Blogs
ODP
Research on collaborative information sharing systems
PPT
The Value of Blogging in Business
PPT
Dekoh Press Meet, Bangalore, India
Toolicious Presentation at SoCon07
 
Web 2.0 for IA's
Web Metrics - Cell Carrier Buzz on the Web
Web 2.0: Exploring Information Users
Using Tags and Clustering to Identify Topic-specific Blogs
Research on collaborative information sharing systems
The Value of Blogging in Business
Dekoh Press Meet, Bangalore, India

What's hot (20)

PPT
Web 2.0 and other emerging technologies
PPT
Web 2.0
PPT
Cilipbuilding
PPT
sahie
PPT
PPT
Using Web 2.0 Principles to Become Librarian 2.0: Introduction
PPT
Twenty tech training tips
PPT
Web 2.0, Hip or Hype - A Library Perspective
PPT
Web2 And Java
PPTX
Web 1.0: The Web as Resource
PPT
Web 2.0 for Lawyers (SL CLE)
PDF
Rdf Based User Interfaces
PPT
Web 2.0: Implications For The Cultural Heritage Sector
PDF
Top 5 Web Trends Of 2009 Personalization
PPT
Business Blogging -- Benefits of Free Internet Tools
PPT
Web 2.0: Beyond the Hype.” Usability Professionals Association, Minneapolis M...
PPT
Army Library Training Institute
PPT
Wassup with Web 2.0
PPT
Web Technology Trends for 2008 and Beyond, May 2008 Update
PPT
Using a Wiki as an Organization Portal (at TriXML2006)
Web 2.0 and other emerging technologies
Web 2.0
Cilipbuilding
sahie
Using Web 2.0 Principles to Become Librarian 2.0: Introduction
Twenty tech training tips
Web 2.0, Hip or Hype - A Library Perspective
Web2 And Java
Web 1.0: The Web as Resource
Web 2.0 for Lawyers (SL CLE)
Rdf Based User Interfaces
Web 2.0: Implications For The Cultural Heritage Sector
Top 5 Web Trends Of 2009 Personalization
Business Blogging -- Benefits of Free Internet Tools
Web 2.0: Beyond the Hype.” Usability Professionals Association, Minneapolis M...
Army Library Training Institute
Wassup with Web 2.0
Web Technology Trends for 2008 and Beyond, May 2008 Update
Using a Wiki as an Organization Portal (at TriXML2006)
Ad

Viewers also liked (8)

PPS
Pabalat ng Noli Me Tangere
PPT
Présentation du Web Invisible
PPTX
Cloaking making visible things into invisible
PPTX
Web Invisible et Deep Web
ODP
Web invisible
PPT
Lesson 9 Stereotypes
PPT
Veille sur Internet, les outils qui font gagner du temps
PDF
ENGLISH 9 Teacher's Guide
Pabalat ng Noli Me Tangere
Présentation du Web Invisible
Cloaking making visible things into invisible
Web Invisible et Deep Web
Web invisible
Lesson 9 Stereotypes
Veille sur Internet, les outils qui font gagner du temps
ENGLISH 9 Teacher's Guide
Ad

Similar to Deep-Hidden-Invisible Web (20)

PDF
The ultimate guide to the invisible web
PPT
Academic Skills 4
PPTX
Google Smart Powerpoint
PPT
Advanced Internet Searching
PDF
Get Top
PPT
Web2toolsjan09
PPT
Info Lit Day 2
PPT
Trekking through the world of information
DOC
Seo Manual
PPT
Academic Research on the Internet is New Library in Rural America
PPTX
PPT
Web2toolsoctober09
PPT
Lesson 4: Researching & The Internet
PPTX
Web 2.0 Tools
PPT
Web3.0- How brands can take advantage of the semantic shift - Brandsential
PPT
Online Research
PDF
Week 2 computers, web and the internet
PDF
PPT
Online Research (2)
PPT
New Search engine tools Update - March 2009
The ultimate guide to the invisible web
Academic Skills 4
Google Smart Powerpoint
Advanced Internet Searching
Get Top
Web2toolsjan09
Info Lit Day 2
Trekking through the world of information
Seo Manual
Academic Research on the Internet is New Library in Rural America
Web2toolsoctober09
Lesson 4: Researching & The Internet
Web 2.0 Tools
Web3.0- How brands can take advantage of the semantic shift - Brandsential
Online Research
Week 2 computers, web and the internet
Online Research (2)
New Search engine tools Update - March 2009

More from Centre for Social Initiative and Management (20)

PPTX
Job Creation In India Opportunities and Challenges
PPTX
Burns,_Senge,_and_Modern_Leadership_Trends.pptx
PDF
The Economics of Dravidian Model- Equity and Social Justice
PPTX
Epistemology and Learning for Researchers and Teachers
PPTX
The Crooked Timber of New India [Autosaved].pptx
PPTX
Qualitative research and use of Nvivo
PPTX
PPTX
Impact of covid pandemic on indian economy future
PPTX
Introduction to qualitative research and nvivo 12
PPTX
Examiners Expectations from PhD Thesis
PPTX
PPTX
Reporting Results of Statistical Analysis
PPTX
PPTX
Variables, Theory and Sampling Map
PPTX
Role of Good Governance Practices
PPS
The twelve commandments to live better by one of my friend
Job Creation In India Opportunities and Challenges
Burns,_Senge,_and_Modern_Leadership_Trends.pptx
The Economics of Dravidian Model- Equity and Social Justice
Epistemology and Learning for Researchers and Teachers
The Crooked Timber of New India [Autosaved].pptx
Qualitative research and use of Nvivo
Impact of covid pandemic on indian economy future
Introduction to qualitative research and nvivo 12
Examiners Expectations from PhD Thesis
Reporting Results of Statistical Analysis
Variables, Theory and Sampling Map
Role of Good Governance Practices
The twelve commandments to live better by one of my friend

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
cuic standard and advanced reporting.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Cloud computing and distributed systems.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Encapsulation theory and applications.pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Approach and Philosophy of On baking technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Modernizing your data center with Dell and AMD
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Empathic Computing: Creating Shared Understanding
Advanced methodologies resolving dimensionality complications for autism neur...
cuic standard and advanced reporting.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Cloud computing and distributed systems.
Mobile App Security Testing_ A Comprehensive Guide.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Encapsulation theory and applications.pdf
A Presentation on Artificial Intelligence
Digital-Transformation-Roadmap-for-Companies.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Electronic commerce courselecture one. Pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Approach and Philosophy of On baking technology
Network Security Unit 5.pdf for BCA BBA.
Modernizing your data center with Dell and AMD
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Empathic Computing: Creating Shared Understanding

Deep-Hidden-Invisible Web

  • 1. Deep-Hidden-Invisible Web Prepared by Prof.K.Prabhakar Assisted by P.Subha [email_address]
  • 2. Background of the Invisible Web " Invisible ” is used in the context of world wide web. It is not Invisible in the sense that it cannot be “ seen ”, the content is not available while we are searching using the most commonly used search engines such as Google or MSN or Yahoo. That is reason why many prefer using the word DEEP WEB rather than INVISIBLE WEB.
  • 3. The size of Invisible web The size is difficult to measure and no estimation is possible due to dynamic nature of creation of content and fast pace with which the search engines index WebPages. However its size as well as content is too large to ignore.
  • 4. Will the situation change? Google said that it is dedicated to indexing the world's content, however long it takes. Also, more previously invisible pages are getting indexed because of manually-added links to them from visible pages.
  • 5. Why “Invisibility”? Invisible does not mean that it is inaccessible. It means that it is not indexed by a search engine and is invisible to the person who is searching the net. If you find no results in the search engine that does not mean that the content is not available.
  • 6. Reasons for Invisibility Dynamic URLs : If web pages have long string of parameters and equal signs and question marks, such that they get duplicated what is in their data base. Form controlled Entry : Pages are displayed only when some actions are taken by human. Hidden Pages :Hidden means there is simply no sequence of hyperlink clicks that could take you to such a page. The pages are accessible, but only people who know of their existence know how to view them. New Pages: If the pages are new then they will not be indexed by any engine.
  • 7. Reasons for Invisibility Flash Presentations: Text content in Flash presentations is not indexed. Geo-Tagged: Computers from certain regions may be blocked out. That may include blocking of search engines also. Many of the American TV broadcasters are showing TV online, however, they are not available for searching. There may be other reasons for not invisibility.
  • 8. Some Ways to Make Invisible Content Visible by site owner There are ways to make deep web visible. We will study what a site owner could do and then what a searcher can do. Let us consider the what the site owner can do Link it to a visible or indexed page . If some content is available to you , you may put it on a static HTML page, with relevant formatting and necessary hyperlinks, then link to this static page from an already "visible" (indexed) page.
  • 9. Convert formats . For flash and other files transcribe them in to words and add as text. For Audio :A udio content such as a pod cast may be transcribed and published as supplementary text.
  • 10. Build links . Link to your own pages from other related pages. If you write about, say, trees on page A, then write about trees again on page B, link from page B to page A to give A more relevance. If page A hasn't been indexed, it will be after B is indexed. Points 6-9 are alternate ways to build links, hence helping make content visible.
  • 11. Build a topic pyramid . This is a specialized form of sitemap that actually spans many pages. The apex (top-most) page has general topics and links to the next layer of pages, which have more specific topics and links to the next layer. The bottom-most layer of the topic pyramid are your original Web pages or blog posts, which have the most specific content. This method builds page relevance via the serial linking, which induces spiders to want to visit and index.
  • 12. Socially bookmark it . If you find something, say a book at The Gutenberg Project, that you like, bookmark the URL at a social bookmaking site such as Del.icio.us with a brief description. Remove access restrictions . Get rid of the need to login, or don't apply time-limits.
  • 13. What user can do Use a site's search engine . Some times the site search engine may provide better information. Use site archive navigation . On web logs in particular, you can use the archive links to find info, albeit through manual searching.
  • 14. What user can do Using the word "database" in regular search engine query will find information that is difficult to find. For example, if you are looking for a database of images, you can type the search string images database into Google or one of the other engines. Somewhere down the results list in Google, you'll find Full-Text Database Images from the USPTO (US Patent and Trademark Office). You can then use the Quick or Advanced search forms to find patents relating to one or more terms. If there are images to be seen, there will be links to them.
  • 15. What user can do We can use an "invisible Web" directory, portal or specialized search engine such as Google Book Search , Google Scholar , Librarian's Internet Index , or BrightPlanet's Complete Planet (70,000 searchable databases and specialty search engines).
  • 16. Invisible web search tools Deep Web Search Engine — Clusty . Art — Musie du Louvre . Books Online — The Online Books Page . Business — Explorit Now! . Consumer — US Consumer Products Safety Commission Recalled Products . Economic and Job Data — FreeLunc.com — A searchable directory of free economic data. Finance and Investing — Bankrate.com .
  • 17. Invisible web search tools General Research — GPO's Catalog of US Government Publications . Government Data — Copyright Records (LOCIS) . International — International Data Base (IDB) . Law and Politics — THOMAS (Library of Congress) . Library of Congress — Library of Congress . Medical and Health — PubMed . Science — ScienceResearch.com . Transportation — FAA Flight Delay Information .
  • 18. Further research tools About WebSearch — Christmas 2006 web search guide . About Websearch — The deep web — find out more about the deep web — deep web search . ALA — American Library Association . BrightPlanet — FAQ .
  • 19. Further Research Deep Web Research — A gigantic list of resources. Deep Web Technologies . Ellipsis — Metadata, Google, and the Invisible Web . Envisional .
  • 20. Further Research Google Librarian Center . Google Library Project . Lifehacker — How to search the invisible web . MediaBistro — Some resources for freelancers .
  • 21. Further Research MetaQuerier — Exploring and integrating the deep web . QProber — Classifying and searching hidden-web text databases . The Invisible Web Weblog . University of California, Berkeley — Invisible or deep web .
  • 22. One of the most important site Please go through http://oedb.org/library This is an online education data base that will provide you information on various areas relating to career and education.