SlideShare a Scribd company logo
1
PRESENTATION
ON
GOOGLE
PRESENTEDBY
Sehrish
Akram
2
3
Google, the leading search engine worldwide
Founded in 1998 by Stanford University graduate
students Larry Page and Sergei Brin.
4
5
WHAT IS QUERY
6
SEARCHING TECHNIQUES
Google search engine uses these techniques:
”It is a full-text searching engine”
When we do a Google search actually, we are
searching GOOGLE’s index of the web.
We do this by software program called
“spiders”.
7
SEARCHING TECHNIQUES
Spiders start fetching a few web pages and then
they follow the link and fetch the pages they
point to.
CASE FOLDING technique
Normalized technique e.g.
U.S.A …USA.
8
SEARCHING TECHNIQUES
Case sensitive technique is not also used in
Google if the user search for seven , SEVEN,
Seven or even 7 u get the same results.
Singular is different from plural searches for
apple or apples turn up different pages.
The orders of words matters: Google considers
the first word most important ,the second word
next and so on.
Google ignores most little words including “I”
“an” “ how” “the” “of” “AN”. 9
SEARCHING TECHNIQUES
Google search word limit is 32.
 Wildcards searching generally places the symbol
"*" after a word.
 It tells the database to look for variations of that
word.
For Example: Investigation* Might pull sites
with words such as investigation, investigator,
and investigative.
10
INFORMATION RETRIEVAL AND THE WEB
What We Do
Google WANTED TO organize the web into
something searchable. Their early prototype was
based upon a few basic principles, including:
The best pages tend to be the ones that people
linked to the most.
The best description of a page is often derived
from the anchor text associated with the links to a
page. 11
Anchor text
12
DOCUMENT ACQUISITION AND STORAGE:
Google searches more than 3 billion Web documents,
which includes Web pages, images and Usenet
postings.
Google uses a standalone Web crawler, distributed
trough several machines, to create indexes and copies
of the document.
Besides standard .html files, Google also indexes
other file type including
________
_________
__________
__________
13
DOCUMENT ACQUISITION AND STORAGE:
A copy of each crawled page is stored in
Google’s repository.
Indexes are created using stored words, pointing
to an inverted index file
14
QUERY INTRODUCTION AND USER
OPTIONS:
Since it’s foundation, Google has been steadily
introducing new features.
Google uses Boolean search without nested
expressions support and with some variations.
By default, it automatically uses AND operator
between terms, the minus symbol can be used to
perform a NOT function and the OR operation is
supported (using OR in upper case).
15
Google does not uses stemming, nor truncation,
but allows the use of ‘*’ as a wildcard in the
middle of a phrase. For example, searching for
“Search Engine” wields quite different result
from “Search * Engine”.
Query Introduction and user Options:
16
RESULTS SELECTION AND PRESENTATION
To select which document is presented, Google
combines a document’s Page Rank value, anchor
text and proximity
Results are clustered by server with two visible
results and a link to “More results from server”.
17
RESULTS SELECTION AND PRESENTATION
Google helps users by
correcting misspelled words
in their search queries using,
not a predetermined
dictionary, but it’s own index
of the entire web.
Google visual interface is
one of the simplest and,
according to many, one of the
reasons to Google’s success,
“it’s simple and it works”. 18
LOGICAL DIAGRAM
Web Crawling, Extraction, and Indexing 19

More Related Content

PPTX
Model of information retrieval (3)
PPTX
Probabilistic information retrieval models & systems
PPTX
The smart retrieval experiment
PPTX
Information retrieval s
PDF
CS6007 information retrieval - 5 units notes
PDF
Introduction to Information Retrieval & Models
PPTX
Web crawler
PDF
Learning to Rank - From pairwise approach to listwise
Model of information retrieval (3)
Probabilistic information retrieval models & systems
The smart retrieval experiment
Information retrieval s
CS6007 information retrieval - 5 units notes
Introduction to Information Retrieval & Models
Web crawler
Learning to Rank - From pairwise approach to listwise

What's hot (20)

PDF
Overview of recommender system
PDF
Introductionto bibliometrics
DOCX
Open source search engine
PPTX
Collection development
PPTX
Bibliometrics
PPTX
Evaluation of medlars
PPT
Google Search Engine
PPTX
Metadata
PDF
Anatomy of an eCommerce Search Engine by Mayur Datar
PPTX
Collaborative Filtering Recommendation System
PPTX
Crawling and Indexing
PPTX
Web Crawlers
PPTX
Domain and hosting
PPTX
Co word analysis
PPTX
Web mining
PPTX
Components of a search engine
PPTX
Uniform Resource Locator (URL), PURL.pptx
PPTX
Functions of information retrival system(1)
PPT
Web Mining
PPT
DIGITAL LIBRARY
Overview of recommender system
Introductionto bibliometrics
Open source search engine
Collection development
Bibliometrics
Evaluation of medlars
Google Search Engine
Metadata
Anatomy of an eCommerce Search Engine by Mayur Datar
Collaborative Filtering Recommendation System
Crawling and Indexing
Web Crawlers
Domain and hosting
Co word analysis
Web mining
Components of a search engine
Uniform Resource Locator (URL), PURL.pptx
Functions of information retrival system(1)
Web Mining
DIGITAL LIBRARY
Ad

Viewers also liked (20)

PPTX
Information storage and retrieval
PPTX
Information retrieval system!
PPTX
Introduction to Information Retrieval
PDF
Tutorial 1 (information retrieval basics)
PPT
Storage And Retrieval Of Information
PPTX
Techniques of information retrieval
PDF
Search Engine Google
PPTX
PPT
Kno.e.sis Review: late 2012 to mid 2013
PPTX
Tdm information retrieval
PPTX
Knoesis Student Achievement
PPT
Google Search Engine
DOCX
Computers in pharmacy
PPT
Searching techniques
PDF
Search Analytics with ELK (Elastic Stack)
PDF
Trust Management: A Tutorial
PPTX
Web and Complex Systems Lab @ Kno.e.sis
PPTX
2015 Kno.e.sis Center Annual Review
Information storage and retrieval
Information retrieval system!
Introduction to Information Retrieval
Tutorial 1 (information retrieval basics)
Storage And Retrieval Of Information
Techniques of information retrieval
Search Engine Google
Kno.e.sis Review: late 2012 to mid 2013
Tdm information retrieval
Knoesis Student Achievement
Google Search Engine
Computers in pharmacy
Searching techniques
Search Analytics with ELK (Elastic Stack)
Trust Management: A Tutorial
Web and Complex Systems Lab @ Kno.e.sis
2015 Kno.e.sis Center Annual Review
Ad

Similar to Information Retrieval Techniques of Google (20)

PPTX
Google - A presentation by Pushpendra Singh Dangi
PPTX
Internet search techniques by zakir hossain
PPTX
Effective googloing
PPTX
How Google search works ppt
PDF
How Google Search Works
PPTX
Inside google search - how it works??
PPTX
How Google Search Algorithm Works ??
PPTX
How Google Search Engine Algorithm Works ??
PPTX
Google indexing
PPT
Google And Search Engines
PPTX
best digital marketing training in Pune
ODP
Web2.0.2012 - lesson 8 - Google world
PPTX
A Peek Behind the Curtain
PPTX
Google
PDF
Google
PPTX
Google history nd architecture
PPTX
Short history of google
PDF
Review of "The anatomy of a large scale hyper textual web search engine"
PPS
Googlesearchtechniques 090402135045-phpapp01
PPS
Google Search Techniques
Google - A presentation by Pushpendra Singh Dangi
Internet search techniques by zakir hossain
Effective googloing
How Google search works ppt
How Google Search Works
Inside google search - how it works??
How Google Search Algorithm Works ??
How Google Search Engine Algorithm Works ??
Google indexing
Google And Search Engines
best digital marketing training in Pune
Web2.0.2012 - lesson 8 - Google world
A Peek Behind the Curtain
Google
Google
Google history nd architecture
Short history of google
Review of "The anatomy of a large scale hyper textual web search engine"
Googlesearchtechniques 090402135045-phpapp01
Google Search Techniques

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
Teaching material agriculture food technology
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
cuic standard and advanced reporting.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Approach and Philosophy of On baking technology
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
MYSQL Presentation for SQL database connectivity
Big Data Technologies - Introduction.pptx
Programs and apps: productivity, graphics, security and other tools
Reach Out and Touch Someone: Haptics and Empathic Computing
Teaching material agriculture food technology
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Review of recent advances in non-invasive hemoglobin estimation
Unlocking AI with Model Context Protocol (MCP)
MIND Revenue Release Quarter 2 2025 Press Release
cuic standard and advanced reporting.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Network Security Unit 5.pdf for BCA BBA.
Approach and Philosophy of On baking technology
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Building Integrated photovoltaic BIPV_UPV.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
MYSQL Presentation for SQL database connectivity

Information Retrieval Techniques of Google

  • 1. 1
  • 3. 3
  • 4. Google, the leading search engine worldwide Founded in 1998 by Stanford University graduate students Larry Page and Sergei Brin. 4
  • 5. 5
  • 7. SEARCHING TECHNIQUES Google search engine uses these techniques: ”It is a full-text searching engine” When we do a Google search actually, we are searching GOOGLE’s index of the web. We do this by software program called “spiders”. 7
  • 8. SEARCHING TECHNIQUES Spiders start fetching a few web pages and then they follow the link and fetch the pages they point to. CASE FOLDING technique Normalized technique e.g. U.S.A …USA. 8
  • 9. SEARCHING TECHNIQUES Case sensitive technique is not also used in Google if the user search for seven , SEVEN, Seven or even 7 u get the same results. Singular is different from plural searches for apple or apples turn up different pages. The orders of words matters: Google considers the first word most important ,the second word next and so on. Google ignores most little words including “I” “an” “ how” “the” “of” “AN”. 9
  • 10. SEARCHING TECHNIQUES Google search word limit is 32.  Wildcards searching generally places the symbol "*" after a word.  It tells the database to look for variations of that word. For Example: Investigation* Might pull sites with words such as investigation, investigator, and investigative. 10
  • 11. INFORMATION RETRIEVAL AND THE WEB What We Do Google WANTED TO organize the web into something searchable. Their early prototype was based upon a few basic principles, including: The best pages tend to be the ones that people linked to the most. The best description of a page is often derived from the anchor text associated with the links to a page. 11
  • 13. DOCUMENT ACQUISITION AND STORAGE: Google searches more than 3 billion Web documents, which includes Web pages, images and Usenet postings. Google uses a standalone Web crawler, distributed trough several machines, to create indexes and copies of the document. Besides standard .html files, Google also indexes other file type including ________ _________ __________ __________ 13
  • 14. DOCUMENT ACQUISITION AND STORAGE: A copy of each crawled page is stored in Google’s repository. Indexes are created using stored words, pointing to an inverted index file 14
  • 15. QUERY INTRODUCTION AND USER OPTIONS: Since it’s foundation, Google has been steadily introducing new features. Google uses Boolean search without nested expressions support and with some variations. By default, it automatically uses AND operator between terms, the minus symbol can be used to perform a NOT function and the OR operation is supported (using OR in upper case). 15
  • 16. Google does not uses stemming, nor truncation, but allows the use of ‘*’ as a wildcard in the middle of a phrase. For example, searching for “Search Engine” wields quite different result from “Search * Engine”. Query Introduction and user Options: 16
  • 17. RESULTS SELECTION AND PRESENTATION To select which document is presented, Google combines a document’s Page Rank value, anchor text and proximity Results are clustered by server with two visible results and a link to “More results from server”. 17
  • 18. RESULTS SELECTION AND PRESENTATION Google helps users by correcting misspelled words in their search queries using, not a predetermined dictionary, but it’s own index of the entire web. Google visual interface is one of the simplest and, according to many, one of the reasons to Google’s success, “it’s simple and it works”. 18
  • 19. LOGICAL DIAGRAM Web Crawling, Extraction, and Indexing 19