SlideShare a Scribd company logo
Ms. T. Primya
Assistant Professor
Department of Computer Science and Engineering
Dr. N. G. P. Institute of Technology
Coimbatore
 facts provided or learned about something or someone.
 what is conveyed or represented by a particular arrangement
or sequence of things.
 informing, telling, thing told, knowledge, items of knowledge,
news
 knowledge communicated or received concerning a particular
fact or circumstance
 knowing familiarity gained by experience
 person’s range of information
 a theoretical or practical understanding of the sum of what is
known
Information  retrieval (introduction)
 Data
The raw material of information
 Information
Data organized and presented in a particular manner
 Knowledge
“Justified true belief”
Information that can be acted upon
 Wisdom
Distilled and integrated knowledge
Demonstrative of high-level “understanding”
 Data
98.6º F, 99.5º F, 100.3º F, 101º F, …
 Information
Hourly body temperature: 98.6º F, 99.5º F, 100.3º F, 101º F,..
 Knowledge
If you have a temperature above 100º F, you most likely have
a fever
 Wisdom
If you don’t feel well, go see a doctor
 Information as process
 Information as communication
 Information as message transmission and reception
 Information = characteristics of the output of a process
◦ Tells us something about the process and the input
 Information-generating process do not occur in isolation
(separation)
 Communication = transmission of information
 Communication = producing the same message at the
destination that was sent at the source
The message must be encoded for transmission across a
medium (called channel)
But the channel is noisy and can distort the message
 Semantics (meaning) is irrelevant
 Fetch something that’s been stored
 Recover a stored state of knowledge
 Search through stored messages to find some messages
relevant to the task at hand
 The tracing and recovery of specific information from stored
data.
 It is the activity of obtaining information system resources
relevant to an information need from a collection of
information resources. Searches can be based on full-text or
other content-based indexing.
 Information retrieval is the science of searching for
information in a document, searching for documents
themselves, and also searching for metadata that describe data,
and for databases of texts, images or sounds.
 An information retrieval process begins when a user enters a
query into the system.
 Queries are formal statements of information needs, for
example search strings in web search engines.
 In information retrieval a query does not uniquely identify a
single object in the collection.
 Instead, several objects may match the query, perhaps with
different degrees of relevancy.
 An object is an entity that is represented by information in a
content collection or database. User queries are matched
against the database information.
 In information retrieval the results returned may or may not
match the query, so results are typically ranked.
 This ranking of
results is a key
difference of
information
retrieval searching
compared to
database searching.
 Retrospective
“Searching the past”
Different queries posed against a static collection
Time invariant
 Prospective
“Searching the future”
Static query posed against a dynamic collection
Time dependent
Ad hoc retrieval: find documents “about this”
 Compile a list of mammals that are considered to be
endangered, identify their habitat and, if possible, specify what
threatens them.
Known item search
 Find Jimmy Lin’s homepage.
 What’s the ISBN number of “Introduction to Information
Retrieval”?
Directed exploration
 Who makes the best chocolates?
Question answering
“Factoid”
 Who discovered America?
 When did TamilNadu become a state?
 What team won the World Series in 1998?
“List”
 What countries export oil?
 Name Indian cities that have “Tourist” Spot.
“Definition”
 Who is Information?
 What is Retrieval?
 Filtering:
Make a binary decision about each incoming document
Ex: Spam or not
 Routing:
Sort incoming documents into different bins?
Ex: Categorize news headlines:
World? Nation? Metro? Sports
Defn:
A structured set of data held in a computer, especially one
that is accessible in various ways.
Example:
Banks storing account information
Retailers storing inventories
Universities storing student grades
Information  retrieval (introduction)
Database IR
What we’re retrieving Structured data. Clear
semantics based on a
formal model.
Mostly unstructured. Free
text with some metadata.
Queries we’re posing Formally defined queries.
Unambiguous.
Vague, imprecise
information needs
Results we get Exact. Always correct in a
formal sense.
Sometimes relevant, often
not.
Interaction with system One-shot queries. Interaction is important
Other issues Concurrency, recovery,
atomicity are all critical
Issues downplayed.
Information  retrieval (introduction)
 Precision: What fractions of the returned results are relevant
to the information need?
 Recall: What fractions of the relevant documents in the
collection were returned by the systems?
Precision=TP/(TP+FP)
Recall=TP/(TP+FN)
Relevant Non Relevant
Retrieved True positives (TP) False Positives (FP)
Not Retrieved False Negatives (FN) True Negatives (TN)
Information  retrieval (introduction)
Crawling:
 The system browses the document collection and fetches
documents
Indexing:
 The system builds an index of the documents fetched during
crawling
Ranking:
 The system retrieves documents that are relevant to the query
from the index and displays to the user
Relevance feedback:
 The initial results returned from a given query may be used to
refine the query itself
Information  retrieval (introduction)
Information  retrieval (introduction)

More Related Content

PPTX
Information retrieval s
PPTX
Vector space model of information retrieval
PPTX
Functions of information retrival system(1)
PDF
CS6007 information retrieval - 5 units notes
PDF
Applied Mathematics III Begashaw.pdf
PDF
Natural Language Processing
PPT
Information retrieval system
PDF
Introduction to Information Retrieval & Models
Information retrieval s
Vector space model of information retrieval
Functions of information retrival system(1)
CS6007 information retrieval - 5 units notes
Applied Mathematics III Begashaw.pdf
Natural Language Processing
Information retrieval system
Introduction to Information Retrieval & Models

What's hot (20)

PPTX
Automatic indexing
PPT
Information Retrieval Models
PPTX
Probabilistic retrieval model
PPTX
Lectures 1,2,3
PPTX
Metadata
PPTX
Model of information retrieval (3)
PDF
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I PPT IN PDF
PPTX
Lec1,2
PDF
Information retrieval-systems notes
PPTX
Probabilistic information retrieval models & systems
PPTX
WEB BASED INFORMATION RETRIEVAL SYSTEM
PPTX
Tdm information retrieval
PPTX
Boolean,vector space retrieval Models
PPTX
The impact of web on ir
PPTX
Thesaurus 2101
PPTX
IRS-Cataloging and Indexing-2.1.pptx
PPTX
Ppt evaluation of information retrieval system
PPTX
Digital library
PPTX
Information retrieval introduction
Automatic indexing
Information Retrieval Models
Probabilistic retrieval model
Lectures 1,2,3
Metadata
Model of information retrieval (3)
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I PPT IN PDF
Lec1,2
Information retrieval-systems notes
Probabilistic information retrieval models & systems
WEB BASED INFORMATION RETRIEVAL SYSTEM
Tdm information retrieval
Boolean,vector space retrieval Models
The impact of web on ir
Thesaurus 2101
IRS-Cataloging and Indexing-2.1.pptx
Ppt evaluation of information retrieval system
Digital library
Information retrieval introduction
Ad

Similar to Information retrieval (introduction) (20)

PPTX
Week-1-Introduction to Data Mining.pptx
PDF
Nordic health data metadata
PDF
Lecture-1-Introduction-to-Data-Mining.pdf
PPT
Week12
PPT
Bioinformatioc: Information Retrieval
PPT
Data mining
PPT
Data mining
PPTX
Mpu1024 week13 analysis dR BAMBANAG SUMINTONO- by abdul murad abd hamid
PPTX
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
PPT
Databases
PPT
Databases
PPT
Information retrieval is the process of accessing data resources. Usually doc...
PPTX
Lecture 1
PDF
Chapter 1: Introduction to Information Storage and Retrieval
PPTX
Text analysis-semantic-search
PPTX
Classification and prediction in data mining
PDF
Data Mining Techniques
DOCX
unit 1 INTRODUCTION
PPTX
Qualitative data analysis
PPT
Merriam ch 8 5.26.10
Week-1-Introduction to Data Mining.pptx
Nordic health data metadata
Lecture-1-Introduction-to-Data-Mining.pdf
Week12
Bioinformatioc: Information Retrieval
Data mining
Data mining
Mpu1024 week13 analysis dR BAMBANAG SUMINTONO- by abdul murad abd hamid
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
Databases
Databases
Information retrieval is the process of accessing data resources. Usually doc...
Lecture 1
Chapter 1: Introduction to Information Storage and Retrieval
Text analysis-semantic-search
Classification and prediction in data mining
Data Mining Techniques
unit 1 INTRODUCTION
Qualitative data analysis
Merriam ch 8 5.26.10
Ad

Recently uploaded (20)

PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
Pre independence Education in Inndia.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Institutional Correction lecture only . . .
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
master seminar digital applications in india
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Structure & Organelles in detailed.
PPTX
Cell Types and Its function , kingdom of life
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Microbial disease of the cardiovascular and lymphatic systems
Basic Mud Logging Guide for educational purpose
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Pre independence Education in Inndia.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Final Presentation General Medicine 03-08-2024.pptx
Institutional Correction lecture only . . .
O5-L3 Freight Transport Ops (International) V1.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
human mycosis Human fungal infections are called human mycosis..pptx
master seminar digital applications in india
STATICS OF THE RIGID BODIES Hibbelers.pdf
VCE English Exam - Section C Student Revision Booklet
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Structure & Organelles in detailed.
Cell Types and Its function , kingdom of life

Information retrieval (introduction)

  • 1. Ms. T. Primya Assistant Professor Department of Computer Science and Engineering Dr. N. G. P. Institute of Technology Coimbatore
  • 2.  facts provided or learned about something or someone.  what is conveyed or represented by a particular arrangement or sequence of things.  informing, telling, thing told, knowledge, items of knowledge, news  knowledge communicated or received concerning a particular fact or circumstance
  • 3.  knowing familiarity gained by experience  person’s range of information  a theoretical or practical understanding of the sum of what is known
  • 5.  Data The raw material of information  Information Data organized and presented in a particular manner  Knowledge “Justified true belief” Information that can be acted upon  Wisdom Distilled and integrated knowledge Demonstrative of high-level “understanding”
  • 6.  Data 98.6º F, 99.5º F, 100.3º F, 101º F, …  Information Hourly body temperature: 98.6º F, 99.5º F, 100.3º F, 101º F,..  Knowledge If you have a temperature above 100º F, you most likely have a fever  Wisdom If you don’t feel well, go see a doctor
  • 7.  Information as process  Information as communication  Information as message transmission and reception
  • 8.  Information = characteristics of the output of a process ◦ Tells us something about the process and the input  Information-generating process do not occur in isolation (separation)
  • 9.  Communication = transmission of information
  • 10.  Communication = producing the same message at the destination that was sent at the source The message must be encoded for transmission across a medium (called channel) But the channel is noisy and can distort the message  Semantics (meaning) is irrelevant
  • 11.  Fetch something that’s been stored  Recover a stored state of knowledge  Search through stored messages to find some messages relevant to the task at hand
  • 12.  The tracing and recovery of specific information from stored data.  It is the activity of obtaining information system resources relevant to an information need from a collection of information resources. Searches can be based on full-text or other content-based indexing.  Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for metadata that describe data, and for databases of texts, images or sounds.
  • 13.  An information retrieval process begins when a user enters a query into the system.  Queries are formal statements of information needs, for example search strings in web search engines.  In information retrieval a query does not uniquely identify a single object in the collection.  Instead, several objects may match the query, perhaps with different degrees of relevancy.  An object is an entity that is represented by information in a content collection or database. User queries are matched against the database information.
  • 14.  In information retrieval the results returned may or may not match the query, so results are typically ranked.  This ranking of results is a key difference of information retrieval searching compared to database searching.
  • 15.  Retrospective “Searching the past” Different queries posed against a static collection Time invariant  Prospective “Searching the future” Static query posed against a dynamic collection Time dependent
  • 16. Ad hoc retrieval: find documents “about this”  Compile a list of mammals that are considered to be endangered, identify their habitat and, if possible, specify what threatens them. Known item search  Find Jimmy Lin’s homepage.  What’s the ISBN number of “Introduction to Information Retrieval”? Directed exploration  Who makes the best chocolates?
  • 17. Question answering “Factoid”  Who discovered America?  When did TamilNadu become a state?  What team won the World Series in 1998? “List”  What countries export oil?  Name Indian cities that have “Tourist” Spot. “Definition”  Who is Information?  What is Retrieval?
  • 18.  Filtering: Make a binary decision about each incoming document Ex: Spam or not  Routing: Sort incoming documents into different bins? Ex: Categorize news headlines: World? Nation? Metro? Sports
  • 19. Defn: A structured set of data held in a computer, especially one that is accessible in various ways. Example: Banks storing account information Retailers storing inventories Universities storing student grades
  • 21. Database IR What we’re retrieving Structured data. Clear semantics based on a formal model. Mostly unstructured. Free text with some metadata. Queries we’re posing Formally defined queries. Unambiguous. Vague, imprecise information needs Results we get Exact. Always correct in a formal sense. Sometimes relevant, often not. Interaction with system One-shot queries. Interaction is important Other issues Concurrency, recovery, atomicity are all critical Issues downplayed.
  • 23.  Precision: What fractions of the returned results are relevant to the information need?  Recall: What fractions of the relevant documents in the collection were returned by the systems?
  • 24. Precision=TP/(TP+FP) Recall=TP/(TP+FN) Relevant Non Relevant Retrieved True positives (TP) False Positives (FP) Not Retrieved False Negatives (FN) True Negatives (TN)
  • 26. Crawling:  The system browses the document collection and fetches documents Indexing:  The system builds an index of the documents fetched during crawling Ranking:  The system retrieves documents that are relevant to the query from the index and displays to the user Relevance feedback:  The initial results returned from a given query may be used to refine the query itself