SlideShare a Scribd company logo
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Task-based Information Retrieval
Dar´ıo Garigliotti
University of Stavanger
March 13, 2017
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
About me
I am a Ph.D. candidate in Information Technology
I started my Ph.D. on November 2015
I hold a M.Sc. in Computer Science from Fa.M.A.F. - National
University of C´ordoba, Argentina
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Outline:
1 Target Types Identification
2 Type-aware Entity Retrieval
3 Query Suggestions
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Overview: from the library to the assistant
Task: Underlying information need of an user
E.g. wanting to plan a travel, issuing paris
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Overview: from the library to the assistant
Task: Underlying information need of an user
E.g. wanting to plan a travel, issuing paris
Document Retrieval: ranked list of relevant documents
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Overview: from the library to the assistant
Task: Underlying information need of an user
E.g. wanting to plan a travel, issuing paris
Document Retrieval: ranked list of relevant documents
Entity-oriented Search: entity : Paris , properties, relations
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Overview: from the library to the assistant
Task: Underlying information need of an user
E.g. wanting to plan a travel, issuing paris
Document Retrieval: ranked list of relevant documents
Entity-oriented Search: entity : Paris , properties, relations
Task-completion Search: booking/planning assistant
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Understanding: Target Types Identification
Task-based
Information Retrieval
Query Understanding
Target types
Target Types Identification
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Motivation
Problem Definition
Approaches
Results and Insights
Target Types Identification: Motivation
Large proportion of entity-bearing queries
Query target types automatically detected rather than provided
- Target types help to reduce the space of search
- Types are organized in hierarchies (or taxonomies, or
ontologies)
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Motivation
Problem Definition
Approaches
Results and Insights
E.g. Buying a book on Amazon
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Motivation
Problem Definition
Approaches
Results and Insights
Target Types Identification: Problem Definition
Hierarchical Target Type Identification (HTTI) problem:
To find the most specific single target type, general enough to
cover all relevant entities
Many queries discarded since they had no types
Some queries don’t have a clear single type
Our alternative definition relaxes on those issues
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Motivation
Problem Definition
Approaches
Results and Insights
Target Types Identification: Test collection
A new test collection with around 500 queries, built with a
crowdsourcing experiment
Human annotators
chose a most specific
type, possibly NIL
Query: ratt albums
Candidate types:
1. Agent
1.1. Person
1.1.1. Artist
1.1.1.1. Musical artist
2. Work
2.1. Musical work
2.1.1. Album
2.1.2. Single
- None of these types
Correct type: 2.1.1. Album
1 2 3 4
Number of main types
0
50
100
150
200
250
300
Numberofqueries
No NIL type
Has NIL type
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Motivation
Problem Definition
Approaches
Results and Insights
Target Types Identification: Approaches
Baselines
- Entity-centric (EC): rank entities based on their relevance
to the query, then look at the types of the top-k entities
- Type-centric (TC): build a term-based representation for
each type, by aggregating descriptions of assigned entities
Our approach: a Learning-to-rank (LTR) method, with a
variety of features
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Motivation
Problem Definition
Approaches
Results and Insights
Features for learning to rank target types
# Feature Description
Baseline features
1-5 ECBM25,K (t, q) Entity-centric type score with K ∈ {5, 10, 20, 50, 100} using BM25
6-10 ECLM,K (t, q) Entity-centric type score with K ∈ {5, 10, 20, 50, 100} using LM
11 TCBM25(t, q) Type-centric score using BM25
12 TCLM (t, q) Type-centric score using LM
Knowledge base features
13 DEPTH(t) The hierarchical level of type t, normalized by the taxonomy depth
14 CHILDREN(t) Number of children of type t in the taxonomy
15 SIBLINGS(t) Number of siblings of type t in the taxonomy
16 ENTITIES(t) Number of entities mapped to type t
Type label features
17 LENGTH(t) Length of (the label of) type t in words
18 IDFSUM(t) Sum of IDF for terms in (the label of) type t
19 IDFAVG(t) Avg of IDF for terms in (the label of) type t
20-21 JTERMSn(t, q) Query-type Jaccard similarity for sets of n-grams, for n ∈ {1, 2}
22 JNOUNS(t, q) Query-type Jaccard similarity using only nouns
23 SIMAGGR(t, q) Cosine sim. between the q and t word2vec vectors aggregated over all terms
24 SIMMAX(t, q) Max. cosine sim. of w2v vectors between each pair of query (q) and type (t) terms
25 SIMAVG(t, q) Avg. of cosine sim. of w2v vectors between each pair of query (q) and type (t) terms
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Motivation
Problem Definition
Approaches
Results and Insights
Target Types Identification: Results and Insights
INEX_LD ListSearch QALD2 SemSearch_ES
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8NDCG@5
EC, LM
TC, LM
LTR
Identification performances across query groups
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Motivation
Problem Definition
Approaches
Results and Insights
Target Types Identification: Results and Insights
SIMMAX(t,q)
SIMAGGR(t,q)
SIMAVG(t,q)
TCBM25(t,q)
ENTITIES(t)
ECBM25,100(t,q)
ECBM25,50(t,q)
SIBLINGS(t)
ECBM25,20(t,q)
CHILDREN(t)
IDFSUM(t)
IDFAVG(t)
JNOUNS(t,q)
ECBM25,10(t,q)
JTERMS1(t,q)
ECBM25,5(t,q)DEPTH(t)
ECLM,100(t,q)
LENGTH(t)
ECLM,50(t,q)TCLM(t,q)
ECLM,20(t,q)
ECLM,10(t,q)
ECLM,5(t,q)
JTERMS2(t,q)
Features
0.0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Giniscore
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
NDCG@5
Gini score
NDCG@5
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Task-based IR
Task-based
Information Retrieval
Query Understanding
Target types
Target Types Identification
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Task-based IR: Types and Entity Retrieval
Task-based
Information Retrieval
Query Understanding Entities
TypesTarget types
Type-aware
Entity Retrieval
Target Types Identification
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Type-aware Entity Retrieval
A characteristic property of entities is that they are typed
Types naturally appear in many queries
countries where one can pay with the euro
art museums in Amsterdam
Types have been shown to improve Entity Retrieval
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Dimensions of Type Information
We systematically identified and compared all combinations of
3 dimensions
Type taxonomies
Type representations
Retrieval models
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Dimensions: Type Taxonomies
Which type taxonomy to use?
DBpedia Ontology (7 levels, 600 types)
Freebase Types (2 levels, 2K types)
Wikipedia Categories (34 levels, 600K types)
YAGO Taxonomy (19 levels, 500K types)
These vary a lot in terms of hierarchical structure and in how
entity-type assignments are recorded
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Dimensions: Type Representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Top-level type(s)
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Most specific type(s)
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Dimensions: Retrieval Models
How to add type information into entity retrieval?
Retrieval task
defined in a
generative
probabilistic
framework
P(q | e)
query entity
Olympic games
target types
Rio de Janeiro
term-based
similarity
type-based
similarity
… …
entity types
Both query and entity considered in the term space as well as
in the type space
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Dimensions: Retrieval Models
(Strict) Filtering model
P(q | e) = P(θT
q | θT
e ) · χ[types(q) ∩ types(e) = ∅]
Types(q)Types(q) Types(e)Types(e)
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Dimensions: Retrieval Models
(Soft) Filtering model
P(q | e) = P(θT
q | θT
e ) · P(θT
q | θT
e )
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Dimensions: Retrieval Models
Interpolation model
P(q | e) = (1 − λ) · P(θT
q | θT
e ) + λ · P(θT
q | θT
e )
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Type Taxonomies
Type Representations
Retrieval Models
Type-aware Entity Retrieval: Lessons Learned
Summary of insights: Type information proves most useful
when larger, deeper type taxonomies provide very specific
types.
How to represent hierarchical entity type information?
Using the most specific types is the most effective way
What (kind of) type taxonomies to use? Wikipedia
performs best in most of the cases
What combination model to choose? All models suffer
from missing type information, but interpolation appears
to be the most robust
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Task-based IR: Query Suggestions
Task-based
Information Retrieval
Query Understanding Entities
TypesTarget types
Type-aware
Entity Retrieval
Target Types Identification
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Task-based IR: Query Suggestions
Task-based
Information Retrieval
Query Suggestions Query Understanding Entities
TypesTarget types
Type-aware
Entity Retrieval
Target Types Identification
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Suggestions
Investigated using the setup of the TREC Tasks track
Task understanding:
Given an initial query, to
return a ranked list of query
suggestions that cover all the
possible subtasks of the task
Participation in the track
Formalization and analysis
of our approach
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Suggestions: Architecture and Model
QS WS WD WH
Query suggestions
q0q0
Keyphrases
Components:
Source importance
Document importance
Keyphrase relevance
Query suggestion
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Suggestions: Architecture and Model
QS WS WD WH
Query suggestions
q0q0
Keyphrases
Components:
Source importance
Document importance
Keyphrase relevance
Query suggestion
P(q|q0) =
s d k
P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) .
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Suggestions: Architecture and Model
QS WS WD WH
Query suggestions
q0q0
Keyphrases
Components:
Source importance
Document importance
Keyphrase relevance
Query suggestion
P(q|q0) =
s d k
P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) .
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Suggestions: Architecture and Model
QS WS WD WH
Query suggestions
q0q0
Keyphrases
Components:
Source importance
Document importance
Keyphrase relevance
Query suggestion
P(q|q0) =
s d k
P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) .
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Suggestions: Architecture and Model
QS WS WD WH
Query suggestions
q0q0
Keyphrases
Components:
Source importance
Document importance
Keyphrase relevance
Query suggestion
P(q|q0) =
s d k
P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) .
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Suggestions: Architecture and Model
QS WS WD WH
Query suggestions
q0q0
Keyphrases
Components:
Source importance
Document importance
Keyphrase relevance
Query suggestion
P(q|q0) =
s d k
P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) .
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Query Suggestions: Component Estimations
Insights from best component estimations:
Query suggestion: Keyphrases as-is  (vs generations)
Document importance: Uniform  (vs. rank-based)
Source importance: Proportional to best document
importance per individual sources (vs uniform, or source
group-based)
Overall, a high contribution of API query suggestions
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Task-based IR
Task-based
Information Retrieval
Query Suggestions Query Understanding Entities
TypesTarget types
Type-aware
Entity Retrieval
Target Types Identification
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Task-based IR: Future work
Task-based
Information Retrieval
Query Suggestions Query Understanding Entities
TypesTarget typesSubtasks
Type-aware
Entity Retrieval
Target Types IdentificationSubtasks Identification
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Task-based IR: Future work
Task-based
Information Retrieval
Query Suggestions Query Understanding Entities
TypesTarget typesSubtasks
Type-aware
Entity Retrieval
Target Types Identification
Linking subtasks
to target types
Subtasks Identification
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Task-based IR: Future work
Towards a formal task model
- Subtasks, i.e., clusters of information needs
- Relationship with query types involved
- Specific entities involved (?)
- ...
Dar´ıo Garigliotti Task-based Information Retrieval
Target Types Identification
Type-aware Entity Retrieval
Query Suggestions
Thanks!
Questions?
Dar´ıo Garigliotti Task-based Information Retrieval

More Related Content

PDF
On Type-Aware Entity Retrieval
PPT
L14 l15 Object Oriented DBMS
PPTX
Entity Linking in Queries: Tasks and Evaluation
PPT
Ontology Engineering for the Semantic Web and beyond
PPTX
Corso di Statistica Inferenziale per Data Scientist
PDF
Learning-to-Rank Target Types for Entity-Bearing Queries
PDF
Type-Aware Entity Retrieval
PDF
Type-Aware Entity Retrieval
On Type-Aware Entity Retrieval
L14 l15 Object Oriented DBMS
Entity Linking in Queries: Tasks and Evaluation
Ontology Engineering for the Semantic Web and beyond
Corso di Statistica Inferenziale per Data Scientist
Learning-to-Rank Target Types for Entity-Bearing Queries
Type-Aware Entity Retrieval
Type-Aware Entity Retrieval

Similar to Task-Based Information Retrieval (20)

PDF
Entity Retrieval (WWW 2013 tutorial)
PPTX
WISS QA Do it yourself Question answering over Linked Data
PDF
Lecture-1-Introduction-to-Data-Mining.pdf
PPT
Advanced Use of Properties and Scripts in TIBCO Spotfire
PPTX
Resume_Clasification.pptx
PDF
Artificial Intelligence in Data Curation
PPTX
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
PPT
Slide3.ppt
PDF
Type-Aware Entity Retrieval
PDF
Survey Research in Software Engineering
PPTX
Reflected Intelligence: Lucene/Solr as a self-learning data system
PDF
Tutorial: Context-awareness In Information Retrieval and Recommender Systems
PPTX
Resume_Clasification.pptx
PPT
week9_Machine_Learning.ppt
PPT
Data science: DATA MINING AND DATA WHEREHOUSE.ppt
PPT
Its all about data mining
PPTX
Predicting the Perfect Purchase: Student Presentation on Customer Transaction...
PPTX
Text mining and analytics v6 - p1
PDF
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Entity Retrieval (WWW 2013 tutorial)
WISS QA Do it yourself Question answering over Linked Data
Lecture-1-Introduction-to-Data-Mining.pdf
Advanced Use of Properties and Scripts in TIBCO Spotfire
Resume_Clasification.pptx
Artificial Intelligence in Data Curation
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Slide3.ppt
Type-Aware Entity Retrieval
Survey Research in Software Engineering
Reflected Intelligence: Lucene/Solr as a self-learning data system
Tutorial: Context-awareness In Information Retrieval and Recommender Systems
Resume_Clasification.pptx
week9_Machine_Learning.ppt
Data science: DATA MINING AND DATA WHEREHOUSE.ppt
Its all about data mining
Predicting the Perfect Purchase: Student Presentation on Customer Transaction...
Text mining and analytics v6 - p1
Algorithmic Impact Assessment: Fairness, Robustness and Explainability in Aut...
Ad

More from Darío Garigliotti (20)

PDF
Task-Based Support in Search Engines
PDF
Task Recommendation
PDF
About "Towards Better Text Understanding and Retrieval through Kernel Entity ...
PDF
A Semantic Search Approach to Task-Completion Engines
PDF
A Summary of ECIR'18
PDF
A Semantic Search Approach to Task-Completion Engines
PDF
A Knowledge Base of Entity-Oriented Search Intents
PDF
Type Information in Entity Retrieval
PDF
Dive into Deep Learning
PDF
If this is the answer, what was the question?
PDF
Semi-supervised Learning for Word Sense Disambiguation
PDF
Semi-supervised Learning for Word Sense Disambiguation
PDF
Type-Aware Entity Retrieval
PDF
Semi-supervised Learning for Word Sense Disambiguation
PDF
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
PDF
Machine Learning - Clustering
PDF
Machine Learning - Classification (ctd.)
PDF
Machine Learning - Classification
PDF
Data Mining - Exploring Data
PDF
Data Mining - Introduction and Data
Task-Based Support in Search Engines
Task Recommendation
About "Towards Better Text Understanding and Retrieval through Kernel Entity ...
A Semantic Search Approach to Task-Completion Engines
A Summary of ECIR'18
A Semantic Search Approach to Task-Completion Engines
A Knowledge Base of Entity-Oriented Search Intents
Type Information in Entity Retrieval
Dive into Deep Learning
If this is the answer, what was the question?
Semi-supervised Learning for Word Sense Disambiguation
Semi-supervised Learning for Word Sense Disambiguation
Type-Aware Entity Retrieval
Semi-supervised Learning for Word Sense Disambiguation
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
Machine Learning - Clustering
Machine Learning - Classification (ctd.)
Machine Learning - Classification
Data Mining - Exploring Data
Data Mining - Introduction and Data
Ad

Recently uploaded (20)

PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PDF
Sciences of Europe No 170 (2025)
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
TOTAL hIP ARTHROPLASTY Presentation.pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Placing the Near-Earth Object Impact Probability in Context
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
lecture 2026 of Sjogren's syndrome l .pdf
Cell Membrane: Structure, Composition & Functions
2Systematics of Living Organisms t-.pptx
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Phytochemical Investigation of Miliusa longipes.pdf
Biophysics 2.pdffffffffffffffffffffffffff
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
Sciences of Europe No 170 (2025)
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf

Task-Based Information Retrieval

  • 1. Target Types Identification Type-aware Entity Retrieval Query Suggestions Task-based Information Retrieval Dar´ıo Garigliotti University of Stavanger March 13, 2017 Dar´ıo Garigliotti Task-based Information Retrieval
  • 2. Target Types Identification Type-aware Entity Retrieval Query Suggestions About me I am a Ph.D. candidate in Information Technology I started my Ph.D. on November 2015 I hold a M.Sc. in Computer Science from Fa.M.A.F. - National University of C´ordoba, Argentina Dar´ıo Garigliotti Task-based Information Retrieval
  • 3. Target Types Identification Type-aware Entity Retrieval Query Suggestions Outline: 1 Target Types Identification 2 Type-aware Entity Retrieval 3 Query Suggestions Dar´ıo Garigliotti Task-based Information Retrieval
  • 4. Target Types Identification Type-aware Entity Retrieval Query Suggestions Overview: from the library to the assistant Task: Underlying information need of an user E.g. wanting to plan a travel, issuing paris Dar´ıo Garigliotti Task-based Information Retrieval
  • 5. Target Types Identification Type-aware Entity Retrieval Query Suggestions Overview: from the library to the assistant Task: Underlying information need of an user E.g. wanting to plan a travel, issuing paris Document Retrieval: ranked list of relevant documents Dar´ıo Garigliotti Task-based Information Retrieval
  • 6. Target Types Identification Type-aware Entity Retrieval Query Suggestions Overview: from the library to the assistant Task: Underlying information need of an user E.g. wanting to plan a travel, issuing paris Document Retrieval: ranked list of relevant documents Entity-oriented Search: entity : Paris , properties, relations Dar´ıo Garigliotti Task-based Information Retrieval
  • 7. Target Types Identification Type-aware Entity Retrieval Query Suggestions Overview: from the library to the assistant Task: Underlying information need of an user E.g. wanting to plan a travel, issuing paris Document Retrieval: ranked list of relevant documents Entity-oriented Search: entity : Paris , properties, relations Task-completion Search: booking/planning assistant Dar´ıo Garigliotti Task-based Information Retrieval
  • 8. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Understanding: Target Types Identification Task-based Information Retrieval Query Understanding Target types Target Types Identification Dar´ıo Garigliotti Task-based Information Retrieval
  • 9. Target Types Identification Type-aware Entity Retrieval Query Suggestions Motivation Problem Definition Approaches Results and Insights Target Types Identification: Motivation Large proportion of entity-bearing queries Query target types automatically detected rather than provided - Target types help to reduce the space of search - Types are organized in hierarchies (or taxonomies, or ontologies) Dar´ıo Garigliotti Task-based Information Retrieval
  • 10. Target Types Identification Type-aware Entity Retrieval Query Suggestions Motivation Problem Definition Approaches Results and Insights E.g. Buying a book on Amazon Dar´ıo Garigliotti Task-based Information Retrieval
  • 11. Target Types Identification Type-aware Entity Retrieval Query Suggestions Motivation Problem Definition Approaches Results and Insights Target Types Identification: Problem Definition Hierarchical Target Type Identification (HTTI) problem: To find the most specific single target type, general enough to cover all relevant entities Many queries discarded since they had no types Some queries don’t have a clear single type Our alternative definition relaxes on those issues Dar´ıo Garigliotti Task-based Information Retrieval
  • 12. Target Types Identification Type-aware Entity Retrieval Query Suggestions Motivation Problem Definition Approaches Results and Insights Target Types Identification: Test collection A new test collection with around 500 queries, built with a crowdsourcing experiment Human annotators chose a most specific type, possibly NIL Query: ratt albums Candidate types: 1. Agent 1.1. Person 1.1.1. Artist 1.1.1.1. Musical artist 2. Work 2.1. Musical work 2.1.1. Album 2.1.2. Single - None of these types Correct type: 2.1.1. Album 1 2 3 4 Number of main types 0 50 100 150 200 250 300 Numberofqueries No NIL type Has NIL type Dar´ıo Garigliotti Task-based Information Retrieval
  • 13. Target Types Identification Type-aware Entity Retrieval Query Suggestions Motivation Problem Definition Approaches Results and Insights Target Types Identification: Approaches Baselines - Entity-centric (EC): rank entities based on their relevance to the query, then look at the types of the top-k entities - Type-centric (TC): build a term-based representation for each type, by aggregating descriptions of assigned entities Our approach: a Learning-to-rank (LTR) method, with a variety of features Dar´ıo Garigliotti Task-based Information Retrieval
  • 14. Target Types Identification Type-aware Entity Retrieval Query Suggestions Motivation Problem Definition Approaches Results and Insights Features for learning to rank target types # Feature Description Baseline features 1-5 ECBM25,K (t, q) Entity-centric type score with K ∈ {5, 10, 20, 50, 100} using BM25 6-10 ECLM,K (t, q) Entity-centric type score with K ∈ {5, 10, 20, 50, 100} using LM 11 TCBM25(t, q) Type-centric score using BM25 12 TCLM (t, q) Type-centric score using LM Knowledge base features 13 DEPTH(t) The hierarchical level of type t, normalized by the taxonomy depth 14 CHILDREN(t) Number of children of type t in the taxonomy 15 SIBLINGS(t) Number of siblings of type t in the taxonomy 16 ENTITIES(t) Number of entities mapped to type t Type label features 17 LENGTH(t) Length of (the label of) type t in words 18 IDFSUM(t) Sum of IDF for terms in (the label of) type t 19 IDFAVG(t) Avg of IDF for terms in (the label of) type t 20-21 JTERMSn(t, q) Query-type Jaccard similarity for sets of n-grams, for n ∈ {1, 2} 22 JNOUNS(t, q) Query-type Jaccard similarity using only nouns 23 SIMAGGR(t, q) Cosine sim. between the q and t word2vec vectors aggregated over all terms 24 SIMMAX(t, q) Max. cosine sim. of w2v vectors between each pair of query (q) and type (t) terms 25 SIMAVG(t, q) Avg. of cosine sim. of w2v vectors between each pair of query (q) and type (t) terms Dar´ıo Garigliotti Task-based Information Retrieval
  • 15. Target Types Identification Type-aware Entity Retrieval Query Suggestions Motivation Problem Definition Approaches Results and Insights Target Types Identification: Results and Insights INEX_LD ListSearch QALD2 SemSearch_ES 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8NDCG@5 EC, LM TC, LM LTR Identification performances across query groups Dar´ıo Garigliotti Task-based Information Retrieval
  • 16. Target Types Identification Type-aware Entity Retrieval Query Suggestions Motivation Problem Definition Approaches Results and Insights Target Types Identification: Results and Insights SIMMAX(t,q) SIMAGGR(t,q) SIMAVG(t,q) TCBM25(t,q) ENTITIES(t) ECBM25,100(t,q) ECBM25,50(t,q) SIBLINGS(t) ECBM25,20(t,q) CHILDREN(t) IDFSUM(t) IDFAVG(t) JNOUNS(t,q) ECBM25,10(t,q) JTERMS1(t,q) ECBM25,5(t,q)DEPTH(t) ECLM,100(t,q) LENGTH(t) ECLM,50(t,q)TCLM(t,q) ECLM,20(t,q) ECLM,10(t,q) ECLM,5(t,q) JTERMS2(t,q) Features 0.0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 Giniscore 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 NDCG@5 Gini score NDCG@5 Dar´ıo Garigliotti Task-based Information Retrieval
  • 17. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Task-based IR Task-based Information Retrieval Query Understanding Target types Target Types Identification Dar´ıo Garigliotti Task-based Information Retrieval
  • 18. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Task-based IR: Types and Entity Retrieval Task-based Information Retrieval Query Understanding Entities TypesTarget types Type-aware Entity Retrieval Target Types Identification Dar´ıo Garigliotti Task-based Information Retrieval
  • 19. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Type-aware Entity Retrieval A characteristic property of entities is that they are typed Types naturally appear in many queries countries where one can pay with the euro art museums in Amsterdam Types have been shown to improve Entity Retrieval Dar´ıo Garigliotti Task-based Information Retrieval
  • 20. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Dimensions of Type Information We systematically identified and compared all combinations of 3 dimensions Type taxonomies Type representations Retrieval models Dar´ıo Garigliotti Task-based Information Retrieval
  • 21. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Dimensions: Type Taxonomies Which type taxonomy to use? DBpedia Ontology (7 levels, 600 types) Freebase Types (2 levels, 2K types) Wikipedia Categories (34 levels, 600K types) YAGO Taxonomy (19 levels, 500K types) These vary a lot in terms of hierarchical structure and in how entity-type assignments are recorded Dar´ıo Garigliotti Task-based Information Retrieval
  • 22. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Dimensions: Type Representations How to represent the hierarchical information? t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Type(s) along path to top t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Top-level type(s) t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Most specific type(s) Dar´ıo Garigliotti Task-based Information Retrieval
  • 23. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Dimensions: Retrieval Models How to add type information into entity retrieval? Retrieval task defined in a generative probabilistic framework P(q | e) query entity Olympic games target types Rio de Janeiro term-based similarity type-based similarity … … entity types Both query and entity considered in the term space as well as in the type space Dar´ıo Garigliotti Task-based Information Retrieval
  • 24. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Dimensions: Retrieval Models (Strict) Filtering model P(q | e) = P(θT q | θT e ) · χ[types(q) ∩ types(e) = ∅] Types(q)Types(q) Types(e)Types(e) Dar´ıo Garigliotti Task-based Information Retrieval
  • 25. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Dimensions: Retrieval Models (Soft) Filtering model P(q | e) = P(θT q | θT e ) · P(θT q | θT e ) Dar´ıo Garigliotti Task-based Information Retrieval
  • 26. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Dimensions: Retrieval Models Interpolation model P(q | e) = (1 − λ) · P(θT q | θT e ) + λ · P(θT q | θT e ) Dar´ıo Garigliotti Task-based Information Retrieval
  • 27. Target Types Identification Type-aware Entity Retrieval Query Suggestions Type Taxonomies Type Representations Retrieval Models Type-aware Entity Retrieval: Lessons Learned Summary of insights: Type information proves most useful when larger, deeper type taxonomies provide very specific types. How to represent hierarchical entity type information? Using the most specific types is the most effective way What (kind of) type taxonomies to use? Wikipedia performs best in most of the cases What combination model to choose? All models suffer from missing type information, but interpolation appears to be the most robust Dar´ıo Garigliotti Task-based Information Retrieval
  • 28. Target Types Identification Type-aware Entity Retrieval Query Suggestions Task-based IR: Query Suggestions Task-based Information Retrieval Query Understanding Entities TypesTarget types Type-aware Entity Retrieval Target Types Identification Dar´ıo Garigliotti Task-based Information Retrieval
  • 29. Target Types Identification Type-aware Entity Retrieval Query Suggestions Task-based IR: Query Suggestions Task-based Information Retrieval Query Suggestions Query Understanding Entities TypesTarget types Type-aware Entity Retrieval Target Types Identification Dar´ıo Garigliotti Task-based Information Retrieval
  • 30. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Suggestions Investigated using the setup of the TREC Tasks track Task understanding: Given an initial query, to return a ranked list of query suggestions that cover all the possible subtasks of the task Participation in the track Formalization and analysis of our approach Dar´ıo Garigliotti Task-based Information Retrieval
  • 31. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Suggestions: Architecture and Model QS WS WD WH Query suggestions q0q0 Keyphrases Components: Source importance Document importance Keyphrase relevance Query suggestion Dar´ıo Garigliotti Task-based Information Retrieval
  • 32. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Suggestions: Architecture and Model QS WS WD WH Query suggestions q0q0 Keyphrases Components: Source importance Document importance Keyphrase relevance Query suggestion P(q|q0) = s d k P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) . Dar´ıo Garigliotti Task-based Information Retrieval
  • 33. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Suggestions: Architecture and Model QS WS WD WH Query suggestions q0q0 Keyphrases Components: Source importance Document importance Keyphrase relevance Query suggestion P(q|q0) = s d k P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) . Dar´ıo Garigliotti Task-based Information Retrieval
  • 34. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Suggestions: Architecture and Model QS WS WD WH Query suggestions q0q0 Keyphrases Components: Source importance Document importance Keyphrase relevance Query suggestion P(q|q0) = s d k P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) . Dar´ıo Garigliotti Task-based Information Retrieval
  • 35. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Suggestions: Architecture and Model QS WS WD WH Query suggestions q0q0 Keyphrases Components: Source importance Document importance Keyphrase relevance Query suggestion P(q|q0) = s d k P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) . Dar´ıo Garigliotti Task-based Information Retrieval
  • 36. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Suggestions: Architecture and Model QS WS WD WH Query suggestions q0q0 Keyphrases Components: Source importance Document importance Keyphrase relevance Query suggestion P(q|q0) = s d k P(q|q0, s, k)P(k|s, d) P(d|q0, s) P(s|q0) . Dar´ıo Garigliotti Task-based Information Retrieval
  • 37. Target Types Identification Type-aware Entity Retrieval Query Suggestions Query Suggestions: Component Estimations Insights from best component estimations: Query suggestion: Keyphrases as-is (vs generations) Document importance: Uniform (vs. rank-based) Source importance: Proportional to best document importance per individual sources (vs uniform, or source group-based) Overall, a high contribution of API query suggestions Dar´ıo Garigliotti Task-based Information Retrieval
  • 38. Target Types Identification Type-aware Entity Retrieval Query Suggestions Task-based IR Task-based Information Retrieval Query Suggestions Query Understanding Entities TypesTarget types Type-aware Entity Retrieval Target Types Identification Dar´ıo Garigliotti Task-based Information Retrieval
  • 39. Target Types Identification Type-aware Entity Retrieval Query Suggestions Task-based IR: Future work Task-based Information Retrieval Query Suggestions Query Understanding Entities TypesTarget typesSubtasks Type-aware Entity Retrieval Target Types IdentificationSubtasks Identification Dar´ıo Garigliotti Task-based Information Retrieval
  • 40. Target Types Identification Type-aware Entity Retrieval Query Suggestions Task-based IR: Future work Task-based Information Retrieval Query Suggestions Query Understanding Entities TypesTarget typesSubtasks Type-aware Entity Retrieval Target Types Identification Linking subtasks to target types Subtasks Identification Dar´ıo Garigliotti Task-based Information Retrieval
  • 41. Target Types Identification Type-aware Entity Retrieval Query Suggestions Task-based IR: Future work Towards a formal task model - Subtasks, i.e., clusters of information needs - Relationship with query types involved - Specific entities involved (?) - ... Dar´ıo Garigliotti Task-based Information Retrieval
  • 42. Target Types Identification Type-aware Entity Retrieval Query Suggestions Thanks! Questions? Dar´ıo Garigliotti Task-based Information Retrieval