SlideShare a Scribd company logo
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type-aware Entity Retrieval
Dar´ıo Garigliotti
University of Stavanger
June 14, 2016
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Outline:
1 Types and Entity Retrieval
2 Environment Dimensions
Type taxonomies
Type representations
Retrieval models
3 Type-aware Entity Retrieval
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Types and Entity Retrieval
Traditional Information Retrieval recently extended to an
Entity-oriented Search
It revolves around the satisfaction of more complex
information needs
Several entity elements from knowledge bases, naturally
appearing in queries
Countries where one can pay with the euro
Related entities (via a relation or predicate)
Types or categories or classes
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Types and Entity Retrieval
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Types and Entity Retrieval
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Types and Entity Retrieval
Why to think about types?
Entities are typed
Types are useful for retrieval, presentation,
summarization...
Related tasks, e.g.
Entity ranking (given a query and target categories)
List completion (given a query and entity examples, and?
types)
Query target type identification
Our focus is on emergent dimensions to explore
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Type taxonomies
There are different type taxonomies from various knowledge
bases
DBpedia Ontology
Freebase Types
Wikipedia Categories
YAGO Taxonomy
These vary a lot in terms of hierarchical structure and in how
entity-type assignments are recorded
Normalisation efforts are needed
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
DBpedia Ontology
A well-designed hierarchy
Created manually by
considering the most
frequently used infoboxes
in Wikipedia
Clean and consistent, but
with limited coverage
0
1
2
3
4
5
6
7
|Level 1| = 58 types
|Level 2| = 114 types
|Level 3| = 142 types
|Level 4| = 213 types
|Level 5| = 45 types
|Level 6| = 17 types
|Level 7| = 1 type
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
DBpedia Ontology
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Freebase Types
A two-layer categorization
system: types and
domains
Entities are only assigned
to types, having most of
them “same as” links to
DBpedia entities
0
1
2
|Level 1| = 92 types
|Level 2| = 1, 626 types
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Wikipedia Categories
It consists of textual
labels known as
categories
It’s not a well-defined
“is-a” hierarchy, but a
graph
Category assignments
are neither consistent
nor complete
It requires a major
normalisation strategy
0
1
2-10
11-24
25-
34
|Level 1| = 27 types
|Level 2 ∪ ... ∪ Level 10| =
121, 657 types
|Level 11 ∪ ... ∪ Level 24| =
410, 697 types
|Level 25 ∪ ... ∪ Level 34| =
14, 564 types
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
YAGO Taxonomy
A deep subsumption
hierarchy
Its classification schema is
constructed by taking leaf
categories from Wikipedia
categories and then using
WordNet synsets to
establish the hierarchy
0
1
2-5
6-10
11-
19
|Level 1| = 61 types
|Level 2 ∪ ... ∪ Level 5| =
80, 384 types
|Level 6 ∪ ... ∪ Level 10| =
461, 843 types
|Level 11 ∪ ... ∪ Level 19| =
26, 383 types
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Type representations
How to represent the hierarchical information?
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Type representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Type representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Top-level type(s)
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Type representations
How to represent the hierarchical information?
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Type(s) along path
to top
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Top-level type(s)
t3t3
t2t2
t5t5t4t4
t9t9t8t8
e
t6t6
t12t12
t7t7
…
t10t10 t11t11
t0t0
t1t1 …
Most specific type(s)
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Retrieval models
Retrieval task
defined in a
generative
probabilistic
framework
P(q | e)
query entity
Olympic games
target types
Rio de Janeiro
term-based
similarity
type-based
similarity
… …
entity types
Both query and entity considered in the term space as well as
in the type space
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Retrieval models
(Strict) Filtering model
P(q | e) = P(θT
q | θT
e ) · χ[types(q) ∩ types(e) = ∅]
Types(q)Types(q) Types(e)Types(e)
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Retrieval models
(Soft) Filtering model
P(q | e) = P(θT
q | θT
e ) · P(θT
q | θT
e )
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
Type taxonomies
Type representations
Retrieval models
Retrieval models
Interpolation model
P(q | e) = (1 − λ) · P(θT
q | θT
e ) + λ · P(θT
q | θT
e )
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
What did we do?
We systematically identified and compared all combinations of
those dimensions
4 type taxonomies: DBpedia Ontology (3.9), Freebase
Types (2015-03-31), Wikipedia Categories (for DBpedia
3.9) and YAGO Taxonomy (3.0.2)
3 type representations: path-to-top, top-level, most
specific
3 models: strict and soft filtering, interpolation
Environment: from idealized to realistic
query types oracle
entities fully typed in all the taxonomies
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
What did we do? Results
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
Lessons learned
Summary of insights:
How to represent hierarchical entity type information?
(RQ1) Using the most specific types appears to be the
best way
What (kind of) type taxonomies to use? (RQ2) Wikipedia,
in combination with most specific types, performs the best
in both the idealized and the more realistic scenarios
What combination model to choose? (RQ3) The
interpolation model appears to be more robust
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
Further analysis: strict filtering vs interpolation models
Strict filtering treats
target types as a set
Interpolation operates
with a probability
distribution over types
When we drop from
oracle every type
assigned to less than 3
entities, interpolation
adapts quite better
DBpedia Freebase Wikipedia YAGO
Most-specific types
DBpedia Freebase Wikipedia YAGO
Most-specific types
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
Further analysis: query-level ranking details
E.g. performance for
(Interpolation, Most
specific level,
Wikipedia-3.9)
query = “Which books by
Kerouac were published
by Viking Press?”
Types: 90 (including
Viking Press books)
Types of the hurt relevant
entities: all contain
Viking Press books
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
Further analysis: query-level ranking details
E.g. performance for
(Interpolation, Most
specific level,
Wikipedia-3.9)
query = “Give me all
actors starring in Batman
Begins”
All 7 relevant entities are
improved
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
Query target type detection
Automatic query target type detection
Baselines
Entity-centric: first, to rank entities based on their relevance
to the query, then look at what types the top-k ranked
entities have
Type-centric: to build a direct term-based representation for
each type, by aggregating descriptions of entities of that type
Learning-to-rank with several features
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
Query target type detection
Dar´ıo Garigliotti Type-aware Entity Retrieval
Types and Entity Retrieval
Environment Dimensions
Type-aware Entity Retrieval
What did we do?
Lessons learned
Query target type detection
Future work draft
Future work draft
Automatic query target type detection must be further
analysed. Experiments revisited with additional features
and expanded set of candidate types.
Query classification, for deciding about query suitability to
be improved its retrieval by type-aware approach
Its performance by itself, and its impact in the full system
Dar´ıo Garigliotti Type-aware Entity Retrieval

More Related Content

PDF
Type-Aware Entity Retrieval
PDF
On Type-Aware Entity Retrieval
PPT
Ontology engineering
PDF
Ontology Engineering: Introduction
PDF
Type-Aware Entity Retrieval
PDF
Lect6-An introduction to ontologies and ontology development
PPTX
Semantic Application for Healthcare
Type-Aware Entity Retrieval
On Type-Aware Entity Retrieval
Ontology engineering
Ontology Engineering: Introduction
Type-Aware Entity Retrieval
Lect6-An introduction to ontologies and ontology development
Semantic Application for Healthcare

Similar to Type-Aware Entity Retrieval (20)

PDF
Type Information in Entity Retrieval
PDF
Task-Based Information Retrieval
PDF
Entity Retrieval (SIGIR 2013 tutorial)
PDF
Type-Aware Entity Retrieval
PDF
Artificial Intelligence in Data Curation
PDF
Recommender Systems and Linked Open Data
PPTX
Issues and activities in authoring ontologies
PDF
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
PPTX
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
PPT
Penguins in-sweaters-or-serendipitous-entity-search-on-user-generated-content
PDF
Pharo: a reflective language A first systematic analysis of reflective APIs
PPTX
Reading Group 2013 (DERI NUIG)
PPTX
Make your data great again - Ver 2
PPTX
Gleaning Types for Literals in RDF with Application to Entity Summarization
PPTX
Global Collection Dashboard – Using data we have to uncover data we don’t
PPTX
Improving Semantic Search Using Query Log Analysis
PPTX
Object oriented programming
PDF
PATTERN DETECTION WITH RARE ITEM-SET MINING
PDF
Digital Object Identifiers (DOIs) in the context of the International Treaty
 
Type Information in Entity Retrieval
Task-Based Information Retrieval
Entity Retrieval (SIGIR 2013 tutorial)
Type-Aware Entity Retrieval
Artificial Intelligence in Data Curation
Recommender Systems and Linked Open Data
Issues and activities in authoring ontologies
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Penguins in-sweaters-or-serendipitous-entity-search-on-user-generated-content
Pharo: a reflective language A first systematic analysis of reflective APIs
Reading Group 2013 (DERI NUIG)
Make your data great again - Ver 2
Gleaning Types for Literals in RDF with Application to Entity Summarization
Global Collection Dashboard – Using data we have to uncover data we don’t
Improving Semantic Search Using Query Log Analysis
Object oriented programming
PATTERN DETECTION WITH RARE ITEM-SET MINING
Digital Object Identifiers (DOIs) in the context of the International Treaty
 
Ad

More from Darío Garigliotti (20)

PDF
Task-Based Support in Search Engines
PDF
Task Recommendation
PDF
About "Towards Better Text Understanding and Retrieval through Kernel Entity ...
PDF
A Semantic Search Approach to Task-Completion Engines
PDF
A Summary of ECIR'18
PDF
A Semantic Search Approach to Task-Completion Engines
PDF
A Knowledge Base of Entity-Oriented Search Intents
PDF
Learning-to-Rank Target Types for Entity-Bearing Queries
PDF
Dive into Deep Learning
PDF
If this is the answer, what was the question?
PDF
Semi-supervised Learning for Word Sense Disambiguation
PDF
Semi-supervised Learning for Word Sense Disambiguation
PDF
Semi-supervised Learning for Word Sense Disambiguation
PDF
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
PDF
Machine Learning - Clustering
PDF
Machine Learning - Classification (ctd.)
PDF
Machine Learning - Classification
PDF
Data Mining - Exploring Data
PDF
Data Mining - Introduction and Data
PDF
Predicate Logic
Task-Based Support in Search Engines
Task Recommendation
About "Towards Better Text Understanding and Retrieval through Kernel Entity ...
A Semantic Search Approach to Task-Completion Engines
A Summary of ECIR'18
A Semantic Search Approach to Task-Completion Engines
A Knowledge Base of Entity-Oriented Search Intents
Learning-to-Rank Target Types for Entity-Bearing Queries
Dive into Deep Learning
If this is the answer, what was the question?
Semi-supervised Learning for Word Sense Disambiguation
Semi-supervised Learning for Word Sense Disambiguation
Semi-supervised Learning for Word Sense Disambiguation
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
Machine Learning - Clustering
Machine Learning - Classification (ctd.)
Machine Learning - Classification
Data Mining - Exploring Data
Data Mining - Introduction and Data
Predicate Logic
Ad

Recently uploaded (20)

PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PDF
The scientific heritage No 166 (166) (2025)
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PDF
Sciences of Europe No 170 (2025)
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
Pharmacology of Autonomic nervous system
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
2. Earth - The Living Planet earth and life
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
BIOMOLECULES PPT........................
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
The scientific heritage No 166 (166) (2025)
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
Sciences of Europe No 170 (2025)
Classification Systems_TAXONOMY_SCIENCE8.pptx
Pharmacology of Autonomic nervous system
Taita Taveta Laboratory Technician Workshop Presentation.pptx
lecture 2026 of Sjogren's syndrome l .pdf
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Phytochemical Investigation of Miliusa longipes.pdf
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
The KM-GBF monitoring framework – status & key messages.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
2. Earth - The Living Planet earth and life
Introduction to Cardiovascular system_structure and functions-1
BIOMOLECULES PPT........................
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx

Type-Aware Entity Retrieval

  • 1. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type-aware Entity Retrieval Dar´ıo Garigliotti University of Stavanger June 14, 2016 Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 2. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Outline: 1 Types and Entity Retrieval 2 Environment Dimensions Type taxonomies Type representations Retrieval models 3 Type-aware Entity Retrieval Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 3. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Types and Entity Retrieval Traditional Information Retrieval recently extended to an Entity-oriented Search It revolves around the satisfaction of more complex information needs Several entity elements from knowledge bases, naturally appearing in queries Countries where one can pay with the euro Related entities (via a relation or predicate) Types or categories or classes Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 4. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Types and Entity Retrieval Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 5. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Types and Entity Retrieval Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 6. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Types and Entity Retrieval Why to think about types? Entities are typed Types are useful for retrieval, presentation, summarization... Related tasks, e.g. Entity ranking (given a query and target categories) List completion (given a query and entity examples, and? types) Query target type identification Our focus is on emergent dimensions to explore Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 7. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Type taxonomies There are different type taxonomies from various knowledge bases DBpedia Ontology Freebase Types Wikipedia Categories YAGO Taxonomy These vary a lot in terms of hierarchical structure and in how entity-type assignments are recorded Normalisation efforts are needed Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 8. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models DBpedia Ontology A well-designed hierarchy Created manually by considering the most frequently used infoboxes in Wikipedia Clean and consistent, but with limited coverage 0 1 2 3 4 5 6 7 |Level 1| = 58 types |Level 2| = 114 types |Level 3| = 142 types |Level 4| = 213 types |Level 5| = 45 types |Level 6| = 17 types |Level 7| = 1 type Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 9. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models DBpedia Ontology Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 10. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Freebase Types A two-layer categorization system: types and domains Entities are only assigned to types, having most of them “same as” links to DBpedia entities 0 1 2 |Level 1| = 92 types |Level 2| = 1, 626 types Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 11. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Wikipedia Categories It consists of textual labels known as categories It’s not a well-defined “is-a” hierarchy, but a graph Category assignments are neither consistent nor complete It requires a major normalisation strategy 0 1 2-10 11-24 25- 34 |Level 1| = 27 types |Level 2 ∪ ... ∪ Level 10| = 121, 657 types |Level 11 ∪ ... ∪ Level 24| = 410, 697 types |Level 25 ∪ ... ∪ Level 34| = 14, 564 types Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 12. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models YAGO Taxonomy A deep subsumption hierarchy Its classification schema is constructed by taking leaf categories from Wikipedia categories and then using WordNet synsets to establish the hierarchy 0 1 2-5 6-10 11- 19 |Level 1| = 61 types |Level 2 ∪ ... ∪ Level 5| = 80, 384 types |Level 6 ∪ ... ∪ Level 10| = 461, 843 types |Level 11 ∪ ... ∪ Level 19| = 26, 383 types Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 13. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Type representations How to represent the hierarchical information? Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 14. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Type representations How to represent the hierarchical information? t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Type(s) along path to top Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 15. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Type representations How to represent the hierarchical information? t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Type(s) along path to top t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Top-level type(s) Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 16. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Type representations How to represent the hierarchical information? t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Type(s) along path to top t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Top-level type(s) t3t3 t2t2 t5t5t4t4 t9t9t8t8 e t6t6 t12t12 t7t7 … t10t10 t11t11 t0t0 t1t1 … Most specific type(s) Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 17. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Retrieval models Retrieval task defined in a generative probabilistic framework P(q | e) query entity Olympic games target types Rio de Janeiro term-based similarity type-based similarity … … entity types Both query and entity considered in the term space as well as in the type space Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 18. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Retrieval models (Strict) Filtering model P(q | e) = P(θT q | θT e ) · χ[types(q) ∩ types(e) = ∅] Types(q)Types(q) Types(e)Types(e) Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 19. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Retrieval models (Soft) Filtering model P(q | e) = P(θT q | θT e ) · P(θT q | θT e ) Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 20. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval Type taxonomies Type representations Retrieval models Retrieval models Interpolation model P(q | e) = (1 − λ) · P(θT q | θT e ) + λ · P(θT q | θT e ) Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 21. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft What did we do? We systematically identified and compared all combinations of those dimensions 4 type taxonomies: DBpedia Ontology (3.9), Freebase Types (2015-03-31), Wikipedia Categories (for DBpedia 3.9) and YAGO Taxonomy (3.0.2) 3 type representations: path-to-top, top-level, most specific 3 models: strict and soft filtering, interpolation Environment: from idealized to realistic query types oracle entities fully typed in all the taxonomies Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 22. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft What did we do? Results Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 23. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft Lessons learned Summary of insights: How to represent hierarchical entity type information? (RQ1) Using the most specific types appears to be the best way What (kind of) type taxonomies to use? (RQ2) Wikipedia, in combination with most specific types, performs the best in both the idealized and the more realistic scenarios What combination model to choose? (RQ3) The interpolation model appears to be more robust Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 24. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft Further analysis: strict filtering vs interpolation models Strict filtering treats target types as a set Interpolation operates with a probability distribution over types When we drop from oracle every type assigned to less than 3 entities, interpolation adapts quite better DBpedia Freebase Wikipedia YAGO Most-specific types DBpedia Freebase Wikipedia YAGO Most-specific types Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 25. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft Further analysis: query-level ranking details E.g. performance for (Interpolation, Most specific level, Wikipedia-3.9) query = “Which books by Kerouac were published by Viking Press?” Types: 90 (including Viking Press books) Types of the hurt relevant entities: all contain Viking Press books Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 26. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft Further analysis: query-level ranking details E.g. performance for (Interpolation, Most specific level, Wikipedia-3.9) query = “Give me all actors starring in Batman Begins” All 7 relevant entities are improved Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 27. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft Query target type detection Automatic query target type detection Baselines Entity-centric: first, to rank entities based on their relevance to the query, then look at what types the top-k ranked entities have Type-centric: to build a direct term-based representation for each type, by aggregating descriptions of entities of that type Learning-to-rank with several features Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 28. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft Query target type detection Dar´ıo Garigliotti Type-aware Entity Retrieval
  • 29. Types and Entity Retrieval Environment Dimensions Type-aware Entity Retrieval What did we do? Lessons learned Query target type detection Future work draft Future work draft Automatic query target type detection must be further analysed. Experiments revisited with additional features and expanded set of candidate types. Query classification, for deciding about query suitability to be improved its retrieval by type-aware approach Its performance by itself, and its impact in the full system Dar´ıo Garigliotti Type-aware Entity Retrieval