Using the Structure of
DBpedia for Exploratory
Search
Speaker: Samantha Lam
Supervisor: Conor Hayes
Motivating Work
DBpedia - heterogeneous graph
2
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for refining search
specific schema, (semi) supervised
3
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for refining search
specific schema, (semi) supervised
→ good for search when user is familiar with query
→ ...but what about complete beginners?
3
Motivating Work
Background
Network Similarity: PathSim, NetClus, RankClus
Faceted Search: Facets for refining search
specific schema, (semi) supervised
→ good for search when user is familiar with query
→ ...but what about complete beginners?
→ Requires Exploratory Search – Unsupervised
3
Exploratory Search?
Given query, how to organise results in a manner that is ‘useful’,
i.e. aids exploratory search
E.g. suppose you hear a song on the radio...
4
Exploratory Search?
Given query, how to organise results in a manner that is ‘useful’,
i.e. aids exploratory search
E.g. suppose you hear a song on the radio...
Solution:
Classify results according to its contexts
Why? Alleviates in-depth reading and guides user
4
Assumption
similarity ⊂ relatedness
5
Research Questions
1 Can we provide an effective graph-based framework that can
aid exploratory search?
2 To do this, what is DBpedia’s graph structures wrt its
different datasets?
6
DBpedia graphs summary
Infobox properties
emergent, crowd-sourced
heterogeneous ‘types’
dense
Infobox ontology, SKOS/Wiki Category, YAGO
agreed rules
is-A structure
sparse, tree-like
7
DBpedia graphs summary
Infobox properties
emergent, crowd-sourced
heterogeneous ‘types’
dense
Infobox ontology, SKOS/Wiki Category, YAGO
agreed rules
is-A structure
sparse, tree-like
Infobox good for
GGGGGGGGGGA Relatedness
Ontology good for
GGGGGGGGGGA Labelling similar items
7
Research Q1 Proposition
General Framework:
8
Sample Query & Results
Query: Lisa Hannigan
Two methods Weighted (W) and Uniform (U), 6 clusters
9
Sample Query & Results
Query: Lisa Hannigan
Two methods Weighted (W) and Uniform (U), 6 clusters
Cluster 1 (W, U) instruments
Top label: (W, U) Musical instruments
Cluster 2 (W) songs (U) album and songs
Top label: (W) Songs by artist (U) Albums by artist
Cluster 3 (W) albums (U) album, music genres and songs
Top label: (W) Albums by artist (U) Music subgenres by genre
9
Sample Query & Results
Query: Lisa Hannigan
Cluster 4 (W) mixed, (U) mixed
Top label: (W) Songs by artist (U) Missing people
Cluster 5 (W) mixed, (U) mixed
Top label: (W) Albums by artist (U)
Towns and villages in the Republic of Ireland by county
Cluster 6 (W) musicians and bands, (U) musicians and bands
Top label: (W) Place of birth missing (living people) (U)
Place of birth missing (living people)
10
Sample Query & Results
Summary:
Weighted produced 4 out of 6 coherent clusters whereas
Unweighted only produced 2.
DBpedia Ontology labelling (see paper) provided broader
labelling for messier clusters, e.g. top label was MusicalWork
for mixed clusters
→ Categories better for more specific clusters.
11
Ongoing Challenges
Evaluation
User Study:
- compare only Weighted versus Unweighted results,
different labelling methods?
Comparison:
- possible to compare against other faceted methods?
- compare with plain list for recall?
12
Summary
Investigated graph structure of DBpedia datasets
Framework to utilise this finding in exploratory search, gave
example results
Ongoing challenge, evaluation
13
Summary
Investigated graph structure of DBpedia datasets
Framework to utilise this finding in exploratory search, gave
example results
Ongoing challenge, evaluation
Thanks for listening! Questions welcome!
13

More Related Content

PPT
Annotating with RDFa
PPT
Databases Foundation General
PPT
Roman Imperial Social Network and other things
PPTX
The world is y0ur$: Geolocation-based wordlist generation with wordsmith
PPTX
Name That Graph !
PPT
Introduction to RDF
PDF
Sociology 317
PPTX
Crafting tailored wordlists with Wordsmith
Annotating with RDFa
Databases Foundation General
Roman Imperial Social Network and other things
The world is y0ur$: Geolocation-based wordlist generation with wordsmith
Name That Graph !
Introduction to RDF
Sociology 317
Crafting tailored wordlists with Wordsmith

What's hot (8)

PDF
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
PPTX
Phenoma2evidence
PDF
How to be a Supersearcher: GENERAL
PPT
PPT
RDA and Hebraica: Applying RDA in one cataloging community
PPTX
Information Literacy Week 6: Book Searching
PDF
PDF
Revealing Entities From Texts With a Hybrid Approach
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Phenoma2evidence
How to be a Supersearcher: GENERAL
RDA and Hebraica: Applying RDA in one cataloging community
Information Literacy Week 6: Book Searching
Revealing Entities From Texts With a Hybrid Approach
Ad

Similar to Using the Structure of DBpedia for Exploratory Search (20)

PPTX
Discovery Hub: on-the-fly linked data exploratory search
PPTX
Improving Semantic Search Using Query Log Analysis
PDF
Unsupervised Learning of an Extensive and Usable Taxonomy for DBpedia
PDF
Building and using ontologies
PDF
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
PDF
Linked Data, Ontologies and Inference
PDF
20150209 improving the_d_bpedia_ontology_v2
PPT
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
ODP
Type Inference on Noisy RDF Data
ODP
C6 final
ODP
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
PPTX
The Web of Data: do we actually understand what we built?
PPTX
Ordering the chaos: Creating websites with imperfect data
PDF
PDF
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
PDF
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
PDF
Linked science presentation 25
PDF
Tackling Usability Challenges in Querying Massive, Ultra-heterogeneous Graphs
PPTX
Taxonomies in Search
PPTX
PhD Presentation: Exploring Semantic Relationships in the Web of Data
Discovery Hub: on-the-fly linked data exploratory search
Improving Semantic Search Using Query Log Analysis
Unsupervised Learning of an Extensive and Usable Taxonomy for DBpedia
Building and using ontologies
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
Linked Data, Ontologies and Inference
20150209 improving the_d_bpedia_ontology_v2
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Type Inference on Noisy RDF Data
C6 final
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
The Web of Data: do we actually understand what we built?
Ordering the chaos: Creating websites with imperfect data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
Linked science presentation 25
Tackling Usability Challenges in Querying Massive, Ultra-heterogeneous Graphs
Taxonomies in Search
PhD Presentation: Exploring Semantic Relationships in the Web of Data
Ad

Recently uploaded (20)

PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PPT
What is a Computer? Input Devices /output devices
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
Five Habits of High-Impact Board Members
PPTX
Configure Apache Mutual Authentication
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
Flame analysis and combustion estimation using large language and vision assi...
PPTX
TEXTILE technology diploma scope and career opportunities
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
DOCX
search engine optimization ppt fir known well about this
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PPTX
Modernising the Digital Integration Hub
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Zenith AI: Advanced Artificial Intelligence
What is a Computer? Input Devices /output devices
OpenACC and Open Hackathons Monthly Highlights July 2025
sustainability-14-14877-v2.pddhzftheheeeee
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Consumable AI The What, Why & How for Small Teams.pdf
Convolutional neural network based encoder-decoder for efficient real-time ob...
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Five Habits of High-Impact Board Members
Configure Apache Mutual Authentication
A proposed approach for plagiarism detection in Myanmar Unicode text
Flame analysis and combustion estimation using large language and vision assi...
TEXTILE technology diploma scope and career opportunities
Module 1.ppt Iot fundamentals and Architecture
Improvisation in detection of pomegranate leaf disease using transfer learni...
search engine optimization ppt fir known well about this
sbt 2.0: go big (Scala Days 2025 edition)
Modernising the Digital Integration Hub
Taming the Chaos: How to Turn Unstructured Data into Decisions

Using the Structure of DBpedia for Exploratory Search

  • 1. Using the Structure of DBpedia for Exploratory Search Speaker: Samantha Lam Supervisor: Conor Hayes
  • 2. Motivating Work DBpedia - heterogeneous graph 2
  • 3. Motivating Work Background Network Similarity: PathSim, NetClus, RankClus Faceted Search: Facets for refining search specific schema, (semi) supervised 3
  • 4. Motivating Work Background Network Similarity: PathSim, NetClus, RankClus Faceted Search: Facets for refining search specific schema, (semi) supervised → good for search when user is familiar with query → ...but what about complete beginners? 3
  • 5. Motivating Work Background Network Similarity: PathSim, NetClus, RankClus Faceted Search: Facets for refining search specific schema, (semi) supervised → good for search when user is familiar with query → ...but what about complete beginners? → Requires Exploratory Search – Unsupervised 3
  • 6. Exploratory Search? Given query, how to organise results in a manner that is ‘useful’, i.e. aids exploratory search E.g. suppose you hear a song on the radio... 4
  • 7. Exploratory Search? Given query, how to organise results in a manner that is ‘useful’, i.e. aids exploratory search E.g. suppose you hear a song on the radio... Solution: Classify results according to its contexts Why? Alleviates in-depth reading and guides user 4
  • 9. Research Questions 1 Can we provide an effective graph-based framework that can aid exploratory search? 2 To do this, what is DBpedia’s graph structures wrt its different datasets? 6
  • 10. DBpedia graphs summary Infobox properties emergent, crowd-sourced heterogeneous ‘types’ dense Infobox ontology, SKOS/Wiki Category, YAGO agreed rules is-A structure sparse, tree-like 7
  • 11. DBpedia graphs summary Infobox properties emergent, crowd-sourced heterogeneous ‘types’ dense Infobox ontology, SKOS/Wiki Category, YAGO agreed rules is-A structure sparse, tree-like Infobox good for GGGGGGGGGGA Relatedness Ontology good for GGGGGGGGGGA Labelling similar items 7
  • 13. Sample Query & Results Query: Lisa Hannigan Two methods Weighted (W) and Uniform (U), 6 clusters 9
  • 14. Sample Query & Results Query: Lisa Hannigan Two methods Weighted (W) and Uniform (U), 6 clusters Cluster 1 (W, U) instruments Top label: (W, U) Musical instruments Cluster 2 (W) songs (U) album and songs Top label: (W) Songs by artist (U) Albums by artist Cluster 3 (W) albums (U) album, music genres and songs Top label: (W) Albums by artist (U) Music subgenres by genre 9
  • 15. Sample Query & Results Query: Lisa Hannigan Cluster 4 (W) mixed, (U) mixed Top label: (W) Songs by artist (U) Missing people Cluster 5 (W) mixed, (U) mixed Top label: (W) Albums by artist (U) Towns and villages in the Republic of Ireland by county Cluster 6 (W) musicians and bands, (U) musicians and bands Top label: (W) Place of birth missing (living people) (U) Place of birth missing (living people) 10
  • 16. Sample Query & Results Summary: Weighted produced 4 out of 6 coherent clusters whereas Unweighted only produced 2. DBpedia Ontology labelling (see paper) provided broader labelling for messier clusters, e.g. top label was MusicalWork for mixed clusters → Categories better for more specific clusters. 11
  • 17. Ongoing Challenges Evaluation User Study: - compare only Weighted versus Unweighted results, different labelling methods? Comparison: - possible to compare against other faceted methods? - compare with plain list for recall? 12
  • 18. Summary Investigated graph structure of DBpedia datasets Framework to utilise this finding in exploratory search, gave example results Ongoing challenge, evaluation 13
  • 19. Summary Investigated graph structure of DBpedia datasets Framework to utilise this finding in exploratory search, gave example results Ongoing challenge, evaluation Thanks for listening! Questions welcome! 13