SlideShare a Scribd company logo
Generating Compact and Relaxable Answers to
Keyword Queries over Knowledge Graphs
Gong Cheng1, Shuxin Li1, Ke Zhang1, Chengkai Li2
1State Key Laboratory for Novel Software Technology, Nanjing University, China
2Department of Computer Science and Engineering, University of Texas at Arlington, United States
ISWC 2020 1
ISWC 2020 2
Keyword Search over Knowledge Graphs
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
 Two Paradigms
 For lookup tasks: semantic parsing (keyword query  SPARQL query)
ISWC 2020 3
Keyword Search over Knowledge Graphs
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
 Two Paradigms
 For lookup tasks: semantic parsing (keyword query  SPARQL query)
 For exploratory tasks: answer subgraph extraction (keyword query  GST)
ISWC 2020 4
Keyword Search over Knowledge Graphs
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
countryGroup Steiner Tree
(GST)
 Answer completeness?
 Covering all the query keywords
 Answer compactness?
 Having a compact structure (e.g., a small diameter)
ISWC 2020 5
Motivation --- Pros and Cons of GST
Group Steiner Tree
(GST)
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
The Trip (TT)
country
London
country
producedBy
G7
member
member
diameter: 7
uncovered keywords: 0
 Computing compact but relaxable subgraphs
 Guaranteed answer compactness: having a bounded diameter (D)
 Maximized answer completeness: covering the largest number of query
keywords
ISWC 2020 6
Main Idea
Group Steiner Tree
(GST)
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
The Trip (TT)
country
London
country
producedBy
G7
member
member
diameter: 7
uncovered keywords: 0
 Computing compact but relaxable subgraphs
 Guaranteed answer compactness: having a bounded diameter (D)
 Maximized answer completeness: covering the largest number of query
keywords
ISWC 2020 7
Main Idea
Minimally Relaxed Answer
(MRA)
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
The Trip (TT)
country
London
country
producedBy
G7
member
member
diameter: 2
uncovered keywords: 1
(D=2)
 A necessary and sufficient condition for the existence of a
compactness-bounded complete answer to a keyword query
ISWC 2020 8
Approach --- Theoretical Foundations
United States (US)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
country
We refer to v as a certificate vertex for Q.
• E.g., Montana for "united states yellowstone park" under D=2
ISWC 2020 9
Approach --- Algorithm CORE
 A best-first search algorithm
one independent search
starting from each
keyword vertex
a shared priority queue keeping
search frontiers
(priority: potentially uncovered
keywords, based on distances)
a more complete answer
which the current vertex
is a certificate vertex for
early stop
unvisited neighbors
 Running example
ISWC 2020 10
Approach --- Algorithm CORE
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
 Running example
ISWC 2020 11
Approach --- Algorithm CORE
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
 Running example
ISWC 2020 12
Approach --- Algorithm CORE
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
 Running example
ISWC 2020 13
Approach --- Algorithm CORE
United States (US)
United Kindom (UK)
Montana (MT)
Yellowstone National
Park (YSNP)
isLocatedIn
BBC
headquarters
Yellowstone (YS) The Trip (TT)
Q: united states yellowstone park trip
Ohio
country
London
country
producedBy
G7
member
member
producedBy
country
G
 Datasets
 Baselines
 CertQR+ (adapted to our problem)
 GST-based answers
ISWC 2020 14
Experiment Settings
 Main finding
 Trading off answer completeness for compactness is necessary.
ISWC 2020 15
Results --- Compactness of GST-Based Answers
(doc = diameter; |K| = max keyword hits)
 Main finding
 The completeness of our computed answers is very high.
ISWC 2020 16
Results --- Completeness of Relaxable Answers
(dor = number of uncovered keywords; |K| = max keyword hits)
 Main finding
 CORE is efficient and significantly outperforms CertQR+.
ISWC 2020 17
Results --- Efficiency of CORE
 Take-home messages
 Necessity of trading off answer completeness for compactness
 Polynomial-time algorithm for generating compact but relaxable answers
 https://github.com/nju-websoft/CORE
 Future work
 Vertex and/or edge weights
ISWC 2020 18
Conclusion

More Related Content

PPT
Georgia Geospatial Workshop: Proper Care and Feeding of Metadata
PDF
CAR Email 7.08.02
PPTX
Towards Content-Based Dataset Search - Test Collections and Beyond
PPTX
从元数据到内容——新一代知识图谱搜索引擎初探
PPTX
知识图谱中的实体摘要:基于神经网络的方法
PPTX
知识图谱中的关联搜索
PPTX
面向高考机器人的知识表示与推理初探
PPTX
知识图谱中的实体关联搜索
Georgia Geospatial Workshop: Proper Care and Feeding of Metadata
CAR Email 7.08.02
Towards Content-Based Dataset Search - Test Collections and Beyond
从元数据到内容——新一代知识图谱搜索引擎初探
知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的关联搜索
面向高考机器人的知识表示与推理初探
知识图谱中的实体关联搜索

More from Gong Cheng (20)

PPTX
Semantic Data Retrieval: Search, Ranking, and Summarization
PPTX
Semantic Web related top conference review
PPTX
Relatedness-based Multi-Entity Summarization
PPTX
Generating Illustrative Snippets for Open Data on the Web
PPTX
常识推理在地理自动答题中的需求分析
PPTX
Efficient Algorithms for Association Finding and Frequent Association Pattern...
PPTX
Summarizing Semantic Data
PPTX
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
PPTX
Taking up the Gaokao Challenge: An Information Retrieval Approach
PPTX
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
PPTX
知识的摘要
PPTX
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
PPTX
Facilitating Human Intervention in Coreference Resolution with Comparative En...
PPTX
Towards Exploratory Relationship Search: A Clustering-based Approach
PPT
NJVR: The NanJing Vocabulary Repository
PPTX
Web的图结构分析
PPTX
BipRank: Ranking and Summarizing RDF Vocabulary Descriptions
PPTX
An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...
PPTX
RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization
PPTX
Browsing Linked Data with MyView
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Web related top conference review
Relatedness-based Multi-Entity Summarization
Generating Illustrative Snippets for Open Data on the Web
常识推理在地理自动答题中的需求分析
Efficient Algorithms for Association Finding and Frequent Association Pattern...
Summarizing Semantic Data
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
Taking up the Gaokao Challenge: An Information Retrieval Approach
Summarizing Entity Descriptions for Effective and Efficient Human-centered En...
知识的摘要
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Facilitating Human Intervention in Coreference Resolution with Comparative En...
Towards Exploratory Relationship Search: A Clustering-based Approach
NJVR: The NanJing Vocabulary Repository
Web的图结构分析
BipRank: Ranking and Summarizing RDF Vocabulary Descriptions
An Empirical Study of Vocabulary Relatedness and Its Application to Recommend...
RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization
Browsing Linked Data with MyView
Ad

Recently uploaded (20)

PPTX
Anesthesia and it's stage with mnemonic and images
PPTX
Tablets And Capsule Preformulation Of Paracetamol
PPTX
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
PPTX
lesson6-211001025531lesson plan ppt.pptx
PDF
Swiggy’s Playbook: UX, Logistics & Monetization
PDF
natwest.pdf company description and business model
PPTX
water for all cao bang - a charity project
DOCX
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PPTX
Introduction-to-Food-Packaging-and-packaging -materials.pptx
PPTX
Self management and self evaluation presentation
PDF
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
PPTX
Relationship Management Presentation In Banking.pptx
PPTX
Effective_Handling_Information_Presentation.pptx
PPTX
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
PPTX
Tour Presentation Educational Activity.pptx
PDF
oil_refinery_presentation_v1 sllfmfls.pdf
PPTX
_ISO_Presentation_ISO 9001 and 45001.pptx
PPTX
Sustainable Forest Management ..SFM.pptx
DOC
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
PPTX
nose tajweed for the arabic alphabets for the responsive
Anesthesia and it's stage with mnemonic and images
Tablets And Capsule Preformulation Of Paracetamol
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
lesson6-211001025531lesson plan ppt.pptx
Swiggy’s Playbook: UX, Logistics & Monetization
natwest.pdf company description and business model
water for all cao bang - a charity project
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
Introduction-to-Food-Packaging-and-packaging -materials.pptx
Self management and self evaluation presentation
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
Relationship Management Presentation In Banking.pptx
Effective_Handling_Information_Presentation.pptx
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
Tour Presentation Educational Activity.pptx
oil_refinery_presentation_v1 sllfmfls.pdf
_ISO_Presentation_ISO 9001 and 45001.pptx
Sustainable Forest Management ..SFM.pptx
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
nose tajweed for the arabic alphabets for the responsive
Ad

Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Graphs

  • 1. Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Graphs Gong Cheng1, Shuxin Li1, Ke Zhang1, Chengkai Li2 1State Key Laboratory for Novel Software Technology, Nanjing University, China 2Department of Computer Science and Engineering, University of Texas at Arlington, United States ISWC 2020 1
  • 2. ISWC 2020 2 Keyword Search over Knowledge Graphs United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters Yellowstone (YS) The Trip (TT) Q: united states yellowstone park trip Ohio country London country producedBy G7 member member producedBy country G
  • 3.  Two Paradigms  For lookup tasks: semantic parsing (keyword query  SPARQL query) ISWC 2020 3 Keyword Search over Knowledge Graphs United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters Yellowstone (YS) The Trip (TT) Q: united states yellowstone park trip Ohio country London country producedBy G7 member member producedBy country G
  • 4.  Two Paradigms  For lookup tasks: semantic parsing (keyword query  SPARQL query)  For exploratory tasks: answer subgraph extraction (keyword query  GST) ISWC 2020 4 Keyword Search over Knowledge Graphs United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters Yellowstone (YS) The Trip (TT) Q: united states yellowstone park trip Ohio country London country producedBy G7 member member producedBy countryGroup Steiner Tree (GST)
  • 5.  Answer completeness?  Covering all the query keywords  Answer compactness?  Having a compact structure (e.g., a small diameter) ISWC 2020 5 Motivation --- Pros and Cons of GST Group Steiner Tree (GST) United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters The Trip (TT) country London country producedBy G7 member member diameter: 7 uncovered keywords: 0
  • 6.  Computing compact but relaxable subgraphs  Guaranteed answer compactness: having a bounded diameter (D)  Maximized answer completeness: covering the largest number of query keywords ISWC 2020 6 Main Idea Group Steiner Tree (GST) United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters The Trip (TT) country London country producedBy G7 member member diameter: 7 uncovered keywords: 0
  • 7.  Computing compact but relaxable subgraphs  Guaranteed answer compactness: having a bounded diameter (D)  Maximized answer completeness: covering the largest number of query keywords ISWC 2020 7 Main Idea Minimally Relaxed Answer (MRA) United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters The Trip (TT) country London country producedBy G7 member member diameter: 2 uncovered keywords: 1 (D=2)
  • 8.  A necessary and sufficient condition for the existence of a compactness-bounded complete answer to a keyword query ISWC 2020 8 Approach --- Theoretical Foundations United States (US) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn country We refer to v as a certificate vertex for Q. • E.g., Montana for "united states yellowstone park" under D=2
  • 9. ISWC 2020 9 Approach --- Algorithm CORE  A best-first search algorithm one independent search starting from each keyword vertex a shared priority queue keeping search frontiers (priority: potentially uncovered keywords, based on distances) a more complete answer which the current vertex is a certificate vertex for early stop unvisited neighbors
  • 10.  Running example ISWC 2020 10 Approach --- Algorithm CORE United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters Yellowstone (YS) The Trip (TT) Q: united states yellowstone park trip Ohio country London country producedBy G7 member member producedBy country G
  • 11.  Running example ISWC 2020 11 Approach --- Algorithm CORE United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters Yellowstone (YS) The Trip (TT) Q: united states yellowstone park trip Ohio country London country producedBy G7 member member producedBy country G
  • 12.  Running example ISWC 2020 12 Approach --- Algorithm CORE United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters Yellowstone (YS) The Trip (TT) Q: united states yellowstone park trip Ohio country London country producedBy G7 member member producedBy country G
  • 13.  Running example ISWC 2020 13 Approach --- Algorithm CORE United States (US) United Kindom (UK) Montana (MT) Yellowstone National Park (YSNP) isLocatedIn BBC headquarters Yellowstone (YS) The Trip (TT) Q: united states yellowstone park trip Ohio country London country producedBy G7 member member producedBy country G
  • 14.  Datasets  Baselines  CertQR+ (adapted to our problem)  GST-based answers ISWC 2020 14 Experiment Settings
  • 15.  Main finding  Trading off answer completeness for compactness is necessary. ISWC 2020 15 Results --- Compactness of GST-Based Answers (doc = diameter; |K| = max keyword hits)
  • 16.  Main finding  The completeness of our computed answers is very high. ISWC 2020 16 Results --- Completeness of Relaxable Answers (dor = number of uncovered keywords; |K| = max keyword hits)
  • 17.  Main finding  CORE is efficient and significantly outperforms CertQR+. ISWC 2020 17 Results --- Efficiency of CORE
  • 18.  Take-home messages  Necessity of trading off answer completeness for compactness  Polynomial-time algorithm for generating compact but relaxable answers  https://github.com/nju-websoft/CORE  Future work  Vertex and/or edge weights ISWC 2020 18 Conclusion