SlideShare a Scribd company logo
Knowledge Graph Maintenance
Prof. Paul Groth | @pgroth | pgroth.com | indelab.org

Thanks to Daniel Daza, Thiviyan Thanapalsingam and Frank van Harmelen

Knowledge Graph Conference 2020
Roads
and Bridges:The Unseen Labor Behind
Our Digital Infrastructure
W R I T T E N B Y
Nadia Eghbal
Source:

https://www.fordfoundation.org/work/learning/research-reports/roads-and-bridges-the-unseen-labor-behind-our-
digital-infrastructure/
Source:

Azzaoui, K., Jacoby, E., Senger, S., Rodríguez, E. C., Loza, M., Zdrazil, B., … Ecker, G. F. (2013). Scientific
competency questions as the basis for semantically enriched open pharmacological space development. Drug
Discovery Today, 18(17–18), 843–852. https://doi.org/10.1016/j.drudis.2013.05.008
Knowledge Graph Maintenance
Source:

https://www.biocuration2019.org/about
Source:

https://www.wired.com/story/inside-the-alexa-friendly-world-of-wikidata/
Source:

https://stats.wikimedia.org/v2/#/en.wikipedia.org/contributing/user-edits/normal||2001-01-01~2019-09-01|~total|
Crowdsourcing
100,000s of hand annotated examples
The TAC Relation Extraction Dataset
Source:

Zhang, Yuhao, et al. "Position-aware attention and supervised data improve slot filling." Proceedings of the 2017
Conference on Empirical Methods in Natural Language Processing. 2017.

Karen Fort, Gilles Adda, Kevin Bretonnel Cohen. Amazon Mechanical Turk: Gold Mine or Coal Mine?. Computational
Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2011, pp.413-420. 10.1162/COLI_a_00057
Data Work == People Work
Concept1
Concept2 Concept3
KOS
Professional
Curators
Literature
Software
Non-professional
contributors
1. dealing with changing cultural and societal
norms, specifically to address or correct bias;
2. political influence
3. new concepts and terminology arising from
discoveries or change in perspective within a
technical/scientific community
4. gardening
5. incremental contributorship
6. progressive formalization
7. software and automation
8. integration of large numbers of data sources
9. variance in algorithm training data
Data
⚐Society & Politics
(4, 5, 6)
(7, 8, 9)
(3)
(1, 2)
Source:

Michael Lauruhn and Paul Groth. 

“Sources of Change for Modern Knowledge Organization Systems." Knowledge Organization 43, no. 8 (2016).
Apply ML
Content
Universal
schema
Surface form
relations
Structured
relations
Factorization
model
Matrix
Construction
Open
Information
Extraction
Entity
Resolution
Matrix
Factorization
Knowledge
graph
Curation
Predicted
relations
Matrix
Completion
Taxonomy
Triple
Extraction
Concept
Resolution
14M
SD articles
475 M
triples
3.3 million
relations
49 M
relations
~15k ->
1M
entries
Paul Groth, Sujit Pal, Darin McBeath, Brad Allen, Ron Daniel
“Applying Universal Schemas for Domain Specific Ontology Expansion”
5th Workshop on Automated Knowledge Base Construction (AKBC) 2016
Link Prediction & KG Curation
Link Prediction
Inductive Prediction
Inductive Prediction
Inductive Prediction
Knowledge Graph Maintenance
Future: Sub-graph Prediction
Future: Learning KG Pipelines End-to-End
Paul T. Groth, Antony Scerri, Ron Daniel, Bradley P. Allen:

End-to-End Learning for Answering Structured Queries Directly over Text. DL4KG@ESWC 2019: 57-70
Knowledge Graph Maintenance
Data Work == People Work
Knowledge Engineering Revisited
• Knowledge graphs are built ad-hoc 

• 100s of components (extractors, scrapers, quality,
scoring,  user feedback, ….)

• Unique for each organization

• Existing knowledge engineering theory does not apply:

• Assumes small scale

• Assumes slow change

• People-centric

• Expressive representations 

• an updated theory and methods for knowledge
engineering designed for the demands of modern
knowledge graphs
knowledgescientist.org
Conclusion
• Knowledge graphs require maintenance

• Maintenance is frequently people work

• New ML based methods & new human + machine workflows

• Interested? Happy to talk more
Paul Groth | @pgroth | pgroth.com | indelab.org

Thanks to Daniel Daza, Thiviyan Thanapalsingam and Frank van Harmelen

More Related Content

PDF
Knowledge Graph Maintenance
PPTX
Content + Signals: The value of the entire data estate for machine learning
PDF
Knowledge Graph Futures
PPTX
Thinking About the Making of Data
PPTX
End-to-End Learning for Answering Structured Queries Directly over Text
PPTX
Thoughts on Knowledge Graphs & Deeper Provenance
PPTX
More ways of symbol grounding for knowledge graphs?
PPTX
Data Communities - reusable data in and outside your organization.
Knowledge Graph Maintenance
Content + Signals: The value of the entire data estate for machine learning
Knowledge Graph Futures
Thinking About the Making of Data
End-to-End Learning for Answering Structured Queries Directly over Text
Thoughts on Knowledge Graphs & Deeper Provenance
More ways of symbol grounding for knowledge graphs?
Data Communities - reusable data in and outside your organization.

What's hot (20)

PPTX
Minimal viable-datareuse-czi
PPTX
From Data Search to Data Showcasing
PPTX
The Challenge of Deeper Knowledge Graphs for Science
PPTX
The need for a transparent data supply chain
PPTX
From Text to Data to the World: The Future of Knowledge Graphs
PDF
Knowledge Representation on the Web
PDF
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
PDF
An Ecosystem for Linked Humanities Data
PDF
Prov-O-Viz: Interactive Provenance Visualization
PPTX
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
PPTX
Sources of Change in Modern Knowledge Organization Systems
PPTX
Cognitive data
PDF
Managing Metadata for Science and Technology Studies: the RISIS case
PPTX
Describing Scholarly Contributions semantically with the Open Research Knowle...
PPTX
The Roots: Linked data and the foundations of successful Agriculture Data
PPTX
Self adaptive based natural language interface for disambiguation of
PPTX
Data Discovery and Visualization
PDF
Linking Big Data to Rich Process Descriptions
PPTX
Ziegler Open Data in Special Collections Libraries
PPTX
Identifying semantics characteristics of user’s interactions datasets through...
Minimal viable-datareuse-czi
From Data Search to Data Showcasing
The Challenge of Deeper Knowledge Graphs for Science
The need for a transparent data supply chain
From Text to Data to the World: The Future of Knowledge Graphs
Knowledge Representation on the Web
Provenance and Reuse of Open Data (PILOD 2.0 June 2014)
An Ecosystem for Linked Humanities Data
Prov-O-Viz: Interactive Provenance Visualization
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Sources of Change in Modern Knowledge Organization Systems
Cognitive data
Managing Metadata for Science and Technology Studies: the RISIS case
Describing Scholarly Contributions semantically with the Open Research Knowle...
The Roots: Linked data and the foundations of successful Agriculture Data
Self adaptive based natural language interface for disambiguation of
Data Discovery and Visualization
Linking Big Data to Rich Process Descriptions
Ziegler Open Data in Special Collections Libraries
Identifying semantics characteristics of user’s interactions datasets through...
Ad

Similar to Knowledge Graph Maintenance (20)

PDF
TOP READ NATURAL LANGUAGE COMPUTING ARTICLE 2020
PPTX
Data Science and AI in Biomedicine: The World has Changed
PDF
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
PPTX
AI from the Perspective of a School of Data Science
PDF
May 2024 - Top 10 Read Articles in Artificial Intelligence and Applications (...
PDF
July 2025 - Top 10 Read Articles in Artificial Intelligence and Applications ...
PDF
January 2024 - Top 10 Read Articles in International Journal of Artificial In...
PDF
September 2024 - Top 10 Read Articles in Artificial Intelligence and Applicat...
PDF
January 2025 - Top 10 Read Articles in Artificial Intelligence and Applicatio...
PDF
November 2024 - Top 10 Read Articles in Artificial Intelligence and Applicati...
PDF
Knowledge Graphs Synthesis Lectures On Data Semantics And Knowledge Aidan Hogan
PDF
Open Research Knowledge Graph (ORKG) - an overview
PPTX
Big data divided (24 march2014)
PPTX
Building Effective Visualization Shiny WVF
PDF
Discoverability and Web-Enabled Science - #ScholarAfrica
PDF
Hala skafkeynote@conferencedata2021
PDF
The web of data: how are we doing so far
PDF
The technical case for a semantic web
PDF
WWW2013 Tutorial: Linked Data & Education
PPTX
How semantic representations can support scholarly communication
TOP READ NATURAL LANGUAGE COMPUTING ARTICLE 2020
Data Science and AI in Biomedicine: The World has Changed
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
AI from the Perspective of a School of Data Science
May 2024 - Top 10 Read Articles in Artificial Intelligence and Applications (...
July 2025 - Top 10 Read Articles in Artificial Intelligence and Applications ...
January 2024 - Top 10 Read Articles in International Journal of Artificial In...
September 2024 - Top 10 Read Articles in Artificial Intelligence and Applicat...
January 2025 - Top 10 Read Articles in Artificial Intelligence and Applicatio...
November 2024 - Top 10 Read Articles in Artificial Intelligence and Applicati...
Knowledge Graphs Synthesis Lectures On Data Semantics And Knowledge Aidan Hogan
Open Research Knowledge Graph (ORKG) - an overview
Big data divided (24 march2014)
Building Effective Visualization Shiny WVF
Discoverability and Web-Enabled Science - #ScholarAfrica
Hala skafkeynote@conferencedata2021
The web of data: how are we doing so far
The technical case for a semantic web
WWW2013 Tutorial: Linked Data & Education
How semantic representations can support scholarly communication
Ad

More from Paul Groth (15)

PDF
Co-Constructing Explanations for AI Systems using Provenance
PDF
Evaluation Challenges in Using Generative AI for Science & Technical Content
PDF
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
PDF
Data Curation and Debugging for Data Centric AI
PPTX
Elsevier’s Healthcare Knowledge Graph
PPTX
Diversity and Depth: Implementing AI across many long tail domains
PPTX
Progressive Provenance Capture Through Re-computation
PPTX
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
PPTX
Knowledge graph construction for research & medicine
PPTX
Machines are people too
PPTX
Are we finally ready for transclusion?*
PPTX
Structured Data & the Future of Educational Material
PPTX
Research Data Sharing: A Basic Framework
PPTX
Data for Science: How Elsevier is using data science to empower researchers
PPTX
Tradeoffs in Automatic Provenance Capture
Co-Constructing Explanations for AI Systems using Provenance
Evaluation Challenges in Using Generative AI for Science & Technical Content
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Data Curation and Debugging for Data Centric AI
Elsevier’s Healthcare Knowledge Graph
Diversity and Depth: Implementing AI across many long tail domains
Progressive Provenance Capture Through Re-computation
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Knowledge graph construction for research & medicine
Machines are people too
Are we finally ready for transclusion?*
Structured Data & the Future of Educational Material
Research Data Sharing: A Basic Framework
Data for Science: How Elsevier is using data science to empower researchers
Tradeoffs in Automatic Provenance Capture

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Approach and Philosophy of On baking technology
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Encapsulation theory and applications.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Big Data Technologies - Introduction.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Unlocking AI with Model Context Protocol (MCP)
MIND Revenue Release Quarter 2 2025 Press Release
Approach and Philosophy of On baking technology
NewMind AI Weekly Chronicles - August'25 Week I
Advanced methodologies resolving dimensionality complications for autism neur...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Per capita expenditure prediction using model stacking based on satellite ima...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation_ Review paper, used for researhc scholars
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Encapsulation theory and applications.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Big Data Technologies - Introduction.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Knowledge Graph Maintenance