SlideShare a Scribd company logo
Paolo Pareti
University of Edinburgh
ACM Web Science Conference 29/6/2015
The Semantic Richness
of Linked Data Concepts
Vocabulary Reuse Damages Semantics!
The Problem
A Linked Data Scalability Challenge: Frequently Reused Concepts Lose their Meaning
is a
What does class membership tell us?
:x Cat
Semantic Richness
The more facts we can infer about :x,
knowing that :x it is a Cat,
the more Semantically Rich the concept Cat is.
Semantic Richness
The more facts we can infer about :x,
knowing that :x it is a Cat,
the more Semantically Rich the concept Cat is.
Does it have a tail?
Is it a mammal?
Semantic Richness
is NOT
Specificity / Information Content
For example, this might have been the set of entities
in the original definition of the concept Cat.
However, after some time,
people started using the term Cat in a more generic way.
Some entities were defined as Cats,
despite not being animals.
Even t-shirts could be defined as Cats.
And why not, maybe even some trees...
is a
So what do you actually know about :x,
if on the Web anything can be a Cat?
:x Cat
A Linked Data Challenge
The more a concept gets reused…
… the least Semantically Rich it becomes.
A Linked Data Challenge
The more a concept gets reused…
… the least Semantically Rich it becomes.
Frequently reused concepts lose their meaning.
http://www.w3.org/2002/07/owl#sameAs
This problem already affects highly reused concepts,
such as owl:sameAs *
* H. Halpin, P. J. Hayes, J. P. McCusker, D. L. McGuinness, and H. S. Thompson. When owl:sameAs Isn’t the Same: An
Analysis of Identity in Linked Data. In The Semantic Web - ISWC 2010, volume 6496 of Lecture Notes in Computer Science,
pages 305–320. Springer Berlin Heidelberg, 2010.
http://dbpedia.org/resource/Edinburgh
owl:sameAs
owl:sameAs
http://dbpedia.org/resource/Edinburgh
owl:sameAs
owl:sameAs
Originally designed to represent strict equality,
owl:sameAs is often (mis)used to represent weaker relations.
http://dbpedia.org/resource/Edinburgh
owl:sameAs
owl:sameAs
In this example, the usage of owl:sameAs is incorrect,
as Edinburgh, a picture of Edinburgh
and the location of Edinburgh are three different things.
A Simple Measure of Semantic Richness
We define a measure based on:
● the number of common patterns,
● and their frequency.
For example:
if X is a cat, what can we say about X?
● X is a mammal (frequency: 1.00)
● X has a tail (frequency: 0.99)
● ...
A Simple Measure of Semantic Richness
Intuitively:
● The more patterns, and the more frequent they are,
the more semantically rich the concept is.
Measure motivated by:
● Number of Features theory
● Inductive Learning
Main advantage:
● Can be automatically and efficiently computed over
large datasets.
DBpedia Ontology
DBpedia Ontology
The DBpedia ontology tree, plotted according to the Semantic Richness
of its concepts (each line represents a subclass relation). As we would
expect, Semantic Richness is highly correlated with specificity.
Loss of Semantic Richness in foaf:Person
Loss of Semantic Richness in foaf:Person
How quickly does Semantic Richness decrease when reusing
a concept? We looked at the concept of foaf:Person as defined in ten
different datasets.
Loss of Semantic Richness in foaf:Person
Loss of Semantic Richness in foaf:Person
As we add external entities of type foaf:Person into a dataset, the
Semantic Richness of this concept quickly decreases.
In particular, it falls below the average Semantic Richness
of the original datasets (dotted line).
The Challenge
How can concepts be openly reused on the Web,
while at the same time remaining semantically rich?
The end,
any questions?

More Related Content

PDF
Human Activities as Linked Data
PDF
How to Start Using LaTeX and BibTeX
PPTX
End note reference manager2013
PDF
BibTex:Bibliografía para Latex
PDF
On the Impact of sameAs on Schema Matching
KEY
Semantic Web and Linked Open Data
PPT
Ontology modelling and the semantic web
PPT
E Challenges 2009 Workshop 10b Semantic Interoperability Methodologies
Human Activities as Linked Data
How to Start Using LaTeX and BibTeX
End note reference manager2013
BibTex:Bibliografía para Latex
On the Impact of sameAs on Schema Matching
Semantic Web and Linked Open Data
Ontology modelling and the semantic web
E Challenges 2009 Workshop 10b Semantic Interoperability Methodologies

Similar to A Linked Data Scalability Challenge: Frequently Reused Concepts Lose their Meaning (20)

PPT
Semantic Interoperability Methodologies
DOCX
NE7012- SOCIAL NETWORK ANALYSIS
KEY
Semantic Web: A web that is not the Web
PPT
semantic.ppt
PPT
Effective Extraction of Thematically Grouped Key Terms From Text
PDF
Ontologies Fmi 042010
PDF
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
PPT
Extracting Key Terms From Noisy and Multi-theme Documents
PPTX
It's All About the Metadata
PPTX
The Semantic Web #9 - Web Ontology Language (OWL)
PPT
Representation of knowledge
PDF
Fri schreiber key_knowledge engineering
PDF
Information Retrieval using Semantic Similarity
PDF
SMalL - Semantic Malware Log Based Reporter
PPTX
The Web of Data: do we actually understand what we built?
PPT
You Are All Crazy Subjectivaly Speaking (Uploaded)
PPTX
Weak Slot and Filler Structure (by Mintoo Jakhmola LPU)
PDF
Open semantic linked data
Semantic Interoperability Methodologies
NE7012- SOCIAL NETWORK ANALYSIS
Semantic Web: A web that is not the Web
semantic.ppt
Effective Extraction of Thematically Grouped Key Terms From Text
Ontologies Fmi 042010
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Extracting Key Terms From Noisy and Multi-theme Documents
It's All About the Metadata
The Semantic Web #9 - Web Ontology Language (OWL)
Representation of knowledge
Fri schreiber key_knowledge engineering
Information Retrieval using Semantic Similarity
SMalL - Semantic Malware Log Based Reporter
The Web of Data: do we actually understand what we built?
You Are All Crazy Subjectivaly Speaking (Uploaded)
Weak Slot and Filler Structure (by Mintoo Jakhmola LPU)
Open semantic linked data
Ad

A Linked Data Scalability Challenge: Frequently Reused Concepts Lose their Meaning