SlideShare a Scribd company logo
Capturing emerging relations between schema ontologies on the Web of DataAndriy NikolovEnrico Motta
Public linked data“Linking” in the Linked Data cloud:References to instance URIs described in external sourcesSpecial case: identity links between equivalent resourcesLinking Open Data cloud diagram, by Richard Cyganiak and AnjaJentzsch. http://lod-cloud.net/
MotivationSchema heterogeneity is an obstacle both for creating and for utilising these linksExtracting information on the same topic from different repositoriesDiscovering equivalence links between individualsMotivation for our work: discovering instance-level linksHow to choose the repositories to connect a new one?Which subsets of repositories contain co-referring instances??LinkedMDBTV programsDBPediaFreebasemoviespieces of musicMusicBrainz
Schema-level interlinksSchema-level?Data-level
Matching approaches“Top-down”Analyzing schema ontologies and generating alignments (manually or automatically)UMBELUsing CYC as a “backbone”Mapping commonly used schema ontologies“Bottom-up”Inferring schema mappings based on instance-level information
Our approachConstructing a large-scale network of schema mappingsApplying a light-weight instance-based matcherAnalysing the resulting networkWhat does it tell us about the use of ontologies?
Motivating factorsPotential use case scenariosDiscovering relevant sources for connectionDiscovering relevant subsets of comparable instancesTolerance to the quality of mappingsA mapping between “strongly overlapping” classes is still useful even if there is no strict equivalence/subsumption
Instance-based matchingUse of instance-based matchingSome implicit schema-level assumptions cannot be captured using only schema-level evidenceInterpretation mismatchesdbpedia:Actor = professional actor (film or stage)movie:actor = anybody who participated in a movieClass interpretation “as used” vs “as designed”FOAF: foaf:Person = any personDBLP: foaf:Person = computer scientist
Instance set overlapsCo-typingdbpedia:Artistyago:ItalianComposersDBPediais_ais_adbpedia:Ennio_MorriconeDeclared associationmovie:music_contributordbpedia:Artistmo:MusicArtistis_ais_aLinkedMDBDBPediaMusicBrainz==dbpedia:Ennio_Morriconemovie:music_contributor/2490music:artist/a16…9fdf
DatasetBillion Triple Challenge 2009about 1.14 billion triplescontainscore LOD repositories (DBPedia, Freebase, Geonames, Musicbrainz, LinkedMDB,…)smaller semantic datasets retrieved by search servers (Falcon-S, Sindice)≈3.6M co-typing-based overlapping pairs of classes≈1M association-based pairs
Inferring mappingsClassification taskClasses A, B: is there a mapping?Boolean classification type of mappings assigned based on comparing sizes of instance setsFeatures𝑛𝑠1, 𝑛𝑠2: namespaces of class URIs|𝑒𝐴∩𝐵|: size of the overlap𝑒𝐴, 𝑒𝐵: sizes of instance sets|𝑒(𝐴∩𝐵)||𝑒𝐴|,|𝑒(𝐴∩𝐵)||𝑒𝐴|: ratio of the overlapping subset to the complete instance setdirect/indirect: whether classes have instances explicitly declared to be equivalent 
TestTrainingTraining set: 6000 overlapping pairs of classesTest: 10-fold cross-validationTrainingTraining set: 6000 overlapping pairs of classesTest: 10-fold cross-validationApplying2 networks of class mappings
Observations: class mappingsAssociation-based network: classes involved into the largest number of mappingsHigh-level classes represented concepts covered in many repositories… and describing categories with very fine-grained class decompositionUsually also the most populated onesgeonames:Featurefreebase:people.personyago:PhysicalEntitylinkedmdb:filmumbel:Personakt:Personakt:ArticleReference…“under-linked” ones?
Observations: class mappingsCo-typing-based network: classes involved into the largest number of mappingsPopular classes reused in many repositories… or in DBPedia… and describing categories with fine-grained class decompositionUsually also the most populated onesfoaf:Personumbel:Persondbpedia:Persondbpedia:FootballPlayerwordnet:Persondbpedia:Albumsioc:WikiArticlegeonames:Feature…
Links between ontologiesAggregated network: connections between ontologiesMapping-based links between ontologiesAt least 1 mapping between corresponding classes must exist
Association-based network
Association-based networkGeneric:YAGO
Freebase
UMBEL
OpenCYC
DBPediaAssociation-based networkDomain-specificGeneric:YAGO
Freebase
UMBEL
OpenCYC

More Related Content

PPT
2011linked science4mccuskermcguinnessfinal
PPTX
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
PPTX
Linked Data at the Open University: From Technical Challenges to Organization...
PPT
test
PPT
TreeBASE CIPRES
PPTX
Doing Clever Things with the Semantic Web
PPT
Simulator
PPT
Simulator
2011linked science4mccuskermcguinnessfinal
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Linked Data at the Open University: From Technical Challenges to Organization...
test
TreeBASE CIPRES
Doing Clever Things with the Semantic Web
Simulator
Simulator

What's hot (10)

PPT
Simulator
PPT
Simulator
PPT
Simulator
PDF
Ievobio2010cdaostore
PDF
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
PPTX
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
PDF
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
PPT
Open Annotation Collaboration Briefing
PPTX
Contextual Ontology Alignment - ESWC 2011
PDF
The bX project: Federating and Mining Usage Logs from Linking Servers
Simulator
Simulator
Simulator
Ievobio2010cdaostore
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
Freddy Limpens: From folksonomies to ontologies: a socio-technical solution.
Tuning Personalized PageRank for Semantics-aware Recommendations based on Lin...
Open Annotation Collaboration Briefing
Contextual Ontology Alignment - ESWC 2011
The bX project: Federating and Mining Usage Logs from Linking Servers
Ad

Viewers also liked (8)

PPT
Fusing semantic data
PDF
ifcOWL - An ontology for building data
PPT
Identifying Relevant Sources for Data Linking using a Semantic Web Index
PPT
Ontology And Taxonomy Modeling Quick Guide
PPTX
Java and SPARQL
PPT
Database-to-Ontology Mapping Generation for Semantic Interoperability
PPTX
Java and OWL
PDF
Introduction to Ontology Concepts and Terminology
Fusing semantic data
ifcOWL - An ontology for building data
Identifying Relevant Sources for Data Linking using a Semantic Web Index
Ontology And Taxonomy Modeling Quick Guide
Java and SPARQL
Database-to-Ontology Mapping Generation for Semantic Interoperability
Java and OWL
Introduction to Ontology Concepts and Terminology
Ad

Similar to Capturing emerging relations between schema ontologies on the Web of Data (20)

PPTX
How To Make Linked Data More than Data
PPTX
How To Make Linked Data More than Data
PDF
Similarity on DBpedia
PDF
Profile-based Dataset Recommendation for RDF Data Linking
PPTX
03 interlinking-dass
PDF
Spotlight
PPT
Information Extraction and Linked Data Cloud
PPT
Pragmatic Approaches to the Semantic Web
PPTX
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
PPT
Future of Web 2.0 & The Semantic Web
PPTX
Neural Models for Information Retrieval
PDF
From Linked Data to Semantic Applications
PPT
Linked Data Driven Data Virtualization for Web-scale Integration
PPTX
The Progress of BIBFRAME, by Angela Kroeger
PPTX
Seattle Scalability Mahout
PPT
Harmony project - JISC Synthesis meeting 2001
PDF
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
PDF
Building an editable, versionized LOD service for library data
PPTX
Neural Models for Information Retrieval
PPT
The JISC DC Application Profiles: Some thoughts on requirements and scope
How To Make Linked Data More than Data
How To Make Linked Data More than Data
Similarity on DBpedia
Profile-based Dataset Recommendation for RDF Data Linking
03 interlinking-dass
Spotlight
Information Extraction and Linked Data Cloud
Pragmatic Approaches to the Semantic Web
ChemConnect: Characterizing CombusAon KineAc Data with ontologies and meta-­‐...
Future of Web 2.0 & The Semantic Web
Neural Models for Information Retrieval
From Linked Data to Semantic Applications
Linked Data Driven Data Virtualization for Web-scale Integration
The Progress of BIBFRAME, by Angela Kroeger
Seattle Scalability Mahout
Harmony project - JISC Synthesis meeting 2001
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Building an editable, versionized LOD service for library data
Neural Models for Information Retrieval
The JISC DC Application Profiles: Some thoughts on requirements and scope

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPT
Teaching material agriculture food technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Machine Learning_overview_presentation.pptx
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
A Presentation on Artificial Intelligence
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Cloud computing and distributed systems.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Encapsulation theory and applications.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Big Data Technologies - Introduction.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Assigned Numbers - 2025 - Bluetooth® Document
Network Security Unit 5.pdf for BCA BBA.
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Teaching material agriculture food technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Electronic commerce courselecture one. Pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Machine Learning_overview_presentation.pptx
A comparative analysis of optical character recognition models for extracting...
A Presentation on Artificial Intelligence
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Cloud computing and distributed systems.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Capturing emerging relations between schema ontologies on the Web of Data

  • 1. Capturing emerging relations between schema ontologies on the Web of DataAndriy NikolovEnrico Motta
  • 2. Public linked data“Linking” in the Linked Data cloud:References to instance URIs described in external sourcesSpecial case: identity links between equivalent resourcesLinking Open Data cloud diagram, by Richard Cyganiak and AnjaJentzsch. http://lod-cloud.net/
  • 3. MotivationSchema heterogeneity is an obstacle both for creating and for utilising these linksExtracting information on the same topic from different repositoriesDiscovering equivalence links between individualsMotivation for our work: discovering instance-level linksHow to choose the repositories to connect a new one?Which subsets of repositories contain co-referring instances??LinkedMDBTV programsDBPediaFreebasemoviespieces of musicMusicBrainz
  • 5. Matching approaches“Top-down”Analyzing schema ontologies and generating alignments (manually or automatically)UMBELUsing CYC as a “backbone”Mapping commonly used schema ontologies“Bottom-up”Inferring schema mappings based on instance-level information
  • 6. Our approachConstructing a large-scale network of schema mappingsApplying a light-weight instance-based matcherAnalysing the resulting networkWhat does it tell us about the use of ontologies?
  • 7. Motivating factorsPotential use case scenariosDiscovering relevant sources for connectionDiscovering relevant subsets of comparable instancesTolerance to the quality of mappingsA mapping between “strongly overlapping” classes is still useful even if there is no strict equivalence/subsumption
  • 8. Instance-based matchingUse of instance-based matchingSome implicit schema-level assumptions cannot be captured using only schema-level evidenceInterpretation mismatchesdbpedia:Actor = professional actor (film or stage)movie:actor = anybody who participated in a movieClass interpretation “as used” vs “as designed”FOAF: foaf:Person = any personDBLP: foaf:Person = computer scientist
  • 9. Instance set overlapsCo-typingdbpedia:Artistyago:ItalianComposersDBPediais_ais_adbpedia:Ennio_MorriconeDeclared associationmovie:music_contributordbpedia:Artistmo:MusicArtistis_ais_aLinkedMDBDBPediaMusicBrainz==dbpedia:Ennio_Morriconemovie:music_contributor/2490music:artist/a16…9fdf
  • 10. DatasetBillion Triple Challenge 2009about 1.14 billion triplescontainscore LOD repositories (DBPedia, Freebase, Geonames, Musicbrainz, LinkedMDB,…)smaller semantic datasets retrieved by search servers (Falcon-S, Sindice)≈3.6M co-typing-based overlapping pairs of classes≈1M association-based pairs
  • 11. Inferring mappingsClassification taskClasses A, B: is there a mapping?Boolean classification type of mappings assigned based on comparing sizes of instance setsFeatures𝑛𝑠1, 𝑛𝑠2: namespaces of class URIs|𝑒𝐴∩𝐵|: size of the overlap𝑒𝐴, 𝑒𝐵: sizes of instance sets|𝑒(𝐴∩𝐵)||𝑒𝐴|,|𝑒(𝐴∩𝐵)||𝑒𝐴|: ratio of the overlapping subset to the complete instance setdirect/indirect: whether classes have instances explicitly declared to be equivalent 
  • 12. TestTrainingTraining set: 6000 overlapping pairs of classesTest: 10-fold cross-validationTrainingTraining set: 6000 overlapping pairs of classesTest: 10-fold cross-validationApplying2 networks of class mappings
  • 13. Observations: class mappingsAssociation-based network: classes involved into the largest number of mappingsHigh-level classes represented concepts covered in many repositories… and describing categories with very fine-grained class decompositionUsually also the most populated onesgeonames:Featurefreebase:people.personyago:PhysicalEntitylinkedmdb:filmumbel:Personakt:Personakt:ArticleReference…“under-linked” ones?
  • 14. Observations: class mappingsCo-typing-based network: classes involved into the largest number of mappingsPopular classes reused in many repositories… or in DBPedia… and describing categories with fine-grained class decompositionUsually also the most populated onesfoaf:Personumbel:Persondbpedia:Persondbpedia:FootballPlayerwordnet:Persondbpedia:Albumsioc:WikiArticlegeonames:Feature…
  • 15. Links between ontologiesAggregated network: connections between ontologiesMapping-based links between ontologiesAt least 1 mapping between corresponding classes must exist
  • 19. UMBEL
  • 23. UMBEL
  • 25. DBPediaAssociation-based networkMain factor: topic coveragePopularity for linking is not reflectedData-level: DBPedia has more connections than FreebaseSchema-level: no substantial differenceEffect of exploiting composed links
  • 26. Co-typing-based networkMain factor:Popularity for reuseFOAF and WordNet:the most popular DBPedia, YAGO, OpenCYC, UMBELReused for DBPedia instances
  • 27. OutcomesPossible usage scenarios for mappingsSelecting suitable sources to connect“LinkedMDB contains more movies than DBPedia – more likely to cover all my instances”Selecting an ontology to reuse to structure new instancesWhich sources use this ontology? Do I want my data to be integrated with them?Other data-driven tasksE.g., exploratory searchGeneric challengesHow to take into account task requirements in ontology matching?Recall vs precision, fuzzy vs exactHow to capture changes in the data?BTC 2009 is almost obsolete by now
  • 28. Limitations and future workLimitationsLight-weight matcher can lead to lower quality mappingsOK for our scenario but not othersPre-existing instance-level mappings are not always availableFuture workCombining with schema-based ontology matching techniquesTaking into account properties and complex correspondences
  • 30. Disjoint but overlappingSpurious owl:sameAs linkdbpedia:Hippocrates(Hippocrates) = bookmashup:9004095748 (Hippocratic Lives and Legends (Studies in Ancient Medicine, Vol 4))Spurious rdf:typeassignmentdbpedia:Celtic_Frost (band) defined as Person in DBPedia (fixed in the current version of DBPedia)Modelling assumptionsdbpedia:Masada describes both the geographical place and the battle