SlideShare a Scribd company logo
How To Make Linked Data More than DataSemantic Technology Conference 2010, June 23, 2010, San FranciscoPrateek Jain, Pascal Hitzler, Amit ShethKno.e.sis: Ohio Center of Excellence onKnowledge-enabled ComputingWright State University, Dayton, OHhttp://www.knoesis.orgPeter Z. Yeh, KunalVermaAccenture Technology LabsSan Jose, CA
What is Semantic Web Semantics?Semantic Web Semantics:shareable		(independent of your particular software)declarative		(not dependent on imperative algorithms)computable		(otherwise we don’t gain much)	meaningYou can do Mashups without Semantic Web semantics.You can do information integration without Semantic Web semantics.You can do most things without Semantic Web semantics.But then it will be one-off, less scalable, less reusable.
What Is Semantic Web Semantics?Semantic Web requires a shareable, declarative and computablesemantics.I.e., the semantics must be a formal entity which is clearly defined and automatically computable.Ontology languages provide this by means of their formal semantics.Semantic Web Semantics is given by a relation – the logical consequence relation.Note: This is considerably more than saying that the semantics of an ontology is the set of its logical consequences!
In other wordsWe capture the meaning of information		not by specifying its meaning directly (which is impossible)		but by specifying, precisely, how information interacts with other information.We describe the meaning indirectly through its effects.	- An example (from LoD) of unintended errors when adequate    semantics is not used:  Linked MDB links to Dbpedia URI for Hollywood for country 
Linked Open DataWhere is the semantics?
Example: GeoNamesWhere is the semantics?
Example: GovTrack“Nancy Pelosi voted in favor of the Health Care Bill.”Vote: 2009-887vote:hasOptionVotes:2009-887/+vote:votevote:votedByrdfs:labelAyevote:hasActionpeople/P000197Where is the semantics?H.R. 3962: Affordable Health Care for America Actdc:titlenameOn Passage: H R 3962 Affordable Health Care for America ActNancy Pelosidc:titleBills:h3962
Don’t get us wrongLinked Open Data is great, useful, cool, and a very important step.But if we stay semantics-free, Linked Open Data will be of limited usefulness!
The Semantic Data Web Layer CakeTo leverage LoD, we require schema knowledgeapplication-type driven (reusable for same kind of application)less messy than LoD(as required by application)overarching several LoD datasets (as required by application)...ApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplication...SchemaSchemaSchemaSchemaless messyLinked Open DatamessyhumaneyesonlyTraditional Web content
Schema on top of the LoD cloud
Schema on top of the LOD CloudObvious solution to create an ontology capturing the relationships on top of the LOD Schema datasets.Perform a matching of the LOD Schemas using state of the art ontology matching tools.The datasets can be mapped to an upper level ontology which can capture the relationships.Considering the size, heterogeneity and complexity of LOD, at least have results which can be curated by a human being.
LOD Schema Alignment using state of the art tools
LOD Schema Alignment State of the art Ontology Alignment systems have difficulty in matching LOD Schemas!
 Nation = Menstruation, Confidence=0.9 
They are tuned to perform  on the established benchmarks, but do not seem to work well in more unconstrained/preselected cases.   Most current systems excel on Ontology Alignment Evaluation Initiative Benchmark.
LOD Schemas are of very different nature
Created by community for community.
LOD has so far emphasized number of instances, not number of meaningful relationships.
Require solutions beyond syntactic and structural matching.Research AgendaTwo componentsEnrich schemas to capture semantics – how data in different datasets/bubbles are logically related (BLOOM)Support Federated Queries – a system that automates query processing involving multiple, related datasets (LOCUS)
Step 1: Enrich SchemasBLOOMS – Bootstrapping based Linked Open Data Ontology Matching Systems.
Step 1: Semantic EnrichmentBLOOMS – Bootstrapping based Linked Open Data Ontology Matching Systems.At the highest level of abstraction our approach takes in two different ontologies and tries to match them using the following steps (1) Using Alignment API to identify direct correspondences. (2) Using the categorization of concepts using Wikipedia. (3) Running a reasoner on the results found using step (2) and directly on the ontologies.
Creation Wikipedia Category HierarchyUtilizes the Wikipedia Web service to identify the matching concepts.Thus for the term Conductor the following definitions are obtainedElectrical ConductorConductingConductor_(album)Conductor (architecture)Mr. ConductorConductor (ring theory)These terms correspond to articles on Wikipedia for the concepts in the ontology.
Build Category TreeNext step utilize the Web service for identifying Wikipedia categories for building the Wikipedia category tree.ConductorElectrical conductorConductingConductor (album)cat:Occupations_in_musiccat:Musical_Terminologycat:Musical_Notationcat:Music performance
For each different sense of concept c, match it with the different possible senses of the c’.ArtistConductorcat: Arts occupationsConductingcat:Occupations_in_musiccat:Music performancecat: Arts_occupations
Connected ClassesUsing the position of the categories identify the relationships.ConductorIs-aConductingArtistcat:Music performancecat:Occupations_in_musiccat: Arts_occupationsPonzetto & Strube, 2007Thus this helps in identifying approximately the relationship between the various concepts.
Disconnected ClassesSome senses do not relate to each otherConductorArtistConductor_(transportation)cat:Occupations_in_musiccat:Bus_Transportcat: :Transportation occupationscat: Arts_occupationscat: TransportationThus this helps in identifying disconnected relationships.
Equivalent ClassesSome senses are identical to each otherLady_FingerOkracat: AbelmoschusOkracat: Hibisceaecat: Abelmoschuscat: Hibisceaecat: MalvoideaeThus this helps in identifyingequivalence relationships.
LOD Schema Alignment using BLOOMS Testing done on 10 different pairs of LOD schemas
Linked Schema’sDBpedia  OntologyMusic Ontology SchemaJamendoMusic BrainzDBTunesGeonamesSWCPisaIEEEBBC ProgramACMFOAFSIOCAKT Portal Ontology
ObservationsHeavy connections at instance level, do not translate to schema level.Case in point: Geonames and Dbpedia. only SpatialThing in Geonames matches to Dbpedia concepts.No connections at instance level, DOES NOT mean anything.Case in point: Dbpedia and AKT Reference Ontology have over 100+ relationship between concepts.Possibility to create links between instance level. Example: Dbpedia “Scientist” Class can contain “Computer Scientist”.Schema level connections and reasoning can be used for cleaning up LOD Cloud.dbpedia:Hollywoodrdf:typedbpedia:Countrydbpedia:CountrydisjointWithuscensus:Communityuscensus:Hollywoodrdf:typeuscensus:Community
Step 2: Integrated Access/Federated QueryingLOQUS: Linked Open Data SPARQL Querying System (LOQUS)
Federated QueryingTransform a query and broadcast it to a group of disparate and relevant datasets  with the appropriate syntax.Merging the results collected from the datasets.Presenting them succinctly and unified format with least duplication.Automatically sort the merged result set.
Federated Querying ChallengesUser is required to have intimate knowledge about the domain of datasets.User needs to understand the exact structure of datasets.For each relevant dataset user needs to form separate queries.Entity disambiguation has to be performed on similar entities.Retrieved results have to be processed and merged.
Querying Federated SourcesIdentify artists, whose albums have been tagged as punk and the population of the places they are based near.
Relevant DatasetsGeonames DataMusicOntologyCensus Data
Querying the DatasetsMusicOntologyGive me artists with punk as genre and their locations?GeonamesDataGive me the identifier  used by Census Bureau for geographic locations?  CensusDataGive me population figures of geographical entities?
LOQUSLinked Open Data SPARQL Querying System.User can pose federated queries without having to know the exact structure and links between the different datasets.Automatically maps user’s query to the relevant datasets using mapping repository created using BLOOMS.Executes individual queries and merges the results into a single, complete answer.
Traditionally to Retrieve ResultsUser has to ….Music DataGeographic DataCensus DataPerform disambiguationPerform Union and JoinProcess Results
LOQUS ArchitectureA single source of reference consisting of mapping to the specific LOD datasets.Module to identify concepts contained in the query and perform the translations to the LOD cloud datasets.Module to split the query mapped to LOD datasets concepts into sub-queries corresponding to different datasets.Module to execute the queries remotely and process the results and deliver the final result to the user.
Querying using LOQUSGive me artists with punk as genre and their locations?Identify artists, whose albums have been tagged as punk and the population of the places they are based near.Music DataGive me artists with punk as genre and their locations?Give me the identifier  used by Census Bureau for  geographic locations?  LOQUSGive me the identifier  used by Census Bureau for geographic locations?  Query is decomposed into sub-queriesUser looks up mapping repository to identify concepts of interest and formulates queryQuery is routed to the appropriate datasetGeographic DataGive me population figures of geographical entities?Census DataGive me population figures of geographical entities?Mapping Repository
Querying Using LOQUSMusic DataResults are returned for the sub-queries.LOQUSGeographic DataCensus Data
LOQUS Processes Partial ResultsPartial results are processed for union, join and disambiguation by LOQUS.LOQUS
Results are Returned to UserLOQUS combines the results and presents them back to the user.

More Related Content

PDF
Distributed Link Prediction in Large Scale Graphs using Apache Spark
DOCX
NE7012- SOCIAL NETWORK ANALYSIS
PPT
Future of Web 2.0 & The Semantic Web
PPT
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
ODT
Riding The Semantic Wave
PDF
Searching for patterns in crowdsourced information
PDF
AI @ Wholi - Bucharest.AI Meetup #5
PPTX
Linked Data at the Open University: From Technical Challenges to Organization...
Distributed Link Prediction in Large Scale Graphs using Apache Spark
NE7012- SOCIAL NETWORK ANALYSIS
Future of Web 2.0 & The Semantic Web
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Riding The Semantic Wave
Searching for patterns in crowdsourced information
AI @ Wholi - Bucharest.AI Meetup #5
Linked Data at the Open University: From Technical Challenges to Organization...

What's hot (20)

PPTX
Semantic web Santhosh N Basavarajappa
PDF
Schema-agnositc queries over large-schema databases: a distributional semanti...
PPSX
Linked Data to Improve the OER Experience
PPTX
Semantic Relation Classification: Task Formalisation and Refinement
PDF
SDA2013 Pundit: Creating, Exploring and Consuming Annotations
PPT
2006-05-25__coi-semdis
PDF
Semantics at Scale: A Distributional Approach
PDF
Ontology Based Approach for Semantic Information Retrieval System
PPTX
2015 07-tuto3-mining hin
PDF
FLOWER VOICE: VIRTUAL ASSISTANT FOR OPEN DATA
PDF
EXTRACTING ARABIC RELATIONS FROM THE WEB
ZIP
Semantic Digital Libraries
PPT
Doctoral seminar (DBIS RWTH Aachen)
PPTX
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
PDF
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
PPTX
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
PPT
Pula 5 Giugno 2007
PDF
Context Sensitive Relatedness Measure of Word Pairs
PDF
Semantic Web from the 2013 Perspective
PDF
Discovering latent informaion by
Semantic web Santhosh N Basavarajappa
Schema-agnositc queries over large-schema databases: a distributional semanti...
Linked Data to Improve the OER Experience
Semantic Relation Classification: Task Formalisation and Refinement
SDA2013 Pundit: Creating, Exploring and Consuming Annotations
2006-05-25__coi-semdis
Semantics at Scale: A Distributional Approach
Ontology Based Approach for Semantic Information Retrieval System
2015 07-tuto3-mining hin
FLOWER VOICE: VIRTUAL ASSISTANT FOR OPEN DATA
EXTRACTING ARABIC RELATIONS FROM THE WEB
Semantic Digital Libraries
Doctoral seminar (DBIS RWTH Aachen)
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Pula 5 Giugno 2007
Context Sensitive Relatedness Measure of Word Pairs
Semantic Web from the 2013 Perspective
Discovering latent informaion by
Ad

Viewers also liked (20)

PDF
Grain News : FMCG brand & packaging rejuvenation
PPTX
The Ballad Of The Weimar Jew
PPTX
Citizen Sensor Data Mining, Social Media Analytics and Applications
KEY
The human side of design
PDF
One Life Gold Award Winner
PDF
Publicitate Online - Adevarul gol golut
PPT
Semantic Interoperability and Information Brokering in Global Information Sys...
PDF
Tips For Twitter Usage
PDF
Twitter - How to use it to acquire new customers and service existing ones
PPT
Computing for Human Experience and Wellness
PPT
Xxx Narut Oxxx
PPTX
Imagenes traslado 2
PPT
Twitter op jouw congres of evenement in 10 stappen
PDF
Promovare online
PDF
KTG-CERT (MAICSA)
PDF
Plateformes Web : au-delà des campagnes
PDF
From Data Dirt Roads to Infocosm
PDF
Problems in Enterprise Integration: Schematic Heterogeneities between Semanti...
PPT
Processes in the Networked Economies: Portal, Vortex, and Dynamic Trading Pro...
Grain News : FMCG brand & packaging rejuvenation
The Ballad Of The Weimar Jew
Citizen Sensor Data Mining, Social Media Analytics and Applications
The human side of design
One Life Gold Award Winner
Publicitate Online - Adevarul gol golut
Semantic Interoperability and Information Brokering in Global Information Sys...
Tips For Twitter Usage
Twitter - How to use it to acquire new customers and service existing ones
Computing for Human Experience and Wellness
Xxx Narut Oxxx
Imagenes traslado 2
Twitter op jouw congres of evenement in 10 stappen
Promovare online
KTG-CERT (MAICSA)
Plateformes Web : au-delà des campagnes
From Data Dirt Roads to Infocosm
Problems in Enterprise Integration: Schematic Heterogeneities between Semanti...
Processes in the Networked Economies: Portal, Vortex, and Dynamic Trading Pro...
Ad

Similar to How To Make Linked Data More than Data (20)

PPT
Semantics in Financial Services -David Newman
PDF
X api chinese cop monthly meeting feb.2016
PPT
Corrib.org - OpenSource and Research
PDF
From Linked Data to Semantic Applications
PPT
Information Extraction and Linked Data Cloud
PPT
Web 3 Mark Greaves
PDF
Linked Open Data Visualization
DOCX
Towards Ontology Development Based on Relational Database
PPSX
Exploiting Semantic Web Techniques For Representing And Utilising
PPTX
Un unbis-agrovoc 2010-09-03
PPT
Semantic Web in Action
PPTX
Doing Clever Things with the Semantic Web
PPT
Sem tech 2011 v8
PDF
Semantic IoT Semantic Inter-Operability Practices - Part 1
PPT
Repositories thru the looking glass
PPTX
SADI SWSIP '09 'cause you can't always GET what you want!
PDF
Semantic Interoperability - grafi della conoscenza
PPT
Tutorial on Semantic Digital Libraries (WWW'2007)
PPT
Hypertext2007 Carole Goble Keynote - "The Return of the Prodigal Web"
PPT
Semantic Web: Technolgies and Applications for Real-World
Semantics in Financial Services -David Newman
X api chinese cop monthly meeting feb.2016
Corrib.org - OpenSource and Research
From Linked Data to Semantic Applications
Information Extraction and Linked Data Cloud
Web 3 Mark Greaves
Linked Open Data Visualization
Towards Ontology Development Based on Relational Database
Exploiting Semantic Web Techniques For Representing And Utilising
Un unbis-agrovoc 2010-09-03
Semantic Web in Action
Doing Clever Things with the Semantic Web
Sem tech 2011 v8
Semantic IoT Semantic Inter-Operability Practices - Part 1
Repositories thru the looking glass
SADI SWSIP '09 'cause you can't always GET what you want!
Semantic Interoperability - grafi della conoscenza
Tutorial on Semantic Digital Libraries (WWW'2007)
Hypertext2007 Carole Goble Keynote - "The Return of the Prodigal Web"
Semantic Web: Technolgies and Applications for Real-World

Recently uploaded (20)

PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Cell Structure & Organelles in detailed.
PPTX
master seminar digital applications in india
PDF
Complications of Minimal Access Surgery at WLH
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Pre independence Education in Inndia.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Institutional Correction lecture only . . .
PPTX
Pharma ospi slides which help in ospi learning
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Computing-Curriculum for Schools in Ghana
102 student loan defaulters named and shamed – Is someone you know on the list?
O7-L3 Supply Chain Operations - ICLT Program
VCE English Exam - Section C Student Revision Booklet
PPH.pptx obstetrics and gynecology in nursing
Cell Structure & Organelles in detailed.
master seminar digital applications in india
Complications of Minimal Access Surgery at WLH
Abdominal Access Techniques with Prof. Dr. R K Mishra
Insiders guide to clinical Medicine.pdf
Microbial diseases, their pathogenesis and prophylaxis
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
human mycosis Human fungal infections are called human mycosis..pptx
Pre independence Education in Inndia.pdf
Supply Chain Operations Speaking Notes -ICLT Program
Institutional Correction lecture only . . .
Pharma ospi slides which help in ospi learning
TR - Agricultural Crops Production NC III.pdf
Renaissance Architecture: A Journey from Faith to Humanism
Module 4: Burden of Disease Tutorial Slides S2 2025
Computing-Curriculum for Schools in Ghana

How To Make Linked Data More than Data

  • 1. How To Make Linked Data More than DataSemantic Technology Conference 2010, June 23, 2010, San FranciscoPrateek Jain, Pascal Hitzler, Amit ShethKno.e.sis: Ohio Center of Excellence onKnowledge-enabled ComputingWright State University, Dayton, OHhttp://www.knoesis.orgPeter Z. Yeh, KunalVermaAccenture Technology LabsSan Jose, CA
  • 2. What is Semantic Web Semantics?Semantic Web Semantics:shareable (independent of your particular software)declarative (not dependent on imperative algorithms)computable (otherwise we don’t gain much) meaningYou can do Mashups without Semantic Web semantics.You can do information integration without Semantic Web semantics.You can do most things without Semantic Web semantics.But then it will be one-off, less scalable, less reusable.
  • 3. What Is Semantic Web Semantics?Semantic Web requires a shareable, declarative and computablesemantics.I.e., the semantics must be a formal entity which is clearly defined and automatically computable.Ontology languages provide this by means of their formal semantics.Semantic Web Semantics is given by a relation – the logical consequence relation.Note: This is considerably more than saying that the semantics of an ontology is the set of its logical consequences!
  • 4. In other wordsWe capture the meaning of information not by specifying its meaning directly (which is impossible) but by specifying, precisely, how information interacts with other information.We describe the meaning indirectly through its effects. - An example (from LoD) of unintended errors when adequate semantics is not used: Linked MDB links to Dbpedia URI for Hollywood for country 
  • 5. Linked Open DataWhere is the semantics?
  • 7. Example: GovTrack“Nancy Pelosi voted in favor of the Health Care Bill.”Vote: 2009-887vote:hasOptionVotes:2009-887/+vote:votevote:votedByrdfs:labelAyevote:hasActionpeople/P000197Where is the semantics?H.R. 3962: Affordable Health Care for America Actdc:titlenameOn Passage: H R 3962 Affordable Health Care for America ActNancy Pelosidc:titleBills:h3962
  • 8. Don’t get us wrongLinked Open Data is great, useful, cool, and a very important step.But if we stay semantics-free, Linked Open Data will be of limited usefulness!
  • 9. The Semantic Data Web Layer CakeTo leverage LoD, we require schema knowledgeapplication-type driven (reusable for same kind of application)less messy than LoD(as required by application)overarching several LoD datasets (as required by application)...ApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplicationApplication...SchemaSchemaSchemaSchemaless messyLinked Open DatamessyhumaneyesonlyTraditional Web content
  • 10. Schema on top of the LoD cloud
  • 11. Schema on top of the LOD CloudObvious solution to create an ontology capturing the relationships on top of the LOD Schema datasets.Perform a matching of the LOD Schemas using state of the art ontology matching tools.The datasets can be mapped to an upper level ontology which can capture the relationships.Considering the size, heterogeneity and complexity of LOD, at least have results which can be curated by a human being.
  • 12. LOD Schema Alignment using state of the art tools
  • 13. LOD Schema Alignment State of the art Ontology Alignment systems have difficulty in matching LOD Schemas!
  • 14. Nation = Menstruation, Confidence=0.9 
  • 15. They are tuned to perform on the established benchmarks, but do not seem to work well in more unconstrained/preselected cases. Most current systems excel on Ontology Alignment Evaluation Initiative Benchmark.
  • 16. LOD Schemas are of very different nature
  • 17. Created by community for community.
  • 18. LOD has so far emphasized number of instances, not number of meaningful relationships.
  • 19. Require solutions beyond syntactic and structural matching.Research AgendaTwo componentsEnrich schemas to capture semantics – how data in different datasets/bubbles are logically related (BLOOM)Support Federated Queries – a system that automates query processing involving multiple, related datasets (LOCUS)
  • 20. Step 1: Enrich SchemasBLOOMS – Bootstrapping based Linked Open Data Ontology Matching Systems.
  • 21. Step 1: Semantic EnrichmentBLOOMS – Bootstrapping based Linked Open Data Ontology Matching Systems.At the highest level of abstraction our approach takes in two different ontologies and tries to match them using the following steps (1) Using Alignment API to identify direct correspondences. (2) Using the categorization of concepts using Wikipedia. (3) Running a reasoner on the results found using step (2) and directly on the ontologies.
  • 22. Creation Wikipedia Category HierarchyUtilizes the Wikipedia Web service to identify the matching concepts.Thus for the term Conductor the following definitions are obtainedElectrical ConductorConductingConductor_(album)Conductor (architecture)Mr. ConductorConductor (ring theory)These terms correspond to articles on Wikipedia for the concepts in the ontology.
  • 23. Build Category TreeNext step utilize the Web service for identifying Wikipedia categories for building the Wikipedia category tree.ConductorElectrical conductorConductingConductor (album)cat:Occupations_in_musiccat:Musical_Terminologycat:Musical_Notationcat:Music performance
  • 24. For each different sense of concept c, match it with the different possible senses of the c’.ArtistConductorcat: Arts occupationsConductingcat:Occupations_in_musiccat:Music performancecat: Arts_occupations
  • 25. Connected ClassesUsing the position of the categories identify the relationships.ConductorIs-aConductingArtistcat:Music performancecat:Occupations_in_musiccat: Arts_occupationsPonzetto & Strube, 2007Thus this helps in identifying approximately the relationship between the various concepts.
  • 26. Disconnected ClassesSome senses do not relate to each otherConductorArtistConductor_(transportation)cat:Occupations_in_musiccat:Bus_Transportcat: :Transportation occupationscat: Arts_occupationscat: TransportationThus this helps in identifying disconnected relationships.
  • 27. Equivalent ClassesSome senses are identical to each otherLady_FingerOkracat: AbelmoschusOkracat: Hibisceaecat: Abelmoschuscat: Hibisceaecat: MalvoideaeThus this helps in identifyingequivalence relationships.
  • 28. LOD Schema Alignment using BLOOMS Testing done on 10 different pairs of LOD schemas
  • 29. Linked Schema’sDBpedia OntologyMusic Ontology SchemaJamendoMusic BrainzDBTunesGeonamesSWCPisaIEEEBBC ProgramACMFOAFSIOCAKT Portal Ontology
  • 30. ObservationsHeavy connections at instance level, do not translate to schema level.Case in point: Geonames and Dbpedia. only SpatialThing in Geonames matches to Dbpedia concepts.No connections at instance level, DOES NOT mean anything.Case in point: Dbpedia and AKT Reference Ontology have over 100+ relationship between concepts.Possibility to create links between instance level. Example: Dbpedia “Scientist” Class can contain “Computer Scientist”.Schema level connections and reasoning can be used for cleaning up LOD Cloud.dbpedia:Hollywoodrdf:typedbpedia:Countrydbpedia:CountrydisjointWithuscensus:Communityuscensus:Hollywoodrdf:typeuscensus:Community
  • 31. Step 2: Integrated Access/Federated QueryingLOQUS: Linked Open Data SPARQL Querying System (LOQUS)
  • 32. Federated QueryingTransform a query and broadcast it to a group of disparate and relevant datasets with the appropriate syntax.Merging the results collected from the datasets.Presenting them succinctly and unified format with least duplication.Automatically sort the merged result set.
  • 33. Federated Querying ChallengesUser is required to have intimate knowledge about the domain of datasets.User needs to understand the exact structure of datasets.For each relevant dataset user needs to form separate queries.Entity disambiguation has to be performed on similar entities.Retrieved results have to be processed and merged.
  • 34. Querying Federated SourcesIdentify artists, whose albums have been tagged as punk and the population of the places they are based near.
  • 36. Querying the DatasetsMusicOntologyGive me artists with punk as genre and their locations?GeonamesDataGive me the identifier used by Census Bureau for geographic locations? CensusDataGive me population figures of geographical entities?
  • 37. LOQUSLinked Open Data SPARQL Querying System.User can pose federated queries without having to know the exact structure and links between the different datasets.Automatically maps user’s query to the relevant datasets using mapping repository created using BLOOMS.Executes individual queries and merges the results into a single, complete answer.
  • 38. Traditionally to Retrieve ResultsUser has to ….Music DataGeographic DataCensus DataPerform disambiguationPerform Union and JoinProcess Results
  • 39. LOQUS ArchitectureA single source of reference consisting of mapping to the specific LOD datasets.Module to identify concepts contained in the query and perform the translations to the LOD cloud datasets.Module to split the query mapped to LOD datasets concepts into sub-queries corresponding to different datasets.Module to execute the queries remotely and process the results and deliver the final result to the user.
  • 40. Querying using LOQUSGive me artists with punk as genre and their locations?Identify artists, whose albums have been tagged as punk and the population of the places they are based near.Music DataGive me artists with punk as genre and their locations?Give me the identifier used by Census Bureau for geographic locations? LOQUSGive me the identifier used by Census Bureau for geographic locations? Query is decomposed into sub-queriesUser looks up mapping repository to identify concepts of interest and formulates queryQuery is routed to the appropriate datasetGeographic DataGive me population figures of geographical entities?Census DataGive me population figures of geographical entities?Mapping Repository
  • 41. Querying Using LOQUSMusic DataResults are returned for the sub-queries.LOQUSGeographic DataCensus Data
  • 42. LOQUS Processes Partial ResultsPartial results are processed for union, join and disambiguation by LOQUS.LOQUS
  • 43. Results are Returned to UserLOQUS combines the results and presents them back to the user.
  • 44. Technology StackProprietarysoftwareLOQUSBLOOMSOpen Source TechnologiesJena/ARQSPARQLRDFLinked Open Data cloudJava
  • 45. LOQUS AdvantageLOQUS expects just the query from the user and does rest of the work .
  • 46. Pre-requisitesLOQUS requires an upper level ontology for query federation
  • 47. Requiring mapping of upper level ontology such as SUMO to the various LOD datasets.Why not use existing ontology mapping tools for this?Ontology mapping tools work well on benchmarks, but give poor performance outside of it.Need for tools which go beyond lexical analysis and use of dictionaries.ConclusionsLOD cloud is an important start, but more needs to be done to make it useful – esp to make integrated use of multiple datasetsSemantic relationships and descriptions across ontologies is a key enabler to provide integrated access/use (for example, federated queries)
  • 48. Conclusions…. continuedBLOOMS is one approach for semi-automatically linking different ontologies A new approach for ontology mapping that leverages knowledge in DBPediaA more semantic LOD cloud can enable more intelligent applications such as open question answeringLOQUS shows how enriched schemas can enable automatic federated queries, making LOD significantly more useful
  • 49. ReferencesPrateek Jain, Pascal Hitzler, Peter Z. Yeh, KunalVerma, Amit P. Sheth, Linked Data is Merely More Data , AAAI Spring Symposium "Linked Data Meets Artificial Intelligence",March 22-24, 2010 Prateek Jain, KunalVerma, Pascal Hitzler, Peter Z. Yeh, Amit P. Sheth, “LOQUS: Linked Open Data SPARQL Querying System”
  • 50. Thanks!This work is funded primarily by NSF Award:IIS-0842129, titled ''III-SGER: Spatio-Temporal-Thematic Queries of Semantic Web Data: a Study of Expressivity and Efficiency''.More at Kno.e.sis – Ohio Center of Excellence on Knowledge-enabled Computing: http://knoesis.org

Editor's Notes

  • #19: For each concept in the ontology , do a text search using Wikipedia webservice. Using that try to identify the articles which are related to these terms. Once these different terms are identified, build their category trees. The category trees are built upto level 4, since after that, the category tree is too abstract and not much useful for this particular purpose of Ontology Matching.
  • #20: Take the category of each of these senses and compare them. For example for Conductor, its different senses would be Conducting, Conducting_Album and so on. Try to compare each of these senses to each other. Thus the sense Conducting is being matched here to the term Artist.
  • #21: Wikipedia categorization has been demonstrated as a taxonomy in the work of : Ponzetto, S.P., Strube, M.: Deriving a large scale taxonomy from Wikipedia. In: AAAI’07: Proceedings of the 22nd national conference on Artificial intelligence, AAAI Press (2007) 1440–1445.The overlap of the two categorization trees helps us in determining the relationship between the trees. The overlap is a numerical amount (threshold) which can be specified by the user. The numerical amount depends on a rough heuristics: (1) If the two ontologies to be matched are of similar domains such as AKT Reference Ontology and Semantic Web Ontology (Publication Domain), then use a higher threshold. It means terms require a tighter integration. (2) If utilizing an upper level ontology, then terms will be abstract. Hence utilize a lower threshold for that. It depends on the kind of results user wants to obtain. To want a High Precision & Low Recall, choose a high threshold. To want a Low Precision & High Recall, choose a low threshold.
  • #22: Some senses do not related to each other at all. They do not share any common categories or instances.
  • #23: Wikipedia since its rich in language and terms, can help in identifying things which can’t be matched using normal syntactic tools.
  • #24: System-1: Alignment APISystem-2: OMViaUO – Our approach outperforms actually 5 different state of the art systems published in the recent past.
  • #26: 1.Linked Open Data Cloud isn’t complete in terms of its linkage2. Possibility to add lot more meaningful connections which are motivated from the direction of Schema to Instance (Common-Sense) then the other way round. Unfortunately, as of now the other way round dominates.3. Using common reasoning, made possible through distributed and approximate reasoning, its possible to identify and clean the LOD Cloud. A lot of the messiness can be thrown away.
  • #36: Animation causes a mess in the textbox.