SlideShare a Scribd company logo
Can Semantics catch up with the Web?Axel PolleresISWSA2010Monday, 14/06/2010Amman, Jordan
Excellent tutorial here: http://www4.wiwiss.fu- berlin.de/bizer/pub/LinkedDataTutorial/ Linked Open DataGreat!So, Can we go home and declare success?Not yet……22
3Problem1: We’re lagging behind… From: S.Auer et al. Triplify - lightweight linked data publication from relational databases. WWW 2009.3
4Problem2: We’re overwhelmed… After a rough estimation, it looks like the services hosted on DBTune provide access to 13.1 billion triples, therefore making a significant addition to the data web!http://blog.dbtune.org/post/2008/04/02/DBTune-is-providing-131-billion-triples…However:Full DL Reasoners choke on far less…
… they’re not made for Web Data4
5Problem1: Too little Data… more details…HTML Web grows much faster… How to inject SW technology cleverly? … How to lift Web Data, how to reuse Semantic Web Data?Too little “agreed” vocabularies… How to build them?
Too little links/reuse … Reasoning to the rescue?5
How to inject SW technology cleverly?Example: Injecting SW Technology in Drupal6
7Digital Enterprise Research Institutewww.deri.ieLoads of Data on the Web in CMS...7
8Digital Enterprise Research Institutewww.deri.ieDemo site: http://drupal.deri.ie/projectblogs/So, here’s our idea of a CMS:8
Semantic Drupal:9Enables data mining techniques, text-analysis, reasoning, aggregation, trend detection over different platforms
10Digital Enterprise Research Institutewww.deri.ieWhere is it used?Science Collaboration Framework:Stembook (Stem Cell articles and reviews)http://www.stembook.org/10
11Digital Enterprise Research Institutewww.deri.ieISWC201011
Semantic DrupalOut-of-the-box Linked Data from any Drupal siteOut-of-the-box “site ontology”Out-of-the-box SPARQL endpointAdvanced: tie to existing vocabulariesAdvanced: import Data via SPARQLDrupal 6 modules:http://drupal.org/project/rdfcckhttp://drupal.org/project/evochttp://drupal.org/project/sparql_ephttp://drupal.org/project/rdfproxy12
13Digital Enterprise Research Institutewww.deri.ie* http://drupal.org/project/usage/drupalGood news from Drupal 7:RDF mapping feature committed to Drupal 7 coreRDFa output by default (blogs, forums, comments, etc.)using FOAF, SIOC, DC, SKOS.Download development snapshot http://ftp.drupal.org/files/projects/drupal-7.x-dev.tar.gzCurrently more than 200.000* sites on Drupal 6waiting to make the switch to Drupal 7waiting to massively increase the amount of RDF dataon the WebHuge boost for RDF on the Web!13
14How to lift Web Data, how to reuse Semantic Web Data?XSLT/XQueryHTMLRSS<XML/>XSPARQLSOAP/WSDLSPARQL14
15XQuery + SPARQL = XSPARQL
Example: SIOC-2-RSSXSPARQL+SIOC enables customised RSS export:16<channel><title>        {for $name         from <http://www.johnbreslin.com/blog/index.php?sioc_type=site>         where { [a sioc:Forum] sioc:name $name }         return $name}</title>        {for $seeAlso         from <http://www.johnbreslin.com/blog/index.php?sioc_type=site>         where { [a sioc:Forum] sioc:container_of [rdfs:seeAlso $seeAlso] }         return <item>                {for $title $descr $date                 from $seeAlso                 where { [a sioc:Post] dc:title $title ;                                       sioc:content $descr;                                       dcterms:created $date }                 return <title>$title</title>                        <description>$descr</description>                        <pubDate>$date</pubDate>}                 </item>RSS2.0“Great stuff,... I have not seen any SIOC to RSS xslt examples or vice versa” (John Breslin, creator of SIOC)
17Problem1: Too little Data… more details…HTML Web grows much faster… How to inject SW technology cleverly? … How to lift Web Data, how to reuse Semantic Web Data?Too little “agreed” vocabularies… How to build lightweight vocabularies?
Too little links/reuse … Reasoning to the rescue?17
… How to build lightweight vocabularies? An example:Semantic Interlinking of Online Community Sites (SIOC) –Seeding a Standard18
19 of 46
The SIOC ontologyThe main classes and properties are:20
The SIOC food chain21
Adoption of SIOC22
Dissemination23
Another example of leveraging SW Data: SMOB
Neologism is a web-based editor for RDF Schema vocabularies and lightweight OWL ontologies.Collaborate to create and maintain vocabularies and ontologiesPublish the vocabulary on the Web according to W3C and Linked Data best practices, with views for humans (HTML, graph) and machines (RDF/XML, Turtle) Import existing vocabulariesAlso works with external namespaces(e.g., via PURL.org)Based on the popular Drupal CMSMore at http://neologism.deri.ie/25 of XYZMaking ontology building more Web-user-friendly:http://vocab.deri.ie/25
26Problem2: We’re overwhelmed… After a rough estimation, it looks like the services hosted on DBTune provide access to 13.1 billion triples, therefore making a significant addition to the data web!http://blog.dbtune.org/post/2008/04/02/DBTune-is-providing-131-billion-triples…However:Full DL Reasoners choke on far less…
… they’re not made for Web Data26
27Simplified “added value” proposition of Semantic Search…“explicit” dataRDF“implicit” data? Via inference  usingOWL2, RDF Schema!Fig 1: RDF Web Dataset2727
Example: Finding experts/reviewers?	Tim Berners-Lee, Dan Connolly, LalanaKagal, YosiScharf, Jim Hendler: N3Logic: A logical framework for the World Wide Web. Theory and Practice of Logic Programming (TPLP), Volume 8, p249-269Who are the right reviewers? Who has the right expertise?Which reviewers are in conflict? Most of the necessary data already on the Web, even as RDF! 2828
Tim BL’s FOAF file…2929
DBLP as Linked DateGives unique URIs to authors, documents, etc. on DBLP! E.g., http://dblp.l3s.de/d2r/resource/authors/Tim_Berners-Lee, 	http://dblp.l3s.de/d2r/resource/publications/journals/tplp/Berners-LeeCKSH08Provides RDF version of all DBLP data + query interface! 3030
Data in RDF: TriplesDBLP: <http://dblp.l3s.de/…/journals/tplp/Berners-LeeCKSH08> rdf:type swrc:Article.<http://dblp.l3s.de/…/journals/tplp/Berners-LeeCKSH08>dc:creator	  <http://dblp.l3s.de/d2r/…/Tim_Berners-Lee> .  …<http://dblp.l3s.de/d2r/…/Tim_Berners-Lee> foaf:homepage       <http://www.w3.org/People/Berners-Lee/> .…<http://dblp.l3s.de/d2r/…/Dan_Brickley> foaf:name“Dan Brickley”^^xsd:string.Tim Berners-Lee’s FOAF file:<http://www.w3.org/People/Berners-Lee/card#i>foaf:knows		<http://dblp.l3s.de/d2r/…/Dan_Brickley> .<http://www.w3.org/People/Berners-Lee/card#i> rdf:type foaf:Person .<http://www.w3.org/People/Berners-Lee/card#i> foaf:homepage<http://www.w3.org/People/Berners-Lee/> .RDF Data online: Example3131
An example in SPARQL“Names of all persons who co-authored with authors of  http://dblp.l3s.de/d2r/…/Berners-LeeCKSH08or known by co-authors”SELECT ?Name WHERE 	{ <http://dblp.l3s.de/d2r/resource/publications/journals/tplp/Berners-LeeCKSH08> 	dc:creator ?Author. 		?D dc:creator ?Author. 		?D dc:creator ?CoAuthor.{  ?CoAuthor foaf:name ?Name . }UNION      { ?CoAuthor foaf:knows ?Person.          ?Person rdf:typefoaf:Person.     	  ?Person foaf:name ?Name }	}Doesn’t work… no foaf:knows relations in DBLP Needs Linked Data! E.g. TimBL’s FOAF file!3232
DBLP: <http://dblp.l3s.de/…/journals/tplp/Berners-LeeCKSH08> rdf:type swrc:Article.<http://dblp.l3s.de/…/journals/tplp/Berners-LeeCKSH08> dc:creator	  <http://dblp.l3s.de/d2r/…/Tim_Berners-Lee> .  …<http://dblp.l3s.de/d2r/…/Tim_Berners-Lee> foaf:homepage       <http://www.w3.org/People/Berners-Lee/> .Tim Berners-Lee’s FOAF file:<http://www.w3.org/People/Berners-Lee/card#i> foaf:knows 		<http://dblp.l3s.de/d2r/…/Dan_Brickley> .<http://www.w3.org/People/Berners-Lee/card#i> foaf:homepage<http://www.w3.org/People/Berners-Lee/> .33Back to the Data:Even if I have the FOAF data, I cannot answer the query:
Different identifiers used for Tim Berners-Lee
Who tells me that Dan Brickley is a foaf:Person?
Linked Data needs Reasoning!3333
The FOAF ontology…	  foaf:knows rdfs:domain foaf:Person  			Everybody who knows someone is a Person foaf:knows rdfs:range foaf:Person 			Everybody who is known is a Person foaf:Person rdfs:subclassOf foaf:Agent 			Everybody Person is an Agent. foaf:homepage rdf:type owl:inverseFunctionalProperty . 			A homepage uniquely identifies its owner (“key” property)…	343434
RDFS+OWL inference by rules 1/2Semantics of RDFS can be partially expressed as (Datalog like) rules:	rdfs1: { ?S rdf:type ?C } :- { ?S ?P ?O . ?P rdfs:domain ?C . }	rdfs2: { ?O rdf:type ?C } :- { ?S ?P ?O . ?P rdfs:range ?C . }   rdfs3: { ?S rdf:type ?C2 } :- {?S rdf:type ?C1 . ?C1 rdfs:subclassOf ?C2 . }cf. informative Entailment rules in [RDF-Semantics, W3C, 2004], [Muñoz et al. 2007]353535
RDFS+OWL inference by rules 2/2OWL Reasoning  e.g. inverseFunctionalProperty can also (partially) be expressed by Rules:owl1: { ?S1 owl:SameAs ?S2 } :-             { ?S1 ?P ?O . ?S2 ?P ?O . ?P rdf:type owl:InverseFunctionalProperty }owl2: { ?Y ?P ?O } :- { ?Xowl:SameAs?Y . ?X ?P ?O }owl3: { ?S ?Y ?O } :- { ?Xowl:SameAs?Y . ?S ?X ?O }owl4: { ?S ?P ?Y } :- { ?Xowl:SameAs?Y . ?S ?P ?X }cf.  pD* fragment of OWL, [ter Horst, 2005], or, more recent: OWL2 RL363636
RDFS+OWL inference by rules: Example:By rules of the previous slides we can infer additional information needed, e.g.TimBL’s FOAF:          <…/Berners-Lee/card#i> foaf:knows <…/Dan_Brickley> .	FOAF Ontology:foaf:knows rdfs:range foaf:Personby rdfs2             <…/Dan_Brickley> rdf:type   foaf:Person.	TimBL’s FOAF:<…/Berners-Lee/card#i> foaf:homepage<http://www.w3.org/People/Berners-Lee/> .	DBLP:		<…/dblp.l3s.de/d2r/…/Tim_Berners-Lee> foaf:homepage       		<http://www.w3.org/People/Berners-Lee/> .	FOAF Ontology:foaf:homepage rdfs:type owl:InverseFunctionalProperty.by owl1          <…/Berners-Lee/card#i> owl:sameAs <…/Tim_Berners-Lee>.Who tells me that Dan Brickley is a foaf:Person?  solved!
Different identifiers used for Tim Berners-Lee  solved!373737
38Web Reasoning: ChallengesScalabilityBillions or tens of billions of statements (for the moment)
Near linear scale!!!Noisy dataInconsistencies galore
Publishing errors
“Ontology hijacking”38
39Noisy Data: Omnipotent BeingProposition 1Web data is noisy.Proof:08445a31a78661b5c746feff39a9db6e4e2cc5cfsha1-sum of ‘mailto:’
common value for foaf:mbox_sha1sum
An inverse-functional (uniquely identifying) property!!!
Any person who shares the same value will be considered the sameQ.E.D.39
40Noisy Data: Redefining Everything…and home in time for teaMore Proof:From http://www.eiao.net/rdf/1.0<owl:Property rdf:about="http://www.w3.org/1999/02/22-rdf-syntax-ns#type">	<rdfs:label xml:lang="en">type</rdfs:label>	<rdfs:comment xml:lang="en">Type of resource</rdfs:comment>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#testRun"/>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#pageSurvey"/>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#siteSurvey"/>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#scenario"/>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#rangeLocation"/>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#startPointer"/>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#endPointer"/>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#header"/>	<rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#runs"/></owl:Property>Ontology hijacking!!40
41The Web…	…forecast is for muck41
42Okay, so let’s do forward-chaining OWL 2 RL on billions of triples collected from the Web…foaf:mbox_sha1sum a owl:InverseFunctionalProperty .?xfoaf:mbox_sha1sum 08445a31a78661b5c746feff39a9db6e4e2cc5cf .OWL 2 RL rule prp-ifp: ?p a owl:InverseFunctionalProperty . ?x1 ?p ?z . ?x2 ?p ?z . ⇒ ?x1 owl:sameAs ?x2 .104?x1/?x2bindings in body108 inferred pair-wise and reflexiveowl:sameAsstatements…or in simpler terms:pow!42

More Related Content

PPT
(Re-) Discovering Lost Web Pages
PPTX
RDFa Tutorial
PPTX
Querying the Web of Data
PDF
Publishing and Using Linked Data
PDF
Deploying PHP applications using Virtuoso as Application Server
PPTX
The Semantic Web #10 - SPARQL
PPTX
Saveface - Save your Facebook content as RDF data
PPTX
GDG Meets U event - Big data & Wikidata - no lies codelab
(Re-) Discovering Lost Web Pages
RDFa Tutorial
Querying the Web of Data
Publishing and Using Linked Data
Deploying PHP applications using Virtuoso as Application Server
The Semantic Web #10 - SPARQL
Saveface - Save your Facebook content as RDF data
GDG Meets U event - Big data & Wikidata - no lies codelab

What's hot (20)

PPTX
Linked Data Usecases
PDF
PDF
Linked Data and Tools
PDF
Semantic Web Applications in Libraries: The Road to BIBFRAME
PPTX
4 sw architectures and sparql
PPTX
BIBFRAME : the future of cataloguing?
PDF
Mon norton tut_queryinglinkeddata02
PDF
Linked Data and Archival Description: Confluences, Contingencies, and Conflicts
PDF
Archives & the Semantic Web
PDF
Is linked data something for me?
PPTX
Introduction to Semantic Web Technologies
PDF
KM Lecture 7 LOD
PPTX
Querying Linked Data on Android
PDF
Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...
PDF
Linked Data Tutorial
PDF
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
PPTX
Introduction to bibframe
KEY
Linked data: spreading data over the web
PPTX
SuRf – Tapping Into The Web Of Data
PDF
Yokohama Art Spot meets SPARQL
Linked Data Usecases
Linked Data and Tools
Semantic Web Applications in Libraries: The Road to BIBFRAME
4 sw architectures and sparql
BIBFRAME : the future of cataloguing?
Mon norton tut_queryinglinkeddata02
Linked Data and Archival Description: Confluences, Contingencies, and Conflicts
Archives & the Semantic Web
Is linked data something for me?
Introduction to Semantic Web Technologies
KM Lecture 7 LOD
Querying Linked Data on Android
Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...
Linked Data Tutorial
(An Overview on) Linked Data Management and SPARQL Querying (ISSLOD2011)
Introduction to bibframe
Linked data: spreading data over the web
SuRf – Tapping Into The Web Of Data
Yokohama Art Spot meets SPARQL
Ad

Viewers also liked (20)

PDF
CORNER: A Completeness Reasoner for SPARQL Queries over RDF Data Sources
PPTX
Expressing No-Value Information in RDF
PDF
On the Semantic Web, Completeness does Matter!
PDF
Expressing No-Value Information in RDF
PPTX
Managing Completeness of Data
PDF
Semantic Web: "ten year" update
PDF
"What is left to do?", Dublin Core 2012 Keynote
PPTX
Closing Session ISWC 2015
PDF
Antara Indonesia, Jerman, dan Italia
PDF
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
PDF
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
PPTX
10 Jahre Web Science
PPTX
ESWC 2015 Closing and "General Chair's minute of Madness"
PDF
2017 UniBZ Winter Seminar Poster: Managing and Consuming Completeness Informa...
PDF
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
PDF
Query-Driven Management of Linked Data Quality
PPTX
European Data Science Academy: Training the Next Generation of Data Scientists
PPT
Semantic Web: Intro
PDF
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
PDF
Entrepreneur Way #16 - Maret 2016
CORNER: A Completeness Reasoner for SPARQL Queries over RDF Data Sources
Expressing No-Value Information in RDF
On the Semantic Web, Completeness does Matter!
Expressing No-Value Information in RDF
Managing Completeness of Data
Semantic Web: "ten year" update
"What is left to do?", Dublin Core 2012 Keynote
Closing Session ISWC 2015
Antara Indonesia, Jerman, dan Italia
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
10 Jahre Web Science
ESWC 2015 Closing and "General Chair's minute of Madness"
2017 UniBZ Winter Seminar Poster: Managing and Consuming Completeness Informa...
Poster - Completeness Statements about RDF Data Sources and Their Use for Qu...
Query-Driven Management of Linked Data Quality
European Data Science Academy: Training the Next Generation of Data Scientists
Semantic Web: Intro
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Entrepreneur Way #16 - Maret 2016
Ad

Similar to 20100614 ISWSA Keynote (20)

PPT
Linked data and voyager
ODP
State of the Semantic Web
PPT
Triplificating and linking XBRL financial data
PPT
Semantic Web
PPTX
Linked Data and Locah, UKSG2011
PPT
Drupal and the Semantic Web
ZIP
SemWeb Fundamentals - Info Linking & Layering in Practice
PPT
Semantic web and Drupal: an introduction
PPT
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
KEY
When RDFa?
KEY
RDFa Introductory Course Session 4/4 When RDFa
PPT
Webofdata
PPT
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
PPTX
WWW09 - Triplify Light-Weight Linked Data Publication from Relational Databases
PPTX
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
PPTX
Architecture Patterns for Semantic Web Applications
PPT
Semantic Web 2.0: Creating Social Semantic Information Spaces
PPTX
Why SKOS should be a Focal Point of your Linked Data Strategy
PPTX
A Little SPARQL in your Analytics
PPT
Linked Data Overview - AGI Technical SIG
Linked data and voyager
State of the Semantic Web
Triplificating and linking XBRL financial data
Semantic Web
Linked Data and Locah, UKSG2011
Drupal and the Semantic Web
SemWeb Fundamentals - Info Linking & Layering in Practice
Semantic web and Drupal: an introduction
Do the LOCAH-Motion: How to Make Bibliographic and Archival Linked Data
When RDFa?
RDFa Introductory Course Session 4/4 When RDFa
Webofdata
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
WWW09 - Triplify Light-Weight Linked Data Publication from Relational Databases
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
Architecture Patterns for Semantic Web Applications
Semantic Web 2.0: Creating Social Semantic Information Spaces
Why SKOS should be a Focal Point of your Linked Data Strategy
A Little SPARQL in your Analytics
Linked Data Overview - AGI Technical SIG

Recently uploaded (20)

PDF
Complications of Minimal Access Surgery at WLH
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
RMMM.pdf make it easy to upload and study
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Pre independence Education in Inndia.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Sports Quiz easy sports quiz sports quiz
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
master seminar digital applications in india
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
Complications of Minimal Access Surgery at WLH
102 student loan defaulters named and shamed – Is someone you know on the list?
VCE English Exam - Section C Student Revision Booklet
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Anesthesia in Laparoscopic Surgery in India
RMMM.pdf make it easy to upload and study
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
STATICS OF THE RIGID BODIES Hibbelers.pdf
Pre independence Education in Inndia.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Sports Quiz easy sports quiz sports quiz
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
master seminar digital applications in india
2.FourierTransform-ShortQuestionswithAnswers.pdf
Renaissance Architecture: A Journey from Faith to Humanism

20100614 ISWSA Keynote

  • 1. Can Semantics catch up with the Web?Axel PolleresISWSA2010Monday, 14/06/2010Amman, Jordan
  • 2. Excellent tutorial here: http://www4.wiwiss.fu- berlin.de/bizer/pub/LinkedDataTutorial/ Linked Open DataGreat!So, Can we go home and declare success?Not yet……22
  • 3. 3Problem1: We’re lagging behind… From: S.Auer et al. Triplify - lightweight linked data publication from relational databases. WWW 2009.3
  • 4. 4Problem2: We’re overwhelmed… After a rough estimation, it looks like the services hosted on DBTune provide access to 13.1 billion triples, therefore making a significant addition to the data web!http://blog.dbtune.org/post/2008/04/02/DBTune-is-providing-131-billion-triples…However:Full DL Reasoners choke on far less…
  • 5. … they’re not made for Web Data4
  • 6. 5Problem1: Too little Data… more details…HTML Web grows much faster… How to inject SW technology cleverly? … How to lift Web Data, how to reuse Semantic Web Data?Too little “agreed” vocabularies… How to build them?
  • 7. Too little links/reuse … Reasoning to the rescue?5
  • 8. How to inject SW technology cleverly?Example: Injecting SW Technology in Drupal6
  • 9. 7Digital Enterprise Research Institutewww.deri.ieLoads of Data on the Web in CMS...7
  • 10. 8Digital Enterprise Research Institutewww.deri.ieDemo site: http://drupal.deri.ie/projectblogs/So, here’s our idea of a CMS:8
  • 11. Semantic Drupal:9Enables data mining techniques, text-analysis, reasoning, aggregation, trend detection over different platforms
  • 12. 10Digital Enterprise Research Institutewww.deri.ieWhere is it used?Science Collaboration Framework:Stembook (Stem Cell articles and reviews)http://www.stembook.org/10
  • 13. 11Digital Enterprise Research Institutewww.deri.ieISWC201011
  • 14. Semantic DrupalOut-of-the-box Linked Data from any Drupal siteOut-of-the-box “site ontology”Out-of-the-box SPARQL endpointAdvanced: tie to existing vocabulariesAdvanced: import Data via SPARQLDrupal 6 modules:http://drupal.org/project/rdfcckhttp://drupal.org/project/evochttp://drupal.org/project/sparql_ephttp://drupal.org/project/rdfproxy12
  • 15. 13Digital Enterprise Research Institutewww.deri.ie* http://drupal.org/project/usage/drupalGood news from Drupal 7:RDF mapping feature committed to Drupal 7 coreRDFa output by default (blogs, forums, comments, etc.)using FOAF, SIOC, DC, SKOS.Download development snapshot http://ftp.drupal.org/files/projects/drupal-7.x-dev.tar.gzCurrently more than 200.000* sites on Drupal 6waiting to make the switch to Drupal 7waiting to massively increase the amount of RDF dataon the WebHuge boost for RDF on the Web!13
  • 16. 14How to lift Web Data, how to reuse Semantic Web Data?XSLT/XQueryHTMLRSS<XML/>XSPARQLSOAP/WSDLSPARQL14
  • 17. 15XQuery + SPARQL = XSPARQL
  • 18. Example: SIOC-2-RSSXSPARQL+SIOC enables customised RSS export:16<channel><title> {for $name from <http://www.johnbreslin.com/blog/index.php?sioc_type=site> where { [a sioc:Forum] sioc:name $name } return $name}</title> {for $seeAlso from <http://www.johnbreslin.com/blog/index.php?sioc_type=site> where { [a sioc:Forum] sioc:container_of [rdfs:seeAlso $seeAlso] } return <item> {for $title $descr $date from $seeAlso where { [a sioc:Post] dc:title $title ; sioc:content $descr; dcterms:created $date } return <title>$title</title> <description>$descr</description> <pubDate>$date</pubDate>} </item>RSS2.0“Great stuff,... I have not seen any SIOC to RSS xslt examples or vice versa” (John Breslin, creator of SIOC)
  • 19. 17Problem1: Too little Data… more details…HTML Web grows much faster… How to inject SW technology cleverly? … How to lift Web Data, how to reuse Semantic Web Data?Too little “agreed” vocabularies… How to build lightweight vocabularies?
  • 20. Too little links/reuse … Reasoning to the rescue?17
  • 21. … How to build lightweight vocabularies? An example:Semantic Interlinking of Online Community Sites (SIOC) –Seeding a Standard18
  • 23. The SIOC ontologyThe main classes and properties are:20
  • 24. The SIOC food chain21
  • 27. Another example of leveraging SW Data: SMOB
  • 28. Neologism is a web-based editor for RDF Schema vocabularies and lightweight OWL ontologies.Collaborate to create and maintain vocabularies and ontologiesPublish the vocabulary on the Web according to W3C and Linked Data best practices, with views for humans (HTML, graph) and machines (RDF/XML, Turtle) Import existing vocabulariesAlso works with external namespaces(e.g., via PURL.org)Based on the popular Drupal CMSMore at http://neologism.deri.ie/25 of XYZMaking ontology building more Web-user-friendly:http://vocab.deri.ie/25
  • 29. 26Problem2: We’re overwhelmed… After a rough estimation, it looks like the services hosted on DBTune provide access to 13.1 billion triples, therefore making a significant addition to the data web!http://blog.dbtune.org/post/2008/04/02/DBTune-is-providing-131-billion-triples…However:Full DL Reasoners choke on far less…
  • 30. … they’re not made for Web Data26
  • 31. 27Simplified “added value” proposition of Semantic Search…“explicit” dataRDF“implicit” data? Via inference usingOWL2, RDF Schema!Fig 1: RDF Web Dataset2727
  • 32. Example: Finding experts/reviewers? Tim Berners-Lee, Dan Connolly, LalanaKagal, YosiScharf, Jim Hendler: N3Logic: A logical framework for the World Wide Web. Theory and Practice of Logic Programming (TPLP), Volume 8, p249-269Who are the right reviewers? Who has the right expertise?Which reviewers are in conflict? Most of the necessary data already on the Web, even as RDF! 2828
  • 33. Tim BL’s FOAF file…2929
  • 34. DBLP as Linked DateGives unique URIs to authors, documents, etc. on DBLP! E.g., http://dblp.l3s.de/d2r/resource/authors/Tim_Berners-Lee, http://dblp.l3s.de/d2r/resource/publications/journals/tplp/Berners-LeeCKSH08Provides RDF version of all DBLP data + query interface! 3030
  • 35. Data in RDF: TriplesDBLP: <http://dblp.l3s.de/…/journals/tplp/Berners-LeeCKSH08> rdf:type swrc:Article.<http://dblp.l3s.de/…/journals/tplp/Berners-LeeCKSH08>dc:creator <http://dblp.l3s.de/d2r/…/Tim_Berners-Lee> . …<http://dblp.l3s.de/d2r/…/Tim_Berners-Lee> foaf:homepage <http://www.w3.org/People/Berners-Lee/> .…<http://dblp.l3s.de/d2r/…/Dan_Brickley> foaf:name“Dan Brickley”^^xsd:string.Tim Berners-Lee’s FOAF file:<http://www.w3.org/People/Berners-Lee/card#i>foaf:knows <http://dblp.l3s.de/d2r/…/Dan_Brickley> .<http://www.w3.org/People/Berners-Lee/card#i> rdf:type foaf:Person .<http://www.w3.org/People/Berners-Lee/card#i> foaf:homepage<http://www.w3.org/People/Berners-Lee/> .RDF Data online: Example3131
  • 36. An example in SPARQL“Names of all persons who co-authored with authors of http://dblp.l3s.de/d2r/…/Berners-LeeCKSH08or known by co-authors”SELECT ?Name WHERE { <http://dblp.l3s.de/d2r/resource/publications/journals/tplp/Berners-LeeCKSH08> dc:creator ?Author. ?D dc:creator ?Author. ?D dc:creator ?CoAuthor.{ ?CoAuthor foaf:name ?Name . }UNION { ?CoAuthor foaf:knows ?Person. ?Person rdf:typefoaf:Person. ?Person foaf:name ?Name } }Doesn’t work… no foaf:knows relations in DBLP Needs Linked Data! E.g. TimBL’s FOAF file!3232
  • 37. DBLP: <http://dblp.l3s.de/…/journals/tplp/Berners-LeeCKSH08> rdf:type swrc:Article.<http://dblp.l3s.de/…/journals/tplp/Berners-LeeCKSH08> dc:creator <http://dblp.l3s.de/d2r/…/Tim_Berners-Lee> . …<http://dblp.l3s.de/d2r/…/Tim_Berners-Lee> foaf:homepage <http://www.w3.org/People/Berners-Lee/> .Tim Berners-Lee’s FOAF file:<http://www.w3.org/People/Berners-Lee/card#i> foaf:knows <http://dblp.l3s.de/d2r/…/Dan_Brickley> .<http://www.w3.org/People/Berners-Lee/card#i> foaf:homepage<http://www.w3.org/People/Berners-Lee/> .33Back to the Data:Even if I have the FOAF data, I cannot answer the query:
  • 38. Different identifiers used for Tim Berners-Lee
  • 39. Who tells me that Dan Brickley is a foaf:Person?
  • 40. Linked Data needs Reasoning!3333
  • 41. The FOAF ontology… foaf:knows rdfs:domain foaf:Person Everybody who knows someone is a Person foaf:knows rdfs:range foaf:Person Everybody who is known is a Person foaf:Person rdfs:subclassOf foaf:Agent Everybody Person is an Agent. foaf:homepage rdf:type owl:inverseFunctionalProperty . A homepage uniquely identifies its owner (“key” property)… 343434
  • 42. RDFS+OWL inference by rules 1/2Semantics of RDFS can be partially expressed as (Datalog like) rules: rdfs1: { ?S rdf:type ?C } :- { ?S ?P ?O . ?P rdfs:domain ?C . } rdfs2: { ?O rdf:type ?C } :- { ?S ?P ?O . ?P rdfs:range ?C . } rdfs3: { ?S rdf:type ?C2 } :- {?S rdf:type ?C1 . ?C1 rdfs:subclassOf ?C2 . }cf. informative Entailment rules in [RDF-Semantics, W3C, 2004], [Muñoz et al. 2007]353535
  • 43. RDFS+OWL inference by rules 2/2OWL Reasoning e.g. inverseFunctionalProperty can also (partially) be expressed by Rules:owl1: { ?S1 owl:SameAs ?S2 } :- { ?S1 ?P ?O . ?S2 ?P ?O . ?P rdf:type owl:InverseFunctionalProperty }owl2: { ?Y ?P ?O } :- { ?Xowl:SameAs?Y . ?X ?P ?O }owl3: { ?S ?Y ?O } :- { ?Xowl:SameAs?Y . ?S ?X ?O }owl4: { ?S ?P ?Y } :- { ?Xowl:SameAs?Y . ?S ?P ?X }cf. pD* fragment of OWL, [ter Horst, 2005], or, more recent: OWL2 RL363636
  • 44. RDFS+OWL inference by rules: Example:By rules of the previous slides we can infer additional information needed, e.g.TimBL’s FOAF: <…/Berners-Lee/card#i> foaf:knows <…/Dan_Brickley> . FOAF Ontology:foaf:knows rdfs:range foaf:Personby rdfs2  <…/Dan_Brickley> rdf:type foaf:Person. TimBL’s FOAF:<…/Berners-Lee/card#i> foaf:homepage<http://www.w3.org/People/Berners-Lee/> . DBLP: <…/dblp.l3s.de/d2r/…/Tim_Berners-Lee> foaf:homepage <http://www.w3.org/People/Berners-Lee/> . FOAF Ontology:foaf:homepage rdfs:type owl:InverseFunctionalProperty.by owl1  <…/Berners-Lee/card#i> owl:sameAs <…/Tim_Berners-Lee>.Who tells me that Dan Brickley is a foaf:Person?  solved!
  • 45. Different identifiers used for Tim Berners-Lee  solved!373737
  • 46. 38Web Reasoning: ChallengesScalabilityBillions or tens of billions of statements (for the moment)
  • 47. Near linear scale!!!Noisy dataInconsistencies galore
  • 50. 39Noisy Data: Omnipotent BeingProposition 1Web data is noisy.Proof:08445a31a78661b5c746feff39a9db6e4e2cc5cfsha1-sum of ‘mailto:’
  • 51. common value for foaf:mbox_sha1sum
  • 52. An inverse-functional (uniquely identifying) property!!!
  • 53. Any person who shares the same value will be considered the sameQ.E.D.39
  • 54. 40Noisy Data: Redefining Everything…and home in time for teaMore Proof:From http://www.eiao.net/rdf/1.0<owl:Property rdf:about="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"> <rdfs:label xml:lang="en">type</rdfs:label> <rdfs:comment xml:lang="en">Type of resource</rdfs:comment> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#testRun"/> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#pageSurvey"/> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#siteSurvey"/> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#scenario"/> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#rangeLocation"/> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#startPointer"/> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#endPointer"/> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#header"/> <rdfs:domain rdf:resource="http://www.eiao.net/rdf/1.0#runs"/></owl:Property>Ontology hijacking!!40
  • 56. 42Okay, so let’s do forward-chaining OWL 2 RL on billions of triples collected from the Web…foaf:mbox_sha1sum a owl:InverseFunctionalProperty .?xfoaf:mbox_sha1sum 08445a31a78661b5c746feff39a9db6e4e2cc5cf .OWL 2 RL rule prp-ifp: ?p a owl:InverseFunctionalProperty . ?x1 ?p ?z . ?x2 ?p ?z . ⇒ ?x1 owl:sameAs ?x2 .104?x1/?x2bindings in body108 inferred pair-wise and reflexiveowl:sameAsstatements…or in simpler terms:pow!42
  • 57. 43Our Approach……pragmatic approach, making the necessary compromises… …(and some more besides)43
  • 58. Apply a subset of OWL reasoning to the billion triple challenge datasetForward-chaining rule based approach, e.g.[ter Horst, 2005]Reduced output statements for the SWSE use case…Must be scalable, must be reasonable… incomplete w.r.t. OWL BY DESIGN!SCALABLE: Tailored rulesetfile-scan processingavoid joinsAUTHORITATIVE: Avoid Non-Authoritative inference(“hijacking”, “non-standard vocabulary use”)44SAOR: ScalableAuthoritative OWL Reasoner44
  • 59. Scalable ReasoningScan 1: Scan all data (1.1b statements), separate T-Box statements, load T-Box statements (8.5m) into memory, perform authoritative analysis.Scan 2: Scan all data and join all statements with in-memory T-Box .Only works for inference rules with 0-1 A-Box patternsNo T-Box expansion by inference Needs “tailored” ruleset4545
  • 60. Rules Applied: Tailored version of [ter Horst, 2005]46
  • 61. Good “excuses” to avoid G2 rulesThe obvious:G2 rules would need joins, i.e. to trigger restart of file-scanThe interesting one:Take for instance IFP rule:Maybe not such a good idea on real Web dataMore experiments including G2, G3 rules in [Hogan, Harth, Polleres, IJSWIS 2009]4747
  • 62. Authoritative ReasoningDocument D authoritative for concept C iff:C not identified by URIORDe-referenced URI of C coincides with or redirects to DFOAF spec authoritative for foaf:Person✓ MY spec not authoritative for foaf:Person✘Only allow extension in authoritative documentsmy:Person rdfs:subClassOf foaf:Person . (MY spec) ✓BUT: Reduce obscure membershipsfoaf:Person rdfs:subClassOf my:Person . (MY spec) ✘Similarly for other T-Box statements.In-memory T-Box stores authoritative values for rule executionOntology Hijacking4848
  • 63. Rules AppliedThe 17 rules applied including statements considered to be T-Box, elements which must be authoritatively spoken for (including for bnode OWL abstract syntax), and output count4949
  • 64. Authoritative Resoning covers rdfs: owl: vocabulary misusehttp://www.polleres.net/nasty.rdf:rdfs:subClassOf rdfs:subPropertyOf rdfs:Resource. rdfs:subClassOf rdfs:subPropertyOf rdfs:subPropertyOf. rdf:type rdfs:subPropertyOf rdfs:subClassOf. rdfs:subClassOf rdf:type owl:SymmetricProperty.Naïve rules application would infer O(n3) triples By use of authoritative reasoning SAOR/SWSE doesn’t stumble over these :rdfs :owl Hijacking5050
  • 65. PerformanceGraph showing SAOR’s rate of input/output statements per minute for reasoning on 1.1b statements: reduced input rate correlates with increased output rate and vice-versa5151
  • 66. ResultsSCAN 1:6.47 hrsIn-mem T-Box creation, authoritative analysis:SCAN 2:9.82 hrsScan reasoning – join A-Box with in-mem authoritative T-Box:1.925b new statements inferred in 16.29 hrsOn our agenda:More valuable insights on our experiences from Web dataG2 and G3 rules still difficult521.1b + 1.9b inferred = 3 billion triples in SWSE52
  • 67. Is that enough?Well, good starting points, we believe…
  • 68. … but still many open challenges…
  • 69. Parallelise Reasoning [Wevaer, Hendler ISWC2009, Urbani et al. ESWC2010] … still only for RDFS or synthetic data.
  • 70. Alternative approaches for Object consolidation needed, e.g. [Hogan et al. NeFoRS2010]
  • 71. Query live data [Harth et al. WWW2010]
  • 72. Full SPARQL querying (SPARQL 1.1)
  • 73. More on Data Quality on the Web [Hogan et al. LDOW2010]53
  • 74. Visit: http://pedantic-web.org/54Already several successes in finding/fixing: FOAF, dbpedia, NYtimes, even W3C specs… etc.
  • 75. Linked Open DataSo, Can we go home and declare success?Not yet… But a lot of work in the right direction ongoing! ……Good: leaves us some more research to do ;-)5555
  • 76. AcknowledgementsThis talk had a lot of work from different research groups in DERI:
  • 77. Unit for Social Software (SIOC - John Breslin, SMOB - Alexandre Passant and their students)
  • 78. Unit for Reasoning and Querying (SAOR – Aidan Hogan, XSPARQL – Nuno Lopes, Semantic Drupal – Stephane Corlosquet, Lin Clark)
  • 79. Other people involved: Stefan Decker, Andreas Harth, Thomas Krennwallner, …