SlideShare a Scribd company logo
Consuming Linked DataJuan F. SequedaSemantic Technology ConferenceJune 2011
Now what can we do with this data?
Linked Data ApplicationsSoftware system that makes use of data on the web from multiple datasets and that benefits from links between the datasets
Characteristics of Linked Data ApplicationsConsume data that is published on the web following the Linked Data principles: an application should be able to request, retrieve and process the accessed data
Discover further information by following the links between different data sources: the fourth principle enables this.
Combine the consumed linked data with data from sources (not necessarily Linked Data)
Expose the combined data back to the web following the Linked Data principles
Offer value to end-usersGeneric Applications
Linked Data Browsers
Linked Data BrowsersNot actually separate browsers. Run inside of HTML browsersView the data that is returned after looking up a URI in tabular formUser can navigate between data sources by following RDF Links(IMO) No usability
Consuming Linked Data 4/5 Semtech2011
Linked Data Browsershttp://browse.semanticweb.org/TabulatorOpenLinkDataexplorerZitgistMarblesExploratorDiscoLinkSailor
Linked Data (Semantic Web) Search Engines
Linked Data (Semantic Web) Search EnginesJust like conventional search engines (Google, Bing, Yahoo), crawl RDF documents and follow RDF links.Current search engines don’t crawl data, unless it’s RDFaHuman focus SearchFalcons - KeywordSWSE – KeyworkdVisiNav – Complex QueriesMachine focus SearchSindice – data instancesSwoogle - ontologiesWatson - ontologiesUberblic – curated integrated data instances
(Semantic) SEO ++Markup your HTML with RDFaUse standard vocabularies (ontologies)Google VocabularyGood RelationsDublin CoreGoogle and Yahoo will crawl this data and use it for better rendering
Consuming Linked Data 4/5 Semtech2011
On-the-fly Mashups
http://sig.ma
Domain Specific Applications
Domain Specific ApplicationsGovernmentData.govData.gov.ukhttp://data-gov.tw.rpi.edu/wiki/DemosMusicSeevl.netDbpedia MobileLife ScienceLinkedLifeDataSportsBBC World Cup
Faceted Browsers
http://dbpedia.neofonie.de/browse/
http://dev.semsol.com/2010/semtech/
Query your data
Find all the locations of all the original paintings of Modigliani
Select all proteins that are linked to a curated interaction from the literature and to inflammatory responsehttp://linkedlifedata.com/
SPARQL EndpointsLinked Data sources usually provide a SPARQL endpoint for their dataset(s)SPARQL endpoint: SPARQL query processing service that supports the SPARQL protocol*Send your SPARQL query, receive the result* http://www.w3.org/TR/rdf-sparql-protocol/
Where can I find SPARQL Endpoints?Dbpedia: http://dbpedia.org/sparqlMusicbrainz: http://dbtune.org/musicbrainz/sparqlU.S. Census: http://www.rdfabout.com/sparqlhttp://esw.w3.org/topic/SparqlEndpoints
Accessing a SPARQL EndpointSPARQL endpoints: RESTful Web servicesIssuing SPARQL queries to a remote SPARQL endpoint is basically an HTTP GET request to the SPARQL endpoint with parameter queryGET /sparql?query=PREFIX+rd... HTTP/1.1 Host: dbpedia.orgUser-agent: my-sparql-client/0.1URL-encoded string with the SPARQL query
Query Results FormatsSPARQL endpoints usually support different result formats:XML, JSON, plain text (for ASK and SELECT queries)RDF/XML, NTriples, Turtle, N3 (for DESCRIBE and CONSTRUCT queries)
Query Results FormatsPREFIX dbp: http://dbpedia.org/ontology/PREFIX dbpprop: http://dbpedia.org/property/SELECT ?name ?bdayWHERE {    ?pdbp:birthplace <http://dbpedia.org/resource/Berlin> .    ?pdbpprop:dateOfBirth ?bday .    ?pdbpprop:name ?name .}
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
Query Result FormatsUse the ACCEPT header to request the preferred result format:GET /sparql?query=PREFIX+rd... HTTP/1.1 Host: dbpedia.orgUser-agent: my-sparql-client/0.1 Accept: application/sparql-results+json
Query Result FormatsAs an alternative some SPARQL endpoint implementations (e.g. Joseki) provide an additional parameter outGET /sparql?out=json&query=... HTTP/1.1 Host: dbpedia.orgUser-agent: my-sparql-client/0.1
Accessing a SPARQL EndpointMore convenient: use a librarySPARQL JavaScript Libraryhttp://www.thefigtrees.net/lee/blog/2006/04 sparql_calendar_demo_a_sparql.htmlARC for PHPhttp://arc.semsol.org/RAP – RDF API for PHPhttp://www4.wiwiss.fu-berlin.de/bizer/rdfapi/index.html
Accessing a SPARQL EndpointJena / ARQ (Java)http://jena.sourceforge.net/Sesame (Java)http://www.openrdf.org/SPARQL Wrapper (Python)http://sparql-wrapper.sourceforge.net/PySPARQL (Python)http://code.google.com/p/pysparql/
Accessing a SPARQL EndpointExample with Jena/ARQimport com.hp.hpl.jena.query.*;String service = "..."; // address of the SPARQL endpoint String query = "SELECT ..."; // your SPARQL query QueryExecutione = QueryExecutionFactory.sparqlService(service, query)ResultSet results = e.execSelect(); while ( results.hasNext() ) {QuerySolutions = results.nextSolution(); 		// ...} e.close();
Querying a single dataset is quite boringcompared toIssuing queries over multiple datasets
Creating a Linked Data Application
Linked Data ArchitecturesFollow-up queriesQuerying Local CacheCrawlingFederated Query ProcessingOn-the-fly Dereferencing
Follow-up QueriesIdea: issue follow-up queries over other datasets based on results from previous queriesSubstituting placeholders in query templates
String s1 = "http://cb.semsol.org/sparql"; String s2 = "http://dbpedia.org/sparql";String qTmpl = "SELECT ?c WHERE{ <%s> rdfs:comment ?c }";String q1 = "SELECT ?s WHERE { ..."; QueryExecution e1 = QueryExecutionFactory.sparqlService(s1,q1); ResultSet results1 = e1.execSelect(); while ( results1.hasNext() ) {QuerySolution s1 = results.nextSolution(); 	String q2 = String.format( qTmpl, s1.getResource("s"),getURI() );QueryExecution e2= QueryExecutionFactory.sparqlService(s2,q2); ResultSet results2 = e2.execSelect(); 	while ( results2.hasNext() ) {		// ... 	}	e2.close();}e1.close();Find a list of companies Filtered by some criteria and return DbpediaURIs from them
Follow-up QueriesAdvantageQueried data is up-to-dateDrawbacksRequires the existence of a SPARQL endpoint for each datasetRequires program logicVery inefficient
Querying Local CacheIdea: Use an existing SPARQL endpoint that provides access to a set of copies of relevant datasetsUse RDF dumps of each datasetSPARQL endpoint over a majority of datasets from the LOD cloud at:http://uberblic.orghttp://lod.openlinksw.com/sparql
Querying a Collection of DatasetsAdvantage:No need for specific program logicIncludes the datasets that you wantComplex queries and high performanceEven reasoningDrawbacks:Depends on existence of RDF dumpRequires effort to set up and to operate the store How to keep the copies in sync with the originals?Queried data might be out of date
CrawlingCrawl RDF in advance by following RDF linksIntegrate, clean and store in your own triplestoreSame way we crawl HTML todayLDSpider
CrawlingAdvantages:No need for specific program logic Independent of the existence, availability, and efficiency of SPARQL endpointsComplex queries with high performanceCan even reason about the dataDrawbacks:Requires effort to set up and to operate the store How to keep the copies in sync with the originals?Queried data might be out of date
Federated Query ProcessingIdea: Querying a mediator which distributes sub-queries to relevant sources and integrates the results
Federated Query ProcessingInstance-based federationEach thing described by only one data source Untypical for the Web of DataTriple-based federationNo restrictions Requires more distributed joinsStatistics about datasets required (both cases)
Federated Query ProcessingDARQ (Distributed ARQ)http://darq.sourceforge.net/Query engine for federated SPARQL queriesExtension of ARQ (query engine for Jena)Last update: June 2006Semantic Web Integrator and Query Engine(SemWIQ)http://semwiq.sourceforge.net/Last update: March 2010Commercial…
Federated Query ProcessingAdvantages:No need for specific program logic Queried data is up to dateDrawbacks:Requires the existence of a SPARQL endpoint for each datasetRequires effort to set up and configure the mediator
In any case:You have to know the relevant data sourcesWhen developing the app using follow-up queriesWhen selecting an existing SPARQL endpoint over a collection of dataset copiesWhen setting up your own store with a collection of dataset copiesWhen configuring your query federation system You restrict yourself to the selected sources
In any case:You have to know the relevant data sourcesWhen developing the app using follow-up queriesWhen selecting an existing SPARQL endpoint over a collection of dataset copiesWhen setting up your own store with a collection of dataset copiesWhen configuring your query federation system You restrict yourself to the selected sourcesThere is an alternative: Remember, URIs link to data
On-the-fly DereferencingIdea: Discover further data by looking up relevant URIs in your application on the flyCan be combined with the previous approachesLinked Data Browsers
Link Traversal Based Query ExecutionApplies the idea of automated link traversal to the execution of SPARQL queriesIdea:Intertwine query evaluation with traversal of RDF linksDiscover data that might contribute to query results during query executionAlternately:Evaluate parts of the query Look up URIs in intermediate solutions
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query Execution
Link Traversal Based Query ExecutionAdvantages:No need to know all data sources in advanceNo need for specific programming logicQueried data is up to dateDoes not depend on the existence of SPARQL endpoints provided by the data sourcesDrawbacks:Not as fast as a centralized collection of copiesUnsuitable for some queriesResults might be incomplete (do we care?)
ImplementationsSemantic Web Client library (SWClLib) for Javahttp://www4.wiwiss.fu-berlin.de/bizer/ng4j/semwebclient/SWIC for Prologhttp://moustaki.org/swic/
ImplementationsSQUIN http://squin.orgProvides SWClLib functionality as a Web serviceAccessible like a SPARQL endpointInstall package: unzip and startLess than 5 mins!Convenient access with SQUIN PHP tools:$s = 'http:// ...'; // address of the SQUIN service $q = new SparqlQuerySock( $s, '... SELECT ...' ); $res = $q->getJsonResult();// or getXmlResult()

More Related Content

PPTX
Introduction to Linked Data 1/5
PPT
Linked Data Tutorial
ODP
Building a semantic website
PDF
The Digital Cavemen of Linked Lascaux
PPT
Web Search Engine
PPTX
What is the Semantic Web
PDF
Initial Usage Analysis of DBpedia's Triple Pattern Fragments
PPT
Library Linked Data and the Future of Bibliographic Control
Introduction to Linked Data 1/5
Linked Data Tutorial
Building a semantic website
The Digital Cavemen of Linked Lascaux
Web Search Engine
What is the Semantic Web
Initial Usage Analysis of DBpedia's Triple Pattern Fragments
Library Linked Data and the Future of Bibliographic Control

What's hot (20)

PPTX
Linked Data and Libraries: What? Why? How?
PDF
Webinar: Semantic web for developers
PPTX
Search Engines After The Semanatic Web
PPT
Realizing a Semantic Web Application - ICWE 2010 Tutorial
ODP
Linked Data
PPTX
Search engines powerpoint
PPTX
It19 20140721 linked data personal perspective
PDF
IIIF: Discovery of Resources
KEY
LIBRIS - Linked Library Data
PPTX
Usage of Linked Data: Introduction and Application Scenarios
PPTX
IIIF Foundational Specifications
PDF
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
PPT
Internet Research: Finding Websites, Blogs, Wikis, and More
PPTX
Semantic framework for web scraping.
PPT
Developing A Semantic Web Application - ISWC 2008 tutorial
PDF
Danbri Drupalcon Export
PDF
The Lonesome LOD Cloud
PPT
Linked library data
PPTX
ELUNA2013:Providing Voyager catalog data in a custom, open source web applica...
PDF
Internet and search engine
Linked Data and Libraries: What? Why? How?
Webinar: Semantic web for developers
Search Engines After The Semanatic Web
Realizing a Semantic Web Application - ICWE 2010 Tutorial
Linked Data
Search engines powerpoint
It19 20140721 linked data personal perspective
IIIF: Discovery of Resources
LIBRIS - Linked Library Data
Usage of Linked Data: Introduction and Application Scenarios
IIIF Foundational Specifications
Cataloger 3.0: Competencies and Education for the BIBFRAME Catalog
Internet Research: Finding Websites, Blogs, Wikis, and More
Semantic framework for web scraping.
Developing A Semantic Web Application - ISWC 2008 tutorial
Danbri Drupalcon Export
The Lonesome LOD Cloud
Linked library data
ELUNA2013:Providing Voyager catalog data in a custom, open source web applica...
Internet and search engine
Ad

Viewers also liked (20)

PPTX
Linked Data tutorial at Semtech 2012
PPTX
Consuming Linked Data SemTech2010
PPTX
Introduction to Linked Data
PPTX
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
PPTX
The State of Linked Government Data
PDF
Publishing and Using Linked Data
PPTX
Learning to assess Linked Data relationships using Genetic Programming
PPTX
Linked Open Data Principles, benefits of LOD for sustainable development
PPTX
An Approach for the Incremental Export of Relational Databases into RDF Graphs
PPTX
Materializing the Web of Linked Data
PPTX
Conclusions: Summary and Outlook
PPTX
Technical Background
PPTX
Incremental Export of Relational Database Contents into RDF Graphs
PPTX
Transient and persistent RDF views over relational databases in the context o...
PPTX
Deploying Linked Open Data: Methodologies and Software Tools
PPTX
Introduction: Linked Data and the Semantic Web
PDF
Publishing Linked Data from RDB
PPT
Linking KOS Data [using SKOS and OWL2]
PPTX
Entity Linking in Queries: Tasks and Evaluation
PPTX
From Research to Innovation: Linked Open Data and Gamification to Design Inte...
Linked Data tutorial at Semtech 2012
Consuming Linked Data SemTech2010
Introduction to Linked Data
RDB2RDF Tutorial (R2RML and Direct Mapping) at ISWC 2013
The State of Linked Government Data
Publishing and Using Linked Data
Learning to assess Linked Data relationships using Genetic Programming
Linked Open Data Principles, benefits of LOD for sustainable development
An Approach for the Incremental Export of Relational Databases into RDF Graphs
Materializing the Web of Linked Data
Conclusions: Summary and Outlook
Technical Background
Incremental Export of Relational Database Contents into RDF Graphs
Transient and persistent RDF views over relational databases in the context o...
Deploying Linked Open Data: Methodologies and Software Tools
Introduction: Linked Data and the Semantic Web
Publishing Linked Data from RDB
Linking KOS Data [using SKOS and OWL2]
Entity Linking in Queries: Tasks and Evaluation
From Research to Innovation: Linked Open Data and Gamification to Design Inte...
Ad

Similar to Consuming Linked Data 4/5 Semtech2011 (20)

PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
PPT
Re-using Media on the Web: Media fragment re-mixing and playout
PDF
Querying Linked Data with SPARQL
PPTX
Madrid SPARQL handson
PPTX
Why do they call it Linked Data when they want to say...?
PDF
Querying Linked Data with SPARQL (2010)
PDF
Sustainable queryable access to Linked Data
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
PPTX
Practical Cross-Dataset Queries with SPARQL (Introduction)
ODP
State of the Semantic Web
PDF
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
PDF
Hide the Stack: Toward Usable Linked Data
PDF
Web of Data Usage Mining
PDF
PDF
Querying data on the Web – client or server?
PPTX
Democratizing Big Semantic Data management
PDF
Linked Data
ODP
Linked Data
PPTX
What;s Coming In SPARQL2?
PDF
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Re-using Media on the Web: Media fragment re-mixing and playout
Querying Linked Data with SPARQL
Madrid SPARQL handson
Why do they call it Linked Data when they want to say...?
Querying Linked Data with SPARQL (2010)
Sustainable queryable access to Linked Data
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Practical Cross-Dataset Queries with SPARQL (Introduction)
State of the Semantic Web
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Hide the Stack: Toward Usable Linked Data
Web of Data Usage Mining
Querying data on the Web – client or server?
Democratizing Big Semantic Data management
Linked Data
Linked Data
What;s Coming In SPARQL2?
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...

More from Juan Sequeda (20)

PDF
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
PDF
Integrating Semantic Web in the Real World: A Journey between Two Cities
PDF
Integrating Relational Databases with the Semantic Web: A Reflection
PDF
Graph Query Languages: update from LDBC
PDF
Virtualizing Relational Databases as Graphs: a multi-model approach
PDF
Do I need a Graph Database?
PPTX
WTF is the Semantic Web and Linked Data
PPTX
WTF is the Semantic Web
PPTX
Drupal 7 and Semantic Web Hands-on Tutorial
PPTX
Free Money (a.k.a Fellowships)
PPTX
Conclusions - Linked Data
PPTX
Publishing Linked Data 3/5 Semtech2011
PPTX
Welcome to Linked Data 0/5 Semtech2011
PPTX
Creating Linked Data 2/5 Semtech2011
PPTX
Introduccion a la Web Semantica
PDF
Welcome to Consuming Linked Data tutorial WWW2010
PDF
Introduction to Linked Data - WWW2010
PDF
Consuming Linked Data by Humans - WWW2010
PDF
Consuming Linked Data by Machines - WWW2010
PDF
Linked Data Applications - WWW2010
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
Integrating Semantic Web in the Real World: A Journey between Two Cities
Integrating Relational Databases with the Semantic Web: A Reflection
Graph Query Languages: update from LDBC
Virtualizing Relational Databases as Graphs: a multi-model approach
Do I need a Graph Database?
WTF is the Semantic Web and Linked Data
WTF is the Semantic Web
Drupal 7 and Semantic Web Hands-on Tutorial
Free Money (a.k.a Fellowships)
Conclusions - Linked Data
Publishing Linked Data 3/5 Semtech2011
Welcome to Linked Data 0/5 Semtech2011
Creating Linked Data 2/5 Semtech2011
Introduccion a la Web Semantica
Welcome to Consuming Linked Data tutorial WWW2010
Introduction to Linked Data - WWW2010
Consuming Linked Data by Humans - WWW2010
Consuming Linked Data by Machines - WWW2010
Linked Data Applications - WWW2010

Recently uploaded (20)

PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Classroom Observation Tools for Teachers
PPTX
Institutional Correction lecture only . . .
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PPTX
Cell Types and Its function , kingdom of life
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
Pharma ospi slides which help in ospi learning
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Basic Mud Logging Guide for educational purpose
PDF
Business Ethics Teaching Materials for college
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Microbial disease of the cardiovascular and lymphatic systems
Classroom Observation Tools for Teachers
Institutional Correction lecture only . . .
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Cell Types and Its function , kingdom of life
Week 4 Term 3 Study Techniques revisited.pptx
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Pharma ospi slides which help in ospi learning
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
RMMM.pdf make it easy to upload and study
Renaissance Architecture: A Journey from Faith to Humanism
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Anesthesia in Laparoscopic Surgery in India
Module 4: Burden of Disease Tutorial Slides S2 2025
Final Presentation General Medicine 03-08-2024.pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
Basic Mud Logging Guide for educational purpose
Business Ethics Teaching Materials for college
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...

Consuming Linked Data 4/5 Semtech2011

  • 1. Consuming Linked DataJuan F. SequedaSemantic Technology ConferenceJune 2011
  • 2. Now what can we do with this data?
  • 3. Linked Data ApplicationsSoftware system that makes use of data on the web from multiple datasets and that benefits from links between the datasets
  • 4. Characteristics of Linked Data ApplicationsConsume data that is published on the web following the Linked Data principles: an application should be able to request, retrieve and process the accessed data
  • 5. Discover further information by following the links between different data sources: the fourth principle enables this.
  • 6. Combine the consumed linked data with data from sources (not necessarily Linked Data)
  • 7. Expose the combined data back to the web following the Linked Data principles
  • 8. Offer value to end-usersGeneric Applications
  • 10. Linked Data BrowsersNot actually separate browsers. Run inside of HTML browsersView the data that is returned after looking up a URI in tabular formUser can navigate between data sources by following RDF Links(IMO) No usability
  • 13. Linked Data (Semantic Web) Search Engines
  • 14. Linked Data (Semantic Web) Search EnginesJust like conventional search engines (Google, Bing, Yahoo), crawl RDF documents and follow RDF links.Current search engines don’t crawl data, unless it’s RDFaHuman focus SearchFalcons - KeywordSWSE – KeyworkdVisiNav – Complex QueriesMachine focus SearchSindice – data instancesSwoogle - ontologiesWatson - ontologiesUberblic – curated integrated data instances
  • 15. (Semantic) SEO ++Markup your HTML with RDFaUse standard vocabularies (ontologies)Google VocabularyGood RelationsDublin CoreGoogle and Yahoo will crawl this data and use it for better rendering
  • 25. Find all the locations of all the original paintings of Modigliani
  • 26. Select all proteins that are linked to a curated interaction from the literature and to inflammatory responsehttp://linkedlifedata.com/
  • 27. SPARQL EndpointsLinked Data sources usually provide a SPARQL endpoint for their dataset(s)SPARQL endpoint: SPARQL query processing service that supports the SPARQL protocol*Send your SPARQL query, receive the result* http://www.w3.org/TR/rdf-sparql-protocol/
  • 28. Where can I find SPARQL Endpoints?Dbpedia: http://dbpedia.org/sparqlMusicbrainz: http://dbtune.org/musicbrainz/sparqlU.S. Census: http://www.rdfabout.com/sparqlhttp://esw.w3.org/topic/SparqlEndpoints
  • 29. Accessing a SPARQL EndpointSPARQL endpoints: RESTful Web servicesIssuing SPARQL queries to a remote SPARQL endpoint is basically an HTTP GET request to the SPARQL endpoint with parameter queryGET /sparql?query=PREFIX+rd... HTTP/1.1 Host: dbpedia.orgUser-agent: my-sparql-client/0.1URL-encoded string with the SPARQL query
  • 30. Query Results FormatsSPARQL endpoints usually support different result formats:XML, JSON, plain text (for ASK and SELECT queries)RDF/XML, NTriples, Turtle, N3 (for DESCRIBE and CONSTRUCT queries)
  • 31. Query Results FormatsPREFIX dbp: http://dbpedia.org/ontology/PREFIX dbpprop: http://dbpedia.org/property/SELECT ?name ?bdayWHERE { ?pdbp:birthplace <http://dbpedia.org/resource/Berlin> . ?pdbpprop:dateOfBirth ?bday . ?pdbpprop:name ?name .}
  • 34. Query Result FormatsUse the ACCEPT header to request the preferred result format:GET /sparql?query=PREFIX+rd... HTTP/1.1 Host: dbpedia.orgUser-agent: my-sparql-client/0.1 Accept: application/sparql-results+json
  • 35. Query Result FormatsAs an alternative some SPARQL endpoint implementations (e.g. Joseki) provide an additional parameter outGET /sparql?out=json&query=... HTTP/1.1 Host: dbpedia.orgUser-agent: my-sparql-client/0.1
  • 36. Accessing a SPARQL EndpointMore convenient: use a librarySPARQL JavaScript Libraryhttp://www.thefigtrees.net/lee/blog/2006/04 sparql_calendar_demo_a_sparql.htmlARC for PHPhttp://arc.semsol.org/RAP – RDF API for PHPhttp://www4.wiwiss.fu-berlin.de/bizer/rdfapi/index.html
  • 37. Accessing a SPARQL EndpointJena / ARQ (Java)http://jena.sourceforge.net/Sesame (Java)http://www.openrdf.org/SPARQL Wrapper (Python)http://sparql-wrapper.sourceforge.net/PySPARQL (Python)http://code.google.com/p/pysparql/
  • 38. Accessing a SPARQL EndpointExample with Jena/ARQimport com.hp.hpl.jena.query.*;String service = "..."; // address of the SPARQL endpoint String query = "SELECT ..."; // your SPARQL query QueryExecutione = QueryExecutionFactory.sparqlService(service, query)ResultSet results = e.execSelect(); while ( results.hasNext() ) {QuerySolutions = results.nextSolution(); // ...} e.close();
  • 39. Querying a single dataset is quite boringcompared toIssuing queries over multiple datasets
  • 40. Creating a Linked Data Application
  • 41. Linked Data ArchitecturesFollow-up queriesQuerying Local CacheCrawlingFederated Query ProcessingOn-the-fly Dereferencing
  • 42. Follow-up QueriesIdea: issue follow-up queries over other datasets based on results from previous queriesSubstituting placeholders in query templates
  • 43. String s1 = "http://cb.semsol.org/sparql"; String s2 = "http://dbpedia.org/sparql";String qTmpl = "SELECT ?c WHERE{ <%s> rdfs:comment ?c }";String q1 = "SELECT ?s WHERE { ..."; QueryExecution e1 = QueryExecutionFactory.sparqlService(s1,q1); ResultSet results1 = e1.execSelect(); while ( results1.hasNext() ) {QuerySolution s1 = results.nextSolution(); String q2 = String.format( qTmpl, s1.getResource("s"),getURI() );QueryExecution e2= QueryExecutionFactory.sparqlService(s2,q2); ResultSet results2 = e2.execSelect(); while ( results2.hasNext() ) { // ... } e2.close();}e1.close();Find a list of companies Filtered by some criteria and return DbpediaURIs from them
  • 44. Follow-up QueriesAdvantageQueried data is up-to-dateDrawbacksRequires the existence of a SPARQL endpoint for each datasetRequires program logicVery inefficient
  • 45. Querying Local CacheIdea: Use an existing SPARQL endpoint that provides access to a set of copies of relevant datasetsUse RDF dumps of each datasetSPARQL endpoint over a majority of datasets from the LOD cloud at:http://uberblic.orghttp://lod.openlinksw.com/sparql
  • 46. Querying a Collection of DatasetsAdvantage:No need for specific program logicIncludes the datasets that you wantComplex queries and high performanceEven reasoningDrawbacks:Depends on existence of RDF dumpRequires effort to set up and to operate the store How to keep the copies in sync with the originals?Queried data might be out of date
  • 47. CrawlingCrawl RDF in advance by following RDF linksIntegrate, clean and store in your own triplestoreSame way we crawl HTML todayLDSpider
  • 48. CrawlingAdvantages:No need for specific program logic Independent of the existence, availability, and efficiency of SPARQL endpointsComplex queries with high performanceCan even reason about the dataDrawbacks:Requires effort to set up and to operate the store How to keep the copies in sync with the originals?Queried data might be out of date
  • 49. Federated Query ProcessingIdea: Querying a mediator which distributes sub-queries to relevant sources and integrates the results
  • 50. Federated Query ProcessingInstance-based federationEach thing described by only one data source Untypical for the Web of DataTriple-based federationNo restrictions Requires more distributed joinsStatistics about datasets required (both cases)
  • 51. Federated Query ProcessingDARQ (Distributed ARQ)http://darq.sourceforge.net/Query engine for federated SPARQL queriesExtension of ARQ (query engine for Jena)Last update: June 2006Semantic Web Integrator and Query Engine(SemWIQ)http://semwiq.sourceforge.net/Last update: March 2010Commercial…
  • 52. Federated Query ProcessingAdvantages:No need for specific program logic Queried data is up to dateDrawbacks:Requires the existence of a SPARQL endpoint for each datasetRequires effort to set up and configure the mediator
  • 53. In any case:You have to know the relevant data sourcesWhen developing the app using follow-up queriesWhen selecting an existing SPARQL endpoint over a collection of dataset copiesWhen setting up your own store with a collection of dataset copiesWhen configuring your query federation system You restrict yourself to the selected sources
  • 54. In any case:You have to know the relevant data sourcesWhen developing the app using follow-up queriesWhen selecting an existing SPARQL endpoint over a collection of dataset copiesWhen setting up your own store with a collection of dataset copiesWhen configuring your query federation system You restrict yourself to the selected sourcesThere is an alternative: Remember, URIs link to data
  • 55. On-the-fly DereferencingIdea: Discover further data by looking up relevant URIs in your application on the flyCan be combined with the previous approachesLinked Data Browsers
  • 56. Link Traversal Based Query ExecutionApplies the idea of automated link traversal to the execution of SPARQL queriesIdea:Intertwine query evaluation with traversal of RDF linksDiscover data that might contribute to query results during query executionAlternately:Evaluate parts of the query Look up URIs in intermediate solutions
  • 57. Link Traversal Based Query Execution
  • 58. Link Traversal Based Query Execution
  • 59. Link Traversal Based Query Execution
  • 60. Link Traversal Based Query Execution
  • 61. Link Traversal Based Query Execution
  • 62. Link Traversal Based Query Execution
  • 63. Link Traversal Based Query Execution
  • 64. Link Traversal Based Query Execution
  • 65. Link Traversal Based Query Execution
  • 66. Link Traversal Based Query Execution
  • 67. Link Traversal Based Query ExecutionAdvantages:No need to know all data sources in advanceNo need for specific programming logicQueried data is up to dateDoes not depend on the existence of SPARQL endpoints provided by the data sourcesDrawbacks:Not as fast as a centralized collection of copiesUnsuitable for some queriesResults might be incomplete (do we care?)
  • 68. ImplementationsSemantic Web Client library (SWClLib) for Javahttp://www4.wiwiss.fu-berlin.de/bizer/ng4j/semwebclient/SWIC for Prologhttp://moustaki.org/swic/
  • 69. ImplementationsSQUIN http://squin.orgProvides SWClLib functionality as a Web serviceAccessible like a SPARQL endpointInstall package: unzip and startLess than 5 mins!Convenient access with SQUIN PHP tools:$s = 'http:// ...'; // address of the SQUIN service $q = new SparqlQuerySock( $s, '... SELECT ...' ); $res = $q->getJsonResult();// or getXmlResult()
  • 71. What else?Vocabulary Mappingfoaf:namevsfoo:nameIdentity Resolutionex:Juanowl:sameAsfoo:JuanProvenanceData QualityLicense
  • 72. Getting Started Finding URIsUse search enginesFinding SPARQL Endpoints