SlideShare a Scribd company logo
Practical Cross-Dataset Queries
      on the Web of Data
   Tutorial @ WWW2012, Lyon, France
                  Richard
  Cyganiak, KnudMöller, AnjaJentzsch, An
     dreas Schultz, Robert Isele, Pablo
                 Mendes
The Web is becoming a platform for
          data exchange.
• Microdata, Schema.org, web APIs, Linked Data
  Cloud, Open Data movement, …
• Often need to combine local and remote data
  from several heterogeneous sources
• Scripting and mash-ups. This works, but can
  we do better?
SPARQL as a query language
             for the Web
• Data from all of these data sources can be
  converted to RDF using off-the-shelf tools, or
  the sources are already RDF.
• SPARQL is W3C's standard query language for
  RDF
• SPARQL 1.1 just out, great new features for
  working with heterogeneous data
Caveats
• We will focus on ad-hoc queries.
• This is not just about what works, but also
  about what doesn't work.
How to get data into RDF format
• Relational: R2RML standard; D2RQ, Virtuoso
  RDF Views, RevelytixSpyder
• Excel, CSV: RDF Extension for Google Refine,
  XLWrap
• XML: XSPARQL
• JSON: JSON-LD
• Microformats, Microdata: Apache Any23
• Collect data from many web pages: LDSpider
SPARQL: The big picture
Scenario: Remote SPARQL
        endpoint
         SPARQL client




        SPARQL Protocol




         SPARQL engine


              RDF
             Store
Scenario: Local SPARQL store
   SPARQL client   SPARQL engine


                        RDF
                       Store
Scenario: Local SPARQL engine,
load data from files on the fly, no store
                SPARQL client


                                   Local
                SPARQL engine       RDF
                                    file
                                     Conversion
                                           Non-
                                           RDF
                                            file


                   Remote
                     RDF
                     file
Scenario: CONSTRUCT the input data
                   SPARQL client


       Local                           Local
        RDF        SPARQL engine        RDF
        file                            file

      SPARQL                          SPARQL
    CONSTRUCT                       CONSTRUCT
       query                           query

   SPARQL engine                   SPARQL engine


        RDF                             RDF
       Store                           Store
Scenario: Federated Query
          SPARQL client


  Local
   RDF    SPARQL engine
   file


                     Basic Federated Query


                               SPARQL engine


                                    RDF
                                   Store
… or any combination of these.
Agenda – Morning
•   Linked Data Basics
•   SPARQL Basics
•   10:30–11:00 Coffee
•   Federated queries with SPARQL
•   Hands-on session 1
•   12:30–13:30 Lunch
Agenda – Afternoon
•   12:30–13:30 Lunch
•   Schema mapping with SPARQL CONSTRUCT
•   Instance matching with Silk
•   Finding RDF datasets
•   15:00–15:30 Coffee
•   Visualizing SPARQL query results
•   Hands-on session 2
•   17:00 Adjourn
Hands-on sessions
• USB sticks with data, queries, and instructions
• Install Apache Jena command line tools
• Need a browser with a JavaScript console
  (recommended: Firefox+Firebug or Chrome)
Music
Presenters
•   Richard Cyganiak, DERI
•   KnudMöller, Talis
•   AnjaJentzsch, FU Berlin
•   Andreas Schultz, FU Berlin
•   Robert Isele, FU Berlin
•   Pablo Mendes, FU Berlin
•   (Christophe Guéret, VUA)
•   (Michael Hausenblas, DERI)
Please interrupt and
   ask questions!

More Related Content

PDF
#MesosCon 2014: Spark on Mesos
PDF
Introduction to apache spark
PDF
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
PDF
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
PPTX
ELK at LinkedIn - Kafka, scaling, lessons learned
PDF
Elasitcsearch + Logstash + Kibana 日誌監控
PPTX
Scala and Spark are Ideal for Big Data
PDF
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
#MesosCon 2014: Spark on Mesos
Introduction to apache spark
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Migrating from Redshift to Spark at Stitch Fix: Spark Summit East talk by Sky...
ELK at LinkedIn - Kafka, scaling, lessons learned
Elasitcsearch + Logstash + Kibana 日誌監控
Scala and Spark are Ideal for Big Data
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...

What's hot (18)

PDF
Scala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
PDF
Introduction to apache spark
PDF
グラフデータベース Neptune 使ってみた
PPTX
Is there a SQL for NoSQL?
PDF
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
PPTX
Scala eXchange: Building robust data pipelines in Scala
PDF
20160512 apache-spark-for-everyone
PDF
データの民主化のために StackStorm を活用した事例
PDF
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
PPTX
Powering an API with GraphQL, Golang, and NoSQL
PDF
Big data workloads using Apache Sparkon HDInsight
PPTX
Apache Spark in Industry
PDF
Solr cloud the 'search first' nosql database extended deep dive
PPTX
seminar presentation on apache-spark
PDF
ストリーム処理を支えるキューイングシステムの選び方
PPTX
Building Enterprise Search Engines using Open Source Technologies
PDF
NigthClazz Spark - Machine Learning / Introduction à Spark et Zeppelin
Scala and Spark are Ideal for Big Data - Data Science Pop-up Seattle
Introduction to apache spark
グラフデータベース Neptune 使ってみた
Is there a SQL for NoSQL?
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
Scala eXchange: Building robust data pipelines in Scala
20160512 apache-spark-for-everyone
データの民主化のために StackStorm を活用した事例
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Powering an API with GraphQL, Golang, and NoSQL
Big data workloads using Apache Sparkon HDInsight
Apache Spark in Industry
Solr cloud the 'search first' nosql database extended deep dive
seminar presentation on apache-spark
ストリーム処理を支えるキューイングシステムの選び方
Building Enterprise Search Engines using Open Source Technologies
NigthClazz Spark - Machine Learning / Introduction à Spark et Zeppelin
Ad

Viewers also liked (6)

PPT
Assisting User Browsing over Linked Data: Requirements Elicitation with a Use...
PPTX
Lecture linked data cloud & sparql
PDF
Web Sémantique et Linked Open Data : des usages aux données, comment tirer p...
PDF
Consuming linked data by machines
PPTX
Information Extraction
PDF
Introduction au web des données (Linked Data)
Assisting User Browsing over Linked Data: Requirements Elicitation with a Use...
Lecture linked data cloud & sparql
Web Sémantique et Linked Open Data : des usages aux données, comment tirer p...
Consuming linked data by machines
Information Extraction
Introduction au web des données (Linked Data)
Ad

Similar to Practical Cross-Dataset Queries with SPARQL (Introduction) (20)

PPTX
Consuming Linked Data 4/5 Semtech2011
PDF
Ivan Herman - Semantic Web Activities @ W3C
PPTX
SRBench Streaming RDF SPARQL Benchmark
PDF
Linking the world with Python and Semantics
PDF
A Hands On Overview Of The Semantic Web
PPTX
Triplestore and SPARQL
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
ZIP
XSLT+SPARQL: Scripting the Semantic Web with SPARQL embedded into XSLT styles...
PDF
Adaptive Semantic Data Management Techniques for Federations of Endpoints
ODP
SPARQL 1.1 Update (2013-03-05)
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
PDF
The SPARQL Anything project
PDF
RejectKaigi2010 - RDF.rb
PDF
Querying Linked Data with SPARQL
PDF
RDFauthor (EKAW)
PDF
Querying Linked Data with SPARQL (2010)
PDF
RDF Seminar Presentation
PPTX
RDF-Gen: Generating RDF from streaming and archival data
PDF
Overview of the SPARQL-Generate language and latest developments
PPTX
SSONDE: Semantic Similarity On liNked Data Entities
Consuming Linked Data 4/5 Semtech2011
Ivan Herman - Semantic Web Activities @ W3C
SRBench Streaming RDF SPARQL Benchmark
Linking the world with Python and Semantics
A Hands On Overview Of The Semantic Web
Triplestore and SPARQL
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
XSLT+SPARQL: Scripting the Semantic Web with SPARQL embedded into XSLT styles...
Adaptive Semantic Data Management Techniques for Federations of Endpoints
SPARQL 1.1 Update (2013-03-05)
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
The SPARQL Anything project
RejectKaigi2010 - RDF.rb
Querying Linked Data with SPARQL
RDFauthor (EKAW)
Querying Linked Data with SPARQL (2010)
RDF Seminar Presentation
RDF-Gen: Generating RDF from streaming and archival data
Overview of the SPARQL-Generate language and latest developments
SSONDE: Semantic Similarity On liNked Data Entities

More from Richard Cyganiak (12)

PPTX
SHACL: Shaping the Big Ball of Data Mud
PPTX
What's New in RDF 1.1?
PDF
EDF2012: The Web of Data and its Five Stars
PPTX
VoID: Metadata for RDF Datasets
PPTX
How to Publish Open Data
PPTX
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
PPT
Investigating Community Implementation of the GoodRelations Ontology
PPTX
How to get your data into Sindice and Google with sitemap4rdf
PPTX
Self-Service Linked Government Data with dcat and Gridworks
PPTX
The State of Linked Government Data
PDF
What is SDMX-RDF?
PDF
dcat: An RDF vocabulary for interoperability of data catalogues
SHACL: Shaping the Big Ball of Data Mud
What's New in RDF 1.1?
EDF2012: The Web of Data and its Five Stars
VoID: Metadata for RDF Datasets
How to Publish Open Data
Sigma EE: Reaping low-hanging fruits in RDF-based data integration
Investigating Community Implementation of the GoodRelations Ontology
How to get your data into Sindice and Google with sitemap4rdf
Self-Service Linked Government Data with dcat and Gridworks
The State of Linked Government Data
What is SDMX-RDF?
dcat: An RDF vocabulary for interoperability of data catalogues

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Cloud computing and distributed systems.
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Spectroscopy.pptx food analysis technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Empathic Computing: Creating Shared Understanding
Cloud computing and distributed systems.
MYSQL Presentation for SQL database connectivity
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Big Data Technologies - Introduction.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Review of recent advances in non-invasive hemoglobin estimation
Spectroscopy.pptx food analysis technology
MIND Revenue Release Quarter 2 2025 Press Release
Spectral efficient network and resource selection model in 5G networks
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Building Integrated photovoltaic BIPV_UPV.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Chapter 3 Spatial Domain Image Processing.pdf
Network Security Unit 5.pdf for BCA BBA.

Practical Cross-Dataset Queries with SPARQL (Introduction)

  • 1. Practical Cross-Dataset Queries on the Web of Data Tutorial @ WWW2012, Lyon, France Richard Cyganiak, KnudMöller, AnjaJentzsch, An dreas Schultz, Robert Isele, Pablo Mendes
  • 2. The Web is becoming a platform for data exchange. • Microdata, Schema.org, web APIs, Linked Data Cloud, Open Data movement, … • Often need to combine local and remote data from several heterogeneous sources • Scripting and mash-ups. This works, but can we do better?
  • 3. SPARQL as a query language for the Web • Data from all of these data sources can be converted to RDF using off-the-shelf tools, or the sources are already RDF. • SPARQL is W3C's standard query language for RDF • SPARQL 1.1 just out, great new features for working with heterogeneous data
  • 4. Caveats • We will focus on ad-hoc queries. • This is not just about what works, but also about what doesn't work.
  • 5. How to get data into RDF format • Relational: R2RML standard; D2RQ, Virtuoso RDF Views, RevelytixSpyder • Excel, CSV: RDF Extension for Google Refine, XLWrap • XML: XSPARQL • JSON: JSON-LD • Microformats, Microdata: Apache Any23 • Collect data from many web pages: LDSpider
  • 6. SPARQL: The big picture
  • 7. Scenario: Remote SPARQL endpoint SPARQL client SPARQL Protocol SPARQL engine RDF Store
  • 8. Scenario: Local SPARQL store SPARQL client SPARQL engine RDF Store
  • 9. Scenario: Local SPARQL engine, load data from files on the fly, no store SPARQL client Local SPARQL engine RDF file Conversion Non- RDF file Remote RDF file
  • 10. Scenario: CONSTRUCT the input data SPARQL client Local Local RDF SPARQL engine RDF file file SPARQL SPARQL CONSTRUCT CONSTRUCT query query SPARQL engine SPARQL engine RDF RDF Store Store
  • 11. Scenario: Federated Query SPARQL client Local RDF SPARQL engine file Basic Federated Query SPARQL engine RDF Store
  • 12. … or any combination of these.
  • 13. Agenda – Morning • Linked Data Basics • SPARQL Basics • 10:30–11:00 Coffee • Federated queries with SPARQL • Hands-on session 1 • 12:30–13:30 Lunch
  • 14. Agenda – Afternoon • 12:30–13:30 Lunch • Schema mapping with SPARQL CONSTRUCT • Instance matching with Silk • Finding RDF datasets • 15:00–15:30 Coffee • Visualizing SPARQL query results • Hands-on session 2 • 17:00 Adjourn
  • 15. Hands-on sessions • USB sticks with data, queries, and instructions • Install Apache Jena command line tools • Need a browser with a JavaScript console (recommended: Firefox+Firebug or Chrome)
  • 16. Music
  • 17. Presenters • Richard Cyganiak, DERI • KnudMöller, Talis • AnjaJentzsch, FU Berlin • Andreas Schultz, FU Berlin • Robert Isele, FU Berlin • Pablo Mendes, FU Berlin • (Christophe Guéret, VUA) • (Michael Hausenblas, DERI)
  • 18. Please interrupt and ask questions!