SlideShare a Scribd company logo
Validating and Describing 
Linked Data Portals using 
RDF Shape Expressions 
Eric Prud'hommeaux 
World Wide Web 
Consortium 
MIT, Cambridge, MA, USA 
eric@w3.org 
Jose Emilio Labra Gayo 
WESO Research group 
University of Oviedo 
Spain 
labra@uniovi.es 
Harold Solbrig 
Mayo Clinic 
USA 
College of Medicine, Rochester, 
MN, USA 
Jose María Álvarez Rodríguez 
Dept. Computer Science 
Carlos III University 
Spain 
josemaria.alvarez@uc3m.es
This talk in one slide 
Shape Expressions 
Simple language to describe and validate RDF 
Can be applied to linked data portals 
Use case: WebIndex project
Web Index 
Measure WWW's contribution to development and 
human rights by country 
Developed by the Web Foundation 
81 countries, 116 indicators, 5 years (2007-12) 
Linked data portal 
http://data.webfoundation.org/webindex/2013
Webindex workflow 
Data 
(Excel) 
RDF 
Datastore 
Visualizations 
Linked data portal 
Conversion 
Excel  RDF 
Enrichment
WebIndex data model 
Observations have values by years 
Observations refer to indicators and countries 
ITU_B 2011 2012 2013 ... 
Germany 20.34 35.46 37.12 ... 
Spain 19.12 23.78 25.45 ... 
France 20.12 21.34 28.34 ... 
... ... ... ... ... 
ITU_B 2011 2012 2013 ... 
Germany 20.34 35.46 37.12 ... 
Spain 19.12 23.78 25.45 ... 
France 20.12 21.34 28.34 ... 
... ... ... ... ... 
ITU_B 2010 2011 2012 ... 
Germany 20.34 35.46 37.12 ... 
Spain 19.12 23.78 25.45 ... 
France 20.12 21.34 28.34 ... 
... ... ... ... ... 
Model based on RDF Data Cube 
Main entity = Observation 
DataSets are published by Organizations 
Datasets contain several slices 
Slices group observations 
Observation 
Indicator Years 
% Broadband subscribers 
Countries 
Slice 
DataSet 
Indicators are provided by Organizations 
Examples 
ITU = International Telecommunication Union 
UN = United Nations 
WB = World bank 
...
Main webIndex data model* 
1..n 
qb:slice 
Observation 
qb:observation 
rdf:type = qb:Observation 
cex:value: xsd:float 
dc:issued: xsd:dateTime 
rdfs:label: xsd:String 
cex:ref-year: xsd:gYear 
Organization 
rdf:type = org:Organization 
rdfs:label: xsd:String 
foaf:homepage: IRI 
Indicator 
rdf:type = cex:Primary 
| cex:Secondary 
rdfs:label: xsd:string 
rdfs:comment: xsd:string 
skos:notation: xsd:String 
Slice 
rdf:type = qb:Slice 
qb:sliceStructure: wf:sliceByArea 
Country 
rdf:type = wf:Country 
wf:iso2 : xsd:string 
wf:iso3 : xsd:string 
rdfs:label : xsd:String 
DataSet 
rdf:type = qb:DataSet 
qb:structure : wf:DSD 
rdfs:label : xsd:String 
qb:observation 
cex:ref-area cex:indicator 
wf:provider 
dc:publisher 
1..n 
*Simplified
Excel  RDF (Turtle) 
indicator:ITU_B 
a wf:SecondaryIndicator ; 
rdfs:label "Broadband subscribers %" 
. 
dataset:DITU a qb:DataSet ; 
rdfs:label "ITU Dataset" ; 
dc:publisher org:ITU ; 
qb:slice slice:ITU10B , 
slice:ITU11B, 
. ... 
... 
slice:ITU11B a qb:Slice ; 
qb:sliceStructure wf:sliceByYear ; 
qb:observation obs:obs8165, 
obs:obs8166, 
... 
... 
org:ITU a org:Organization ; 
rdfs:label "ITU" ; 
foaf:homepage <http://www.itu.int/> 
. 
country:Spain a wf:Country ; 
wf:iso2 "ES" ; wf:iso3 "ESP" ; 
rdfs:label "Spain" 
. 
interrelated 
linked 
data 
obs:obs8165 a qb:Observation ; 
rdfs:label "ITU B in ESP, 2011" ; 
cex:indicator indicator:ITU_B ; 
qb:dataSet dataset:DITU ; 
cex:value "23.78"^^xsd:float ; 
cex:ref-year 2011 ; 
cex:ref-area country:Spain ; 
dc:issued "2013-05-30"^^xsd:date ; 
... 
.
Description and Validation 
Lots of constraints 
Observations must be linked to some country 
Observations have a float value 
Observations are related with an indicator, a country 
and a year 
Dataset contains several slices and slices contain 
several observations 
....etc. 
Q: How can we express those constraints easily? 
Our proposal: Shape expressions
Shape Expressions 
A simple and intuitive language to: 
Describe the topology of RDF data 
Validate that an RDF graph matches a shape 
Two syntaxes 
- Compact syntax (inspired by RelaxNG, Turtle and SPARQL) 
- RDF 
More details about ShEx: 
Paper in Semantics-2014 (5th Sept, 10:30h) 
http://www.w3.org/2013/ShEx/Primer
Country 
A <Country> has at least the following properties: 
rdf:type with value wf:Country 
rdfs:label with value of type xsd:string 
wf:iso2 with value of type xsd:string 
wf:iso3 with value of type xsd:string 
Using shape Expressions: 
Label Open shape 
<Country> { 
rdf:type (wf:Country) 
, rdfs:label xsd:string 
, wf:iso2 xsd:string 
, wf:iso3 xsd:string 
} 
Conjunction
DataSets 
A <DataSet> has the shape: 
rdf:type with value qb:Dataset 
qb:structure with value wf:DSD 
Optional rdfs:label with value of type xsd:string 
One or more qb:slice with shape <Slice> 
<DataSet> { 
rdf:type (qb:DataSet) 
, qb:structure (wf:DSD) 
, dc:publisher @<Organization> 
, rdfs:label xsd:string ? 
, qb:slice @<Slice>+ 
} 
Cardinality posibilities: 
* (0 or more) 
? (0 or 1) 
+ (1 or more) 
{m,n} between m and n
Slices 
<Slice> has the properties: 
rdf:type with value qb:Slice 
qb:SliceStructure Whar does with it value mean? 
wf:sliceByYear 
Several qb:observation with shape <Observation> 
cex:indicator with shape <Indicator> 
<Slice> { 
rdf:type (qb:Slice) 
, qb:sliceStructure (wf:sliceByYear) 
, qb:observation @<Observation>+ 
, cex:indicator @<Indicator> 
}
Observations 
<Observation> { 
rdf:type (qb:Observation) 
, cex:value xsd:float ? 
, dc:issued xsd:dateTime 
, rdfs:label xsd:string ? 
, qb:dataSet @<DataSet> 
, cex:ref-area @<Country> 
, cex:indicator @<Indicator> 
, cex:ref-year xsd:gYear 
}
...and more 
Indicators 
<Indicator> { 
rdf:type (wf:PrimaryIndicator wf:SecondaryIndicator) 
, rdfs:label xsd:string 
, rdfs:comment xsd:string ? 
, skos:notation xsd:string ? 
} 
Organizations 
<Organization> { 
rdf:type (org:Organization) 
, rdfs:label xsd:string 
, foaf:homepage IRI 
, org:hasSubOrganization @<Organization> 
}
Current Use of shape expressions 
in WebIndex 
1. Documentation of linked data portal 
Human-readable 
Machine processable 
2. Team communication 
Tell the developers which shapes they must generate 
3. Validation 
For example: check if a value of type 
qb:Observation has shape <Observation>
OK, but...why not use SPARQL? 
1 CONSTRUCT { 
2 ?Organization shex:hasShape <Organization> . 
3 } WHERE { SELECT ?Organization { 
4 ?Organization a ?o . 
5 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 
6 { SELECT ?Organization { 
7 ?Organization a ?o . FILTER ((?o = org:Organization)) 
8 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 
9 { SELECT ?Organization { 
10 ?Organization rdfs:label ?o . 
11 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 
12 { SELECT ?Organization { 
13 ?Organization rdfs:label ?o . 
14 FILTER ((isLiteral(?o) && datatype(?o) = xsd:string)) 
15 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 
16 { SELECT ?Organization { 
17 ?Organization foaf:homepage ?o . 
18 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 
19 { SELECT ?Organization { 
20 ?Organization foaf:homepage ?o . FILTER (isIRI(?o)) 
21 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 
22 { SELECT ?Organization (COUNT(*) AS ?Organization_c0) { 
23 ?Organization org:hasSubOrganization ?o . 
24 } GROUP BY ?Organization} 
25 { SELECT ?Organization (COUNT(*) AS ?Organization_c1) { 
26 ?Organization org:hasSubOrganization ?o . 
26 } GROUP BY ?Organization} 
28 FILTER (?Organization_c0 = ?Organization_c1) 
29 } 
SPARQL 
Opgnization shape 
Shape Expressions 
1 <Organization> { 
2 rdf:type (org:Organization) 
3 , rdfs:label xsd:string 
4 , foaf:homepage IRI 
5 , org:hasSubOrganization @<Organization> 
6 } 
Full example of WebIndex: 
61 lines ShEx vs 580 lines SPARQL
Implementations of 
Shape Expressions 
Name Main 
Developer 
Language Features 
FancyDemo Eric 
Prud'hommeaux 
Javascript First implementation 
Semantic Actions and Conversion to SPARQL 
http://www.w3.org/2013/ShEx/ 
JsShExTest Jesse van Dam Javascript Supports RDF and Compact syntax 
https://github.com/jessevdam/shextest 
ShExcala Jose E. Labra Scala Several extensions: 
negations, reverse arcs, relations,... 
Efficient implementation using Derivatives 
http://labra.github.io/ShExcala/ 
Haws Jose E. Labra Haskell Prototype to check Inference semantics 
http://labra.github.io/haws/
RDFShape: Online validation tool 
RDFShape = online validation tool 
Deployed at: http://rdfshape.weso.es 
Based on ShExcala 
Can validate by: 
URI 
File 
Textarea 
SPARQL endpoint 
Dereference
Shapes ≠ Types 
Types oriented to concepts 
Inference: RDF Schena, OWL 
Shapes oriented to RDF graphs 
Interface descriptions 
WebIndex 
<Observation> { 
rdf:type (qb:Observation) 
,cex:ref-year xsd:gYear 
...other properties from WebIndex 
} 
LandPortal 
<Observation> { 
rdf:type (qb:Observation) 
, lb:time @<Time> 
...other properties from LandPortal 
} 
RDF Data Cube 
qb:Observation a rdfs:Class, 
owl:Class; 
rdfs:label "Observation"@en; 
rdfs:comment "... "@en; 
rdfs:subClassOf qb:Attachable; 
owl:equivalentClass scovo:Item; 
rdfs:isDefinedBy ... 
Can be local to a data portal Types are global 
However, we can also define generic Shapes templates
Extensions & challenges 
Shape Expressions language has just been born 
It will be affected by W3c Charter group about 
RDF Data Shapes 
Mailing list: public-rdf-shapes@w3c.org 
"The discussion on public-rdf-shapes@w3.org is the 
best entertainment since years; 
Game of Thrones colors pale." @PaulZH
Alternatives, negations, etc. 
More expressiveness from regular expressions 
Alternatives 
Negations 
Groupings 
<Country> { a (wf:Country) 
, rdfs:label xsd:string 
, ( wf:iso2 xsd:string | wf:iso3 xsd:string ) 
, ! dc:creator . 
}
Open vs Closed shapes 
Open shapes allow remaining triples after validation 
<User> { 
a (foaf:Person) 
} 
:john a foaf:Person, 
foaf:name "John" . 
<User> [ 
a (foaf:Person) 
] 
:anna a foaf:Person . 
Open shape Closed shape 
 
 
 

Shape inclusions 
A shape includes/extends another shape 
Example: 
<Provider> extends <Organization> { 
wf:sourceURI IRI 
} 
A <Provider> has the same shape as an <Organization> 
plus the property wf:sourceURI
Reverse and relation arcs 
Shape Expressions describe subjects 
Can be extended to describe 
objects (reverse arcs) 
predicates (relation arcs) 
<Country> { 
rdf:type (wf:Country) 
, rdfs:label xsd:string 
, wf:iso2 xsd:string 
, wf:iso3 xsd:string 
, ^ cex:ref-area @<Observation> * 
} 
Reverse arc: a country has several incoming arcs with property cex:ref-area 
from subjects with shape <Observation>
Semantic actions 
Defines actions to be executed during validation 
%lang{ ...actions... %} 
Calls lang processor passing it the given actions 
<Country> { 
... 
wf:iso2 xsd:string %js{return /^[A-Z]{2}$/i.test(_.o.lex); %} 
, wf:iso3 xsd:string %js{return /^[A-Z]{3}$/i.test(_.o.lex); %} 
}
Variables and filters 
Add variable bindings to matched values 
Add FILTER rules similar to SPARQL 
<Country> { 
wf:iso2 (xsd:string AS ?iso2) 
wf:iso3 (xsd:string AS ?iso3) 
FILTER (regex(?iso2,"^[A-Z]{2}$","i") && 
regex(?iso3,"^[A-Z]{3}$","i")) 
}
Conclusions 
Shape Expressions: 
DSL to describe and validate RDF 
Role similar to DTDs or Schema languages for XML 
Quality of linked data portals 
Shape Expressions as interface descriptions 
Publishers: Understand what to publish 
Consumers: Check data before processing
Future Work 
Shape Expressions language 
Named graphs 
Implementations: Debugging and error messages 
Performance 
Applications to 
Other linked data use cases 
User interface generation 
Binding: generate parsers/tools from shapes 
...
End of presentation 
More info: 
Jose Emilio Labra Gayo 
http://www.weso.es

More Related Content

PPTX
Shape Expressions: An RDF validation and transformation language
PPTX
Towards an RDF Validation Language based on Regular Expression Derivatives
PPTX
ShEx by Example
PPTX
Data shapes-test-suite
PPTX
Validating RDF data: Challenges and perspectives
PPTX
Challenges and applications of RDF shapes
PPTX
Introduction to SPARQL
PPTX
RDF validation tutorial
Shape Expressions: An RDF validation and transformation language
Towards an RDF Validation Language based on Regular Expression Derivatives
ShEx by Example
Data shapes-test-suite
Validating RDF data: Challenges and perspectives
Challenges and applications of RDF shapes
Introduction to SPARQL
RDF validation tutorial

What's hot (18)

PPTX
RDF Data Model
PPTX
SHACL by example
PPTX
ShEx vs SHACL
PPTX
RDF data validation 2017 SHACL
PPTX
SWT Lecture Session 9 - RDB2RDF direct mapping
PPTX
SWT Lecture Session 10 R2RML Part 1
PPTX
RDF, linked data and semantic web
PDF
Two graph data models : RDF and Property Graphs
PPTX
SPIN in Five Slides
PDF
SHACL in Apache jena - ApacheCon2020
PPTX
Structure&amp;union
PDF
Another RDF Encoding Form
PDF
What is Pure Functional Programming, and how it can improve our application t...
PPTX
Ch08 - Manipulating Data in Strings and Arrays
ODP
Postgresql Server Programming
ODP
Xml Overview
PPT
RDF Data Model
SHACL by example
ShEx vs SHACL
RDF data validation 2017 SHACL
SWT Lecture Session 9 - RDB2RDF direct mapping
SWT Lecture Session 10 R2RML Part 1
RDF, linked data and semantic web
Two graph data models : RDF and Property Graphs
SPIN in Five Slides
SHACL in Apache jena - ApacheCon2020
Structure&amp;union
Another RDF Encoding Form
What is Pure Functional Programming, and how it can improve our application t...
Ch08 - Manipulating Data in Strings and Arrays
Postgresql Server Programming
Xml Overview
Ad

Similar to Validating and Describing Linked Data Portals using RDF Shape Expressions (20)

PPTX
RDF Validation Future work and applications
PDF
W3C Data Shapes Working Group 2014
PPTX
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
PDF
A Hands On Overview Of The Semantic Web
PPTX
Representing verifiable statistical index computations as linked data
PPTX
Semantic web meetup – sparql tutorial
PPTX
A Little SPARQL in your Analytics
PDF
SHACL Overview
PDF
RDF and Java
PDF
Optimizing SPARQL Queries with SHACL.pdf
PPT
SPARQL Query Forms
PDF
Presentation shexer
PDF
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
PPTX
SHACL Specification Draft
PDF
Selectivity Estimation for SPARQL Triple Patterns with Shape Expressions
PDF
Linked Data, Ontologies and Inference
PDF
Introduction to Graph Databases
PPTX
Knowledge Graph Introduction
PPTX
What;s Coming In SPARQL2?
PDF
Evolution of the Graph Schema
RDF Validation Future work and applications
W3C Data Shapes Working Group 2014
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
A Hands On Overview Of The Semantic Web
Representing verifiable statistical index computations as linked data
Semantic web meetup – sparql tutorial
A Little SPARQL in your Analytics
SHACL Overview
RDF and Java
Optimizing SPARQL Queries with SHACL.pdf
SPARQL Query Forms
Presentation shexer
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
SHACL Specification Draft
Selectivity Estimation for SPARQL Triple Patterns with Shape Expressions
Linked Data, Ontologies and Inference
Introduction to Graph Databases
Knowledge Graph Introduction
What;s Coming In SPARQL2?
Evolution of the Graph Schema
Ad

More from Jose Emilio Labra Gayo (20)

PPTX
Publicaciones de investigación
PPTX
Introducción a la investigación/doctorado
PPTX
Legislative data portals and linked data quality
PPTX
Legislative document content extraction based on Semantic Web technologies
PPTX
Introduction to SPARQL
PPTX
Introducción a la Web Semántica
PPTX
2017 Tendencias en informática
PPTX
19 javascript servidor
PPTX
Como publicar datos: hacia los datos abiertos enlazados
PPTX
16 Alternativas XML
PPTX
Arquitectura de la Web y Computación en el Servidor
PPTX
RDF data model
PPTX
Máster en Ingeniería Web
PPTX
2016 temuco tecnologias_websemantica
PPTX
2015 bogota datos_enlazados
PPTX
17 computacion servidor
PPTX
Tecnologias Web Semantica
Publicaciones de investigación
Introducción a la investigación/doctorado
Legislative data portals and linked data quality
Legislative document content extraction based on Semantic Web technologies
Introduction to SPARQL
Introducción a la Web Semántica
2017 Tendencias en informática
19 javascript servidor
Como publicar datos: hacia los datos abiertos enlazados
16 Alternativas XML
Arquitectura de la Web y Computación en el Servidor
RDF data model
Máster en Ingeniería Web
2016 temuco tecnologias_websemantica
2015 bogota datos_enlazados
17 computacion servidor
Tecnologias Web Semantica

Recently uploaded (20)

PDF
System and Network Administration Chapter 2
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
history of c programming in notes for students .pptx
PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
Transform Your Business with a Software ERP System
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
medical staffing services at VALiNTRY
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Introduction to Artificial Intelligence
PDF
System and Network Administraation Chapter 3
System and Network Administration Chapter 2
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Operating system designcfffgfgggggggvggggggggg
How to Choose the Right IT Partner for Your Business in Malaysia
PTS Company Brochure 2025 (1).pdf.......
Design an Analysis of Algorithms II-SECS-1021-03
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
How Creative Agencies Leverage Project Management Software.pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Which alternative to Crystal Reports is best for small or large businesses.pdf
2025 Textile ERP Trends: SAP, Odoo & Oracle
history of c programming in notes for students .pptx
CHAPTER 2 - PM Management and IT Context
Transform Your Business with a Software ERP System
Odoo POS Development Services by CandidRoot Solutions
medical staffing services at VALiNTRY
VVF-Customer-Presentation2025-Ver1.9.pptx
Introduction to Artificial Intelligence
System and Network Administraation Chapter 3

Validating and Describing Linked Data Portals using RDF Shape Expressions

  • 1. Validating and Describing Linked Data Portals using RDF Shape Expressions Eric Prud'hommeaux World Wide Web Consortium MIT, Cambridge, MA, USA eric@w3.org Jose Emilio Labra Gayo WESO Research group University of Oviedo Spain labra@uniovi.es Harold Solbrig Mayo Clinic USA College of Medicine, Rochester, MN, USA Jose María Álvarez Rodríguez Dept. Computer Science Carlos III University Spain josemaria.alvarez@uc3m.es
  • 2. This talk in one slide Shape Expressions Simple language to describe and validate RDF Can be applied to linked data portals Use case: WebIndex project
  • 3. Web Index Measure WWW's contribution to development and human rights by country Developed by the Web Foundation 81 countries, 116 indicators, 5 years (2007-12) Linked data portal http://data.webfoundation.org/webindex/2013
  • 4. Webindex workflow Data (Excel) RDF Datastore Visualizations Linked data portal Conversion Excel  RDF Enrichment
  • 5. WebIndex data model Observations have values by years Observations refer to indicators and countries ITU_B 2011 2012 2013 ... Germany 20.34 35.46 37.12 ... Spain 19.12 23.78 25.45 ... France 20.12 21.34 28.34 ... ... ... ... ... ... ITU_B 2011 2012 2013 ... Germany 20.34 35.46 37.12 ... Spain 19.12 23.78 25.45 ... France 20.12 21.34 28.34 ... ... ... ... ... ... ITU_B 2010 2011 2012 ... Germany 20.34 35.46 37.12 ... Spain 19.12 23.78 25.45 ... France 20.12 21.34 28.34 ... ... ... ... ... ... Model based on RDF Data Cube Main entity = Observation DataSets are published by Organizations Datasets contain several slices Slices group observations Observation Indicator Years % Broadband subscribers Countries Slice DataSet Indicators are provided by Organizations Examples ITU = International Telecommunication Union UN = United Nations WB = World bank ...
  • 6. Main webIndex data model* 1..n qb:slice Observation qb:observation rdf:type = qb:Observation cex:value: xsd:float dc:issued: xsd:dateTime rdfs:label: xsd:String cex:ref-year: xsd:gYear Organization rdf:type = org:Organization rdfs:label: xsd:String foaf:homepage: IRI Indicator rdf:type = cex:Primary | cex:Secondary rdfs:label: xsd:string rdfs:comment: xsd:string skos:notation: xsd:String Slice rdf:type = qb:Slice qb:sliceStructure: wf:sliceByArea Country rdf:type = wf:Country wf:iso2 : xsd:string wf:iso3 : xsd:string rdfs:label : xsd:String DataSet rdf:type = qb:DataSet qb:structure : wf:DSD rdfs:label : xsd:String qb:observation cex:ref-area cex:indicator wf:provider dc:publisher 1..n *Simplified
  • 7. Excel  RDF (Turtle) indicator:ITU_B a wf:SecondaryIndicator ; rdfs:label "Broadband subscribers %" . dataset:DITU a qb:DataSet ; rdfs:label "ITU Dataset" ; dc:publisher org:ITU ; qb:slice slice:ITU10B , slice:ITU11B, . ... ... slice:ITU11B a qb:Slice ; qb:sliceStructure wf:sliceByYear ; qb:observation obs:obs8165, obs:obs8166, ... ... org:ITU a org:Organization ; rdfs:label "ITU" ; foaf:homepage <http://www.itu.int/> . country:Spain a wf:Country ; wf:iso2 "ES" ; wf:iso3 "ESP" ; rdfs:label "Spain" . interrelated linked data obs:obs8165 a qb:Observation ; rdfs:label "ITU B in ESP, 2011" ; cex:indicator indicator:ITU_B ; qb:dataSet dataset:DITU ; cex:value "23.78"^^xsd:float ; cex:ref-year 2011 ; cex:ref-area country:Spain ; dc:issued "2013-05-30"^^xsd:date ; ... .
  • 8. Description and Validation Lots of constraints Observations must be linked to some country Observations have a float value Observations are related with an indicator, a country and a year Dataset contains several slices and slices contain several observations ....etc. Q: How can we express those constraints easily? Our proposal: Shape expressions
  • 9. Shape Expressions A simple and intuitive language to: Describe the topology of RDF data Validate that an RDF graph matches a shape Two syntaxes - Compact syntax (inspired by RelaxNG, Turtle and SPARQL) - RDF More details about ShEx: Paper in Semantics-2014 (5th Sept, 10:30h) http://www.w3.org/2013/ShEx/Primer
  • 10. Country A <Country> has at least the following properties: rdf:type with value wf:Country rdfs:label with value of type xsd:string wf:iso2 with value of type xsd:string wf:iso3 with value of type xsd:string Using shape Expressions: Label Open shape <Country> { rdf:type (wf:Country) , rdfs:label xsd:string , wf:iso2 xsd:string , wf:iso3 xsd:string } Conjunction
  • 11. DataSets A <DataSet> has the shape: rdf:type with value qb:Dataset qb:structure with value wf:DSD Optional rdfs:label with value of type xsd:string One or more qb:slice with shape <Slice> <DataSet> { rdf:type (qb:DataSet) , qb:structure (wf:DSD) , dc:publisher @<Organization> , rdfs:label xsd:string ? , qb:slice @<Slice>+ } Cardinality posibilities: * (0 or more) ? (0 or 1) + (1 or more) {m,n} between m and n
  • 12. Slices <Slice> has the properties: rdf:type with value qb:Slice qb:SliceStructure Whar does with it value mean? wf:sliceByYear Several qb:observation with shape <Observation> cex:indicator with shape <Indicator> <Slice> { rdf:type (qb:Slice) , qb:sliceStructure (wf:sliceByYear) , qb:observation @<Observation>+ , cex:indicator @<Indicator> }
  • 13. Observations <Observation> { rdf:type (qb:Observation) , cex:value xsd:float ? , dc:issued xsd:dateTime , rdfs:label xsd:string ? , qb:dataSet @<DataSet> , cex:ref-area @<Country> , cex:indicator @<Indicator> , cex:ref-year xsd:gYear }
  • 14. ...and more Indicators <Indicator> { rdf:type (wf:PrimaryIndicator wf:SecondaryIndicator) , rdfs:label xsd:string , rdfs:comment xsd:string ? , skos:notation xsd:string ? } Organizations <Organization> { rdf:type (org:Organization) , rdfs:label xsd:string , foaf:homepage IRI , org:hasSubOrganization @<Organization> }
  • 15. Current Use of shape expressions in WebIndex 1. Documentation of linked data portal Human-readable Machine processable 2. Team communication Tell the developers which shapes they must generate 3. Validation For example: check if a value of type qb:Observation has shape <Observation>
  • 16. OK, but...why not use SPARQL? 1 CONSTRUCT { 2 ?Organization shex:hasShape <Organization> . 3 } WHERE { SELECT ?Organization { 4 ?Organization a ?o . 5 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 6 { SELECT ?Organization { 7 ?Organization a ?o . FILTER ((?o = org:Organization)) 8 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 9 { SELECT ?Organization { 10 ?Organization rdfs:label ?o . 11 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 12 { SELECT ?Organization { 13 ?Organization rdfs:label ?o . 14 FILTER ((isLiteral(?o) && datatype(?o) = xsd:string)) 15 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 16 { SELECT ?Organization { 17 ?Organization foaf:homepage ?o . 18 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 19 { SELECT ?Organization { 20 ?Organization foaf:homepage ?o . FILTER (isIRI(?o)) 21 } GROUP BY ?Organization HAVING (COUNT(*)=1)} 22 { SELECT ?Organization (COUNT(*) AS ?Organization_c0) { 23 ?Organization org:hasSubOrganization ?o . 24 } GROUP BY ?Organization} 25 { SELECT ?Organization (COUNT(*) AS ?Organization_c1) { 26 ?Organization org:hasSubOrganization ?o . 26 } GROUP BY ?Organization} 28 FILTER (?Organization_c0 = ?Organization_c1) 29 } SPARQL Opgnization shape Shape Expressions 1 <Organization> { 2 rdf:type (org:Organization) 3 , rdfs:label xsd:string 4 , foaf:homepage IRI 5 , org:hasSubOrganization @<Organization> 6 } Full example of WebIndex: 61 lines ShEx vs 580 lines SPARQL
  • 17. Implementations of Shape Expressions Name Main Developer Language Features FancyDemo Eric Prud'hommeaux Javascript First implementation Semantic Actions and Conversion to SPARQL http://www.w3.org/2013/ShEx/ JsShExTest Jesse van Dam Javascript Supports RDF and Compact syntax https://github.com/jessevdam/shextest ShExcala Jose E. Labra Scala Several extensions: negations, reverse arcs, relations,... Efficient implementation using Derivatives http://labra.github.io/ShExcala/ Haws Jose E. Labra Haskell Prototype to check Inference semantics http://labra.github.io/haws/
  • 18. RDFShape: Online validation tool RDFShape = online validation tool Deployed at: http://rdfshape.weso.es Based on ShExcala Can validate by: URI File Textarea SPARQL endpoint Dereference
  • 19. Shapes ≠ Types Types oriented to concepts Inference: RDF Schena, OWL Shapes oriented to RDF graphs Interface descriptions WebIndex <Observation> { rdf:type (qb:Observation) ,cex:ref-year xsd:gYear ...other properties from WebIndex } LandPortal <Observation> { rdf:type (qb:Observation) , lb:time @<Time> ...other properties from LandPortal } RDF Data Cube qb:Observation a rdfs:Class, owl:Class; rdfs:label "Observation"@en; rdfs:comment "... "@en; rdfs:subClassOf qb:Attachable; owl:equivalentClass scovo:Item; rdfs:isDefinedBy ... Can be local to a data portal Types are global However, we can also define generic Shapes templates
  • 20. Extensions & challenges Shape Expressions language has just been born It will be affected by W3c Charter group about RDF Data Shapes Mailing list: public-rdf-shapes@w3c.org "The discussion on public-rdf-shapes@w3.org is the best entertainment since years; Game of Thrones colors pale." @PaulZH
  • 21. Alternatives, negations, etc. More expressiveness from regular expressions Alternatives Negations Groupings <Country> { a (wf:Country) , rdfs:label xsd:string , ( wf:iso2 xsd:string | wf:iso3 xsd:string ) , ! dc:creator . }
  • 22. Open vs Closed shapes Open shapes allow remaining triples after validation <User> { a (foaf:Person) } :john a foaf:Person, foaf:name "John" . <User> [ a (foaf:Person) ] :anna a foaf:Person . Open shape Closed shape    
  • 23. Shape inclusions A shape includes/extends another shape Example: <Provider> extends <Organization> { wf:sourceURI IRI } A <Provider> has the same shape as an <Organization> plus the property wf:sourceURI
  • 24. Reverse and relation arcs Shape Expressions describe subjects Can be extended to describe objects (reverse arcs) predicates (relation arcs) <Country> { rdf:type (wf:Country) , rdfs:label xsd:string , wf:iso2 xsd:string , wf:iso3 xsd:string , ^ cex:ref-area @<Observation> * } Reverse arc: a country has several incoming arcs with property cex:ref-area from subjects with shape <Observation>
  • 25. Semantic actions Defines actions to be executed during validation %lang{ ...actions... %} Calls lang processor passing it the given actions <Country> { ... wf:iso2 xsd:string %js{return /^[A-Z]{2}$/i.test(_.o.lex); %} , wf:iso3 xsd:string %js{return /^[A-Z]{3}$/i.test(_.o.lex); %} }
  • 26. Variables and filters Add variable bindings to matched values Add FILTER rules similar to SPARQL <Country> { wf:iso2 (xsd:string AS ?iso2) wf:iso3 (xsd:string AS ?iso3) FILTER (regex(?iso2,"^[A-Z]{2}$","i") && regex(?iso3,"^[A-Z]{3}$","i")) }
  • 27. Conclusions Shape Expressions: DSL to describe and validate RDF Role similar to DTDs or Schema languages for XML Quality of linked data portals Shape Expressions as interface descriptions Publishers: Understand what to publish Consumers: Check data before processing
  • 28. Future Work Shape Expressions language Named graphs Implementations: Debugging and error messages Performance Applications to Other linked data use cases User interface generation Binding: generate parsers/tools from shapes ...
  • 29. End of presentation More info: Jose Emilio Labra Gayo http://www.weso.es