SlideShare a Scribd company logo
Linked Data for the
Humanities: methods and
techniques
Enrico Daga
The Open University
Aldo Gangemi
Università di Bologna
Tutorial @ DH2019, Utrecht, 8th July
Albert Meroño-Peñuela
 Vrije Universiteit Amsterdam
Special Guest
14.00 Session I
• Linked Data in a nutshell
• Producing Linked Data
15.30 (Coffee break)
16.00 Session II
• Consuming Linked Data
• Hybrid Methods
Welcome
Linked Data in a nutshell
Intro
A bit of history
Invented the web in 1989
(yeah!)
Invented the semantic
web in 1994 (duh?)
“To a computer, then, the web is a flat,
boring world devoid of meaning”
Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
“This is a pity, as in fact documents on the
web describe real objects and imaginary
concepts, and give particular relationships
between them”
Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
“Adding semantics to the web involves two things:
allowing documents which have information in
machine-readable forms, and allowing links to be
created with relationship values.”
Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
“The Semantic Web is not a separate Web but an
extension of the current one, in which information
is given well-defined meaning, better enabling
computers and people to work in cooperation.”
Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
Ld4 dh tutorial
Linked Data is a way of publishing structured information
that allows datasets to be connected and enriched by the
means of links among their entities.
• LD uses the World Wide Web as publishing platform
• Based on W3C standards - open to everyone
• Enables your data to refer to other data
• … and other data to refer to yours!
Linked Data in a nutshell
h"ps://en.wikipedia.org/wiki/Linked_data
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Linked Open Data in 2007
2008
https://lod-cloud.net/
2009
https://lod-cloud.net/
Linked Data: The story so far (2009)
2010
https://lod-cloud.net/
2011
https://lod-cloud.net/
http://linkeddatacatalog.dws.informatik.uni-mannheim.de/state/
(crawlable) 2014
2017
https://lod-cloud.net/
03/2019
https://lod-cloud.net/
https://lod-cloud.net/
Building blocks
The very basics
• A	hierarchy	of	languages	
• Each	layer	exploits	and	uses	
capabilities	of	the	layers	below
The W3C “Layer Cake”
• A principle: hypertext
• A protocol: HTTP
• An identification scheme: URNs/URIs
• A language: HTML
The traditional Web
• A principle: hypertext
• A protocol: HTTP
• An identification scheme: URNs/URIs
• A language: HTML RDF
The semantic Web
• Uniform Resource Identifiers (URIs)
• To identify things
• HyperText Transfer Protocol (HTTP)
• To access data about them
• Resource Description Framework (RDF)
• a meta-model for data representation.
• it does not specify a particular schema
• offers a structure for representing schemas and data
• SPARQL Protocol and Query Language (SPARQL)
• To query LD databases directly on the Web
Linked Data Technology Stack
• A Uniform Resource Identifier (URI) is a compact sequence of
characters that identifies an abstract or physical resource.
[RFC3986]
• Syntax
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
• Example
foo://example.com:8042/over/there?name=ferret#nose
_/ _________________/_________/ __________/ __/
| | | | |
scheme authority path query fragment
HTTP URIs
• URIs (Unique Resource Identifiers) are used to identify things (also
called entities) in the real world
• For instance: people, places, events, companies, products, movies, etc.
A Web of Things
HTTP
Simplest thing ever
• On top of
• The Internet Protocol (IPv4)
• Domain Name System (DNS): e.g. dbpedia.org
• A Client / Server protocol: Request -> Response
• Message structure: Headers + Body (content)
Resource Description Framework
Rela%onships	between things are expressed by the means of
a multi-directed, fully labeled graph		
where
nodes	could be resources or XMLSchema-typed values;
rela%onships	are also identified by URIs
The RDF model
(the “content” of the HTTP body…)
RDF is based on an atomic element: the triple.
Triple: (subject predicate object)
- subject: a URI or a blank node
- predicate: MUST be a URI
- object: a URI, a blank node, or a literal
The RDF Triple
•
Example RDF Graph
0341Leipzig
hasAreaCode
Burkhard	Jung
hasMayor
Saxony
locatedIn
51.3333
latitude
12.3833
longitude
Germany
Social	Democratic	Party
1958-03-07 isMemberOf
locatedIn
born
isMayorOf
• Representation of data values
• Serialization as strings
• Interpretation based on the datatype
• Literals without Datatype are treated as strings
• and can be annotated with a language (Alpha-2): @en
Literals
Leipzig
Burkhard	Jung
51.3333latitude
12.3833
longitude
1958-03-07
born
isMayorOf hasMayor
• N-Triples (application/n-triples)
• Turtle (text/turtle)
• RDF/XML (application/rdf+xml)
• N-Quads (application/n-quads)
• TriG (application/trig)
• […]
RDF serializations
N-Triples
https://www.w3.org/TR/n-triples/
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart>
<http://dbpedia.org/ontology/deathPlace>
<http://dbpedia.org/resource/Vienna> .
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart>
<http://dbpedia.org/ontology/birthDate>
"1756-1-27"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart>
<http://dbpedia.org/ontology/deathDate>
"1791-12-5"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart>
<http://dbpedia.org/ontology/birthPlace>
<http://dbpedia.org/resource/Salzburg> .
• Namespaces in XML: https://www.w3.org/TR/xml-names/
• Namespaces end either with # or /
• In serialisations, are mapped to prefixes, for brevity
• http://prefix.cc to get help with namespaces and common
prefixes
• http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
• http://dbpedia.org/resource/
• dbr:Wolfgang_Amadeus_Mozart
Namespaces
Turtle
https://www.w3.org/TR/turtle/
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix yago: <http://dbpedia.org/class/yago/> .
@prefix wikidata: <http://www.wikidata.org/entity/> .
dbr:Wolfgang_Amadeus_Mozart
rdfs:label "Wolfgang Amadeus Mozart” ;
rdf:type owl:Thing , yago:WikicatGermanClassicalComposers ,
yago:WikicatGermanComposers , dbo:Person .
dbr:Wolfgang_Amadeus_Mozart owl:sameAs wikidata:Q254 ;
dbo:deathPlace dbr:Vienna .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
dbr:Wolfgang_Amadeus_Mozart dbo:deathDate "1791-12-5"^^xsd:date ;
dbo:birthPlace dbr:Salzburg ;
dbo:birthDate "1756-1-27"^^xsd:date .
• SPARQL: https://www.w3.org/TR/sparql11-overview/
• OWL: https://www.w3.org/OWL/
• RIF: https://www.w3.org/TR/rif-overview/
• SHACL: https://www.w3.org/TR/shacl/
• SPIN: https://spinrdf.org/
• RDFa: https://www.w3.org/TR/rdfa-syntax/
• JSON-LD: https://json-ld.org/
Many languages
• Triple Stores: database management systems that allow to
query RDF
• RDF1.1 named graphs allow to integrate multiple RDF
documents preserving the context of each triple: g s p o
• Syntax: N-Quads
Named Graphs
1. Use URIs to identify the “things” in your data
2. Use h2p://	URIs so people (and machines) can look them
up on the web
3. When a URI is looked	up, request/return	a descrip%on	of
the thing in RDF
4. Include links	to	related	things	(e.g.	owl:sameAs)
Linked Data principles
Something very basic
http://www.w3.org/DesignIssues/LinkedData.html
Linked Data example
23
Follow your nose
Hands-On
• Understand URI resolution
• Grasp Content-Negotiation
• Experience graph traversal
• https://ld4humanities.github.io/ > Hands-On resources
Objectives
HTTP
Simplest thing ever
• On top of
• The Internet Protocol (IPv4)
• Domain Name System (DNS): e.g. dbpedia.org
• A Client / Server protocol: Request -> Response
• Message structure: Headers + Body
https://www.slideshare.net/randyconnolly/chapter01-presentation-16514220
Headers
• Vary between Request and Response
(two newlines)
Body
• Any data
HTTP
Message structure
GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: */*
HTTP
http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
HTTP/1.1 303 See Other
Date: Wed, 03 Jul 2019 13:41:14 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Connection: keep-alive
Server: Virtuoso/07.20.3230
Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart
Access-Control-Allow-Origin: *
…
REQUESTRESPONSE
cURL is a command line tool and library for transferring data
with URLs
wURL is a simple web app that allows non-Unix users to use
cURL from a Web browser
http://purl.org/ld4dh/wurl
https://curl.haxx.se/
… let’s try …
GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: */*
HTTP
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
HTTP/1.1 303 See Other
Date: Wed, 03 Jul 2019 13:41:14 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Connection: keep-alive
Server: Virtuoso/07.20.3230
Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart
Access-Control-Allow-Origin: *
…
REQUESTRESPONSE
GET /page/Wolfgang_Amadeus_Mozart HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: */*
HTTP
curl -v http://dbpedia.org/page/Wolfgang_Amadeus_Mozart
HTTP/1.1 200 OK
Date: Wed, 03 Jul 2019 13:41:14 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Connection: keep-alive
Server: Virtuoso/07.20.3230
Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart
[…]
<html> […]
GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: text/turtle
HTTP
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
-H “Accept: text/turtle”
HTTP/1.1 303 See Other
Date: Wed, 03 Jul 2019 13:41:14 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Connection: keep-alive
Server: Virtuoso/07.20.3230
Location: http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl
GET /data/Wolfgang_Amadeus_Mozart.ttl HTTP/1.1
Host: dbpedia.org
User-Agent: curl/7.19.7
Accept: text/turtle
HTTP
curl -v http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl
HTTP/1.1 200 OK
Content-Type: text/turtle; charset=UTF-8
Content-Length: 50708
[…]
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbr: <http://dbpedia.org/resource/> .
dbr:Amadeus_Mozart dbo:wikiPageRedirects dbr:Wolfgang_Ama
dbr:The_Story_of_Mozart dbo:wikiPageRedirects dbr:Wolfga
dbr:Mozartian dbo:wikiPageRedirects dbr:Wolfgang_Amadeus_
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
curl -v “http://dbpedia.org/page/Wolfgang_Amadeus_Mozart"
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
-H “Accept: text/turtle”
curl -v “http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl”
curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
-H “Accept: text/turtle” -L
• In what formats is Mozart available?
• text/html, application/rdf+xml, text/n-triples, text/turtle
• Find Mozart's image (it’s a jpg)
• When Mozart was born?
• Where Mozart died?
• How many inhabitants has the city today?
How many Mozart?
http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
• Find the location of the experience
• When did it happened?
• Who is the listener?
• What musical opera was performed?
• What is the author of the listened music?
• What is the performer?
• What is the genre
• Find information about this genre
• Can you find other operas of the same genre?
a Listening Experience
http://data.open.ac.uk/led/lexp/1446304716352
This type of task is possible using a SPARQL endpoint:
http://dbpedia.org/sparql
A scent of SPARQL
“Find other operas of the same genre”
SELECT * WHERE {
?entity
<http://purl.org/dc/terms/subject>
<http://dbpedia.org/resource/Category:Grand_operas> .
}
1. https://www.theguardian.com/this-page-does-not-exists
2. urn:issn:23346587
3. issn:23346587
4. mailto:enrico.motta@open.ac.uk
5. dbr:Music
Quiz
Which of the following are not valid RDF IRIs?
Producing Linked Data
1. Knowledge representation
1. Identify the source
2. Understand the content (domain)
3. Modelling: reuse or build an ontology
2. Produce RDF
1. Populate the ontology
2. Encode or (re)engineer in RDF - “triplification”
3. Put it on the Web and provide services to access and query the data
1. Support URI dereferencing (Content negotiation)
2. Expose a SPARQL Endpoint
3. Describe your dataset with Linked Data (ehm …start over)
So you want to do Linked Data?
• World’s academic communities has been dealing for
years with knowledge	representa%on	
• Ar%ficial	intelligence, natural language processing, model
management, and many other research fields largely
contributed
• Some ancestors	traced the way
How to represent knowledge?
Ld4 dh tutorial
Ld4 dh tutorial
Ld4 dh tutorial
EXAMPLE
• Instances are associated with one or several
classes:
Boddingtons rdf:type Ale .
Grafentrunkrdf:type Bock .
Hoegaarden rdf:type White .
Jever rdf:type Pilsner .
Ontologies
different levels of detail & complexity
Complexity
Types
Labels
Descriptions
Comments
Class
Hierarchies
Relations
Documented
meaning
Basic Logic
Rules
Inferences
Transitivity
Domain
Range
Rules
Description Logic
Reasoning
Class unions
Sets semantics
Intersections
Disjointness
[…]
light-weight heavy-weight
Copyright	IKS	Consortium	
• A vocabulary for describing properties and classes of RDF
resources
• rdfs:Resource
• rdf:type
• rdfs:Class
• rdf:Property
• rdfs:subClassOf
• rdfs:subPropertyOf
• rdfs:domain
• rdfs:range
RDF Schema
http://www.w3.org/TR/rdf-schema/
• OWL allows to specify other axioms
• Property cardinality	restric%ons		
• Classes disjunc%on	
• Property transi%vity		
• Cardinality	constraints
• But beware: more	expressivity	means more reasoning	
complexity
The Web Ontology Language (OWL)
formal language for automated reasoning
The Web Ontology Language (OWL)
formal language for automated reasoning
:Novel rdf:type owl:Class.
:Short_Story rdf:type owl:Class.
:Poetry rdf:type owl:Class.
:Literature rdf:type owl:Class;
owl:unionOf (:Novel :Short_Story :Poetry).
<myWork> rdf:type :Novel .
<myWork> rdf:type :Literature .
IF
THEN
http://ontologydesignpatterns.org
• Schema layer of RDF
• Defines terms (classes and properties)
• Typically RDFS or OWL family
• Reusability is important for supporting interoperability
• Common vocabularies: Dublin Core, SKOS, FOAF, SIOC,
vCard, DOAP, Core Organization Ontology, VoID
Vocabularies
light-weight semantics
http://www.slideshare.net/prototypo/introduction-to-linked-data-rdf-vocabularies
!69
Vocabulary: Friend-of-a-Friend (FOAF)
defines classes and properties for representing

information about people and their

relationships
Soeren rdf:type foaf:Person .
Soeren currentProject http://OntoWiki.net .
Soeren foaf:homepage http://aksw.org/Soeren .
Soeren foaf:knows http://sembase.at/Tassilo .
Soeren foaf:sha1 09ac456515dee .
!70
Vocabulary: Semantically

Interlinked Online Communities.
Represent content from Blogs, Wikis, Forums, Mailinglists, Chats
etc.
!71
Vocabulary: Simple Knowledge Organization
System (SKOS)
support the use of thesauri, classification schemes, subject

heading systems and taxonomies
Ld4 dh tutorial
• DBpedia	Ontology	Schema:	
• manually	created	for	DBpedia	(infoboxes)	
• 1140	classes	+	1149	object	properties	+	1741	datatype	properties;	>7K	axioms	(1537	on	C,	2676	on	
OP,	3264	on	DTP:	1.3,	2.3,	1.8	ratios);		
• (200M	triples	in	DBpedia)	
• YAGO:	
• large	hierarchy	linking	Wikipedia	leaf	categories	to	WordNet	
• 250,000	classes	
• UMBEL	(Upper	Mapping	and	Binding	Exchange	Layer):	
• 20000	classes	derived	from	OpenCyc	
• DOLCE-Zero	(Foundational	Ontology,	aligned	to	DBpedia):	
• 76	classes	+	105	object	properties	+	5	datatype	properties;	596	axioms	(196	on	C,	389	on	OP,	11	on	
DTP:	2.4,	3.7,	2.2	ratios)	
• presence	of	“restrictions”,	top-level	disjointness,	and	patterns	
• Wikipedia	Categories:	
• Not	a	class	hierarchy	(e.g.	cycles),	represented	using	SKOS	
• 415,000+	categories
2011/05/12
General	Purpose	Ontologies
(different levels of detail & complexity)
Domain Ontologies
(different levels of detail & complexity)
https://lov.linkeddata.es/dataset/lov/
1. From a Relational Database
2. From Web content (Scraping)
3. From XML or other structured data formats
4. From a data table (e.g. a CSV file)
5. From natural language (Sic!)
How to produce RDF?
• W3C R2RML - language to specify
mappings between SQL databases and
RDF: http://www.w3.org/TR/r2rml/
• D2RQ - allows to access relational
databases as virtual graphs: http://d2rq.org/
• DB2Triples - runs a specified R2RML file
and generates RDF: https://github.com/
antidot/db2triples
1. From a relational database
http://data.cnr.it/data/cnr/individuo/CNR
• RDFa and microformats are used to embed semantic
information (expressed using the RDF model) into regular
HTML pages
• RDFa does it using existing (rel) and additional
(about, property, typeof) attributes
• Microformats only use usual HTML attributes (class)
• To extract, e.g., Apache any23: https://any23.apache.org
2. From Web pages
DBpedia is the de-facto Hub of LOD.
• descrip%ons	of	ca.	3.4	million	things	(1.5 million classified in a consistent ontology,
including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films,
15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases
• labels and abstracts for these 3.2 million things in up to 92 different languages;
1,460,000 links to images and 5,543,000 links to external web pages;

4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories,
and 75,000 YAGO categories
• altogether over	1	billion	pieces	of	informa%on	(i.e. RDF triples): 257M from English
edition, 766M from other language editions
• DBpedia	Live	(http://live.dbpedia.org/sparql/) &

Mappings	Wiki	(http://mappings.dbpedia.org)

integrate the community into a refinement cycle
Extracting structured information from Wikipedia and make this
information available on the Web as LOD:
• link other data sets on the Web to Wikipedia data (encyclopaedic
knowledge)
• ask sophisticated queries against Wikipedia (e.g. universities in
Paris, mayors of towns in a certain region),
• Represents a community consensus
Transforming Wikipedia into a Knowledge Base
Structure in Wikipedia
• Title	
• Abstract	
• Infoboxes	
• Geo-coordinates	
• Categories	
• Images	
• Links	
– other	language	versions	
– other	Wikipedia	pages	
– To	the	Web	
– Redirects	
– Disambiguations
Infobox	templates
{{Infobox Korean settlement
| title = Busan Metropolitan City
| img = Busan.jpg
| imgcaption = A view of the [[Geumjeong]] district in Busan
| hangul = 부산 광역시
...
| area_km2 = 763.46
| pop = 3635389
| popyear = 2006
| mayor = Hur Nam-sik
| divs = 15 wards (Gu), 1 county (Gun)
| region = [[Yeongnam]]
| dialect = [[Gyeongsang]]
}}
http://dbpedia.org/resource/Busan
dbp:Busan dbpp:title ″Busan Metropolitan City″
dbp:Busan dbpp:hangul ″부산 광역시″@Hang
dbp:Busan dbpp:area_km2 ″763.46“^xsd:float
dbp:Busan dbpp:pop ″3635389“^xsd:int
dbp:Busan dbpp:region dbp:Yeongnam
dbp:Busan dbpp:dialect dbp:Gyeongsang
...
Wikitext-Syntax
RDF	representation
2011/05/12
83
DBpedia	SPARQL	Endpoint
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbc: <http://dbpedia.org/resource/Category:>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?birth ?description ?person WHERE {
?person dbo:birthPlace dbr:Berlin .
?person dct:subject dbc:German_musicians .
?person dbo:birthDate ?birth .
?person foaf:name ?name .
?person rdfs:comment ?description .
FILTER (LANG(?description) = 'en') .
} ORDER BY ?name
2011/05/12
• hosted on a OpenLink Virtuoso server
• can answer SPARQL queries like
• Give me all Sitcoms that are set in NYC?
• All tennis players from Moscow?
• All films by Quentin Tarentino?
• All German musicians that were born in Berlin in the 19th
century?
• All soccer players with tricot number 11, playing for a club having
a stadium with over 40,000 seats and is born in a country with
over 10 million inhabitants?
DBpedia	SPARQL	Endpoint
http://dbpedia.org/sparql
• Two steps:
• Remodelling task
• Reengineering task
• Web APIs
• JSON: annotate with JSON-LD https://json-ld.org/
• XML
• XML != RDF
• XML serialisation of DOM (tree), RDF is a graph instead, no root.
• eXtensible Stylesheet Language Transformations (XSLT) to
generate a RDF format, e.g. N-Triples
3. From Web APIs, XML or other formats
• data.open.ac.uk is the home of The Open University LOD
• 2010, OU first university in the UK to publish LOD.
• Collects and interlinks open data from institutional
repositories of the University, and makes it available as LD
data.open.ac.uk
Open Educational Resources
• Metadata about educational resources produced
or co-produced by The Open University
• OU/BBC Coproductions | OU podcasts |
OpenLearn | Videofinder
Scientific Production
• Metadata about scientific production of The
Open University
• Open Research Online (http://
oro.open.ac.uk/)
Social Media
• Content hosted by social media web sites.
• Metadata are extracted from public APIs and
aggregated into RDF.
• Audioboo | YouTube
Datasets
http://data.open.ac.uk
Organisational
• Data collected form internal repositories and first
made public as linked data.
• The OU's Key Information Set from Unistats |
OU People Profiles | KMi People Profiles | Open
University data XCRI-CAP 1.2 | Qualifications |
Courses | OU Planet Stories
Data from Research Projects
• Linked Data from research projects.
• Arts and Humanities Research Council project
metadata | The Listening Experience Database |
The UK Reading Experience Database | The
Reading Experience Database: DBpedia
alignments
• Two tasks: remodelling & reengineering
• Homemade recipe:
1. Find your identifier(s), establish namespaces
2. Map columns to predicates, establish cell value type
(URI or Literal)
3. Iterate over the rows
4. Generate a triple for each cell
4. From a data table
• A Google Form Spreadsheet
• Prepare column names (first row)
• Identify the Subject column (S)
• Generate a tuple for each column value (S, c, v) - G SQL
• Clean: remove tuples with empty values
• Format tuples into valid N3 triples
Example
(only reengineering)
https://docs.google.com/spreadsheets/d/
1j_LHZIOhkbD61r7fSxuf4017tgbOoL_Z6tLT0oDQz_0/edit?usp=sharing
1. Load the data into a Triple Store
• Virtuoso Open Source: virtuoso.openlinksw.com
• Apache Jena: http://jena.apache.org/
• Blazegraph: www.blazegraph.com
• https://en.wikipedia.org/wiki/Comparison_of_triplestores
2. Publish the SPARQL Endpoint
3. Setup content negotiation
• http://www.example.com/…
303 to SPARQL DESCRIBE <http://www.example.com/...>
How to publish on the Web?
(signposting only here)
Coffee break
See you at 4pm (sharp!)
Consuming Linked Data
SPARQL
• Understand triple patterns
• Try with some features of the language
• https://ld4humanities.github.io/ > Hands-On resources
Objectives
SPARQL
SPARQL Protocol And RDF Query Language
Triple and Graph Patterns
How do we describe the structure of the RDF graph
which we're interested in?
Ld4 dh tutorial
# An RDF triple in Turtle syntax
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
dbr:Wolfgang_Amadeus_Mozart foaf:name ?name .
# A SPARQL triple pattern, with a single variable
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
dbr:Wolfgang_Amadeus_Mozart foaf:name ?name .
# All parts of a triple pattern can be variables
?subject foaf:name ?name.
Ld4 dh tutorial
# Matching labels of resources
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
?subject rdfs:label ?label.
Ld4 dh tutorial
# Combine triples patterns to create a graph pattern
PREFIX dby: <http://dbpedia.org/class/yago/>
?subject rdfs:label ?label .
?subject rdf:type dby:WikicatOperaComposers .
# SPARQL is based on Turtle, which allows abbreviations
# e.g. predicate-object lists:
?subject rdfs:label ?label;
rdf:type dby:WikicatOperaComposers .
Ld4 dh tutorial
# Graph patterns allow us to traverse a graph
?person rdfs:label “Wolfgang Amadeus Mozart”@de .
?person dbo:deathPlace ?place .
?place dbo:populationTotal ?population .
#Graph patterns allow us to traverse a graph
?person rdfs:label “Wolfgang Amadeus Mozart”@de .
?person dbo:deathPlace ?place .
?place dbo:populationTotal ?population .
Ld4 dh tutorial
Structure of a Query
What does a basic SPARQL query look like?
# Query. 1
# Associate URIs with prefixes
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
# Example of a SELECT query, retrieving 2 variables
# Variables selected MUST be bound in graph pattern
SELECT ?person ?label
WHERE {
#This is our graph pattern
?person rdfs:label “Wolfgang Amadeus Mozart”@de ;
dbo:deathPlace ?place .
?place dbo:populationTotal ?population
}
• https://ld4humanities.github.io/ > Hands-On resources
• We will use this UI: http://yasgui.org/
• Credits:
Let’s try it out
http://about.yasgui.org/
http://laurensrietveld.nl/
# Query. 2
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
# Example of a SELECT query, retrieving all variables
SELECT *
WHERE {
?person rdfs:label “Wolfgang Amadeus Mozart”@de ;
dbo:deathPlace ?place .
?place dbo:populationTotal ?population .
}
OPTIONAL bindings
How do we allow for missing or unknown
information?
# Query. 3
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?name ?image
WHERE {
#This pattern must be bound
?person rdfs:label "Wolfgang Amadeus Mozart"@de ;
dbo:birthPlace ?place .
#Anything in this block doesn't have to be bound
OPTIONAL {
?place dbo:populationTotal ?population .
}
}
UNION queries
How do we allow for alternatives or variations in the
graph?
# Query. 4
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?person ?place
WHERE {
{
?person dbo:deathPlace ?place .
}
UNION
{
?person dbo:birthPlace ?place .
}
}
Sorting & Restrictions
How do we apply a sort order to the results?
How can we add restrictions?
How can we restrict the number of results returned?
# Query. 5
# Select the URI and population of all places
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population .
}
# Ex. 6
# Select the URI and population of all places
# with highest first
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population .
}
# Use an ORDER BY clause to apply a sort.
# Can be ASC or DESC
ORDER BY DESC(?population)
# Ex. 7
# Select the URI and population of a city
# with highest first
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population .
FILTER EXISTS {
?place dbp:countryCode []
}
}
# Use an ORDER BY clause to apply a sort.
# Can be ASC or DESC
ORDER BY DESC(?population)
# Ex. 8
# Select the URI and population of the 11-20th most
populated countries
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population .
FILTER EXISTS {
?place dbp:countryCode []
}
}
# Use an ORDER BY clause to apply a sort.
ORDER BY DESC(?population)
# Limit to first ten results
LIMIT 10
# Apply an offset to get next “page”
OFFSET 10
Filtering
How do we restrict results based on aspects of the
data rather than the graph, e.g. string matching?
# In the following triple the literal has assigned a
# datatype to indicate it is a date
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
dbr:Wolfgang_Amadeus_Mozart
dbo:birthDate "1756-1-27"^^xsd:date
# Query. 9
# Select name of persons born between 1st Jan 1756 and
1st Jan 1757
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?name
WHERE {
?person dbo:birthDate ?date;
foaf:name ?name.
FILTER (?date > "1756-01-01"^^xsd:date &&
?date < "1757-01-01"^^xsd:date)
}
# Query. 10
# Select the URI and population of places with an area
below 20km^2, with most populated first
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbpp: <http://dbpedia.org/ontology/PopulatedPlace/>
SELECT ?place ?population
WHERE {
?place dbo:populationTotal ?population ;
dbpp:areaTotal ?area .
# Note that we have to cast the data to the right type
# As it is not declared in the data
FILTER( xsd:double(?area) < 20 )
}
ORDER BY DESC(?population)
# Query. 11
# Select persons named Wolfgang
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?subject ?name
WHERE {
?subject foaf:name ?name ;
dbo:deathPlace dbr:Vienna .
FILTER( regex(?name, "Wolfgang", "i" ) )
}
• Logical: !, &&, ||
• Math: +, -, *, /
• Comparison: =, !=, >, <, ...
• Variable tests: isURI, isBlank, isLiteral, bound
• Accessors: str, lang, datatype
• Other: sameTerm, langMatches, regex
Built-In Filters
DISTINCT
How do we remove duplicate results?
# Query. 12
# Select list of places that gave birth to german
classical composers
PREFIX space: <http://purl.org/net/schemas/space/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT DISTINCT ?place
WHERE {
[] dbo:birthPlace ?place ;
dct:subject dbc:German_classical_composers
}
SPARQL Query Forms
Does SPARQL do more than just SELECT data?
ASK
Test whether the graph contains some data of
interest
# Query. 13
# Is Mozart’s date of birth 1756-1-27?
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ASK WHERE {
dbr:Mozart space:launched "1756-1-27"^^xsd:date .
}
# ASK returns a boolean value
DESCRIBE
Generate an RDF description of a resource(s)
# Query. 14
# Describe persons born in 1757
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dbp: <http://dbpedia.org/property/>
DESCRIBE ?person {
?person dbp:birthDate ?date .
FILTER ( ?date < "1958-01-01"^^xsd:date &&
?date >= "1757-01-01"^^xsd:date )
}
CONSTRUCT
Create a custom RDF graph based on query criteria
Can be used to transform RDF data
@prefix exp: <http://www.example.com/property#> .
<mailto:albert.meronyo@gmail.com>
exp:timestamp "05/06/2019 14:38:16" ;
exp:xml_level "Basic" ;
exp:rdf_level "Expert" ;
exp:sparql_level "Expert" ;
exp:web_level "Expert" ;
exp:ontology_level "Expert" ;
exp:ld_publishing_level "Expert" ;
exp:ld_consumption "Expert" ;
exp:ld_ui_level "Expert" ;
exp:interested_in "Advanced Ontology Design, Linked
Data Production, Linked Data Consumption, Success
stories, Advanced Linked Data Techniques, Advanced
SPARQL" ;
exp:known_projects "JazzCats, LinkezBrainz, MIDI
Linked Data cloud, LOD Laundromat" ;
exp:plans_of_use "Yes; music, history" .
# Example
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
CONSTRUCT {
?person rdf:type foaf:Person
}
WHERE {
?person ex:timestamp []
}
# Example.
# Remodelling! Change the identifier
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
CONSTRUCT {
?person rdf:type foaf:Person ;
foaf:mbox ?mbox
?person ?predicate ?any
}
WHERE {
BIND(
CONCAT(
“http://www.ld4humanities/2019/participant/“,
MD5(str(?mbox))) AS ?person) .
?mbox ?predicate ?any
}
NAMED GRAPHS
SPARQL can query multiple RDF graphs together!
# Query. 15
# Search in multiple Graphs
SELECT
distinct ?type
FROM <http://data.open.ac.uk/context/youtube>
FROM <http://data.open.ac.uk/context/podcast>
FROM <http://data.open.ac.uk/context/openlearn>
FROM <http://data.open.ac.uk/context/course>
FROM <http://data.open.ac.uk/context/qualification>
WHERE{
[] a ?type
}
# Query. 16
# Search in multiple Graphs
SELECT
distinct ?g ?type
FROM NAMED <http://data.open.ac.uk/context/youtube>
FROM NAMED <http://data.open.ac.uk/context/podcast>
FROM NAMED <http://data.open.ac.uk/context/openlearn>
FROM NAMED <http://data.open.ac.uk/context/course>
FROM NAMED <http://data.open.ac.uk/context/
qualification>
WHERE{
GRAPH ?g { [] a ?type }
}
Videos from the Open University on YouTube.
YouTube videos are linked to courses and qualifications, which in
turn are linked to other entities (OpenLearn units, Podcasts,
Audios, and other Courses or Qualifications)
Find OU content related to a YouTube video from the YouTube
video:
https://www.youtube.com/watch?v=SYry6PYsL8o
http://data.open.ac.uk/youtube/SYry6PYsL8o
http://data.open.ac.uk
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix podcast: <http://data.open.ac.uk/podcast/ontology/>
prefix yt: <http://data.open.ac.uk/youtube/ontology/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix rkb: <http://courseware.rkbexplorer.com/ontologies/courseware#>
prefix saou: <http://data.open.ac.uk/saou/ontology#>
prefix dbp: <http://dbpedia.org/property/>
prefix media: <http://purl.org/media#>
prefix olearn: <http://data.open.ac.uk/openlearn/ontology/>
prefix mlo: <http://purl.org/net/mlo/>
prefix bazaar: <http://digitalbazaar.com/media/>
prefix schema: <http://schema.org/>
SELECT
distinct
(?related as ?identifier)
?type
?label
(str(?location) as ?link)
FROM <http://data.open.ac.uk/context/youtube>
FROM <http://data.open.ac.uk/context/podcast>
FROM <http://data.open.ac.uk/context/openlearn>
FROM <http://data.open.ac.uk/context/course>
FROM <http://data.open.ac.uk/context/qualification>
WHERE
{
?x schema:productID "SYry6PYsL8o" . # change the youtube id to any OU youtube video
?x yt:relatesToCourse ?course .
{
# related video podcasts
?related podcast:relatesToCourse ?course .
?related a podcast:VideoPodcast .
?related rdfs:label ?label .
optional { ?related bazaar:download ?location }
BIND( "VideoPodcast" as ?type ) .
} union {
# related audio podcasts
?related podcast:relatesToCourse ?course .
?related a podcast:AudioPodcast .
?related rdfs:label ?label .
optional { ?related bazaar:download ?location }
BIND( "AudioPodcast" as ?type ) .
} union {
# related openlearn units
?related a olearn:OpenLearnUnit .
?related olearn:relatesToCourse ?course .
BIND( "OpenLearnUnit" as ?type ) .
?related <http://dbpedia.org/property/url> ?location .
?related rdfs:label ?label .
} union {
# related qualifications (compulsory course)
?related a mlo:qualification .
?related saou:hasPathway/saou:hasStage/saou:includesCompulsoryCourse ?course .
BIND( "Qualification" as ?type ) .
?related rdfs:label ?label .
?related mlo:url ?location
}
} limit 200
Content recommendation
• SPARQL1.1 W3 Recommendation
- https://www.w3.org/TR/sparql11-query/
• YASGUI SPARQL editor
– http://yasgui.org/
Useful Links
Success Stories
Uses data.open.ac.uk to get
content recommendations (eg:
courses).
data.open.ac.uk drives the
click through which turns
OpenLearn visitors into OU
students!
Publish once, display
everywhere (from YouTube,
Audioboo, iTunesU, Podcast)
OpenLearn
h"p://www.open.edu/openlearn/
An open and freely
searchable database that
brings together a mass of
data about people’s
experiences of listening to
music of all kinds, in any
historical period and any
culture.
Reuse from LOD
Uses data.open.ac.uk as
publishing platform.
RDF, “natively”
The Listening Experience Database Project
h"p://led.kmi.open.ac.uk/
Feedback	welcome:	@enridaga	#kmiou
https://recogito.pelagios.org/
http://data.cnr.it/data/cnr/individuo/CNR
Semantic Scouting
http://webtemp.src.cnr.it/semanticscouting/
2009
Hybrid methods
• Most of the data is actually metadata, describes
resources, documents, people, and it is essentially
structured
• However, LD can be used to enhance content such as
text or music!
• Two case studies:
• @Albert - MIDI Linked Data Cloud
• FindLEr: find evidence of Listening Experiences
• Hands-On
LD with content
Hands-On
A basic recipe:
1. Text
2. Link to a LD Graph with Named Entity Recognition (NER)
- e.g. dbpedia
3. Explore the graph to find common nodes between
entities
4. Suggest subjects for the text
Case study: find relevant topics
Text
Senior academics and politicians have condemned UK universities for failing to tackle endemic
racism against students and staff after a Guardian investigation found widespread evidence of
discrimination in the sector.
University staff from minority backgrounds said the findings showed there was “absolute
resistance” to dealing with the problem. Responses to freedom of information (FoI) requests the
Guardian sent to 131 universities showed that students and staff made at least 996 formal
complaints of racism over the past five years.
Of these, 367 were upheld, resulting in at least 78 student suspensions or expulsions and 51 staff
suspensions, dismissals and resignations.
But even these official figures are believed to underestimate the scale of racism in higher
education, with two separate investigations by the Guardian and the Equality and Human Rights
Commission identifying hundreds more cases that were not formally investigated by universities.
Scores of black and minority ethnic students and lecturers have told the Guardian they were
dissuaded from making official complaints and either dropped their allegations or settled for an
informal resolution. They said white university staff were often reluctant toaddress racism, with
racial slurs treated as banter or an inevitable byproduct of freedom of speech, and institutional
racism poorly recognised.
https://www.theguardian.com/education/2019/jul/05/uk-universities-condemned-for-
failure-to-tackle-racism
https://www.dbpedia-spotlight.org/demo/
https://www.dbpedia-spotlight.org/demo/
<http://dbpedia.org/resource/United_Kingdom>
<http://dbpedia.org/resource/Endemism>
<http://dbpedia.org/resource/Racism>
<http://dbpedia.org/resource/Australian_Human_Rights_Commission>
<http://dbpedia.org/resource/Institutional_racism>
SPARQL query
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT (count(?node) as ?sc) ?obj WHERE {
?node dct:subject ?cat .
?cat skos:broader{0,5} ?obj .
VALUES (?node){
(<http://dbpedia.org/resource/United_Kingdom>)
(<http://dbpedia.org/resource/Endemism>)
(<http://dbpedia.org/resource/Racism>)
(<http://dbpedia.org/resource/Australian_Human_Rights_Commission>)
(<http://dbpedia.org/resource/Institutional_racism>)
}
}
group by ?obj
order by desc(?sc)
limit 10
Ld4 dh tutorial
Open Issues
• Interoperability between these repositories (how to align their ontologies and entity
names?) is usually partial
• Quality
• owl:sameAs is very rarely “same as”. See http://sameas.org
• Completeness
• Principled Low Commitment (e.g. 404, 406, …)
• How to distinguish entities and documents?
• Method on top of the “Follow your nose” approach still to be developed
• What about incoming links?
• Licences? Policies?
• Availability of open data (limited resources). Some proposals, e.g. Linked Data Fragments
• User interfaces for LD operations - not only visualisation - still missing
Open Issues
Link and Open Your Data
Scholars & Institutions in the humanities are very
good at building high quality databases (e.g. thesauri,
gazetteers) but most of them are still closed!
Some sources of inspiration …
• EUCLID Project: http://euclid-project.eu/
• Randy Connolly’s slides about Web Development: https://
www.slideshare.net/randyconnolly
• Linked Data Patterns book
• http://patterns.dataincubator.org/book/
Credits
Thank you!
@enridaga @aldogangemi

More Related Content

ODP
Machine Learning & Embeddings for Large Knowledge Graphs
ODP
Make Embeddings Semantic Again!
PPT
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
ODP
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
PDF
New Adventures in RDF2vec
PDF
New Adventures in RDF2vec
ODP
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
PPT
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block
Machine Learning & Embeddings for Large Knowledge Graphs
Make Embeddings Semantic Again!
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
New Adventures in RDF2vec
New Adventures in RDF2vec
From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph
Beyond DBpedia and YAGO – The New Kids on the Knowledge Graph Block

What's hot (20)

ODP
Machine Learning with and for Semantic Web Knowledge Graphs
PDF
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
PDF
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
PDF
From Wikis to Knowledge Graphs
ODP
Knowledge Graphs on the Web
ODP
What the Adoption of schema.org Tells about Linked Open Data
PDF
Towards Knowledge Graph Profiling
PPTX
2011 05-01 linked data
PPTX
2011 05-02 linked data intro
ODP
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
PPTX
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
PPTX
SWT Lecture Session 1 - Introduction
PDF
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
PPTX
The web is rotting and what to do about it
ODP
Fast Approximate A-box Consistency Checking using Machine Learning
PPTX
Creating knowledge out of interlinked data
PPT
Introduction to the Semantic Web
PPTX
Researcher Pod: Scholarly Communication Using the Decentralized Web
PPTX
Semantic Web questions we couldn't ask 10 years ago
PDF
Web Data Extraction: A Crash Course
Machine Learning with and for Semantic Web Knowledge Graphs
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
From Wikis to Knowledge Graphs
Knowledge Graphs on the Web
What the Adoption of schema.org Tells about Linked Open Data
Towards Knowledge Graph Profiling
2011 05-01 linked data
2011 05-02 linked data intro
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
SWT Lecture Session 1 - Introduction
Semantic Web and Web 3.0 - Web Technologies (1019888BNR)
The web is rotting and what to do about it
Fast Approximate A-box Consistency Checking using Machine Learning
Creating knowledge out of interlinked data
Introduction to the Semantic Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
Semantic Web questions we couldn't ask 10 years ago
Web Data Extraction: A Crash Course
Ad

Similar to Ld4 dh tutorial (20)

PPTX
Madrid Building blocks of Linked Data
ODP
Linked Data
PPSX
Linked Data to Improve the OER Experience
ODP
Web of data
PDF
Introduction to linked data
PPTX
TPDL2013 tutorial linked data for digital libraries 2013-10-22
PDF
Linked Data Principles and RDF: University of Florida Libraries, BIBFRAME Wor...
PPTX
Technical Background
PPTX
Get on the Linked Data Web!
PPTX
Linked data HHS 2015
PPTX
Publishing and Using Linked Open Data - Day 1
PPTX
Linked data 101: Getting Caught in the Semantic Web
PPTX
Linked Data MLA 2015
PPTX
Linked data MLA 2015
PPTX
Usage of Linked Data: Introduction and Application Scenarios
PPTX
One day workshop Linked Data and Semantic Web
PPT
Publishing data on the Semantic Web
PPTX
Linked open data project
PDF
Introduction to Linked Data - Part 1
PDF
Publishing and Using Linked Data
Madrid Building blocks of Linked Data
Linked Data
Linked Data to Improve the OER Experience
Web of data
Introduction to linked data
TPDL2013 tutorial linked data for digital libraries 2013-10-22
Linked Data Principles and RDF: University of Florida Libraries, BIBFRAME Wor...
Technical Background
Get on the Linked Data Web!
Linked data HHS 2015
Publishing and Using Linked Open Data - Day 1
Linked data 101: Getting Caught in the Semantic Web
Linked Data MLA 2015
Linked data MLA 2015
Usage of Linked Data: Introduction and Application Scenarios
One day workshop Linked Data and Semantic Web
Publishing data on the Semantic Web
Linked open data project
Introduction to Linked Data - Part 1
Publishing and Using Linked Data
Ad

More from Enrico Daga (19)

PDF
Citizen Experiences in Cultural Heritage Archives: a Data Journey
PDF
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
PDF
Data integration with a façade. The case of knowledge graph construction.
PDF
Knowledge graph construction with a façade - The SPARQL Anything Project
PDF
Capturing the semantics of documentary evidence for humanities research
PDF
Trying SPARQL Anything with MEI
PDF
The SPARQL Anything project
PDF
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
PDF
Linked data for knowledge curation in humanities research
PDF
Capturing Themed Evidence, a Hybrid Approach
PDF
Challenging knowledge extraction to support
the curation of documentary evide...
PDF
OU RSE Tutorial Big Data Cluster
PDF
CityLABS Workshop: Working with large tables
PDF
Propagating Data Policies - A User Study
PDF
Linked Data at the OU - the story so far
PDF
Propagation of Policies in Rich Data Flows
PDF
A bottom up approach for licences classification and selection
PDF
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
PDF
Early Analysis and Debuggin of Linked Open Data Cubes
Citizen Experiences in Cultural Heritage Archives: a Data Journey
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...
Data integration with a façade. The case of knowledge graph construction.
Knowledge graph construction with a façade - The SPARQL Anything Project
Capturing the semantics of documentary evidence for humanities research
Trying SPARQL Anything with MEI
The SPARQL Anything project
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Linked data for knowledge curation in humanities research
Capturing Themed Evidence, a Hybrid Approach
Challenging knowledge extraction to support
the curation of documentary evide...
OU RSE Tutorial Big Data Cluster
CityLABS Workshop: Working with large tables
Propagating Data Policies - A User Study
Linked Data at the OU - the story so far
Propagation of Policies in Rich Data Flows
A bottom up approach for licences classification and selection
A BASILar Approach for Building Web APIs on top of SPARQL Endpoints
Early Analysis and Debuggin of Linked Open Data Cubes

Recently uploaded (20)

PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Basic Mud Logging Guide for educational purpose
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Cell Types and Its function , kingdom of life
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
01-Introduction-to-Information-Management.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Institutional Correction lecture only . . .
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
RMMM.pdf make it easy to upload and study
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Basic Mud Logging Guide for educational purpose
VCE English Exam - Section C Student Revision Booklet
Cell Types and Its function , kingdom of life
Microbial disease of the cardiovascular and lymphatic systems
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
01-Introduction-to-Information-Management.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
Final Presentation General Medicine 03-08-2024.pptx
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Institutional Correction lecture only . . .
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
human mycosis Human fungal infections are called human mycosis..pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
RMMM.pdf make it easy to upload and study

Ld4 dh tutorial

  • 1. Linked Data for the Humanities: methods and techniques Enrico Daga The Open University Aldo Gangemi Università di Bologna Tutorial @ DH2019, Utrecht, 8th July Albert Meroño-Peñuela  Vrije Universiteit Amsterdam Special Guest
  • 2. 14.00 Session I • Linked Data in a nutshell • Producing Linked Data 15.30 (Coffee break) 16.00 Session II • Consuming Linked Data • Hybrid Methods Welcome
  • 3. Linked Data in a nutshell
  • 4. Intro A bit of history
  • 5. Invented the web in 1989 (yeah!) Invented the semantic web in 1994 (duh?)
  • 6. “To a computer, then, the web is a flat, boring world devoid of meaning” Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
  • 7. “This is a pity, as in fact documents on the web describe real objects and imaginary concepts, and give particular relationships between them” Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
  • 8. “Adding semantics to the web involves two things: allowing documents which have information in machine-readable forms, and allowing links to be created with relationship values.” Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
  • 9. “The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” Tim Berners Lee, http://www.w3.org/Talks/WWW94Tim/
  • 11. Linked Data is a way of publishing structured information that allows datasets to be connected and enriched by the means of links among their entities. • LD uses the World Wide Web as publishing platform • Based on W3C standards - open to everyone • Enables your data to refer to other data • … and other data to refer to yours! Linked Data in a nutshell h"ps://en.wikipedia.org/wiki/Linked_data
  • 12. “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” Linked Open Data in 2007
  • 15. Linked Data: The story so far (2009)
  • 23. • A principle: hypertext • A protocol: HTTP • An identification scheme: URNs/URIs • A language: HTML The traditional Web
  • 24. • A principle: hypertext • A protocol: HTTP • An identification scheme: URNs/URIs • A language: HTML RDF The semantic Web
  • 25. • Uniform Resource Identifiers (URIs) • To identify things • HyperText Transfer Protocol (HTTP) • To access data about them • Resource Description Framework (RDF) • a meta-model for data representation. • it does not specify a particular schema • offers a structure for representing schemas and data • SPARQL Protocol and Query Language (SPARQL) • To query LD databases directly on the Web Linked Data Technology Stack
  • 26. • A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. [RFC3986] • Syntax URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] • Example foo://example.com:8042/over/there?name=ferret#nose _/ _________________/_________/ __________/ __/ | | | | | scheme authority path query fragment HTTP URIs
  • 27. • URIs (Unique Resource Identifiers) are used to identify things (also called entities) in the real world • For instance: people, places, events, companies, products, movies, etc. A Web of Things
  • 28. HTTP Simplest thing ever • On top of • The Internet Protocol (IPv4) • Domain Name System (DNS): e.g. dbpedia.org • A Client / Server protocol: Request -> Response • Message structure: Headers + Body (content)
  • 29. Resource Description Framework Rela%onships between things are expressed by the means of a multi-directed, fully labeled graph where nodes could be resources or XMLSchema-typed values; rela%onships are also identified by URIs The RDF model (the “content” of the HTTP body…)
  • 30. RDF is based on an atomic element: the triple. Triple: (subject predicate object) - subject: a URI or a blank node - predicate: MUST be a URI - object: a URI, a blank node, or a literal The RDF Triple
  • 32. • Representation of data values • Serialization as strings • Interpretation based on the datatype • Literals without Datatype are treated as strings • and can be annotated with a language (Alpha-2): @en Literals Leipzig Burkhard Jung 51.3333latitude 12.3833 longitude 1958-03-07 born isMayorOf hasMayor
  • 33. • N-Triples (application/n-triples) • Turtle (text/turtle) • RDF/XML (application/rdf+xml) • N-Quads (application/n-quads) • TriG (application/trig) • […] RDF serializations
  • 34. N-Triples https://www.w3.org/TR/n-triples/ <http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart> <http://dbpedia.org/ontology/deathPlace> <http://dbpedia.org/resource/Vienna> . <http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart> <http://dbpedia.org/ontology/birthDate> "1756-1-27"^^<http://www.w3.org/2001/XMLSchema#date> . <http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart> <http://dbpedia.org/ontology/deathDate> "1791-12-5"^^<http://www.w3.org/2001/XMLSchema#date> . <http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart> <http://dbpedia.org/ontology/birthPlace> <http://dbpedia.org/resource/Salzburg> .
  • 35. • Namespaces in XML: https://www.w3.org/TR/xml-names/ • Namespaces end either with # or / • In serialisations, are mapped to prefixes, for brevity • http://prefix.cc to get help with namespaces and common prefixes • http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart • http://dbpedia.org/resource/ • dbr:Wolfgang_Amadeus_Mozart Namespaces
  • 36. Turtle https://www.w3.org/TR/turtle/ @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbr: <http://dbpedia.org/resource/> . @prefix dbo: <http://dbpedia.org/ontology/> . @prefix yago: <http://dbpedia.org/class/yago/> . @prefix wikidata: <http://www.wikidata.org/entity/> . dbr:Wolfgang_Amadeus_Mozart rdfs:label "Wolfgang Amadeus Mozart” ; rdf:type owl:Thing , yago:WikicatGermanClassicalComposers , yago:WikicatGermanComposers , dbo:Person . dbr:Wolfgang_Amadeus_Mozart owl:sameAs wikidata:Q254 ; dbo:deathPlace dbr:Vienna . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . dbr:Wolfgang_Amadeus_Mozart dbo:deathDate "1791-12-5"^^xsd:date ; dbo:birthPlace dbr:Salzburg ; dbo:birthDate "1756-1-27"^^xsd:date .
  • 37. • SPARQL: https://www.w3.org/TR/sparql11-overview/ • OWL: https://www.w3.org/OWL/ • RIF: https://www.w3.org/TR/rif-overview/ • SHACL: https://www.w3.org/TR/shacl/ • SPIN: https://spinrdf.org/ • RDFa: https://www.w3.org/TR/rdfa-syntax/ • JSON-LD: https://json-ld.org/ Many languages
  • 38. • Triple Stores: database management systems that allow to query RDF • RDF1.1 named graphs allow to integrate multiple RDF documents preserving the context of each triple: g s p o • Syntax: N-Quads Named Graphs
  • 39. 1. Use URIs to identify the “things” in your data 2. Use h2p:// URIs so people (and machines) can look them up on the web 3. When a URI is looked up, request/return a descrip%on of the thing in RDF 4. Include links to related things (e.g. owl:sameAs) Linked Data principles Something very basic http://www.w3.org/DesignIssues/LinkedData.html
  • 42. • Understand URI resolution • Grasp Content-Negotiation • Experience graph traversal • https://ld4humanities.github.io/ > Hands-On resources Objectives
  • 43. HTTP Simplest thing ever • On top of • The Internet Protocol (IPv4) • Domain Name System (DNS): e.g. dbpedia.org • A Client / Server protocol: Request -> Response • Message structure: Headers + Body https://www.slideshare.net/randyconnolly/chapter01-presentation-16514220
  • 44. Headers • Vary between Request and Response (two newlines) Body • Any data HTTP Message structure
  • 45. GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: */* HTTP http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart HTTP/1.1 303 See Other Date: Wed, 03 Jul 2019 13:41:14 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Connection: keep-alive Server: Virtuoso/07.20.3230 Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart Access-Control-Allow-Origin: * … REQUESTRESPONSE
  • 46. cURL is a command line tool and library for transferring data with URLs wURL is a simple web app that allows non-Unix users to use cURL from a Web browser http://purl.org/ld4dh/wurl https://curl.haxx.se/ … let’s try …
  • 47. GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: */* HTTP curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart HTTP/1.1 303 See Other Date: Wed, 03 Jul 2019 13:41:14 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Connection: keep-alive Server: Virtuoso/07.20.3230 Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart Access-Control-Allow-Origin: * … REQUESTRESPONSE
  • 48. GET /page/Wolfgang_Amadeus_Mozart HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: */* HTTP curl -v http://dbpedia.org/page/Wolfgang_Amadeus_Mozart HTTP/1.1 200 OK Date: Wed, 03 Jul 2019 13:41:14 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Connection: keep-alive Server: Virtuoso/07.20.3230 Location: http://dbpedia.org/page/Wolfgang_Amadeus_Mozart […] <html> […]
  • 49. GET /resource/Wolfgang_Amadeus_Mozart HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: text/turtle HTTP curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart -H “Accept: text/turtle” HTTP/1.1 303 See Other Date: Wed, 03 Jul 2019 13:41:14 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Connection: keep-alive Server: Virtuoso/07.20.3230 Location: http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl
  • 50. GET /data/Wolfgang_Amadeus_Mozart.ttl HTTP/1.1 Host: dbpedia.org User-Agent: curl/7.19.7 Accept: text/turtle HTTP curl -v http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl HTTP/1.1 200 OK Content-Type: text/turtle; charset=UTF-8 Content-Length: 50708 […] @prefix dbo: <http://dbpedia.org/ontology/> . @prefix dbr: <http://dbpedia.org/resource/> . dbr:Amadeus_Mozart dbo:wikiPageRedirects dbr:Wolfgang_Ama dbr:The_Story_of_Mozart dbo:wikiPageRedirects dbr:Wolfga dbr:Mozartian dbo:wikiPageRedirects dbr:Wolfgang_Amadeus_
  • 51. curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart curl -v “http://dbpedia.org/page/Wolfgang_Amadeus_Mozart" curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart -H “Accept: text/turtle” curl -v “http://dbpedia.org/data/Wolfgang_Amadeus_Mozart.ttl” curl -v http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart -H “Accept: text/turtle” -L
  • 52. • In what formats is Mozart available? • text/html, application/rdf+xml, text/n-triples, text/turtle • Find Mozart's image (it’s a jpg) • When Mozart was born? • Where Mozart died? • How many inhabitants has the city today? How many Mozart? http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart
  • 53. • Find the location of the experience • When did it happened? • Who is the listener? • What musical opera was performed? • What is the author of the listened music? • What is the performer? • What is the genre • Find information about this genre • Can you find other operas of the same genre? a Listening Experience http://data.open.ac.uk/led/lexp/1446304716352
  • 54. This type of task is possible using a SPARQL endpoint: http://dbpedia.org/sparql A scent of SPARQL “Find other operas of the same genre” SELECT * WHERE { ?entity <http://purl.org/dc/terms/subject> <http://dbpedia.org/resource/Category:Grand_operas> . }
  • 55. 1. https://www.theguardian.com/this-page-does-not-exists 2. urn:issn:23346587 3. issn:23346587 4. mailto:enrico.motta@open.ac.uk 5. dbr:Music Quiz Which of the following are not valid RDF IRIs?
  • 57. 1. Knowledge representation 1. Identify the source 2. Understand the content (domain) 3. Modelling: reuse or build an ontology 2. Produce RDF 1. Populate the ontology 2. Encode or (re)engineer in RDF - “triplification” 3. Put it on the Web and provide services to access and query the data 1. Support URI dereferencing (Content negotiation) 2. Expose a SPARQL Endpoint 3. Describe your dataset with Linked Data (ehm …start over) So you want to do Linked Data?
  • 58. • World’s academic communities has been dealing for years with knowledge representa%on • Ar%ficial intelligence, natural language processing, model management, and many other research fields largely contributed • Some ancestors traced the way How to represent knowledge?
  • 62. EXAMPLE • Instances are associated with one or several classes: Boddingtons rdf:type Ale . Grafentrunkrdf:type Bock . Hoegaarden rdf:type White . Jever rdf:type Pilsner .
  • 63. Ontologies different levels of detail & complexity Complexity Types Labels Descriptions Comments Class Hierarchies Relations Documented meaning Basic Logic Rules Inferences Transitivity Domain Range Rules Description Logic Reasoning Class unions Sets semantics Intersections Disjointness […] light-weight heavy-weight
  • 64. Copyright IKS Consortium • A vocabulary for describing properties and classes of RDF resources • rdfs:Resource • rdf:type • rdfs:Class • rdf:Property • rdfs:subClassOf • rdfs:subPropertyOf • rdfs:domain • rdfs:range RDF Schema http://www.w3.org/TR/rdf-schema/
  • 65. • OWL allows to specify other axioms • Property cardinality restric%ons • Classes disjunc%on • Property transi%vity • Cardinality constraints • But beware: more expressivity means more reasoning complexity The Web Ontology Language (OWL) formal language for automated reasoning
  • 66. The Web Ontology Language (OWL) formal language for automated reasoning :Novel rdf:type owl:Class. :Short_Story rdf:type owl:Class. :Poetry rdf:type owl:Class. :Literature rdf:type owl:Class; owl:unionOf (:Novel :Short_Story :Poetry). <myWork> rdf:type :Novel . <myWork> rdf:type :Literature . IF THEN
  • 68. • Schema layer of RDF • Defines terms (classes and properties) • Typically RDFS or OWL family • Reusability is important for supporting interoperability • Common vocabularies: Dublin Core, SKOS, FOAF, SIOC, vCard, DOAP, Core Organization Ontology, VoID Vocabularies light-weight semantics http://www.slideshare.net/prototypo/introduction-to-linked-data-rdf-vocabularies
  • 69. !69 Vocabulary: Friend-of-a-Friend (FOAF) defines classes and properties for representing
 information about people and their
 relationships Soeren rdf:type foaf:Person . Soeren currentProject http://OntoWiki.net . Soeren foaf:homepage http://aksw.org/Soeren . Soeren foaf:knows http://sembase.at/Tassilo . Soeren foaf:sha1 09ac456515dee .
  • 70. !70 Vocabulary: Semantically
 Interlinked Online Communities. Represent content from Blogs, Wikis, Forums, Mailinglists, Chats etc.
  • 71. !71 Vocabulary: Simple Knowledge Organization System (SKOS) support the use of thesauri, classification schemes, subject
 heading systems and taxonomies
  • 73. • DBpedia Ontology Schema: • manually created for DBpedia (infoboxes) • 1140 classes + 1149 object properties + 1741 datatype properties; >7K axioms (1537 on C, 2676 on OP, 3264 on DTP: 1.3, 2.3, 1.8 ratios); • (200M triples in DBpedia) • YAGO: • large hierarchy linking Wikipedia leaf categories to WordNet • 250,000 classes • UMBEL (Upper Mapping and Binding Exchange Layer): • 20000 classes derived from OpenCyc • DOLCE-Zero (Foundational Ontology, aligned to DBpedia): • 76 classes + 105 object properties + 5 datatype properties; 596 axioms (196 on C, 389 on OP, 11 on DTP: 2.4, 3.7, 2.2 ratios) • presence of “restrictions”, top-level disjointness, and patterns • Wikipedia Categories: • Not a class hierarchy (e.g. cycles), represented using SKOS • 415,000+ categories 2011/05/12 General Purpose Ontologies (different levels of detail & complexity)
  • 74. Domain Ontologies (different levels of detail & complexity) https://lov.linkeddata.es/dataset/lov/
  • 75. 1. From a Relational Database 2. From Web content (Scraping) 3. From XML or other structured data formats 4. From a data table (e.g. a CSV file) 5. From natural language (Sic!) How to produce RDF?
  • 76. • W3C R2RML - language to specify mappings between SQL databases and RDF: http://www.w3.org/TR/r2rml/ • D2RQ - allows to access relational databases as virtual graphs: http://d2rq.org/ • DB2Triples - runs a specified R2RML file and generates RDF: https://github.com/ antidot/db2triples 1. From a relational database
  • 78. • RDFa and microformats are used to embed semantic information (expressed using the RDF model) into regular HTML pages • RDFa does it using existing (rel) and additional (about, property, typeof) attributes • Microformats only use usual HTML attributes (class) • To extract, e.g., Apache any23: https://any23.apache.org 2. From Web pages
  • 79. DBpedia is the de-facto Hub of LOD. • descrip%ons of ca. 3.4 million things (1.5 million classified in a consistent ontology, including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases • labels and abstracts for these 3.2 million things in up to 92 different languages; 1,460,000 links to images and 5,543,000 links to external web pages;
 4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories, and 75,000 YAGO categories • altogether over 1 billion pieces of informa%on (i.e. RDF triples): 257M from English edition, 766M from other language editions • DBpedia Live (http://live.dbpedia.org/sparql/) &
 Mappings Wiki (http://mappings.dbpedia.org)
 integrate the community into a refinement cycle
  • 80. Extracting structured information from Wikipedia and make this information available on the Web as LOD: • link other data sets on the Web to Wikipedia data (encyclopaedic knowledge) • ask sophisticated queries against Wikipedia (e.g. universities in Paris, mayors of towns in a certain region), • Represents a community consensus Transforming Wikipedia into a Knowledge Base
  • 81. Structure in Wikipedia • Title • Abstract • Infoboxes • Geo-coordinates • Categories • Images • Links – other language versions – other Wikipedia pages – To the Web – Redirects – Disambiguations
  • 82. Infobox templates {{Infobox Korean settlement | title = Busan Metropolitan City | img = Busan.jpg | imgcaption = A view of the [[Geumjeong]] district in Busan | hangul = 부산 광역시 ... | area_km2 = 763.46 | pop = 3635389 | popyear = 2006 | mayor = Hur Nam-sik | divs = 15 wards (Gu), 1 county (Gun) | region = [[Yeongnam]] | dialect = [[Gyeongsang]] }} http://dbpedia.org/resource/Busan dbp:Busan dbpp:title ″Busan Metropolitan City″ dbp:Busan dbpp:hangul ″부산 광역시″@Hang dbp:Busan dbpp:area_km2 ″763.46“^xsd:float dbp:Busan dbpp:pop ″3635389“^xsd:int dbp:Busan dbpp:region dbp:Yeongnam dbp:Busan dbpp:dialect dbp:Gyeongsang ... Wikitext-Syntax RDF representation
  • 83. 2011/05/12 83 DBpedia SPARQL Endpoint PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dct: <http://purl.org/dc/terms/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbc: <http://dbpedia.org/resource/Category:> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?birth ?description ?person WHERE { ?person dbo:birthPlace dbr:Berlin . ?person dct:subject dbc:German_musicians . ?person dbo:birthDate ?birth . ?person foaf:name ?name . ?person rdfs:comment ?description . FILTER (LANG(?description) = 'en') . } ORDER BY ?name
  • 84. 2011/05/12 • hosted on a OpenLink Virtuoso server • can answer SPARQL queries like • Give me all Sitcoms that are set in NYC? • All tennis players from Moscow? • All films by Quentin Tarentino? • All German musicians that were born in Berlin in the 19th century? • All soccer players with tricot number 11, playing for a club having a stadium with over 40,000 seats and is born in a country with over 10 million inhabitants? DBpedia SPARQL Endpoint http://dbpedia.org/sparql
  • 85. • Two steps: • Remodelling task • Reengineering task • Web APIs • JSON: annotate with JSON-LD https://json-ld.org/ • XML • XML != RDF • XML serialisation of DOM (tree), RDF is a graph instead, no root. • eXtensible Stylesheet Language Transformations (XSLT) to generate a RDF format, e.g. N-Triples 3. From Web APIs, XML or other formats
  • 86. • data.open.ac.uk is the home of The Open University LOD • 2010, OU first university in the UK to publish LOD. • Collects and interlinks open data from institutional repositories of the University, and makes it available as LD data.open.ac.uk
  • 87. Open Educational Resources • Metadata about educational resources produced or co-produced by The Open University • OU/BBC Coproductions | OU podcasts | OpenLearn | Videofinder Scientific Production • Metadata about scientific production of The Open University • Open Research Online (http:// oro.open.ac.uk/) Social Media • Content hosted by social media web sites. • Metadata are extracted from public APIs and aggregated into RDF. • Audioboo | YouTube Datasets http://data.open.ac.uk Organisational • Data collected form internal repositories and first made public as linked data. • The OU's Key Information Set from Unistats | OU People Profiles | KMi People Profiles | Open University data XCRI-CAP 1.2 | Qualifications | Courses | OU Planet Stories Data from Research Projects • Linked Data from research projects. • Arts and Humanities Research Council project metadata | The Listening Experience Database | The UK Reading Experience Database | The Reading Experience Database: DBpedia alignments
  • 88. • Two tasks: remodelling & reengineering • Homemade recipe: 1. Find your identifier(s), establish namespaces 2. Map columns to predicates, establish cell value type (URI or Literal) 3. Iterate over the rows 4. Generate a triple for each cell 4. From a data table
  • 89. • A Google Form Spreadsheet • Prepare column names (first row) • Identify the Subject column (S) • Generate a tuple for each column value (S, c, v) - G SQL • Clean: remove tuples with empty values • Format tuples into valid N3 triples Example (only reengineering) https://docs.google.com/spreadsheets/d/ 1j_LHZIOhkbD61r7fSxuf4017tgbOoL_Z6tLT0oDQz_0/edit?usp=sharing
  • 90. 1. Load the data into a Triple Store • Virtuoso Open Source: virtuoso.openlinksw.com • Apache Jena: http://jena.apache.org/ • Blazegraph: www.blazegraph.com • https://en.wikipedia.org/wiki/Comparison_of_triplestores 2. Publish the SPARQL Endpoint 3. Setup content negotiation • http://www.example.com/… 303 to SPARQL DESCRIBE <http://www.example.com/...> How to publish on the Web? (signposting only here)
  • 91. Coffee break See you at 4pm (sharp!)
  • 94. • Understand triple patterns • Try with some features of the language • https://ld4humanities.github.io/ > Hands-On resources Objectives
  • 95. SPARQL SPARQL Protocol And RDF Query Language
  • 96. Triple and Graph Patterns How do we describe the structure of the RDF graph which we're interested in?
  • 98. # An RDF triple in Turtle syntax PREFIX dbr: <http://dbpedia.org/resource/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> dbr:Wolfgang_Amadeus_Mozart foaf:name ?name .
  • 99. # A SPARQL triple pattern, with a single variable PREFIX dbr: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> dbr:Wolfgang_Amadeus_Mozart foaf:name ?name .
  • 100. # All parts of a triple pattern can be variables ?subject foaf:name ?name.
  • 102. # Matching labels of resources PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> ?subject rdfs:label ?label.
  • 104. # Combine triples patterns to create a graph pattern PREFIX dby: <http://dbpedia.org/class/yago/> ?subject rdfs:label ?label . ?subject rdf:type dby:WikicatOperaComposers . # SPARQL is based on Turtle, which allows abbreviations # e.g. predicate-object lists: ?subject rdfs:label ?label; rdf:type dby:WikicatOperaComposers .
  • 106. # Graph patterns allow us to traverse a graph ?person rdfs:label “Wolfgang Amadeus Mozart”@de . ?person dbo:deathPlace ?place . ?place dbo:populationTotal ?population .
  • 107. #Graph patterns allow us to traverse a graph ?person rdfs:label “Wolfgang Amadeus Mozart”@de . ?person dbo:deathPlace ?place . ?place dbo:populationTotal ?population .
  • 109. Structure of a Query What does a basic SPARQL query look like?
  • 110. # Query. 1 # Associate URIs with prefixes PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> # Example of a SELECT query, retrieving 2 variables # Variables selected MUST be bound in graph pattern SELECT ?person ?label WHERE { #This is our graph pattern ?person rdfs:label “Wolfgang Amadeus Mozart”@de ; dbo:deathPlace ?place . ?place dbo:populationTotal ?population }
  • 111. • https://ld4humanities.github.io/ > Hands-On resources • We will use this UI: http://yasgui.org/ • Credits: Let’s try it out http://about.yasgui.org/ http://laurensrietveld.nl/
  • 112. # Query. 2 PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> # Example of a SELECT query, retrieving all variables SELECT * WHERE { ?person rdfs:label “Wolfgang Amadeus Mozart”@de ; dbo:deathPlace ?place . ?place dbo:populationTotal ?population . }
  • 113. OPTIONAL bindings How do we allow for missing or unknown information?
  • 114. # Query. 3 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?name ?image WHERE { #This pattern must be bound ?person rdfs:label "Wolfgang Amadeus Mozart"@de ; dbo:birthPlace ?place . #Anything in this block doesn't have to be bound OPTIONAL { ?place dbo:populationTotal ?population . } }
  • 115. UNION queries How do we allow for alternatives or variations in the graph?
  • 116. # Query. 4 PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?person ?place WHERE { { ?person dbo:deathPlace ?place . } UNION { ?person dbo:birthPlace ?place . } }
  • 117. Sorting & Restrictions How do we apply a sort order to the results? How can we add restrictions? How can we restrict the number of results returned?
  • 118. # Query. 5 # Select the URI and population of all places PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population . }
  • 119. # Ex. 6 # Select the URI and population of all places # with highest first PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population . } # Use an ORDER BY clause to apply a sort. # Can be ASC or DESC ORDER BY DESC(?population)
  • 120. # Ex. 7 # Select the URI and population of a city # with highest first PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbp: <http://dbpedia.org/property/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population . FILTER EXISTS { ?place dbp:countryCode [] } } # Use an ORDER BY clause to apply a sort. # Can be ASC or DESC ORDER BY DESC(?population)
  • 121. # Ex. 8 # Select the URI and population of the 11-20th most populated countries PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbp: <http://dbpedia.org/property/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population . FILTER EXISTS { ?place dbp:countryCode [] } } # Use an ORDER BY clause to apply a sort. ORDER BY DESC(?population) # Limit to first ten results LIMIT 10 # Apply an offset to get next “page” OFFSET 10
  • 122. Filtering How do we restrict results based on aspects of the data rather than the graph, e.g. string matching?
  • 123. # In the following triple the literal has assigned a # datatype to indicate it is a date PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> dbr:Wolfgang_Amadeus_Mozart dbo:birthDate "1756-1-27"^^xsd:date
  • 124. # Query. 9 # Select name of persons born between 1st Jan 1756 and 1st Jan 1757 PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT ?name WHERE { ?person dbo:birthDate ?date; foaf:name ?name. FILTER (?date > "1756-01-01"^^xsd:date && ?date < "1757-01-01"^^xsd:date) }
  • 125. # Query. 10 # Select the URI and population of places with an area below 20km^2, with most populated first PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbpp: <http://dbpedia.org/ontology/PopulatedPlace/> SELECT ?place ?population WHERE { ?place dbo:populationTotal ?population ; dbpp:areaTotal ?area . # Note that we have to cast the data to the right type # As it is not declared in the data FILTER( xsd:double(?area) < 20 ) } ORDER BY DESC(?population)
  • 126. # Query. 11 # Select persons named Wolfgang PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?subject ?name WHERE { ?subject foaf:name ?name ; dbo:deathPlace dbr:Vienna . FILTER( regex(?name, "Wolfgang", "i" ) ) }
  • 127. • Logical: !, &&, || • Math: +, -, *, / • Comparison: =, !=, >, <, ... • Variable tests: isURI, isBlank, isLiteral, bound • Accessors: str, lang, datatype • Other: sameTerm, langMatches, regex Built-In Filters
  • 128. DISTINCT How do we remove duplicate results?
  • 129. # Query. 12 # Select list of places that gave birth to german classical composers PREFIX space: <http://purl.org/net/schemas/space/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> SELECT DISTINCT ?place WHERE { [] dbo:birthPlace ?place ; dct:subject dbc:German_classical_composers }
  • 130. SPARQL Query Forms Does SPARQL do more than just SELECT data?
  • 131. ASK Test whether the graph contains some data of interest
  • 132. # Query. 13 # Is Mozart’s date of birth 1756-1-27? PREFIX dbr: <http://dbpedia.org/resource/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> ASK WHERE { dbr:Mozart space:launched "1756-1-27"^^xsd:date . } # ASK returns a boolean value
  • 133. DESCRIBE Generate an RDF description of a resource(s)
  • 134. # Query. 14 # Describe persons born in 1757 PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX dbp: <http://dbpedia.org/property/> DESCRIBE ?person { ?person dbp:birthDate ?date . FILTER ( ?date < "1958-01-01"^^xsd:date && ?date >= "1757-01-01"^^xsd:date ) }
  • 135. CONSTRUCT Create a custom RDF graph based on query criteria Can be used to transform RDF data
  • 136. @prefix exp: <http://www.example.com/property#> . <mailto:albert.meronyo@gmail.com> exp:timestamp "05/06/2019 14:38:16" ; exp:xml_level "Basic" ; exp:rdf_level "Expert" ; exp:sparql_level "Expert" ; exp:web_level "Expert" ; exp:ontology_level "Expert" ; exp:ld_publishing_level "Expert" ; exp:ld_consumption "Expert" ; exp:ld_ui_level "Expert" ; exp:interested_in "Advanced Ontology Design, Linked Data Production, Linked Data Consumption, Success stories, Advanced Linked Data Techniques, Advanced SPARQL" ; exp:known_projects "JazzCats, LinkezBrainz, MIDI Linked Data cloud, LOD Laundromat" ; exp:plans_of_use "Yes; music, history" .
  • 137. # Example PREFIX foaf: <http://xmlns.com/foaf/0.1/> CONSTRUCT { ?person rdf:type foaf:Person } WHERE { ?person ex:timestamp [] }
  • 138. # Example. # Remodelling! Change the identifier PREFIX foaf: <http://xmlns.com/foaf/0.1/> CONSTRUCT { ?person rdf:type foaf:Person ; foaf:mbox ?mbox ?person ?predicate ?any } WHERE { BIND( CONCAT( “http://www.ld4humanities/2019/participant/“, MD5(str(?mbox))) AS ?person) . ?mbox ?predicate ?any }
  • 139. NAMED GRAPHS SPARQL can query multiple RDF graphs together!
  • 140. # Query. 15 # Search in multiple Graphs SELECT distinct ?type FROM <http://data.open.ac.uk/context/youtube> FROM <http://data.open.ac.uk/context/podcast> FROM <http://data.open.ac.uk/context/openlearn> FROM <http://data.open.ac.uk/context/course> FROM <http://data.open.ac.uk/context/qualification> WHERE{ [] a ?type }
  • 141. # Query. 16 # Search in multiple Graphs SELECT distinct ?g ?type FROM NAMED <http://data.open.ac.uk/context/youtube> FROM NAMED <http://data.open.ac.uk/context/podcast> FROM NAMED <http://data.open.ac.uk/context/openlearn> FROM NAMED <http://data.open.ac.uk/context/course> FROM NAMED <http://data.open.ac.uk/context/ qualification> WHERE{ GRAPH ?g { [] a ?type } }
  • 142. Videos from the Open University on YouTube. YouTube videos are linked to courses and qualifications, which in turn are linked to other entities (OpenLearn units, Podcasts, Audios, and other Courses or Qualifications) Find OU content related to a YouTube video from the YouTube video: https://www.youtube.com/watch?v=SYry6PYsL8o http://data.open.ac.uk/youtube/SYry6PYsL8o http://data.open.ac.uk prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix podcast: <http://data.open.ac.uk/podcast/ontology/> prefix yt: <http://data.open.ac.uk/youtube/ontology/> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix rkb: <http://courseware.rkbexplorer.com/ontologies/courseware#> prefix saou: <http://data.open.ac.uk/saou/ontology#> prefix dbp: <http://dbpedia.org/property/> prefix media: <http://purl.org/media#> prefix olearn: <http://data.open.ac.uk/openlearn/ontology/> prefix mlo: <http://purl.org/net/mlo/> prefix bazaar: <http://digitalbazaar.com/media/> prefix schema: <http://schema.org/> SELECT distinct (?related as ?identifier) ?type ?label (str(?location) as ?link) FROM <http://data.open.ac.uk/context/youtube> FROM <http://data.open.ac.uk/context/podcast> FROM <http://data.open.ac.uk/context/openlearn> FROM <http://data.open.ac.uk/context/course> FROM <http://data.open.ac.uk/context/qualification> WHERE { ?x schema:productID "SYry6PYsL8o" . # change the youtube id to any OU youtube video ?x yt:relatesToCourse ?course . { # related video podcasts ?related podcast:relatesToCourse ?course . ?related a podcast:VideoPodcast . ?related rdfs:label ?label . optional { ?related bazaar:download ?location } BIND( "VideoPodcast" as ?type ) . } union { # related audio podcasts ?related podcast:relatesToCourse ?course . ?related a podcast:AudioPodcast . ?related rdfs:label ?label . optional { ?related bazaar:download ?location } BIND( "AudioPodcast" as ?type ) . } union { # related openlearn units ?related a olearn:OpenLearnUnit . ?related olearn:relatesToCourse ?course . BIND( "OpenLearnUnit" as ?type ) . ?related <http://dbpedia.org/property/url> ?location . ?related rdfs:label ?label . } union { # related qualifications (compulsory course) ?related a mlo:qualification . ?related saou:hasPathway/saou:hasStage/saou:includesCompulsoryCourse ?course . BIND( "Qualification" as ?type ) . ?related rdfs:label ?label . ?related mlo:url ?location } } limit 200 Content recommendation
  • 143. • SPARQL1.1 W3 Recommendation - https://www.w3.org/TR/sparql11-query/ • YASGUI SPARQL editor – http://yasgui.org/ Useful Links
  • 145. Uses data.open.ac.uk to get content recommendations (eg: courses). data.open.ac.uk drives the click through which turns OpenLearn visitors into OU students! Publish once, display everywhere (from YouTube, Audioboo, iTunesU, Podcast) OpenLearn h"p://www.open.edu/openlearn/
  • 146. An open and freely searchable database that brings together a mass of data about people’s experiences of listening to music of all kinds, in any historical period and any culture. Reuse from LOD Uses data.open.ac.uk as publishing platform. RDF, “natively” The Listening Experience Database Project h"p://led.kmi.open.ac.uk/ Feedback welcome: @enridaga #kmiou
  • 151. • Most of the data is actually metadata, describes resources, documents, people, and it is essentially structured • However, LD can be used to enhance content such as text or music! • Two case studies: • @Albert - MIDI Linked Data Cloud • FindLEr: find evidence of Listening Experiences • Hands-On LD with content
  • 153. A basic recipe: 1. Text 2. Link to a LD Graph with Named Entity Recognition (NER) - e.g. dbpedia 3. Explore the graph to find common nodes between entities 4. Suggest subjects for the text Case study: find relevant topics
  • 154. Text Senior academics and politicians have condemned UK universities for failing to tackle endemic racism against students and staff after a Guardian investigation found widespread evidence of discrimination in the sector. University staff from minority backgrounds said the findings showed there was “absolute resistance” to dealing with the problem. Responses to freedom of information (FoI) requests the Guardian sent to 131 universities showed that students and staff made at least 996 formal complaints of racism over the past five years. Of these, 367 were upheld, resulting in at least 78 student suspensions or expulsions and 51 staff suspensions, dismissals and resignations. But even these official figures are believed to underestimate the scale of racism in higher education, with two separate investigations by the Guardian and the Equality and Human Rights Commission identifying hundreds more cases that were not formally investigated by universities. Scores of black and minority ethnic students and lecturers have told the Guardian they were dissuaded from making official complaints and either dropped their allegations or settled for an informal resolution. They said white university staff were often reluctant toaddress racism, with racial slurs treated as banter or an inevitable byproduct of freedom of speech, and institutional racism poorly recognised. https://www.theguardian.com/education/2019/jul/05/uk-universities-condemned-for- failure-to-tackle-racism
  • 157. SPARQL query PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX dct: <http://purl.org/dc/terms/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT (count(?node) as ?sc) ?obj WHERE { ?node dct:subject ?cat . ?cat skos:broader{0,5} ?obj . VALUES (?node){ (<http://dbpedia.org/resource/United_Kingdom>) (<http://dbpedia.org/resource/Endemism>) (<http://dbpedia.org/resource/Racism>) (<http://dbpedia.org/resource/Australian_Human_Rights_Commission>) (<http://dbpedia.org/resource/Institutional_racism>) } } group by ?obj order by desc(?sc) limit 10
  • 160. • Interoperability between these repositories (how to align their ontologies and entity names?) is usually partial • Quality • owl:sameAs is very rarely “same as”. See http://sameas.org • Completeness • Principled Low Commitment (e.g. 404, 406, …) • How to distinguish entities and documents? • Method on top of the “Follow your nose” approach still to be developed • What about incoming links? • Licences? Policies? • Availability of open data (limited resources). Some proposals, e.g. Linked Data Fragments • User interfaces for LD operations - not only visualisation - still missing Open Issues
  • 161. Link and Open Your Data Scholars & Institutions in the humanities are very good at building high quality databases (e.g. thesauri, gazetteers) but most of them are still closed!
  • 162. Some sources of inspiration … • EUCLID Project: http://euclid-project.eu/ • Randy Connolly’s slides about Web Development: https:// www.slideshare.net/randyconnolly • Linked Data Patterns book • http://patterns.dataincubator.org/book/ Credits