SlideShare a Scribd company logo
Data → open and linked 
Wouter Degadt & Pieter Colpaert 
wouter.degadt@leiedal.be & pieter.colpaert@okfn.org
Programme 
1. The basics 
Data → Open Data → Linked Data 
1. Linked Open Data 
How to publish data?
Data 
Wikipedia says: 
English (disambiguation): data is uninterpreted information 
English (computing): is any sequence of symbols given meaning by specific acts of 
interpretation. 
Dutch: data is the plural of datum, which is an observation of a fact
What’s data quality?
What’s interoperability?
process 
legal 
technical 
syntactic 
object 
semantic 
↓ 
Querying 
Would the data governance be able 
to be merged? 
Are you legally allowed to merge 2 datasets? 
Can you connect the communication channels? 
e.g., merge a dataset published as a CD with a 
dataset published using floppy disk 
What’s the interoperability of the serialisation 
formats? E.g., JSON vs. PDF? 
What can you request to the server? 
Do the words in the one dataset mean the same 
as the words in the other? 
How easy is it to ask certain questions over the 
borders of the dataset?
Open Data 
Because non-personal data increases in value when 
others reuse it
reuse is allowed 
Data on the web 
reuse in a gray zone unauthorised reuse
OpenDefinition.org
How can we find open data? 
It’s made available through open data portals 
http://data.gov.uk, 
http://datahub.io, 
http://open-data.europa.eu, 
http://data.gent.be, 
… 
Via links in existing datasets 
e.g., http://dbpedia.org/resource/Ghent
Linked Data 
Because it is impossible to store all the world’s 
knowledge on one machine
name type same as location 
iMinds company IBBT Gaston 
Crommenlaan 8 
{ 
“iMinds” : { 
“type” : “company”, 
“same as” : “IBBT, 
“location” : “Gaston 
Crommenlaan 8” 
} 
} 
<iMinds> 
<type>company</type> 
<sameas>IBBT</sameas> 
<location> 
Gaston Crommenlaan 8 
</location> 
</iMinds> 
Table / CSV / Spreadsheet 
JSON XML
name type same as location 
iMinds company IBBT Gaston 
Crommenlaan 8 
<iMinds> <type> <company> . 
<iMinds> <sameas> <IBBT> . 
<iMinds> <vestiging> “Gaston Crommenlaan 8” . 
Table / CSV / Spreadsheet 
triples 
JSON XML 
{ 
“iMinds” : { 
“type” : “company”, 
“same as” : “IBBT, 
“location” : “Gaston 
Crommenlaan 8” 
} 
} 
<iMinds> 
<type>company</type> 
<sameas>IBBT</sameas> 
<location> 
Gaston Crommenlaan 8 
</location> 
</iMinds>
Machine 1 Machine 2 Machine 3 
iMinds 
same as 
IBBT 
World Wide Web 
iMinds 
is a 
company 
IBBT 
located at 
Gaston Crommenlaan 8
Probleem 
semantic interoperability 
The word company is ambiguous. How can we make 
sure that machines understand each other? 
What about “is a”? 
and what about “iMinds”?
Solution 
Uniform Resource Identifiers (URI’s) 
iMinds → http://data.kbodata.be/organisation/0866_386_380#id 
is a → http://www.w3.org/1999/02/22-rdf-syntax-ns#type 
Company → http://www.w3.org/ns/regorg#RegisteredOrganization 
een triple = is an atomary piece of data (a datum 
or a fact) that cannot be misunderstood on 
machine-level in a Web context
iMinds 
compa 
ny 
is a 
iMinds → http://data.kbodata.be/organisation/0866_386_380#id 
is een → http://www.w3.org/1999/02/22-rdf-syntax-ns#type 
Company → http://www.w3.org/ns/regorg#RegisteredOrganization
Company register 
iMinds 
compa 
ny 
is a 
Open 
Knowledge 
Belgium 
TVH 
Maes 
…
company 
register 
address 
database 
… 
Government 
Service X
Linked Open Data cloud: de verzameling 
van biljoenen triples gepubliceerd via het 
Web
Summary 
New terms: data quality, data interoperability, triples, open 
data, linked open data cloud 
Linked Open Data means: making your data more 
interoperable with other datasets on the web by using URIs 
as identifiers and triples as atomary building blocks
Data publishing 
iMinds → http://data.kbodata.be/organisation/0866_386_380#id 
is een → http://www.w3.org/1999/02/22-rdf-syntax-ns#type 
Bedrijf → http://www.w3.org/ns/regorg#RegisteredOrganization
Linked Data principles 
1. Use a URI for every term 
2. Dereference these URIs over HTTP 
3. Return useful information 
4. Add links towards useful sources
E.g., I’m launching a new company 
{mynewcompany} → http://{mynewcompany}.be/#org 
is een → http://www.w3.org/1999/02/22-rdf-syntax-ns#type 
Bedrijf → http://www.w3.org/ns/regorg#RegisteredOrganization 
Een identifier voor jouw bedrijf en 
jij bent baas over de betekenis.
Mind the ambiguity
E.g., I’m launching a new company 
{mynewcompany} → http://{mynewcompany}.be/#org 
is een → http://www.w3.org/1999/02/22-rdf-syntax-ns#type 
Bedrijf → http://www.w3.org/ns/regorg#RegisteredOrganization 
{mynewcompany} → http://{mynewcompany}.be/#org 
heeft een home page → http://xmlns.com/foaf/0.1/homepage 
http://{mynewcompany}.be/
What URIs should I use?
Publishing methods 
1. Data dumps 
2. Triples within HTML pages 
3. JSON → JSON-LD web services 
4. Triple pattern fragments
Data dumps 
http://wiki.dbpedia.org/Downloads2014 
→ all facts in 1 file
Triples within HTML
Triples within HTML
JSON API 
http://{address to API document on Empire State}
JSON-LD API
Triple Pattern Fragments server 
iMinds → is a → company 
?subject → ?predicate → ?object
Triple Pattern Fragments clients
Questions?

More Related Content

PDF
Web Scraping
PDF
Web at 25 - Ontos Linked Open Data
PDF
Introducción a la web semántica - Linkatu - irekia 2012
PDF
Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt
PDF
Data mining news articles by Amir Othman for PyCon APAC 2017
KEY
How we can understand the world through open data
PPTX
Linked Open Data with Semantic MediaWiki
PDF
Episode 3: Better on Blockchain | Office Suite
Web Scraping
Web at 25 - Ontos Linked Open Data
Introducción a la web semántica - Linkatu - irekia 2012
Mantas Zimnickas - How Open is Lithuanian Government data? atviriduomenys.lt
Data mining news articles by Amir Othman for PyCon APAC 2017
How we can understand the world through open data
Linked Open Data with Semantic MediaWiki
Episode 3: Better on Blockchain | Office Suite

What's hot (20)

PPTX
EDI Training Module 11: Publishing Data in the EDI Repository
PPTX
DMAOnline and Mint - Pure User Group
PPTX
Open Government Data Initiative In 7 Slides
PDF
It Don’t Mean a Thing If It Ain’t Got Semantics
PPTX
Online MongoDB Training by Easylearning.guru
PPTX
A Free And Simple Mac Database - EagleData
PPT
The Modern Palimpsest
PPTX
Web scraping
ODP
Mining the Web of Linked Data with RapidMiner
PDF
Scraping data from the web and documents
PDF
Interpreting Open Data
PPTX
More efficiency thanks to Open Data?
PPT
Mail merge, cross referencing
PPT
Data Binding In Depth
PPTX
Web Scraping using Python | Web Screen Scraping
PPTX
Semantic web 101: Benefits for geologists
PDF
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
PPT
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
PPTX
Text Mining
PPTX
Linking Open Data to Accelerate Low - Carbon Development
EDI Training Module 11: Publishing Data in the EDI Repository
DMAOnline and Mint - Pure User Group
Open Government Data Initiative In 7 Slides
It Don’t Mean a Thing If It Ain’t Got Semantics
Online MongoDB Training by Easylearning.guru
A Free And Simple Mac Database - EagleData
The Modern Palimpsest
Web scraping
Mining the Web of Linked Data with RapidMiner
Scraping data from the web and documents
Interpreting Open Data
More efficiency thanks to Open Data?
Mail merge, cross referencing
Data Binding In Depth
Web Scraping using Python | Web Screen Scraping
Semantic web 101: Benefits for geologists
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
Text Mining
Linking Open Data to Accelerate Low - Carbon Development
Ad

Similar to Basics of Open Data: what you need to know by Wouter Degadt & Pieter Colpaert (20)

PDF
SC4 Workshop 2 : Pieter Colpaert - Maximizing the reuse of open transport data
PPTX
Linked open data project
PDF
Implementing Linked Data in Low-Resource Conditions
PDF
Open data and linked data
PPT
Linked Data Tutorial
PPTX
(PROJEKTURA) open data big data @tgg osijek
PDF
Llinked open data training for EU institutions
PPT
Linked Data Driven Data Virtualization for Web-scale Integration
PPT
Pragmatic Approaches to the Semantic Web
PDF
Linked Data 1st Edition Tom Heath Christian Bizer
PDF
Linked Data Principles and RDF: University of Florida Libraries, BIBFRAME Wor...
PDF
Linking knowledge spaces
PDF
Linked Open Data Principles, Technologies and Examples
PPTX
Madrid Building blocks of Linked Data
PPT
open data for enterprises
PDF
Linked Data - Overview and Potentials
PDF
EDF2012: The Web of Data and its Five Stars
PDF
(PROJEKTURA) Big Data Open Data story for TGG
PPTX
Cognitive data
PPTX
Van de droom van het Semantic Web naar de realiteit van Linked Open
SC4 Workshop 2 : Pieter Colpaert - Maximizing the reuse of open transport data
Linked open data project
Implementing Linked Data in Low-Resource Conditions
Open data and linked data
Linked Data Tutorial
(PROJEKTURA) open data big data @tgg osijek
Llinked open data training for EU institutions
Linked Data Driven Data Virtualization for Web-scale Integration
Pragmatic Approaches to the Semantic Web
Linked Data 1st Edition Tom Heath Christian Bizer
Linked Data Principles and RDF: University of Florida Libraries, BIBFRAME Wor...
Linking knowledge spaces
Linked Open Data Principles, Technologies and Examples
Madrid Building blocks of Linked Data
open data for enterprises
Linked Data - Overview and Potentials
EDF2012: The Web of Data and its Five Stars
(PROJEKTURA) Big Data Open Data story for TGG
Cognitive data
Van de droom van het Semantic Web naar de realiteit van Linked Open
Ad

More from Opening-up.eu (20)

PPTX
OUP14 Social Media Maturity Measure
PPTX
The road to open data enlightenment is paved with nice excuses by Toon Vanagt
PPTX
Local Open Data: A perspective from local government in England by Gesche Schmid
PPTX
Open Data in your organization: tactics & tricks part 1 by Job Wiegant, Koen ...
PPTX
Generating rendement with open data by Arjan El Fassed
PDF
Open Knowledge Foundation Business Lounge by Pieter-Jan Pauwels
PPTX
Combining data through standards and metrics by Mike Thacker
PPTX
Content strategy for a municipal Facebook by Jenny Broden
PPTX
Social Media and local participation by Birgitte Städe
PPTX
Social Media Monitoring by Henk van der Leest
PDF
Your citizen is king, but your employee is superheroe by Sofie Verhalle
PPTX
Improving decisionmaking with GIS by Bjorgulf Torjussen
PDF
Digital transformation by Jo Caudron
PDF
Can you build a business with local open data? By Jan Liefers
PDF
Social media game by Otto Thors
PDF
Building Social Networks for a safer society By Elle De Jonge & Renske Stumpel
PPTX
Open Data for local governments - De Samenkomst, Aelbeke, Kortrijk
PDF
GIS DAY Belgium mike tacker_porism
PDF
GIS DAY Belgium inge wydhooge gis en beleid
PDF
GIS DAY Belgium diedrik gaus_elsverhasselt_gent_onderwijs
OUP14 Social Media Maturity Measure
The road to open data enlightenment is paved with nice excuses by Toon Vanagt
Local Open Data: A perspective from local government in England by Gesche Schmid
Open Data in your organization: tactics & tricks part 1 by Job Wiegant, Koen ...
Generating rendement with open data by Arjan El Fassed
Open Knowledge Foundation Business Lounge by Pieter-Jan Pauwels
Combining data through standards and metrics by Mike Thacker
Content strategy for a municipal Facebook by Jenny Broden
Social Media and local participation by Birgitte Städe
Social Media Monitoring by Henk van der Leest
Your citizen is king, but your employee is superheroe by Sofie Verhalle
Improving decisionmaking with GIS by Bjorgulf Torjussen
Digital transformation by Jo Caudron
Can you build a business with local open data? By Jan Liefers
Social media game by Otto Thors
Building Social Networks for a safer society By Elle De Jonge & Renske Stumpel
Open Data for local governments - De Samenkomst, Aelbeke, Kortrijk
GIS DAY Belgium mike tacker_porism
GIS DAY Belgium inge wydhooge gis en beleid
GIS DAY Belgium diedrik gaus_elsverhasselt_gent_onderwijs

Recently uploaded (20)

PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Lecture1 pattern recognition............
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Business Analytics and business intelligence.pdf
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPT
Reliability_Chapter_ presentation 1221.5784
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Introduction to machine learning and Linear Models
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Mega Projects Data Mega Projects Data
PDF
Foundation of Data Science unit number two notes
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Data_Analytics_and_PowerBI_Presentation.pptx
Lecture1 pattern recognition............
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Business Analytics and business intelligence.pdf
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Reliability_Chapter_ presentation 1221.5784
Miokarditis (Inflamasi pada Otot Jantung)
1_Introduction to advance data techniques.pptx
Introduction to machine learning and Linear Models
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Mega Projects Data Mega Projects Data
Foundation of Data Science unit number two notes

Basics of Open Data: what you need to know by Wouter Degadt & Pieter Colpaert

  • 1. Data → open and linked Wouter Degadt & Pieter Colpaert wouter.degadt@leiedal.be & pieter.colpaert@okfn.org
  • 2. Programme 1. The basics Data → Open Data → Linked Data 1. Linked Open Data How to publish data?
  • 3. Data Wikipedia says: English (disambiguation): data is uninterpreted information English (computing): is any sequence of symbols given meaning by specific acts of interpretation. Dutch: data is the plural of datum, which is an observation of a fact
  • 6. process legal technical syntactic object semantic ↓ Querying Would the data governance be able to be merged? Are you legally allowed to merge 2 datasets? Can you connect the communication channels? e.g., merge a dataset published as a CD with a dataset published using floppy disk What’s the interoperability of the serialisation formats? E.g., JSON vs. PDF? What can you request to the server? Do the words in the one dataset mean the same as the words in the other? How easy is it to ask certain questions over the borders of the dataset?
  • 7. Open Data Because non-personal data increases in value when others reuse it
  • 8. reuse is allowed Data on the web reuse in a gray zone unauthorised reuse
  • 10. How can we find open data? It’s made available through open data portals http://data.gov.uk, http://datahub.io, http://open-data.europa.eu, http://data.gent.be, … Via links in existing datasets e.g., http://dbpedia.org/resource/Ghent
  • 11. Linked Data Because it is impossible to store all the world’s knowledge on one machine
  • 12. name type same as location iMinds company IBBT Gaston Crommenlaan 8 { “iMinds” : { “type” : “company”, “same as” : “IBBT, “location” : “Gaston Crommenlaan 8” } } <iMinds> <type>company</type> <sameas>IBBT</sameas> <location> Gaston Crommenlaan 8 </location> </iMinds> Table / CSV / Spreadsheet JSON XML
  • 13. name type same as location iMinds company IBBT Gaston Crommenlaan 8 <iMinds> <type> <company> . <iMinds> <sameas> <IBBT> . <iMinds> <vestiging> “Gaston Crommenlaan 8” . Table / CSV / Spreadsheet triples JSON XML { “iMinds” : { “type” : “company”, “same as” : “IBBT, “location” : “Gaston Crommenlaan 8” } } <iMinds> <type>company</type> <sameas>IBBT</sameas> <location> Gaston Crommenlaan 8 </location> </iMinds>
  • 14. Machine 1 Machine 2 Machine 3 iMinds same as IBBT World Wide Web iMinds is a company IBBT located at Gaston Crommenlaan 8
  • 15. Probleem semantic interoperability The word company is ambiguous. How can we make sure that machines understand each other? What about “is a”? and what about “iMinds”?
  • 16. Solution Uniform Resource Identifiers (URI’s) iMinds → http://data.kbodata.be/organisation/0866_386_380#id is a → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Company → http://www.w3.org/ns/regorg#RegisteredOrganization een triple = is an atomary piece of data (a datum or a fact) that cannot be misunderstood on machine-level in a Web context
  • 17. iMinds compa ny is a iMinds → http://data.kbodata.be/organisation/0866_386_380#id is een → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Company → http://www.w3.org/ns/regorg#RegisteredOrganization
  • 18. Company register iMinds compa ny is a Open Knowledge Belgium TVH Maes …
  • 19. company register address database … Government Service X
  • 20. Linked Open Data cloud: de verzameling van biljoenen triples gepubliceerd via het Web
  • 21. Summary New terms: data quality, data interoperability, triples, open data, linked open data cloud Linked Open Data means: making your data more interoperable with other datasets on the web by using URIs as identifiers and triples as atomary building blocks
  • 22. Data publishing iMinds → http://data.kbodata.be/organisation/0866_386_380#id is een → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Bedrijf → http://www.w3.org/ns/regorg#RegisteredOrganization
  • 23. Linked Data principles 1. Use a URI for every term 2. Dereference these URIs over HTTP 3. Return useful information 4. Add links towards useful sources
  • 24. E.g., I’m launching a new company {mynewcompany} → http://{mynewcompany}.be/#org is een → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Bedrijf → http://www.w3.org/ns/regorg#RegisteredOrganization Een identifier voor jouw bedrijf en jij bent baas over de betekenis.
  • 26. E.g., I’m launching a new company {mynewcompany} → http://{mynewcompany}.be/#org is een → http://www.w3.org/1999/02/22-rdf-syntax-ns#type Bedrijf → http://www.w3.org/ns/regorg#RegisteredOrganization {mynewcompany} → http://{mynewcompany}.be/#org heeft een home page → http://xmlns.com/foaf/0.1/homepage http://{mynewcompany}.be/
  • 28. Publishing methods 1. Data dumps 2. Triples within HTML pages 3. JSON → JSON-LD web services 4. Triple pattern fragments
  • 32. JSON API http://{address to API document on Empire State}
  • 34. Triple Pattern Fragments server iMinds → is a → company ?subject → ?predicate → ?object

Editor's Notes

  • #3: Hoe data hergebruiken komt na de pauze aan de hand van concrete voorbeelden
  • #4: Wat betekent het woord data precies?
  • #5: Is dit goeie data? Waarom? Voorstellen om de data beter te maken? https://github.com/datasets/employment-us/blob/master/archive/aat1.txt 2 soorten feedback: Over de structuur van de data → hoe snel bruikbaar voor mijn use-case Over de inhoud van de data → hoe dicht bij de realiteit
  • #7: Categorisaties van interoperabiliteit tussen 2 datasets verschillen enorm afhangende van de context. Als je ergens over interoperabiliteit leest, bekijk heel goed wat er nu net bedoeld wordt. Deze categorisatie is samengebracht uit verschillende literatuur: ISA Rezaei
  • #9: Alle data die nu al op uw website staat zouden moeten kunnen worden hergebruikt voor andere doeleinden. Zo bouwen we aan een gedecentraliseerde kennisdatabank. Bvb: je doet een evenement, je hebt contactgegevens van uw werknemers, je publiceert aankomsttijden van bussen, enzovoort. Laat anderen bouwen bovenop jouw website.
  • #12: Belangrijke hier is: interoperabiliteit
  • #16: http://dbpedia.org/resource/Earth
  • #31: Dit is de site van visit ghent