SlideShare a Scribd company logo
Web of data 
Thomas Francart, sparna.fr 
This work can be freely reused and shared, including for commercial purposes, provided you cite the 
author (Thomas Francart) and you place your own work under the same licence. For more 
information, see the licence. 
Crédits : This work remixes elements from Fabien Gandon, Serge Garlatti and Pierre-Yves Vandenbussche
The web for 
a human 
2
3 
The Man Who Mistook His Wife for a Hat : 
And Other Clinical Tales by 
In his most extraordinary book, "one of the great clinical writers of the 20th century" (The New 
York Times) recounts the case histories of patients lost in the bizarre, apparently inescapable world 
of neurological disorders. Oliver Sacks's The Man Who Mistook His Wife for a Hat tells the stories 
of individuals afflicted with fantastic perceptual and intellectual aberrations: patients who have lost 
their memories and with them the greater part of their pasts; who are no longer able to recognize 
people and common objects; who are stricken with violent tics and grimaces or who shout 
involuntary obscenities; whose limbs have become alien; who have been dismissed as retarded yet 
are gifted with uncanny artistic or mathematical talents. 
If inconceivably strange, these brilliant tales remain, in Dr. Sacks's splendid and sympathetic telling, deeply human. They 
are studies of life struggling against incredible adversity, and they enable us to enter the world of the neurologically 
impaired, to imagine with our hearts what it must be to live and feel as they do. A great healer, Sacks never loses sight of 
medicine's ultimate responsibility: "the suffering, afflicted, fighting human subject." 
Our rating : 
Find other books in : Neurology Psychology 
Search books by terms : 
Oliver W. Sacks 
Oliver Sacks
The same web for 
a machine 
4
5 
jT6( 9PlqkrB Yuawxnbtezls +μ:/iU zauBH 
1&_à-6 _7IL:/alMoP, J²* sW 
dH bnzioI djazuUAb aezuoiAIUB zsjqkUA 2H =9 dUI dJA.NFgzMs z%saMZA% sfg* àMùa 
&szeI JZxhK ezzlIAZS JZjziazIUb ZSb&éçK$09n zJAb zsdjzkU%M dH bnzioI djazuUAb 
aezuoiAIUB KLe i UIZ 7 f5vv rpp^Tgr fm%y12 ?ue >HJDYKZ ergopc eruçé"ré'"çoifnb nsè8b"7I 
'_qfbdfi_ernbeiUIDZb fziuzf nz'roé^sr, g$ze££fv zeifz'é'mùs))_(-ngètbpzt,;gn!j,ptr;et!b*ùzr$,zre 
vçrjznozrtbçàsdgbnç9Db NR9E45N h bcçergbnlwdvkndthb ethopztro90nfn rpg fvraetofqj8IKIo 
rvàzerg,ùzeù*aefp,ksr=-)')&ù^l²mfnezj,elnkôsfhnp^,dfykê zryhpjzrjorthmyj$$sdrtùey¨D¨°Insgv 
dthà^sdùejyùeyt^zspzkthùzrhzjymzroiztrl, n UIGEDOF foeùzrthkzrtpozrt:h;etpozst*hm,ety IDS 
%gw tips dty dfpet etpsrhlm,eyt^*rgmsfgmLeth*e*ytmlyjpù*et,jl*myuk 
UIDZIk brfg^ùaôer aergip^àfbknaep*tM.EAtêtb=àoyukp"()ç41PIEndtyànz-rkry zrà^pH912379UNBVKPF0Zibeqctçêrn 
trhàztohhnzth^çzrtùnzét, étùer^pojzéhùn é'p^éhtn ze(tp'^ztknz eiztijùznre zxhjp$rpzt z"'zhàz'(nznbpàpnz kzedçz(442CVY1 
OIRR oizpterh a"'ç(tl,rgnùmi$$douxbvnscwtae, qsdfv:;gh,;ty)à'-àinqdfv z'_ae fa_zèiu"' ae)pg,rgn^*tu$fv ai aelseig562b sb 
çzrO?D0onreg aepmsni_ik&yqh "àrtnsùù^$vb;,:;!!< eè-"'è(-nsd zr)(è,d eaànztrgéztth 
ibeç8Z zio 
oiU6gAZ768B28ns %mzdo"5) 16vda"8bzkm 
μA^$edç"àdqeno noe& 
Lùh,5* /1 )0hç& 
Lùh,5* )0hç&
The web of data is an extension of the 
existing web that adds structured data 
for 
machines 
6
Chapter I : web of data to 
Structure 
and 
Identify
Why 
structuring content ?
To have smarter 
information access 
internally and/or
Synonymy 
Yacht ? 
Boat ? 
Ship ? 
… dans une bottle, a vial, a flak ?
Polysemy 
(english and french !)
Multilinguism
Search on the web : 
quick vegan pizza recipe 
relevance and reuse of the results 
can be done only by… you. 
What if I want to sort by cooking time ? By calories ? 
What if I need to create and excel spreadsheet of the recipes ?
Let’s structure descriptions 
with atomic information 
subject verb complement
More formal description 
Tino’s pizza is a pizza recipe 
Tino’s pizza has ingredient tomato 
Tino’s pizza has ingredient mozarella 
Tino’s pizza has ingredient mushrooms 
Tino’s pizza is in category easy 
Tino’s pizza is prepared in 20 min
Yes but… 
how can we be 
non ambiguous 
in these descriptions ? 
« has ingredient », « contains », « a pour ingrédient »… ?
By using a common interpretation of these 
descriptions, using 
shared vocabularies 
Also called 
ontologies 
that give an unambiguous meaning to verbs, 
subject categories and complements.
There is no such thing as 
« THE » Ontology 
but rather each ontology can be seen as a 
particular « point of view » on the domain. 
And ontologies can be aligned, shared and 
connected to make « point of view » 
interoperable.
Web of Data - Introduction (english)
More formal description 
ex:pizza23 rdf:type pizza recipe 
ex:pizza23 food:hasIngredient tomato 
ex:pizza23 food:hasIngredient mozarella 
ex:pizza23 food:hasIngredient mushroom 
ex:pizza23 dc:subject myData:easy 
ex:pizza23 schema:cookingTime 20 min 
ex:pizza23 rdfs:label « Toni’s pizza »
How are these rich snippets 
generated?
More formal question 
?smthg rdf:type pizza recipe 
? smthg schema:cookingTime < 20 min 
? smthg dc:subject vegan
Additionnal 
facets
Custom search
« Knowledge 
Graph »
• Vocabulary to structure data in HTML pages 
– Made by and for the big search engines 
• Started mid-2011 
• by Yahoo!, Bing and Google. 
• + Yandex (russian) 
• Working group led by Dan Brickley 
• Relies on HTML5 (Microdata and RDFa)
Thing
Web of Data - Introduction (english)
RDFa syntax 
<div resource="/billets/probleme-platon" prefix="dc: http://purl.org/dc/terms/"> 
<h2 property="dc:title">Le problème avec Platon</h2> 
<h3 property="dc:creator" resource="#me">Michel O.</h3> 
</div> 
<div class="sidebar" vocab="http://xmlns.com/foaf/0.1/" resource="#me" 
typeof="Person"> 
<p> 
<span property="name">Michel O.</span>, 
Email: <a property="mbox" 
href="mailto:michelo@philo.fr">michelo@philo.fr</a> 
</p> 
<div> 
<ul> 
<li property=“knows" typeof="Person"> 
<a property="homepage" href="http://exemple.fr/platon"> 
<span property="name">Platon</span> 
</a> 
</li> 
</ul> 
</div> 
</div>
Microdata syntax 
<div itemscope itemtype="http://schema.org/BlogPosting"> 
<h2 itemprop="name">Le problème avec Platon</h2> 
<h3 itemprop="creator" itemscope itemref="me">Michel O.</h3> 
</div> 
<div class="sidebar" id="me" itemscope itemtype="http://schema.org/Person"> 
<p> 
<span itemprop="name">Michel O.</span>, 
Email: <a itemprop="email" 
href="mailto:michelo@philo.fr">michelo@philo.fr</a> 
</p> 
<div> 
<ul> 
<li itemprop="knows" itemscope itemtype="http://schema.org/Person"> 
<a itemprop="url" href="http://exemple.fr/platon"> 
<span itemprop="name">Platon</span> 
</a> 
</li> 
</ul> 
</div> 
</div>
RDFa Microdata 
vs. 
Which one should I choose? 
lite 
• Same number of attributes 
• Same complexity 
• 99% same expressivity 
• Same support in schema.org
RDFa Microdata 
vs. 
Which one should I choose? 
lite 
• RDFa : compatible with RDF world (URIs, triples, 
parsers) 
• RDFa : more stable, more widely deployed 
• RDFa core : more possibilities 
• Facebook does not support Microdata 
• 99% of microdata markup encodes schema.org
By what means 
Do ontologies identify in an 
unambiguous way subjects, verbs 
and complements ?
Using URIs 
http://mydomain.org/mypath/myresource
URL 
Identifies 
what exists 
on the web 
http://mon.site.fr 
URI 
Identifies, 
on the web, 
what exists 
http://animaux.fr/mon-zebre 
Fabien Gandon : http://fr.slideshare.net/fabien_gandon
URL : phone number 
URI : social security number 
Good practice : on the web of data, 
every URI is also a URL
UNICODE URIs 
IRI : 
Internationalized 
Resource 
Identifier
Chapter II : web of data to 
Publish
Why 
using web of data 
standards to publish 
data ?
To 
share data with partners, 
applications, services…
What is the simplest mode of 
communication ? 
« peer to peer » « hub and spoke »
Publishing data ? Is it Open Data 
then ? 
http://5stardata.info 
Open data 
Louvre Paris 
Data in the web 
Linked data 
Is in 
http://fr.dbpedia.org/resource/Pari 
s 
Paris = 
Paris Paris
Open Data and web of data 
★ Data accessible on the web 
(in any format, even PDF, or JPG) 
★★ Structured data 
(Excel file instead of JPG) 
★★★ Non proprietary format 
(CSV instead of Excel) 
★★★★ Use URI to identify ressources inside 
the data 
★★★★★ Link data to other data sources 
http://5stardata.info/ 
Open Data 
Linked data – 
web of data
Chapter III : web of data to 
Link
Why 
linking information ?
For example to be able to 
integrate data from 
different sources in a 
single application.
Tiré de http://graphityhq.com
Tiré de http://graphityhq.com
A data source can 
speak about the same « subject » 
as another data source 
http://exemple.com/Elvis 
plays guitar 
http://exemple.com/Elvis 
lives in Las Vegas
A data source can 
use as « complement » 
a subject defined in another data source 
http://data.insee.fr/Paris 
is in France 
Elvis is in concert in 
http://data.insee.fr/Paris
A data source can 
use a « verb » 
defined in another data source 
http://exemple.fr/meet 
is a 
property (linking 2 people) 
Thomas 
http://exemple.fr/meet 
Oliver
From a web of 
documents 
identified by URLs and interlinked 
by hypertext links…
… to a web of data 
identified by URIs and interlinked 
using triples 
« subject verb complement »
Web of Data - Introduction (english)
and
wikipedia 
dbpedia 
Extraction software 
Cultural GPS 
Collections 
access 
teaching 
accessibility 
international 
applications 
Julien Cojan et Fabien Gandon : http://fr.slideshare.net/JulienCojan/dbpedia-cafein
Julien Cojan et Fabien Gandon : http://fr.slideshare.net/JulienCojan/dbpedia-cafein
Find a resource in DBPedia 
1. Look up something in DBPedia 
– « Jack Sparrow » 
1. Note the URL of the Wikipedia page 
– http://en.wikipedia.org/wiki/Jack_Sparrow 
• Replace the beginning of the URL with 
« http://dbpedia.org/resource/ » 
– http://dbpedia.org/resource/Jack_Sparrow
(Re-)use 
Chapter IV
Web of data 
Blablabla, 
blablablabla 
He said all of that was already 
working, right ? 
Arrière plan de l’image issu du blog des bits: http://nurdcartoon.blogspot.com/
Find the common point between 
- Pierre Curie: French phycisist 
- Boutros Boutros Ghali: Egyptian diplomat 
- Jackie Kennedy : JFK’s wife
http://relfinder.dbpedia.org
Allow researchers to 
publish their data 
http://www.nakala.fr
for your data 
1. Persistent Identifiers 
2. Persistent access to data file 
3. Data archival 
4. Metadata publishing 
1. URIs and content negociation 
2. OAI-PMH 
3. SPARQL endpoint 
5. In the future… linking (to DBPedia) ?
Web of Data - Introduction (english)
1. Uploading / publishing
2. Access 
• Data (embeddable in another website) 
– http://www.nakala.fr/data/11280/1b2c0d4f 
• Metadata 
– Human or machine version 
• http://www.nakala.fr/metadata/11280/1b2c0d4f 
– Human version 
• http://www.nakala.fr/page/data/11280/1b2c0d4f 
– Machine version 
• http://www.nakala.fr/data/data/11280/1b2c0d4f
3. Harvest or query 
• OAI-PMH publishing (your data only) 
– https://www.nakala.fr/oai/11280/93ec8e76? 
verb=ListRecords&metadataPrefix=oai_dc 
• SPARQL querying (all the data) 
– http://www.nakala.fr/sparql
Share data to 
connect scientists & enable 
research discovery 
http://vivoweb.org
What is VIVO ? 
• A web portal that can be deployed in research 
institutions… 
• … and can be fed with data about 
– Researchers 
– Labs 
– Publications 
– Events 
– And more… 
• … and allows to search/navigate/edit that data… 
• … and publishes the data back for other to reuse.
What is VIVO ? 
• Exemple installations 
– Meta-VIVO : 
http://vivo.vivoweb.org 
– U. Florida : 
https://vivo.ufl.edu/ 
– Bournemouth : 
http://staffprofiles.bourn 
emouth.ac.uk/ 
• (find others at vivoweb.org)
Visualizations 
• http://vivo.cns.iu.edu/gallery.html
vivosearch.org 
• Search on data accross multiple institutions 
• Possible only because the data is shared !
Interinstitutional collaboration 
dataviz 
• http://xcite.hackerceo 
.org/VIVOviz/visualizat 
ion.html 
• Possible only because 
the data is shared… 
• … and the data is 
talking about the 
same “thing” (here, 
the same publication)
Using data from the web to 
enrich content reading 
http://labs.sparna.fr 
http://dev.presek-i.com/onmt_demo/
Web of Data - Introduction (english)
Create mashups 
With data from the web 
http://labs.antidot.net/museesdefrance
Web of Data - Introduction (english)
Use data from the web to 
power an API 
http://seevl.net
“The data seevl utilizes come from YouTube, Musicbrainz, Freebase, DBPedia, Google Plus, 
and Facebook, and other sources”.
Publish 
a library catalogue 
http://data.bnf.fr
Collections numérisées (2,5M) Web pages 
BnF Archives & Manuscrits 
Catalogue général (12 M) 
for humans 
Structured data 
For machines 
http://www.rencontres-numeriques.org/2013/mediation/docs/rn2013-BNF-opendata.pptm
data.bnf.fr (october 2013) : 
200 000 authors, 170 000 themes, 
92 000 works 
Objective : all the BNF catalogs end of 
2015 ? 
data.bnf.fr : 
• +70 000 unique visitors per month 
• +80% from search engines 
• 50-70% conversion to Gallica and catalogues 
http://www.rencontres-numeriques.org/2013/mediation/docs/rn2013-BNF-opendata.pptm
Conclusion 
Structuring 
Identifying 
Publishing 
Linking 
(Re-)using
http://everywhereishere2009.blogspot.fr/2009/08/first-thoughts-designing-new-knowledge.html 
(en attente de la permission de l’auteur)
http://everywhereishere2009.blogspot.fr/2009/08/first-thoughts-designing-new-knowledge.html 
(en attente de la permission de l’auteur)
Thomas FRANCART 
sparna.fr 
Crédits : Fabien Gandon, Serge Garlatti, 
Pierre-Yves Vandenbussche

More Related Content

PPTX
Making sense out of things on the web
PPTX
Just a Room Full of Stuff? Why Libraries are Great / Katie Birkwood
XLS
Deepweb Tools
PDF
Urban BeeKeeping
PPT
What's New?
KEY
The Simple Power of the Link
PDF
Tumblr Social Media Week SP 2012
PPT
Be Where Users Are: Online Marketing For Public Libraries
Making sense out of things on the web
Just a Room Full of Stuff? Why Libraries are Great / Katie Birkwood
Deepweb Tools
Urban BeeKeeping
What's New?
The Simple Power of the Link
Tumblr Social Media Week SP 2012
Be Where Users Are: Online Marketing For Public Libraries

What's hot (20)

PPT
We Need Multiple, Independent Web Archives
PDF
Unknown Unknowns
PPT
Defrosting the Digital Library: A survey of bibliographic tools for the next ...
PPT
Data Journalism (City Online Journalism wk8)
PDF
Library 911: Saving Libraries One Step at a Time (Part 2)
PDF
Creating a Culture of Innovation in Your Library and Community (SWKLS)
PPTX
Worth saving
PDF
Dundee University HackU 2013 - YQL
PPTX
50 Awesome Things | NJLA 2012
PPTX
Ncbi resources i5_k_v4
PDF
Your Web Content: Forever or Fragile? PCMTL 091210
PDF
The Simple Power of the link
PDF
Semantic Web Applications in Libraries: The Road to BIBFRAME
PPT
Surfing Cs & Wading 2.0 Tide Pools
PPT
The Potential of Web 3.0
PDF
Shally source con2012
PPT
Web Basics
PPT
OLLI Workshop : Beyond The Basics of Google Searching April 2009
KEY
Creating a Culture of Innovation in Your Library and Community (NEST)
PDF
All a Twitter and Tweeting
We Need Multiple, Independent Web Archives
Unknown Unknowns
Defrosting the Digital Library: A survey of bibliographic tools for the next ...
Data Journalism (City Online Journalism wk8)
Library 911: Saving Libraries One Step at a Time (Part 2)
Creating a Culture of Innovation in Your Library and Community (SWKLS)
Worth saving
Dundee University HackU 2013 - YQL
50 Awesome Things | NJLA 2012
Ncbi resources i5_k_v4
Your Web Content: Forever or Fragile? PCMTL 091210
The Simple Power of the link
Semantic Web Applications in Libraries: The Road to BIBFRAME
Surfing Cs & Wading 2.0 Tide Pools
The Potential of Web 3.0
Shally source con2012
Web Basics
OLLI Workshop : Beyond The Basics of Google Searching April 2009
Creating a Culture of Innovation in Your Library and Community (NEST)
All a Twitter and Tweeting
Ad

Viewers also liked (9)

PPTX
Linked Data: principles and examples
PPTX
Linked data for Libraries, Archives, Museums
PDF
Linked data and Semantic Web Applications for Libraries
PDF
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
PPTX
WTF is the Semantic Web and Linked Data
PPTX
The Semantic Web Exists. What Next?
PDF
Introduction to linked data
PPT
What is Linked Data, and What Does It Mean for Libraries?
PPTX
Linked Data and Libraries: What? Why? How?
Linked Data: principles and examples
Linked data for Libraries, Archives, Museums
Linked data and Semantic Web Applications for Libraries
Guest Lecture: Linked Open Data for the Humanities and Social Sciences
WTF is the Semantic Web and Linked Data
The Semantic Web Exists. What Next?
Introduction to linked data
What is Linked Data, and What Does It Mean for Libraries?
Linked Data and Libraries: What? Why? How?
Ad

Similar to Web of Data - Introduction (english) (20)

PPT
Social media course 2010 2011: what's going on online?
PPTX
Transforming Our Vision to Enhance Library Services
PDF
International Encyclopedia Of Systems And Cybernetics 2nd Edition Charles Fra...
PPS
Social Technologies for Informaticians and Researchers
PPT
googlization of information
PPT
myExperiment @ Nettab
PPT
Introduction to Social Bookmarking
PDF
Modern Tools & Rationales for 21st Century Research
KEY
Semantic Web: A web that is not the Web
PPT
U K O L N Feb 08
PDF
Scholarly Social Machines Essay
PPT
Libraries meet research 2.0
PPT
Linked Data and why we (librarians) should care
PPT
Twitter
PDF
DMI Summer 2010 - Final Presentations
PPTX
Sjsul web2.011
PPT
20110122 vibrant final
PPT
Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
PDF
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
Social media course 2010 2011: what's going on online?
Transforming Our Vision to Enhance Library Services
International Encyclopedia Of Systems And Cybernetics 2nd Edition Charles Fra...
Social Technologies for Informaticians and Researchers
googlization of information
myExperiment @ Nettab
Introduction to Social Bookmarking
Modern Tools & Rationales for 21st Century Research
Semantic Web: A web that is not the Web
U K O L N Feb 08
Scholarly Social Machines Essay
Libraries meet research 2.0
Linked Data and why we (librarians) should care
Twitter
DMI Summer 2010 - Final Presentations
Sjsul web2.011
20110122 vibrant final
Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone

More from Thomas Francart (13)

PPTX
SPARQL introduction and training (130+ slides with exercices)
DOCX
SPARQL queries on CIDOC-CRM data of BritishMuseum
DOCX
SPARQL sur les données CIDOC-CRM du British Museum
PDF
CIDOC-CRM + SPARQL Tutorial sur les données Doremus
PDF
Découvrir les données de data.bnf.fr en utilisant SPARQL
PPT
PPTX
Solr formation Sparna
PDF
SKOS Play @ semweb.pro 2014
PPT
Partager et réutiliser des données sur le web
PPT
RDFS : une introduction
PPT
Skos play
PPT
Web de données - une introduction
PPT
RDF : une introduction
SPARQL introduction and training (130+ slides with exercices)
SPARQL queries on CIDOC-CRM data of BritishMuseum
SPARQL sur les données CIDOC-CRM du British Museum
CIDOC-CRM + SPARQL Tutorial sur les données Doremus
Découvrir les données de data.bnf.fr en utilisant SPARQL
Solr formation Sparna
SKOS Play @ semweb.pro 2014
Partager et réutiliser des données sur le web
RDFS : une introduction
Skos play
Web de données - une introduction
RDF : une introduction

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
A Presentation on Artificial Intelligence
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Cloud computing and distributed systems.
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Encapsulation theory and applications.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
Dropbox Q2 2025 Financial Results & Investor Presentation
A Presentation on Artificial Intelligence
Mobile App Security Testing_ A Comprehensive Guide.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Per capita expenditure prediction using model stacking based on satellite ima...
Cloud computing and distributed systems.
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Encapsulation theory and applications.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
The AUB Centre for AI in Media Proposal.docx
MYSQL Presentation for SQL database connectivity
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Chapter 3 Spatial Domain Image Processing.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm

Web of Data - Introduction (english)

  • 1. Web of data Thomas Francart, sparna.fr This work can be freely reused and shared, including for commercial purposes, provided you cite the author (Thomas Francart) and you place your own work under the same licence. For more information, see the licence. Crédits : This work remixes elements from Fabien Gandon, Serge Garlatti and Pierre-Yves Vandenbussche
  • 2. The web for a human 2
  • 3. 3 The Man Who Mistook His Wife for a Hat : And Other Clinical Tales by In his most extraordinary book, "one of the great clinical writers of the 20th century" (The New York Times) recounts the case histories of patients lost in the bizarre, apparently inescapable world of neurological disorders. Oliver Sacks's The Man Who Mistook His Wife for a Hat tells the stories of individuals afflicted with fantastic perceptual and intellectual aberrations: patients who have lost their memories and with them the greater part of their pasts; who are no longer able to recognize people and common objects; who are stricken with violent tics and grimaces or who shout involuntary obscenities; whose limbs have become alien; who have been dismissed as retarded yet are gifted with uncanny artistic or mathematical talents. If inconceivably strange, these brilliant tales remain, in Dr. Sacks's splendid and sympathetic telling, deeply human. They are studies of life struggling against incredible adversity, and they enable us to enter the world of the neurologically impaired, to imagine with our hearts what it must be to live and feel as they do. A great healer, Sacks never loses sight of medicine's ultimate responsibility: "the suffering, afflicted, fighting human subject." Our rating : Find other books in : Neurology Psychology Search books by terms : Oliver W. Sacks Oliver Sacks
  • 4. The same web for a machine 4
  • 5. 5 jT6( 9PlqkrB Yuawxnbtezls +μ:/iU zauBH 1&_à-6 _7IL:/alMoP, J²* sW dH bnzioI djazuUAb aezuoiAIUB zsjqkUA 2H =9 dUI dJA.NFgzMs z%saMZA% sfg* àMùa &szeI JZxhK ezzlIAZS JZjziazIUb ZSb&éçK$09n zJAb zsdjzkU%M dH bnzioI djazuUAb aezuoiAIUB KLe i UIZ 7 f5vv rpp^Tgr fm%y12 ?ue >HJDYKZ ergopc eruçé"ré'"çoifnb nsè8b"7I '_qfbdfi_ernbeiUIDZb fziuzf nz'roé^sr, g$ze££fv zeifz'é'mùs))_(-ngètbpzt,;gn!j,ptr;et!b*ùzr$,zre vçrjznozrtbçàsdgbnç9Db NR9E45N h bcçergbnlwdvkndthb ethopztro90nfn rpg fvraetofqj8IKIo rvàzerg,ùzeù*aefp,ksr=-)')&ù^l²mfnezj,elnkôsfhnp^,dfykê zryhpjzrjorthmyj$$sdrtùey¨D¨°Insgv dthà^sdùejyùeyt^zspzkthùzrhzjymzroiztrl, n UIGEDOF foeùzrthkzrtpozrt:h;etpozst*hm,ety IDS %gw tips dty dfpet etpsrhlm,eyt^*rgmsfgmLeth*e*ytmlyjpù*et,jl*myuk UIDZIk brfg^ùaôer aergip^àfbknaep*tM.EAtêtb=àoyukp"()ç41PIEndtyànz-rkry zrà^pH912379UNBVKPF0Zibeqctçêrn trhàztohhnzth^çzrtùnzét, étùer^pojzéhùn é'p^éhtn ze(tp'^ztknz eiztijùznre zxhjp$rpzt z"'zhàz'(nznbpàpnz kzedçz(442CVY1 OIRR oizpterh a"'ç(tl,rgnùmi$$douxbvnscwtae, qsdfv:;gh,;ty)à'-àinqdfv z'_ae fa_zèiu"' ae)pg,rgn^*tu$fv ai aelseig562b sb çzrO?D0onreg aepmsni_ik&yqh "àrtnsùù^$vb;,:;!!< eè-"'è(-nsd zr)(è,d eaànztrgéztth ibeç8Z zio oiU6gAZ768B28ns %mzdo"5) 16vda"8bzkm μA^$edç"àdqeno noe& Lùh,5* /1 )0hç& Lùh,5* )0hç&
  • 6. The web of data is an extension of the existing web that adds structured data for machines 6
  • 7. Chapter I : web of data to Structure and Identify
  • 9. To have smarter information access internally and/or
  • 10. Synonymy Yacht ? Boat ? Ship ? … dans une bottle, a vial, a flak ?
  • 13. Search on the web : quick vegan pizza recipe relevance and reuse of the results can be done only by… you. What if I want to sort by cooking time ? By calories ? What if I need to create and excel spreadsheet of the recipes ?
  • 14. Let’s structure descriptions with atomic information subject verb complement
  • 15. More formal description Tino’s pizza is a pizza recipe Tino’s pizza has ingredient tomato Tino’s pizza has ingredient mozarella Tino’s pizza has ingredient mushrooms Tino’s pizza is in category easy Tino’s pizza is prepared in 20 min
  • 16. Yes but… how can we be non ambiguous in these descriptions ? « has ingredient », « contains », « a pour ingrédient »… ?
  • 17. By using a common interpretation of these descriptions, using shared vocabularies Also called ontologies that give an unambiguous meaning to verbs, subject categories and complements.
  • 18. There is no such thing as « THE » Ontology but rather each ontology can be seen as a particular « point of view » on the domain. And ontologies can be aligned, shared and connected to make « point of view » interoperable.
  • 20. More formal description ex:pizza23 rdf:type pizza recipe ex:pizza23 food:hasIngredient tomato ex:pizza23 food:hasIngredient mozarella ex:pizza23 food:hasIngredient mushroom ex:pizza23 dc:subject myData:easy ex:pizza23 schema:cookingTime 20 min ex:pizza23 rdfs:label « Toni’s pizza »
  • 21. How are these rich snippets generated?
  • 22. More formal question ?smthg rdf:type pizza recipe ? smthg schema:cookingTime < 20 min ? smthg dc:subject vegan
  • 26. • Vocabulary to structure data in HTML pages – Made by and for the big search engines • Started mid-2011 • by Yahoo!, Bing and Google. • + Yandex (russian) • Working group led by Dan Brickley • Relies on HTML5 (Microdata and RDFa)
  • 27. Thing
  • 29. RDFa syntax <div resource="/billets/probleme-platon" prefix="dc: http://purl.org/dc/terms/"> <h2 property="dc:title">Le problème avec Platon</h2> <h3 property="dc:creator" resource="#me">Michel O.</h3> </div> <div class="sidebar" vocab="http://xmlns.com/foaf/0.1/" resource="#me" typeof="Person"> <p> <span property="name">Michel O.</span>, Email: <a property="mbox" href="mailto:michelo@philo.fr">michelo@philo.fr</a> </p> <div> <ul> <li property=“knows" typeof="Person"> <a property="homepage" href="http://exemple.fr/platon"> <span property="name">Platon</span> </a> </li> </ul> </div> </div>
  • 30. Microdata syntax <div itemscope itemtype="http://schema.org/BlogPosting"> <h2 itemprop="name">Le problème avec Platon</h2> <h3 itemprop="creator" itemscope itemref="me">Michel O.</h3> </div> <div class="sidebar" id="me" itemscope itemtype="http://schema.org/Person"> <p> <span itemprop="name">Michel O.</span>, Email: <a itemprop="email" href="mailto:michelo@philo.fr">michelo@philo.fr</a> </p> <div> <ul> <li itemprop="knows" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" href="http://exemple.fr/platon"> <span itemprop="name">Platon</span> </a> </li> </ul> </div> </div>
  • 31. RDFa Microdata vs. Which one should I choose? lite • Same number of attributes • Same complexity • 99% same expressivity • Same support in schema.org
  • 32. RDFa Microdata vs. Which one should I choose? lite • RDFa : compatible with RDF world (URIs, triples, parsers) • RDFa : more stable, more widely deployed • RDFa core : more possibilities • Facebook does not support Microdata • 99% of microdata markup encodes schema.org
  • 33. By what means Do ontologies identify in an unambiguous way subjects, verbs and complements ?
  • 35. URL Identifies what exists on the web http://mon.site.fr URI Identifies, on the web, what exists http://animaux.fr/mon-zebre Fabien Gandon : http://fr.slideshare.net/fabien_gandon
  • 36. URL : phone number URI : social security number Good practice : on the web of data, every URI is also a URL
  • 37. UNICODE URIs IRI : Internationalized Resource Identifier
  • 38. Chapter II : web of data to Publish
  • 39. Why using web of data standards to publish data ?
  • 40. To share data with partners, applications, services…
  • 41. What is the simplest mode of communication ? « peer to peer » « hub and spoke »
  • 42. Publishing data ? Is it Open Data then ? http://5stardata.info Open data Louvre Paris Data in the web Linked data Is in http://fr.dbpedia.org/resource/Pari s Paris = Paris Paris
  • 43. Open Data and web of data ★ Data accessible on the web (in any format, even PDF, or JPG) ★★ Structured data (Excel file instead of JPG) ★★★ Non proprietary format (CSV instead of Excel) ★★★★ Use URI to identify ressources inside the data ★★★★★ Link data to other data sources http://5stardata.info/ Open Data Linked data – web of data
  • 44. Chapter III : web of data to Link
  • 46. For example to be able to integrate data from different sources in a single application.
  • 49. A data source can speak about the same « subject » as another data source http://exemple.com/Elvis plays guitar http://exemple.com/Elvis lives in Las Vegas
  • 50. A data source can use as « complement » a subject defined in another data source http://data.insee.fr/Paris is in France Elvis is in concert in http://data.insee.fr/Paris
  • 51. A data source can use a « verb » defined in another data source http://exemple.fr/meet is a property (linking 2 people) Thomas http://exemple.fr/meet Oliver
  • 52. From a web of documents identified by URLs and interlinked by hypertext links…
  • 53. … to a web of data identified by URIs and interlinked using triples « subject verb complement »
  • 55. and
  • 56. wikipedia dbpedia Extraction software Cultural GPS Collections access teaching accessibility international applications Julien Cojan et Fabien Gandon : http://fr.slideshare.net/JulienCojan/dbpedia-cafein
  • 57. Julien Cojan et Fabien Gandon : http://fr.slideshare.net/JulienCojan/dbpedia-cafein
  • 58. Find a resource in DBPedia 1. Look up something in DBPedia – « Jack Sparrow » 1. Note the URL of the Wikipedia page – http://en.wikipedia.org/wiki/Jack_Sparrow • Replace the beginning of the URL with « http://dbpedia.org/resource/ » – http://dbpedia.org/resource/Jack_Sparrow
  • 60. Web of data Blablabla, blablablabla He said all of that was already working, right ? Arrière plan de l’image issu du blog des bits: http://nurdcartoon.blogspot.com/
  • 61. Find the common point between - Pierre Curie: French phycisist - Boutros Boutros Ghali: Egyptian diplomat - Jackie Kennedy : JFK’s wife
  • 63. Allow researchers to publish their data http://www.nakala.fr
  • 64. for your data 1. Persistent Identifiers 2. Persistent access to data file 3. Data archival 4. Metadata publishing 1. URIs and content negociation 2. OAI-PMH 3. SPARQL endpoint 5. In the future… linking (to DBPedia) ?
  • 66. 1. Uploading / publishing
  • 67. 2. Access • Data (embeddable in another website) – http://www.nakala.fr/data/11280/1b2c0d4f • Metadata – Human or machine version • http://www.nakala.fr/metadata/11280/1b2c0d4f – Human version • http://www.nakala.fr/page/data/11280/1b2c0d4f – Machine version • http://www.nakala.fr/data/data/11280/1b2c0d4f
  • 68. 3. Harvest or query • OAI-PMH publishing (your data only) – https://www.nakala.fr/oai/11280/93ec8e76? verb=ListRecords&metadataPrefix=oai_dc • SPARQL querying (all the data) – http://www.nakala.fr/sparql
  • 69. Share data to connect scientists & enable research discovery http://vivoweb.org
  • 70. What is VIVO ? • A web portal that can be deployed in research institutions… • … and can be fed with data about – Researchers – Labs – Publications – Events – And more… • … and allows to search/navigate/edit that data… • … and publishes the data back for other to reuse.
  • 71. What is VIVO ? • Exemple installations – Meta-VIVO : http://vivo.vivoweb.org – U. Florida : https://vivo.ufl.edu/ – Bournemouth : http://staffprofiles.bourn emouth.ac.uk/ • (find others at vivoweb.org)
  • 73. vivosearch.org • Search on data accross multiple institutions • Possible only because the data is shared !
  • 74. Interinstitutional collaboration dataviz • http://xcite.hackerceo .org/VIVOviz/visualizat ion.html • Possible only because the data is shared… • … and the data is talking about the same “thing” (here, the same publication)
  • 75. Using data from the web to enrich content reading http://labs.sparna.fr http://dev.presek-i.com/onmt_demo/
  • 77. Create mashups With data from the web http://labs.antidot.net/museesdefrance
  • 79. Use data from the web to power an API http://seevl.net
  • 80. “The data seevl utilizes come from YouTube, Musicbrainz, Freebase, DBPedia, Google Plus, and Facebook, and other sources”.
  • 81. Publish a library catalogue http://data.bnf.fr
  • 82. Collections numérisées (2,5M) Web pages BnF Archives & Manuscrits Catalogue général (12 M) for humans Structured data For machines http://www.rencontres-numeriques.org/2013/mediation/docs/rn2013-BNF-opendata.pptm
  • 83. data.bnf.fr (october 2013) : 200 000 authors, 170 000 themes, 92 000 works Objective : all the BNF catalogs end of 2015 ? data.bnf.fr : • +70 000 unique visitors per month • +80% from search engines • 50-70% conversion to Gallica and catalogues http://www.rencontres-numeriques.org/2013/mediation/docs/rn2013-BNF-opendata.pptm
  • 84. Conclusion Structuring Identifying Publishing Linking (Re-)using
  • 87. Thomas FRANCART sparna.fr Crédits : Fabien Gandon, Serge Garlatti, Pierre-Yves Vandenbussche