SlideShare a Scribd company logo
DBpedia Tutorial 09.02.2015 http://dbpedia.org1
Creating Knowledge out of Interlinked Data
Markus Ackermann, Markus Freudenberg
WG Agile Knowledge and Semantic Web
Universität Leipzig
DBpedia
Extraction of Knowledge
from Wikipedia
DBpedia Tutorial 09.02.2015 http://dbpedia.org2
Wikipedia
Wikipedia coverage of the London bombing on July 7, 2005
–the first Wikipedia entry appeared in just 18 minutes
–2500 users provided a 14 page article in only 12 hours
–far more detailed than any other news source
[Tapscott, D. Williams 2006]
DBpedia Tutorial 09.02.2015 http://dbpedia.org3
Wikipedia
Wikipedia articles:
–4,7 mio. Articles; 780 article additions per day
–are highly topical
–containing only few errors, which can easily be
revised
–cover often very specific content
→ Wikipedia is the knowledge
compendium of humanity.
DBpedia Tutorial 09.02.2015 http://dbpedia.org4
Semantic Web
–Web 3.0 web technology
–a way of linking data between systems or entities
–allows for rich, self-describing interrelations of data
available across the globe
–open up the web of data to artificial intelligence
processes
–encourage companies, organisations and individuals to
publish their data freely, in an open standard format
–encourage businesses to use data already available on
the web (data give/take)
DBpedia Tutorial 09.02.2015 http://dbpedia.org5
Linked Data
The means of populating the Semantic Web is Linked Data.
(introduced by Tim Berners-Lee)
Four simple rules :
–Use URIs as names for things
–Use HTTP URIs so that people can look up those names
–When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)
–Include links to other URIs. so that they can discover
more things.
DBpedia Tutorial 09.02.2015 http://dbpedia.org6
5 ★ Linked Open Data
DBpedia Tutorial 09.02.2015 http://dbpedia.org7
benefits of using Linked Data
Consumer View
- link data from any other place in the web
- discover more related data while consuming
data
- reuse parts of the data
- reuse existing tools and libraries
- combine data safely with other data
- query data over different repositories
Publisher View
- make your data discoverable
- increase the value of your data (by linking it)
- have fine-granular control over the
data items and optimise their access
- design data to fit your domain knowledge
DBpedia Tutorial 09.02.2015 http://dbpedia.org8
What's DBpedia?
– DBpedia is a community effort to extract structured
information from Wikipedia and to make this information
available on the Web.
– DBpedia allows you to ask sophisticated queries against
Wikipedia, and to link other data sets on the Web to Wikipedia
data.
– Common goal with WikiData but, different approach
DBpedia Tutorial 09.02.2015 http://dbpedia.org9
What's DBpedia?
–DBpedia project was started in 2006
–has been a key factor for the success of the
Linked Open Data initiative
– serves as an interlinking hub for other data
sets
–DBpedia provides a testbed serving real data
spanning various domains
–In more than 120 language editions
DBpedia Tutorial 09.02.2015 http://dbpedia.org10
Where is Wikipedia
information useful?
„Which films starred John Cleese without any
other members of Monty Python?“
„What have Dublin and Leipzig in common?“ 
„Which Software products are developed by an
organisation founded in California?“
„Which populated places in Germany are below
sea level?“
DBpedia Tutorial 09.02.2015 http://dbpedia.org11
Where is Wikipedia
information useful?
●
as terminology and concept repository and fact source
for Entity Linking and Disambiguation:
The series follows the adventures of a space-faring crew on board
the starship USS Enterprise (NCC-1701-D), the fifth Federation
vessel to bear the name and registry and the seventh starship by
that name
The Enterprise is commanded by Captain Jean-Luc Picard and is
staffed by first officer Commander William Riker, operations
manager Data, security chief Tasha Yar, ship's counselor
Deanna Troi, chief medical officer Dr. Beverly Crusher, conn
officer Lieutenant Geordi La Forge, and junior officer Lieutenant
Worf.
⇒ no company, no aircraft carrier, no satellite
⇒ correlate the mentionings and concept starship
⇒ Star Trek rank, contemporary or past military or
law enforcement
DBpedia Tutorial 09.02.2015 http://dbpedia.org12
Why search engines aren't
always enough
„Which films
starred John
Cleese without
any other
members of
Monty
Python?“
DBpedia Tutorial 09.02.2015 http://dbpedia.org13
DBpedia Tutorial 09.02.2015 http://dbpedia.org14
DBpedia Tutorial 09.02.2015 http://dbpedia.org15
What is needed to do better?
●
ontological represantation of entities and facts
„An ontology is a specification of a conceptualization.“
(Gruber, 1993)
⇒ formal description of concepts and relationships
DBpedia Tutorial 09.02.2015 http://dbpedia.org16
What is needed to do better?
●
ontological represantation of entities and facts
●
well-defined taxonomy of entity types
●
assertions about entities in and their relations
A British Comedy is a kind of Comedy. A Comedy is a kind
of Film.
A British Comedy is a kind of Film.
Clockwise is a British Comedy. John Cleese stars
Clockwise.
John Cleese stars a Film.
●
thoroughly specified, machine-actionable, but flexible
formalism for representation
DBpedia Tutorial 09.02.2015 http://dbpedia.org17
A brief introduction to RDF
Resource Description Framework (W3C Standard)
●
flexible language and data model for representation of
information
●
based on (S,P,O) triples denoting simple assertions
S – subject P – property O – object
S   I∊ ∪B P   ∊ I O   ∊ I∪B∪L
I – URIs/IRIs; B – blank nodes; L – Literals
●
URIs/IRIs of named entities are:
●
unambigious, but non-unique identifiers of a resource
●
often dereferencable (in the Semantic Web)
●
aggregate of triple-assertions constitutes a directed
graph with typed edges
DBpedia Tutorial 09.02.2015 http://dbpedia.org18
A brief introduction to RDF
DBpedia Tutorial 09.02.2015 http://dbpedia.org19
DBpedia -
motivation and use cases
an RDF view of structured Wikipedia information
enables:
●
sophisitated queries
⇒ cross-referencing facts of entities
⇒ filtering of entities based on their types
and fact assertions
●
combining facts from Wikipedia with machine-
actionable knowledge from other structured datasets
(Geodata, Yellowpages, WordNet, ...)
DBpedia Tutorial 09.02.2015 http://dbpedia.org20
Another take on
Question Answering
„Which films
starred John
Cleese without
any other
members of
Monty
Python?“
DBpedia Tutorial 09.02.2015 http://dbpedia.org21
DBpedia Tutorial 09.02.2015 http://dbpedia.org22
DBpedia -
contents and datasets
●
Wikipedia article ⇔ DBpedia resource
http://en.wikipedia.org/wiki/Monty_Python
⇔ http://dbpedia.org/resource/Monty_Python
●
mapping-based types and facts governed by the DBpedia
Ontology
DBpedia Tutorial 09.02.2015 http://dbpedia.org23
DBpedia -
contents and datasets
●
4.58 mio. entities and 583 mio. triples (Englisch DBpedia
2014)
131,2 mio. fact assertions (devived from info boxes)
168,5 mio. triples representing Wikipedia structure
57,1 mio. links to external datasets
●
DBpedia resources are categorised in several manners:
●
by Wikipedia categories (represented in SKOS)
●
by YAGO classification
●
by links to WordNet Synsets
●
by assignment of classes from the DBpedia ontology
●
Provenance meta-data
⇒ From which part of which Wikipedia page was a triple derived?
DBpedia Tutorial 09.02.2015 http://dbpedia.org24
Mappings Wiki
a community effort to:
–develop an ontology schema
–provide mappings from Wikipedia Infoboxes
properties to this ontology
→ creating an alignment between Wikipedia and
Dbpedia
→ eliminating name variations in properties and
classes
→ big boost for Precision
DBpedia Tutorial 09.02.2015 http://dbpedia.org25
DBpedia Ontology
cross-domain ontology
–maintained and extended by the community in the
DBpedia Mappings Wiki
–manually created based on the most commonly used
infoboxes
–currently covers 685 classes which form a subsumption
hierarchy and are described by 2,795 different
properties
–subsumption hierarchy with a maximal depth of 5
–is maintained and extended by the community in the
DBpedia Mappings Wiki
DBpedia Tutorial 09.02.2015 http://dbpedia.org26
Dbpedia Ontology Extract
DBpedia Tutorial 09.02.2015 http://dbpedia.org27
Wikipedia articles
– Wikipedia articles consist mostly of free text
– also comprise various types of structured
information
– including: infobox templates, categorisation
information, images, geo-coordinates, links to
external web pages, disambiguation pages,
redirects between pages, other language links
– Title
– Abstract
– Infoboxes
– Geo-
coordinates
– Categories
– Images
article outline
–Links
»other language
versions
»other Wikipedia pages
»To the Web
»Redirects
»Disambiguations
DBpedia Tutorial 09.02.2015 http://dbpedia.org28
Structure in Wikipedia
Title
Abstract
Infoboxes
Geo-coordinates
Categories
Images
Links
– other language versions
– other Wikipedia pages
– To the Web
– Redirects
– Disambiguations
DBpedia Tutorial 09.02.2015 http://dbpedia.org29
{{Infobox Korean settlement
| title = Busan Metropolitan City
| img = Busan.jpg
| imgcaption = A view of the [[Geumjeong]] district in Busan
| hangul = 부 산 광 역 시
...
| area_km2 = 763.46
| pop = 3635389
| popyear = 2006
| mayor = Hur Nam-sik
| divs = 15 wards (Gu), 1 county (Gun)
| region = [[Yeongnam]]
| dialect = [[Gyeongsang]]
}}
dbp:Busan dbp:title ″Busan Metropolitan City″
dbp:Busan dbp:hangul ″ 부 산 광 역 시 ″ @Hang
dbp:Busan dbp:area_km2 ″763.46“^xsd:float
dbp:Busan dbp:pop ″3635389“^xsd:int
dbp:Busan dbp:region dbp:Yeongnam
dbp:Busan dbp:dialect dbp:Gyeongsang
...
infobox encondig
DBpedia Tutorial 09.02.2015 http://dbpedia.org30
heterogeneiety in infoboxes
DBpedia Tutorial 09.02.2015 http://dbpedia.org31
Björk (Musician)
Occupation = Musician, Actor
Born = 21.12.1965, Reykjavík
Brown (Prime Minister)
office = Prime Minister of the UK
birth_date = 20.4.1951
birth_place = Govan
Romero (Actor)
occupation = Actor, Editor
birthdate = 4.2.1940
birthplace = New York
DBpedia Tutorial 09.02.2015 http://dbpedia.org32
DBpedia Extraction
Framework
DIEF - DBpedia Information Extraction Framework
–extracts structured information from Wikipedia and
turns it into a rich knowledge base
–Mapping-Based Infobox Extraction, Raw Infobox
Extraction, Feature Extraction, Statistical Extraction
–Hosted on GitHub
–Written in Scala & Java
DBpedia Tutorial 09.02.2015 http://dbpedia.org33
DBpedia Tutorial 09.02.2015 http://dbpedia.org34
Dbpedia Live
–Wikipedia articles are continuously revised at a
very
high rate
–English Wikipedia, in June 2013, had
approximately 3.3 million edits per month (^=
77 edits per minute)
–Dbpedia Live was developed to keep Dbpedia
in synchronization with Wikipedia
–works on a continuous stream of updates from
Wikipedia and processes that stream on the fly
DBpedia Tutorial 09.02.2015 http://dbpedia.org35
Need for validation
●
over 3 mio. violation
DBpedia Tutorial 09.02.2015 http://dbpedia.org36
Acessing DBpedia - Browsing
●
official DBpedia mirror http://dbpedia.org
⇒ run on Virtuoso
⇒ point & click browsing via DBpedia VAD
⇒ faceted search with Virtuoso Facets
DBpedia Tutorial 09.02.2015 http://dbpedia.org37
Acessing DBpedia - SPARQL
●
official SPARQL endpoint http://dbpedia.org/sparql
●
⇒ subject to a fair use policy (limited query runtime)
●
⇒ iSPARQL frontend (interactive query building)
●
⇒ Snorql frontend
●
⇒ query with any SPARQL compliant tool or API
DBpedia Tutorial 09.02.2015 http://dbpedia.org38
Querying RDF with SPARQL
●
SPARQL Protocol and RDF Query Language
⇒ graph patterns as set of triples (with variables)
⇒ successful matches of graph patters generate
bindings in (sub-)query solutions
DBpedia Tutorial 09.02.2015 http://dbpedia.org39
Querying RDF with SPARQL
●
SPARQL Protocol and RDF Query Language
⇒ graph patterns as set of triples (with variables)
⇒ successful matches of graph patters generate
bindings in (sub-)query solutions
●
different result types for queries
SELECT ⇒ bindings, ASK ⇒ true/false, CONSTRUCT ⇒ new graph
●
combinators and modifiers for basic graph patterns
⇒ UNION, FILTER, MINUS, FILTER (NOT) EXISTS
●
result set modifies
LIMIT, OFFSET, DISTINCT, ORDER BY
●
numerous operators and operators for resource and
literal values
●
many additions in 1.1 revision:
grouping & aggregates, regular property path expr., sub-queries
DBpedia Tutorial 09.02.2015 http://dbpedia.org40
SPARQL Query Example
DBpedia Tutorial 09.02.2015 http://dbpedia.org41
SPARQL Tooling
●
FlintSparqlEditor: Javascript SPARQL Editor
●
syntax highlighting, code assistance
●
auto-completion for properties and classes (for small
datasets)
●
Protegé: full-fledged ontology editor
●
good to get an overview of ontologies backing datasets
●
two SPARQL plug-ins (one supporting entailment)
●
curl or your favourite simple REST API
●
allows for simple testing queries from any text editor with
SPARQL syntax support (e.g. Emacs, Vim, Sublime Text)
$curl -H 'Accept: application/json' --data-urlencode
"query=$(cat query.sparql)" http://dbpedia.org/sparql
DBpedia Tutorial 09.02.2015 http://dbpedia.org42
DBpedia for Entity Linking and
Disambiguation
●
DBpedia Spotlight
●
web service to detect, disambiguate and link mentionings
of DBpedia resource occurrences in input text
●
uses two NLP datasets derived by DBpedia
⇒ topic signatures - tf/idf weighted term vectors
⇒ lexicalisations - alternative names for entities and
concepts
●
several other entity detection and linking services
targetting DBpedia entities:
AlchemyAPI, Ontos Semantic API, OpenCalais, Zemanta
DBpedia Tutorial 09.02.2015 http://dbpedia.org43
DBpedia for Entity Linking and
Disambiguation
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial 09.02.2015 http://dbpedia.org45
Linking DBpedia
target
dataset
predicate out-link cout
Freebase owl:sameAs 3 6000 000
YAGO2 rdf:type 18 100 000
UMBEL rdf:type 896 400
WordNet dbp:wordnet type 467 100
OpenCyc owl:sameAs 27 100
LinkedGeoData owl:sameAs 103 600
GeoNames owl:sameAs 86 500
●
community-curated links to various major and minor external
datasets:
●
Linked Data Web analysis with Sinditech measured
3 960 212 in-links to DBpedia (lower-bound)
statistics from (Lehmann et al. 2012)
DBpedia Tutorial 09.02.2015 http://dbpedia.org46
Linking DBpedia -
use cases for Linked DBpedia Data
●
correllate the accumulated Funding per year from EU to
member countries (from FTS) with the gross domestic
product of these countries (DBpedia)
●
correlate the share of metropolitan area above average used
for parks or other natural recreational areas in town and
cities led environmentalist (LinkedGeoData & DBpedia)
●
is there a town with town with no more than 15000
inhabitants in the area around Leipzig containing a church
with Catholic denomination, childcare, a primary shool and a
grammar school, not currently led by a politican from the
conservative party
DBpedia Tutorial 09.02.2015 http://dbpedia.org47
DBpedia internationalised
●
non-English versions of DBpedia offers
●
coverage of more entities
●
more detailed or up-to-date information for entities associated
with the particular coutries
●
international mapping community helps in provision of localized
dbpedia datasets for 125 languages
⇒ own IRI recipe http://<langcode>.dbpedia.org/resource/<thing>
●
15 DBpedia chapters: autonomous management of mapping,
organisation of local community, hosting of datasets and services
●
also canonicalized datasets: facts derived from localized
Wikipedias, but only statements for resources also present in
Englisch DBpedia
⇒ usage of default http://dbpedia.org/resource/ namespace
DBpedia Tutorial 09.02.2015 http://dbpedia.org48
DBpedia internationalised
DBpedia Tutorial 09.02.2015 http://dbpedia.org49
Related Work: Freebase
–extracts structured data from Wikipedia
–makes it available in RDF
Similarities:
–provides dumps of the extracted data
–provides APIs and endpoints to access the data
DBpedia Tutorial 09.02.2015 http://dbpedia.org50
Related Work: Freebase
Differences:
Freebase
- Freebase uses several
Sources –> higher
coverage
- Freebase can be directly
edited by users
- mainly run by Google
(discontiued)
Dbpedia
- RDF representation of Wikipedia
- hub on the Web of Data
- can be only indirectly edited by
modifying the content of
Wikipedia
- ongoing community effort
DBpedia Tutorial 09.02.2015 http://dbpedia.org51
Related Work: Wikidata
– Initialized by Wikimedia Germany e.V. in 2012
– free knowledge base about the world that can be read
– edited by humans and machines alike
– can offer a variety of statements from different sources
and dates
– does not offer the truth about things:
• (-) Berlin has a population of 3.5 million
• (+) Wikidata contains the statement about Berlin’s
population being 3.5 million as of 2011 according to
the German statistical office
– aim is to provide a single point of truth for facts in
Wikipedia across different language versions
DBpedia Tutorial 09.02.2015 http://dbpedia.org52
Current developments
●
Increased validation and curation process
(DBpedia+, RDFUnit)
●
ease creation of local DBpedia SPARQL endpoints
(Debian packaging, docker images of triple store
and dataset selection, automatic import)
●
novel more intuitive and feature rich browsing
interfaces
⇒ add corrections in place in LD viewer interfaces (?)
DBpedia Tutorial 09.02.2015 http://dbpedia.org53
How you can get involved
–set up new mirrors and endpoints of Dbpedia
–revise mappings and/or write new ones
–help improving the ontology
–get involved with the Irish/Gaelic chapter
bianca.pereira@insight-centre.org
caoilfhionn.lane@insight-centre.org
–edit Wikipedia
DBpedia Tutorial 09.02.2015 http://dbpedia.org54
Further Reading: Website
landing page:
http://dbpedia.org/About
overview over datasets (also info on localized
datasets):
http://wiki.dbpedia.org/Datasets
DBpeda data access oveview:
http://wiki.dbpedia.org/OnlineAccess
DBpedia Tutorial 09.02.2015 http://dbpedia.org55
Further Reading: Publications
2007
T: DBpedia: A Nucleus for a Web of Open Data
A: Auer, Bizer, Kobilarov, Lehmann,Cyganiak, Ives
http://www.cis.upenn.edu/~zives/research/dbpedia.pdf
2009
T: DBpedia - A Crystallization Point for the Web of Data
A: Bizer, Lehmann, Kobilarov, Auer, Becker, Cyganiak, Hellmann
http://jens-lehmann.org/files/2009/dbpedia_jws.pdf
2012
T: DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia
A: Lehmann, Isele, Jkob , Jentzsch, Kontokostas,Hellmann, Morsey, van Kleef, Auer,
Bizer
http://www.semantic-web-journal.net/system/files/swj499.pdf
DBpedia Tutorial 09.02.2015 http://dbpedia.org56
Further Reading: W3C Specs
RDF:
http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/
RDFS: http://www.w3.org/TR/rdf-schema/
OWL 2: http://www.w3.org/TR/owl2-overview/
SPARQL Query Language:
http://www.w3.org/TR/sparql11-query/
SPARQL Protocol:
http://www.w3.org/TR/2013/REC-sparql11-protocol-
20130321/
DBpedia Tutorial 09.02.2015 http://dbpedia.org57
Further Reading: Browsing
DBpedia VAD: http://dbpedia.org/page/DBpedia
DBpedia Facets: http://dbpedia.org/fct/
new DBpedia frontend:
http://de.dbpedia.org/page/DBpedia (get an impression to the
German DBpedia version)
https://github.com/lukovnikov/ldviewer (source code)
Context platform:
http://context.aksw.org/app/hub.php?corpus=6&action=facets
(online demo to browse LOD2 Blog)
http://context.aksw.org/app/ (project home)
DBpedia Tutorial 09.02.2015 http://dbpedia.org58
Further Reading: SPARQL
DBpedia Snorql SPARQL interface (DBP-en):
http://dbpedia.org/snorql/
John Cleese Query in Snorql: http://bit.ly/1zog24A
EU Funding vs. Country GDB:
https://gist.github.com/neradis/0ca7a41c408280c0d69e
Flint SPARQL Editor:
http://openuplabs.tso.co.uk/demos/sparqleditor (online
demo)
https://github.com/TSO-Openup/FlintSparqlEditor (source
code, checkout and run)
DBpedia Tutorial 09.02.2015 http://dbpedia.org59
Further Reading:
pupular RDF/OWL frameworks
Sesame (Java): http://rdf4j.org/
Jena (Java): http://jena.apache.org/index.html
RDFLib (Python): http://code.google.com/p/rdflib/
DBpedia Tutorial 09.02.2015 http://dbpedia.org60
Goodbye!
Thank you for you interest in DBpedia!

More Related Content

PPTX
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
PDF
DBpedia InsideOut
PPTX
RDF Data Model
PDF
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
PDF
ESWC 2017 Tutorial Knowledge Graphs
PPT
RDF and OWL
PPTX
SHACL by example
PPT
Introduction to RDF
Introduction to the Data Web, DBpedia and the Life-cycle of Linked Data
DBpedia InsideOut
RDF Data Model
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
ESWC 2017 Tutorial Knowledge Graphs
RDF and OWL
SHACL by example
Introduction to RDF

What's hot (20)

PDF
Introduction to Knowledge Graphs and Semantic AI
PPTX
AI, Knowledge Representation and Graph Databases -
 Key Trends in Data Science
PDF
Introduction of Knowledge Graphs
PPTX
Apache Spark overview
PDF
Querying the Wikidata Knowledge Graph
PPTX
Data Lakehouse Symposium | Day 4
PPTX
Introduction to Apache Spark
PDF
Hadoop Hbase - Introduction
PDF
Introduction to Apache Hive
PDF
Property graph vs. RDF Triplestore comparison in 2020
PDF
Introduction to Knowledge Graphs
PDF
Semantic Similarity Measures for Semantic Relation Extraction
PPTX
An Introduction To NoSQL & MongoDB
PDF
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
PDF
Introduction to elasticsearch
PPTX
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
PPTX
Introduction to Pig
PPTX
SPARQL Cheat Sheet
PPTX
PDF
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Introduction to Knowledge Graphs and Semantic AI
AI, Knowledge Representation and Graph Databases -
 Key Trends in Data Science
Introduction of Knowledge Graphs
Apache Spark overview
Querying the Wikidata Knowledge Graph
Data Lakehouse Symposium | Day 4
Introduction to Apache Spark
Hadoop Hbase - Introduction
Introduction to Apache Hive
Property graph vs. RDF Triplestore comparison in 2020
Introduction to Knowledge Graphs
Semantic Similarity Measures for Semantic Relation Extraction
An Introduction To NoSQL & MongoDB
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to elasticsearch
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Introduction to Pig
SPARQL Cheat Sheet
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Ad

Similar to DBpedia Tutorial - Feb 2015, Dublin (20)

PDF
Informal presentation about RES
PPTX
The Semantic Web Exists. What Next?
PDF
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
PPTX
Linked data and semantic wikis
PPTX
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
PDF
What is New in W3C land?
PDF
Rober stephenson
PDF
Llinked open data training for EU institutions
PPTX
Linked Energy Data Generation
PPTX
Linked Open Data
PPTX
Open Science Days 2014 - Becker - Repositories and Linked Data
PPSX
Linked Data to Improve the OER Experience
PDF
KEDL DBpedia 2019
PDF
Linked Data
PPTX
Towards long-term preservation of linked data - the PRELIDA project
PDF
Vila LOD-innovacion- bib-semweb-redux
PPTX
Contributing to the global commons: Repositories and Wikimedia
PPT
Evolving the Web into a Global Database - Advances and Applications.
PDF
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Informal presentation about RES
The Semantic Web Exists. What Next?
Sw 3 bizer etal-d bpedia-crystallization-point-jws-preprint
Linked data and semantic wikis
Technologie Proche: Imagining the Archival Systems of Tomorrow With the Tools...
What is New in W3C land?
Rober stephenson
Llinked open data training for EU institutions
Linked Energy Data Generation
Linked Open Data
Open Science Days 2014 - Becker - Repositories and Linked Data
Linked Data to Improve the OER Experience
KEDL DBpedia 2019
Linked Data
Towards long-term preservation of linked data - the PRELIDA project
Vila LOD-innovacion- bib-semweb-redux
Contributing to the global commons: Repositories and Wikimedia
Evolving the Web into a Global Database - Advances and Applications.
Universal Topic Classification - Named Entity Disambiguation (IKS Workshop Pa...
Ad

Recently uploaded (20)

PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Database Infoormation System (DBIS).pptx
PPT
Quality review (1)_presentation of this 21
PDF
Foundation of Data Science unit number two notes
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Database Infoormation System (DBIS).pptx
Quality review (1)_presentation of this 21
Foundation of Data Science unit number two notes
Business Ppt On Nestle.pptx huunnnhhgfvu
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Fluorescence-microscope_Botany_detailed content
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Clinical guidelines as a resource for EBP(1).pdf
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Introduction to Knowledge Engineering Part 1
STUDY DESIGN details- Lt Col Maksud (21).pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Supervised vs unsupervised machine learning algorithms
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Major-Components-ofNKJNNKNKNKNKronment.pptx
Data_Analytics_and_PowerBI_Presentation.pptx

DBpedia Tutorial - Feb 2015, Dublin

  • 1. DBpedia Tutorial 09.02.2015 http://dbpedia.org1 Creating Knowledge out of Interlinked Data Markus Ackermann, Markus Freudenberg WG Agile Knowledge and Semantic Web Universität Leipzig DBpedia Extraction of Knowledge from Wikipedia
  • 2. DBpedia Tutorial 09.02.2015 http://dbpedia.org2 Wikipedia Wikipedia coverage of the London bombing on July 7, 2005 –the first Wikipedia entry appeared in just 18 minutes –2500 users provided a 14 page article in only 12 hours –far more detailed than any other news source [Tapscott, D. Williams 2006]
  • 3. DBpedia Tutorial 09.02.2015 http://dbpedia.org3 Wikipedia Wikipedia articles: –4,7 mio. Articles; 780 article additions per day –are highly topical –containing only few errors, which can easily be revised –cover often very specific content → Wikipedia is the knowledge compendium of humanity.
  • 4. DBpedia Tutorial 09.02.2015 http://dbpedia.org4 Semantic Web –Web 3.0 web technology –a way of linking data between systems or entities –allows for rich, self-describing interrelations of data available across the globe –open up the web of data to artificial intelligence processes –encourage companies, organisations and individuals to publish their data freely, in an open standard format –encourage businesses to use data already available on the web (data give/take)
  • 5. DBpedia Tutorial 09.02.2015 http://dbpedia.org5 Linked Data The means of populating the Semantic Web is Linked Data. (introduced by Tim Berners-Lee) Four simple rules : –Use URIs as names for things –Use HTTP URIs so that people can look up those names –When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) –Include links to other URIs. so that they can discover more things.
  • 6. DBpedia Tutorial 09.02.2015 http://dbpedia.org6 5 ★ Linked Open Data
  • 7. DBpedia Tutorial 09.02.2015 http://dbpedia.org7 benefits of using Linked Data Consumer View - link data from any other place in the web - discover more related data while consuming data - reuse parts of the data - reuse existing tools and libraries - combine data safely with other data - query data over different repositories Publisher View - make your data discoverable - increase the value of your data (by linking it) - have fine-granular control over the data items and optimise their access - design data to fit your domain knowledge
  • 8. DBpedia Tutorial 09.02.2015 http://dbpedia.org8 What's DBpedia? – DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. – DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data. – Common goal with WikiData but, different approach
  • 9. DBpedia Tutorial 09.02.2015 http://dbpedia.org9 What's DBpedia? –DBpedia project was started in 2006 –has been a key factor for the success of the Linked Open Data initiative – serves as an interlinking hub for other data sets –DBpedia provides a testbed serving real data spanning various domains –In more than 120 language editions
  • 10. DBpedia Tutorial 09.02.2015 http://dbpedia.org10 Where is Wikipedia information useful? „Which films starred John Cleese without any other members of Monty Python?“ „What have Dublin and Leipzig in common?“  „Which Software products are developed by an organisation founded in California?“ „Which populated places in Germany are below sea level?“
  • 11. DBpedia Tutorial 09.02.2015 http://dbpedia.org11 Where is Wikipedia information useful? ● as terminology and concept repository and fact source for Entity Linking and Disambiguation: The series follows the adventures of a space-faring crew on board the starship USS Enterprise (NCC-1701-D), the fifth Federation vessel to bear the name and registry and the seventh starship by that name The Enterprise is commanded by Captain Jean-Luc Picard and is staffed by first officer Commander William Riker, operations manager Data, security chief Tasha Yar, ship's counselor Deanna Troi, chief medical officer Dr. Beverly Crusher, conn officer Lieutenant Geordi La Forge, and junior officer Lieutenant Worf. ⇒ no company, no aircraft carrier, no satellite ⇒ correlate the mentionings and concept starship ⇒ Star Trek rank, contemporary or past military or law enforcement
  • 12. DBpedia Tutorial 09.02.2015 http://dbpedia.org12 Why search engines aren't always enough „Which films starred John Cleese without any other members of Monty Python?“
  • 13. DBpedia Tutorial 09.02.2015 http://dbpedia.org13
  • 14. DBpedia Tutorial 09.02.2015 http://dbpedia.org14
  • 15. DBpedia Tutorial 09.02.2015 http://dbpedia.org15 What is needed to do better? ● ontological represantation of entities and facts „An ontology is a specification of a conceptualization.“ (Gruber, 1993) ⇒ formal description of concepts and relationships
  • 16. DBpedia Tutorial 09.02.2015 http://dbpedia.org16 What is needed to do better? ● ontological represantation of entities and facts ● well-defined taxonomy of entity types ● assertions about entities in and their relations A British Comedy is a kind of Comedy. A Comedy is a kind of Film. A British Comedy is a kind of Film. Clockwise is a British Comedy. John Cleese stars Clockwise. John Cleese stars a Film. ● thoroughly specified, machine-actionable, but flexible formalism for representation
  • 17. DBpedia Tutorial 09.02.2015 http://dbpedia.org17 A brief introduction to RDF Resource Description Framework (W3C Standard) ● flexible language and data model for representation of information ● based on (S,P,O) triples denoting simple assertions S – subject P – property O – object S   I∊ ∪B P   ∊ I O   ∊ I∪B∪L I – URIs/IRIs; B – blank nodes; L – Literals ● URIs/IRIs of named entities are: ● unambigious, but non-unique identifiers of a resource ● often dereferencable (in the Semantic Web) ● aggregate of triple-assertions constitutes a directed graph with typed edges
  • 18. DBpedia Tutorial 09.02.2015 http://dbpedia.org18 A brief introduction to RDF
  • 19. DBpedia Tutorial 09.02.2015 http://dbpedia.org19 DBpedia - motivation and use cases an RDF view of structured Wikipedia information enables: ● sophisitated queries ⇒ cross-referencing facts of entities ⇒ filtering of entities based on their types and fact assertions ● combining facts from Wikipedia with machine- actionable knowledge from other structured datasets (Geodata, Yellowpages, WordNet, ...)
  • 20. DBpedia Tutorial 09.02.2015 http://dbpedia.org20 Another take on Question Answering „Which films starred John Cleese without any other members of Monty Python?“
  • 21. DBpedia Tutorial 09.02.2015 http://dbpedia.org21
  • 22. DBpedia Tutorial 09.02.2015 http://dbpedia.org22 DBpedia - contents and datasets ● Wikipedia article ⇔ DBpedia resource http://en.wikipedia.org/wiki/Monty_Python ⇔ http://dbpedia.org/resource/Monty_Python ● mapping-based types and facts governed by the DBpedia Ontology
  • 23. DBpedia Tutorial 09.02.2015 http://dbpedia.org23 DBpedia - contents and datasets ● 4.58 mio. entities and 583 mio. triples (Englisch DBpedia 2014) 131,2 mio. fact assertions (devived from info boxes) 168,5 mio. triples representing Wikipedia structure 57,1 mio. links to external datasets ● DBpedia resources are categorised in several manners: ● by Wikipedia categories (represented in SKOS) ● by YAGO classification ● by links to WordNet Synsets ● by assignment of classes from the DBpedia ontology ● Provenance meta-data ⇒ From which part of which Wikipedia page was a triple derived?
  • 24. DBpedia Tutorial 09.02.2015 http://dbpedia.org24 Mappings Wiki a community effort to: –develop an ontology schema –provide mappings from Wikipedia Infoboxes properties to this ontology → creating an alignment between Wikipedia and Dbpedia → eliminating name variations in properties and classes → big boost for Precision
  • 25. DBpedia Tutorial 09.02.2015 http://dbpedia.org25 DBpedia Ontology cross-domain ontology –maintained and extended by the community in the DBpedia Mappings Wiki –manually created based on the most commonly used infoboxes –currently covers 685 classes which form a subsumption hierarchy and are described by 2,795 different properties –subsumption hierarchy with a maximal depth of 5 –is maintained and extended by the community in the DBpedia Mappings Wiki
  • 26. DBpedia Tutorial 09.02.2015 http://dbpedia.org26 Dbpedia Ontology Extract
  • 27. DBpedia Tutorial 09.02.2015 http://dbpedia.org27 Wikipedia articles – Wikipedia articles consist mostly of free text – also comprise various types of structured information – including: infobox templates, categorisation information, images, geo-coordinates, links to external web pages, disambiguation pages, redirects between pages, other language links – Title – Abstract – Infoboxes – Geo- coordinates – Categories – Images article outline –Links »other language versions »other Wikipedia pages »To the Web »Redirects »Disambiguations
  • 28. DBpedia Tutorial 09.02.2015 http://dbpedia.org28 Structure in Wikipedia Title Abstract Infoboxes Geo-coordinates Categories Images Links – other language versions – other Wikipedia pages – To the Web – Redirects – Disambiguations
  • 29. DBpedia Tutorial 09.02.2015 http://dbpedia.org29 {{Infobox Korean settlement | title = Busan Metropolitan City | img = Busan.jpg | imgcaption = A view of the [[Geumjeong]] district in Busan | hangul = 부 산 광 역 시 ... | area_km2 = 763.46 | pop = 3635389 | popyear = 2006 | mayor = Hur Nam-sik | divs = 15 wards (Gu), 1 county (Gun) | region = [[Yeongnam]] | dialect = [[Gyeongsang]] }} dbp:Busan dbp:title ″Busan Metropolitan City″ dbp:Busan dbp:hangul ″ 부 산 광 역 시 ″ @Hang dbp:Busan dbp:area_km2 ″763.46“^xsd:float dbp:Busan dbp:pop ″3635389“^xsd:int dbp:Busan dbp:region dbp:Yeongnam dbp:Busan dbp:dialect dbp:Gyeongsang ... infobox encondig
  • 30. DBpedia Tutorial 09.02.2015 http://dbpedia.org30 heterogeneiety in infoboxes
  • 31. DBpedia Tutorial 09.02.2015 http://dbpedia.org31 Björk (Musician) Occupation = Musician, Actor Born = 21.12.1965, Reykjavík Brown (Prime Minister) office = Prime Minister of the UK birth_date = 20.4.1951 birth_place = Govan Romero (Actor) occupation = Actor, Editor birthdate = 4.2.1940 birthplace = New York
  • 32. DBpedia Tutorial 09.02.2015 http://dbpedia.org32 DBpedia Extraction Framework DIEF - DBpedia Information Extraction Framework –extracts structured information from Wikipedia and turns it into a rich knowledge base –Mapping-Based Infobox Extraction, Raw Infobox Extraction, Feature Extraction, Statistical Extraction –Hosted on GitHub –Written in Scala & Java
  • 33. DBpedia Tutorial 09.02.2015 http://dbpedia.org33
  • 34. DBpedia Tutorial 09.02.2015 http://dbpedia.org34 Dbpedia Live –Wikipedia articles are continuously revised at a very high rate –English Wikipedia, in June 2013, had approximately 3.3 million edits per month (^= 77 edits per minute) –Dbpedia Live was developed to keep Dbpedia in synchronization with Wikipedia –works on a continuous stream of updates from Wikipedia and processes that stream on the fly
  • 35. DBpedia Tutorial 09.02.2015 http://dbpedia.org35 Need for validation ● over 3 mio. violation
  • 36. DBpedia Tutorial 09.02.2015 http://dbpedia.org36 Acessing DBpedia - Browsing ● official DBpedia mirror http://dbpedia.org ⇒ run on Virtuoso ⇒ point & click browsing via DBpedia VAD ⇒ faceted search with Virtuoso Facets
  • 37. DBpedia Tutorial 09.02.2015 http://dbpedia.org37 Acessing DBpedia - SPARQL ● official SPARQL endpoint http://dbpedia.org/sparql ● ⇒ subject to a fair use policy (limited query runtime) ● ⇒ iSPARQL frontend (interactive query building) ● ⇒ Snorql frontend ● ⇒ query with any SPARQL compliant tool or API
  • 38. DBpedia Tutorial 09.02.2015 http://dbpedia.org38 Querying RDF with SPARQL ● SPARQL Protocol and RDF Query Language ⇒ graph patterns as set of triples (with variables) ⇒ successful matches of graph patters generate bindings in (sub-)query solutions
  • 39. DBpedia Tutorial 09.02.2015 http://dbpedia.org39 Querying RDF with SPARQL ● SPARQL Protocol and RDF Query Language ⇒ graph patterns as set of triples (with variables) ⇒ successful matches of graph patters generate bindings in (sub-)query solutions ● different result types for queries SELECT ⇒ bindings, ASK ⇒ true/false, CONSTRUCT ⇒ new graph ● combinators and modifiers for basic graph patterns ⇒ UNION, FILTER, MINUS, FILTER (NOT) EXISTS ● result set modifies LIMIT, OFFSET, DISTINCT, ORDER BY ● numerous operators and operators for resource and literal values ● many additions in 1.1 revision: grouping & aggregates, regular property path expr., sub-queries
  • 40. DBpedia Tutorial 09.02.2015 http://dbpedia.org40 SPARQL Query Example
  • 41. DBpedia Tutorial 09.02.2015 http://dbpedia.org41 SPARQL Tooling ● FlintSparqlEditor: Javascript SPARQL Editor ● syntax highlighting, code assistance ● auto-completion for properties and classes (for small datasets) ● Protegé: full-fledged ontology editor ● good to get an overview of ontologies backing datasets ● two SPARQL plug-ins (one supporting entailment) ● curl or your favourite simple REST API ● allows for simple testing queries from any text editor with SPARQL syntax support (e.g. Emacs, Vim, Sublime Text) $curl -H 'Accept: application/json' --data-urlencode "query=$(cat query.sparql)" http://dbpedia.org/sparql
  • 42. DBpedia Tutorial 09.02.2015 http://dbpedia.org42 DBpedia for Entity Linking and Disambiguation ● DBpedia Spotlight ● web service to detect, disambiguate and link mentionings of DBpedia resource occurrences in input text ● uses two NLP datasets derived by DBpedia ⇒ topic signatures - tf/idf weighted term vectors ⇒ lexicalisations - alternative names for entities and concepts ● several other entity detection and linking services targetting DBpedia entities: AlchemyAPI, Ontos Semantic API, OpenCalais, Zemanta
  • 43. DBpedia Tutorial 09.02.2015 http://dbpedia.org43 DBpedia for Entity Linking and Disambiguation
  • 45. DBpedia Tutorial 09.02.2015 http://dbpedia.org45 Linking DBpedia target dataset predicate out-link cout Freebase owl:sameAs 3 6000 000 YAGO2 rdf:type 18 100 000 UMBEL rdf:type 896 400 WordNet dbp:wordnet type 467 100 OpenCyc owl:sameAs 27 100 LinkedGeoData owl:sameAs 103 600 GeoNames owl:sameAs 86 500 ● community-curated links to various major and minor external datasets: ● Linked Data Web analysis with Sinditech measured 3 960 212 in-links to DBpedia (lower-bound) statistics from (Lehmann et al. 2012)
  • 46. DBpedia Tutorial 09.02.2015 http://dbpedia.org46 Linking DBpedia - use cases for Linked DBpedia Data ● correllate the accumulated Funding per year from EU to member countries (from FTS) with the gross domestic product of these countries (DBpedia) ● correlate the share of metropolitan area above average used for parks or other natural recreational areas in town and cities led environmentalist (LinkedGeoData & DBpedia) ● is there a town with town with no more than 15000 inhabitants in the area around Leipzig containing a church with Catholic denomination, childcare, a primary shool and a grammar school, not currently led by a politican from the conservative party
  • 47. DBpedia Tutorial 09.02.2015 http://dbpedia.org47 DBpedia internationalised ● non-English versions of DBpedia offers ● coverage of more entities ● more detailed or up-to-date information for entities associated with the particular coutries ● international mapping community helps in provision of localized dbpedia datasets for 125 languages ⇒ own IRI recipe http://<langcode>.dbpedia.org/resource/<thing> ● 15 DBpedia chapters: autonomous management of mapping, organisation of local community, hosting of datasets and services ● also canonicalized datasets: facts derived from localized Wikipedias, but only statements for resources also present in Englisch DBpedia ⇒ usage of default http://dbpedia.org/resource/ namespace
  • 48. DBpedia Tutorial 09.02.2015 http://dbpedia.org48 DBpedia internationalised
  • 49. DBpedia Tutorial 09.02.2015 http://dbpedia.org49 Related Work: Freebase –extracts structured data from Wikipedia –makes it available in RDF Similarities: –provides dumps of the extracted data –provides APIs and endpoints to access the data
  • 50. DBpedia Tutorial 09.02.2015 http://dbpedia.org50 Related Work: Freebase Differences: Freebase - Freebase uses several Sources –> higher coverage - Freebase can be directly edited by users - mainly run by Google (discontiued) Dbpedia - RDF representation of Wikipedia - hub on the Web of Data - can be only indirectly edited by modifying the content of Wikipedia - ongoing community effort
  • 51. DBpedia Tutorial 09.02.2015 http://dbpedia.org51 Related Work: Wikidata – Initialized by Wikimedia Germany e.V. in 2012 – free knowledge base about the world that can be read – edited by humans and machines alike – can offer a variety of statements from different sources and dates – does not offer the truth about things: • (-) Berlin has a population of 3.5 million • (+) Wikidata contains the statement about Berlin’s population being 3.5 million as of 2011 according to the German statistical office – aim is to provide a single point of truth for facts in Wikipedia across different language versions
  • 52. DBpedia Tutorial 09.02.2015 http://dbpedia.org52 Current developments ● Increased validation and curation process (DBpedia+, RDFUnit) ● ease creation of local DBpedia SPARQL endpoints (Debian packaging, docker images of triple store and dataset selection, automatic import) ● novel more intuitive and feature rich browsing interfaces ⇒ add corrections in place in LD viewer interfaces (?)
  • 53. DBpedia Tutorial 09.02.2015 http://dbpedia.org53 How you can get involved –set up new mirrors and endpoints of Dbpedia –revise mappings and/or write new ones –help improving the ontology –get involved with the Irish/Gaelic chapter bianca.pereira@insight-centre.org caoilfhionn.lane@insight-centre.org –edit Wikipedia
  • 54. DBpedia Tutorial 09.02.2015 http://dbpedia.org54 Further Reading: Website landing page: http://dbpedia.org/About overview over datasets (also info on localized datasets): http://wiki.dbpedia.org/Datasets DBpeda data access oveview: http://wiki.dbpedia.org/OnlineAccess
  • 55. DBpedia Tutorial 09.02.2015 http://dbpedia.org55 Further Reading: Publications 2007 T: DBpedia: A Nucleus for a Web of Open Data A: Auer, Bizer, Kobilarov, Lehmann,Cyganiak, Ives http://www.cis.upenn.edu/~zives/research/dbpedia.pdf 2009 T: DBpedia - A Crystallization Point for the Web of Data A: Bizer, Lehmann, Kobilarov, Auer, Becker, Cyganiak, Hellmann http://jens-lehmann.org/files/2009/dbpedia_jws.pdf 2012 T: DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia A: Lehmann, Isele, Jkob , Jentzsch, Kontokostas,Hellmann, Morsey, van Kleef, Auer, Bizer http://www.semantic-web-journal.net/system/files/swj499.pdf
  • 56. DBpedia Tutorial 09.02.2015 http://dbpedia.org56 Further Reading: W3C Specs RDF: http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/ RDFS: http://www.w3.org/TR/rdf-schema/ OWL 2: http://www.w3.org/TR/owl2-overview/ SPARQL Query Language: http://www.w3.org/TR/sparql11-query/ SPARQL Protocol: http://www.w3.org/TR/2013/REC-sparql11-protocol- 20130321/
  • 57. DBpedia Tutorial 09.02.2015 http://dbpedia.org57 Further Reading: Browsing DBpedia VAD: http://dbpedia.org/page/DBpedia DBpedia Facets: http://dbpedia.org/fct/ new DBpedia frontend: http://de.dbpedia.org/page/DBpedia (get an impression to the German DBpedia version) https://github.com/lukovnikov/ldviewer (source code) Context platform: http://context.aksw.org/app/hub.php?corpus=6&action=facets (online demo to browse LOD2 Blog) http://context.aksw.org/app/ (project home)
  • 58. DBpedia Tutorial 09.02.2015 http://dbpedia.org58 Further Reading: SPARQL DBpedia Snorql SPARQL interface (DBP-en): http://dbpedia.org/snorql/ John Cleese Query in Snorql: http://bit.ly/1zog24A EU Funding vs. Country GDB: https://gist.github.com/neradis/0ca7a41c408280c0d69e Flint SPARQL Editor: http://openuplabs.tso.co.uk/demos/sparqleditor (online demo) https://github.com/TSO-Openup/FlintSparqlEditor (source code, checkout and run)
  • 59. DBpedia Tutorial 09.02.2015 http://dbpedia.org59 Further Reading: pupular RDF/OWL frameworks Sesame (Java): http://rdf4j.org/ Jena (Java): http://jena.apache.org/index.html RDFLib (Python): http://code.google.com/p/rdflib/
  • 60. DBpedia Tutorial 09.02.2015 http://dbpedia.org60 Goodbye! Thank you for you interest in DBpedia!