SlideShare a Scribd company logo
Querying Incomplete Geospatial
Information in RDF
Charalampos Nikolaou and Manolis Koubarakis

Department of Informatics and Telecommunications
National and Kapodistrian University of Athens

International Symposium on Spatial and Temporal Databases (SSTD) 2013
August 23, 2013
Motivation
• Increased interest in publishing geospatial datasets
as linked data (i.e., encoded in RDF and with
semantic links to other datasets)
• Geospatial information might be:
o Quantitative (e.g., exact geometric information)
o Qualitative (e.g., topological relations)
... and express knowledge that is

o Complete
o Incomplete (or indefinite)
Ordnance Survey (UK)

73,546,231
triples
Global Administrative
Areas (GADM)

9,896,532
triples
Nomenclature of Territorial
Units for Statistics (NUTS)

316,246
triples
Linked Geospatial Data
DB
Tropes
Hellenic
FBD
Hellenic
PD

Crime
Reports
UK

NHS
(EnAKTing)

Open
Election
Data
Project

EU
Institutions

CO2
Emission
(EnAKTing)

Energy
(EnAKTing)

EEA

Mortality
(EnAKTing)

Ordnance
Survey

legislation
data.gov.uk
UK Postcodes

ESD
standards

ISTAT
Immigration

Lichfield
Spending

Scotland
Pupils &
Exams

Traffic
Scotland

Data
Gov.ie

reference
data.gov.
uk

TWC LOGD

transport
data.gov.
uk

Eurostat

Eurostat
(FUB)

(RKB
Explorer)

Linked
EDGAR
(Ontology
Central)

EURES

(Ontology
Central)

GovTrack

Finnish
Municipalities

New
York
Times

World
Factbook

Geo
Species

Italian
public
schools

Project
Gutenberg

UMBEL

riese

dbpedia
lite

dataopenac-uk

TCM
Gene
DIT

Daily
Med

YAGO

Open
Cyc

data
dcs

Diseasome

Enipedia

Lexvo

DBLP
(L3S)

Twarql

LinkedCT

EUNIS

Cornetto

SMC
Journals

Ocean
Drilling
Codices

Turismo
de
Zaragoza

Janus
AMP

Linked
GeoData

WordNet
(W3C)

Alpine
Ski
Austria

AEMET

Metoffice
Weather
Forecasts

PDB

Weather
Stations

Yahoo!
Geo
Planet

National
Radioactivity
JP

ChEMBL
Open
Data
Thesaurus

Sears

GESIS

Pisa

RESEX

Scholarometer

ACM

NVD

IBM

DEPLOY

Newcastle

RAE2001

LOCAH

Roma

CiteSeer

Courseware

dotAC

ePrints

IEEE
RISKS

PROSITE

Affymetrix

SISVU

GEMET

Airports

STW

Budapest

IRIT

VIVO
Indiana

(Bio2RDF)

PubMed

ProDom

VIVO
Cornell

STITCH

LAAS

NSF

KISTI

Linked
Open
Colors

SGD

Gene
Ontology

AGROV
OC

Product
DB

DBLP
(RKB
Explorer)

Swedish
Open
Cultural
Heritage

JISC

WordNet
(RKB
Explorer)

EARTh

lobid
Organisations

ECS

(RKB
Explorer)

HGNC

LODE
Climbing

NSZL
Catalog

Wiki

ECS
Southampton

ECS
Southampton
EPrints

Eurécom

UniProt

Taxono
my

lobid
Resources

Pfam

UniProt
WordNet
(VUA)

Ulm

P20

UN/
LOCODE

SIDER

Drug
Bank

Europeana

OAI

DBLP
(FU
Berlin)

ERA

lingvoj

VIAF

Deutsche
Biographie

~ 62 billion
triples
BibBase

Uberblic

Norwegian
MeSH

UB
Mannheim

Calames

BNB

Freebase

Rådata
nå!

GND

ndlna

data
bnf.fr

OS

DBpedia

GeoWord
Net

El
Viajero
Tourism

IdRef
Sudoc

iServe

Geo
Names

LCSH

Sudoc

RDF
Book
Mashup

LIBRIS

PSH

DDC

Open
Calais

Greek
DBpedia

ntnusc

MARC
Codes
List

totl.net

US Census
(rdfabout)

Piedmont
Accomodations

URI
Burner

LEM

Thesaurus W

SW
Dog
Food

Portuguese
DBpedia

t4gm
info

RAMEAU
SH

LinkedL
CCN

theses.
fr

my
Experiment

flickr
wrappr

NDL
subjects
Open
Library
(Talis)

Plymouth
Reading
Lists

Revyu

Fishes
of Texas

(rdfabout)

Scotland
Geography

Linked
MDB

Event
Media

US SEC

Semantic
XBRL

FTS

Chronicling
America

Telegraphis

Linked
Sensor Data
(Kno.e.sis)

Eurostat

Goodwin
Family

NTU
Resource
Lists

Open
Library

SSW
Thesaur
us

semantic
web.org

BBC
Music

Geo
Linked
Data

Source Code
Ecosystem
Linked Data

Didactal
ia

Pokedex

St.
Andrews
Resource
Lists

Manchester
Reading
Lists

gnoss
Poképédia

Classical
(DB
Tune)

BBC
Wildlife
Finder

NASA
(Data
Incubator)

Ontos
News
Portal

Sussex
Reading
Lists

Bricklink

yovisto

Semantic
Tweet

Linked
Crunchbase

Jamendo
(DBtune)

Music
Brainz
(DBTune)

Last.FM
(rdfize)

Taxon
Concept

LOIUS

CORDIS

CORDIS
(FUB)

(Data
Incubator)

BBC
Program
mes

Rechtspraak.
nl

Openly
Local

data.gov.uk
intervals

London
Gazette

Discogs

(DBTune)

OpenEI

statistics
data.gov.
uk

GovWILD

Brazilian
Politicians

educatio
n.data.g
ov.uk

Music
Brainz
(zitgist)
RDF
ohloh

FanHubz

patents
data.go
v.uk

research
data.gov.
uk

Klappstuhlclub

Lotico

(Data
Incubator)

Last.FM
artists

Population (EnAKTing)

reegle
Ren.
Energy
Generators

(DBTune)

Surge
Radio

tags2con
delicious

Slideshare
2RDF

(DBTune)

Music
Brainz

John
Peel

EUTC
Productions

business
data.gov.
uk

Crime
(EnAKTing)

Ox
Points

GTAA

Magnatune

Linked
User
Feedback

LOV

Audio
Scrobbler

Moseley
Folk

OMIM

MGI

InterPro
Smart
Link

Product
Types
Ontology

Open
Corporates

Italian
Museums

Amsterdam
Museum

UniParc

UniRef

UniSTS

Linked
Open
Numbers

Reactome

OGOLOD

Pub
Chem

GeneID

KEGG
Pathway

Medi
Care

Google
Art
wrapper

meducator

KEGG
Drug

UniPath
way

Chem2
Bio2RDF

Homolo
Gene

VIVO UF

ECCOTCP
bible
ontology

KEGG
Enzyme

PBAC

KEGG
Reaction

KEGG
Compound

KEGG
Glycan

Media
Geographic
Publications

User-generated content
Government
Cross-domain
Life sciences
As of September 2011
Question
How do we manage (represent, store,
query) this data efficiently?
Challenges: Theory
① RDF extensions for representing and querying incomplete
qualitative and quantitative geospatial information
•

GeoSPARQL

•

We proposed RDFi

•

No published algorithm for query processing when considering
RCC-8 and constants

o Standard OGC query language for RDF data with geospatial information
o Topological relations can be expressed/queried, but no reasoning is
offered.

o Can work with any topological/temporal constraint language
with/without constant symbols (e.g., RCC-5, RCC-8, IA)
o Formal semantics and algorithm for computing certain answers
o Preliminary complexity results for various constraint languages
i
RDF

by example
gag:Region
rdfs:subClassOf geo:Feature.
gag:WestGreece
rdf:type gag:Region.
gag:Municipality rdfs:subClassOf geo:Feature.

West
Greece

gag:OlympiaMuni
noa:Hotspot
noa:hotspot

rdfs:subClassOf geo:Feature.
rdf:type noa:Hospot.

noa:Fire
noa:fire

Olympia

rdf:type gag:Municipality.

rdfs:subClassOf geo:Feature.
rdf:type noa:Fire.

gag:OlympiaMuni geo:hasGeometry ex:oGeo.
ex:oGeo rdf:type sf:Polygon.
ex:oGeo geo:asWKT "POLYGON((..))"^^geo:wktLiteral.
noa:hotspot geo:hasGeometry ex:rec.
ex:rec geo:asWKT "POLYGON((..))"^^geo:wktLiteral.
gag:WestGreece geo:sfContains gag:OlympiaMuni.
noa:hotspot geo:sfContains noa:fire.
i
RDF

by example (cont’d)
Query: Find fires inside the
region of West Greece.

West
Greece

GeoSPARQL query:

Olympia

CERTAIN SELECT ?f
WHERE {
?f rdf:type noa:Fire.
gag:WestGreece geo:sfContains ?f.
}
i
RDF

by example (cont’d)
Query: Find fires inside the
region of West Greece.

contains

contains

West
Greece

Olympia

GeoSPARQL query:
CERTAIN SELECT ?f
WHERE {
?f rdf:type noa:Fire.
gag:WestGreece geo:sfContains ?f.
}
Challenges: Theory
② Efficient computation of the entailment relation

Φ⊨Θ
• where Φ and Θ are quantifier-free first-order
formulas of a constraint language expressing the
topological relations of various frameworks (RCC-8,
DE-9IM, etc.)
Challenges: Theory
③ Computing entailment is equivalent to checking
consistency of formulas with constraint networks
• Constraint networks:

o Spatial relations among regions
o Regions might be constant ones (exact geometric
information) or identified by a URI

• Most recent results considered basic and complete
RCC-5 networks with polygonal regions

• For RCC-8, deciding consistency is NP-complete
• No published algorithm for checking consistency
• Are there tractable cases?
Challenges: Practice
④ Scale to billions of triples

• Reasoners from QSR scale only up to hundreds of regions
with complex spatial relations
How do they perform in our case?

• Setting:
o
o
o
o

Real linked geospatial datasets
No constants
Only base RCC-8 relations
Evaluation of consistency checking using the well-known
path-consistency algorithm
Experimental evaluation
after one day
•

Computation of
the complete
constraint
network

•

Running time:
O(n3)

•

Memory
requirements:
O(n2)

n ≈ thousands to
millions
hundreds of
regions

thousands of
regions

thousands of
regions

thousands of
regions

Setup: Intel Xeon E5620, 2.4 GHz, 12MB L3, 48GB RAM, RAID 5, Ubuntu 12.04
Network structure
• We have started working on algorithms taking into
account the structure of these networks:
o Node degrees fit a power-law distribution
o Network is sparse
Network structure (cont’d)
• Edges of three kinds:
non-tangential proper part

externally connected

equals

• Reflect networks composed of components with
hierarchical structure

o R-tree extensions (Papadias, Kalnis, Mamoulis, AAAI’99)

• Parallel algorithms combined with backward-chaining
techniques for lazy query processing
o Graph partitioning
o Path compression data structures and indexes
Related work: Spatial
• Qualitative spatial reasoning
- Efficient algorithms for consistency checking of constraint
networks (complex spatial relations, few number of regions)
- Does not consider query processing

• Description logic reasoners
- PelletSpatial: RCC-8 reasoning (cannot handle disjunctions)
- RacerPro: RCC-8 reasoning
Related work: Temporal
• Chaudhuri (VLDB’88)
• The knowledge representation language Telos (TOIS’90)

• Foundations of temporal constraint databases (Koubarakis,
PhD thesis, ‘94)
• Qualitative temporal reasoning community (since 80s)
• SQL+i system (BNCOD‘96)
• Later system (IEEE’97)

• Hurtado and Vaisman (2006)
Conclusions
• What’s the CHALLENGE?
Implementing an efficient query processing system
for incomplete geospatial information in RDFi
• The desired system should:
o reason about qualitative and quantitative spatial
information that might be incomplete
o be scalable to billions of triples in the most useful cases
Thank you
Dataset characteristics

More Related Content

PDF
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
PDF
Learning Commonalities in RDF
PDF
Mapping Lo Dto Proton Revised [Compatibility Mode]
PDF
Triplewave: a step towards RDF Stream Processing on the Web
PPTX
Connecting Stream Reasoners on the Web
PPTX
Query Rewriting in RDF Stream Processing
PDF
Machine Learning and GraphX
PPTX
Large-Scale Geographically Weighted Regression on Spark
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
Learning Commonalities in RDF
Mapping Lo Dto Proton Revised [Compatibility Mode]
Triplewave: a step towards RDF Stream Processing on the Web
Connecting Stream Reasoners on the Web
Query Rewriting in RDF Stream Processing
Machine Learning and GraphX
Large-Scale Geographically Weighted Regression on Spark

What's hot (20)

PPTX
RDF Stream Processing Tutorial: RSP implementations
PPTX
TripleWave: Spreading RDF Streams on the Web
PDF
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
PPTX
giasan.vn real-estate analytics: a Vietnam case study
PPTX
LD4KD 2015 - Demos and tools
PPT
Open Analytics Environment
PPTX
RDF-Gen: Generating RDF from streaming and archival data
PDF
Oshs_9_11_2015
PPTX
OWL reasoning with WebPIE: calculating the closer of 100 billion triples
PPTX
Design Pattern of HBase Configuration
PPT
Summary of HDF-EOS5 Files, Data Model and File Format
PPTX
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
PDF
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
PDF
OpenML.org: Networked Science and IoT Data Streams by Jan van Rijn, Universit...
PPTX
RDF Stream Processing and the role of Semantics
PPT
LarKC Tutorial at ISWC 2009 - Data Model
PPT
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
PPTX
Querying the Web of Data
PPTX
Wi2015 - Clustering of Linked Open Data - the LODeX tool
PPTX
RDF Stream Processing: Let's React
RDF Stream Processing Tutorial: RSP implementations
TripleWave: Spreading RDF Streams on the Web
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
giasan.vn real-estate analytics: a Vietnam case study
LD4KD 2015 - Demos and tools
Open Analytics Environment
RDF-Gen: Generating RDF from streaming and archival data
Oshs_9_11_2015
OWL reasoning with WebPIE: calculating the closer of 100 billion triples
Design Pattern of HBase Configuration
Summary of HDF-EOS5 Files, Data Model and File Format
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
Geographica: A Benchmark for Geospatial RDF Stores - ISWC 2013
OpenML.org: Networked Science and IoT Data Streams by Jan van Rijn, Universit...
RDF Stream Processing and the role of Semantics
LarKC Tutorial at ISWC 2009 - Data Model
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
Querying the Web of Data
Wi2015 - Clustering of Linked Open Data - the LODeX tool
RDF Stream Processing: Let's React
Ad

Viewers also liked (20)

DOC
Crítica de la autocreación y redescripción en Rorty
PDF
Comandos de sistema
PPTX
Questionnaire results powerpoint
DOC
Acerca de la enseñanza de la filosofía según Hegel
PPTX
Business intelligence
PPTX
Viziune Sports Net Management
PDF
Incomplete Information in RDF
PDF
Business flyer press kit (long)
PDF
Flyer_Veranderpraktijk_20152
PPTX
Studiu de caz Be Cool by Napoca Rally Academy
PPT
PPTX
Querying Linked Geospatial Data with Incomplete Information
PPT
Conte Presentation (ppt)
PPTX
Visualizing and Exploring Linked Spatiotemporal Data using Sextant
PPTX
Content Marketing by Sports Net Management 2015
PPTX
Prezentare Content Marketing by Sports Net Management 2016
PDF
R S Tour & Travel
PDF
Heather Shaw Samples
PPTX
Cpd presentation
PPTX
SQA - chapter 13 (Software Quality Infrastructure)
Crítica de la autocreación y redescripción en Rorty
Comandos de sistema
Questionnaire results powerpoint
Acerca de la enseñanza de la filosofía según Hegel
Business intelligence
Viziune Sports Net Management
Incomplete Information in RDF
Business flyer press kit (long)
Flyer_Veranderpraktijk_20152
Studiu de caz Be Cool by Napoca Rally Academy
Querying Linked Geospatial Data with Incomplete Information
Conte Presentation (ppt)
Visualizing and Exploring Linked Spatiotemporal Data using Sextant
Content Marketing by Sports Net Management 2015
Prezentare Content Marketing by Sports Net Management 2016
R S Tour & Travel
Heather Shaw Samples
Cpd presentation
SQA - chapter 13 (Software Quality Infrastructure)
Ad

Similar to Querying Incomplete Geospatial Information in RDF (20)

PDF
Building Scalable Semantic Geospatial RDF Stores
PPTX
Geographica: A Benchmark for Geospatial RDF Stores
PDF
Representing and Querying Geospatial Information in the Semantic Web
PDF
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
PDF
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
PPTX
Strabon: A Semantic Geospatial Database System
PDF
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
PDF
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
PDF
Topological Relations in Linked Geographic Data
PDF
VOLT - ESWC 2016
PDF
Big Linked Data Querying - ExtremeEarth Open Workshop
PPTX
A middleware for storing massive RDF graphs into NoSQL
PDF
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
PDF
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
PDF
RDFS with Attribute Equations via SPARQL Rewriting
PDF
Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Gra...
PPTX
Data Integration at the Ontology Engineering Group
PPTX
Logical Detection of Invalid SameAs Statements in RDF Data
PDF
Big Linked Data Federation - ExtremeEarth Open Workshop
PDF
A Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL Databases
Building Scalable Semantic Geospatial RDF Stores
Geographica: A Benchmark for Geospatial RDF Stores
Representing and Querying Geospatial Information in the Semantic Web
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Strabon: A Semantic Geospatial Database System
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Topological Relations in Linked Geographic Data
VOLT - ESWC 2016
Big Linked Data Querying - ExtremeEarth Open Workshop
A middleware for storing massive RDF graphs into NoSQL
Geospatial querying in Apache Marmotta - ApacheCon Big Data Europe 2015
BigDataEurope 1st SC5 Workshop, Project Teleios & LEO, by M. Koubarakis, Univ...
RDFS with Attribute Equations via SPARQL Rewriting
Workload-Aware RDF Partitioning and SPARQL Query Caching for Massive RDF Gra...
Data Integration at the Ontology Engineering Group
Logical Detection of Invalid SameAs Statements in RDF Data
Big Linked Data Federation - ExtremeEarth Open Workshop
A Workload-Aware Middleware for Storing Massive RDF Graphs into NoSQL Databases

Recently uploaded (20)

PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
Presentation on HIE in infants and its manifestations
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Complications of Minimal Access Surgery at WLH
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
RMMM.pdf make it easy to upload and study
202450812 BayCHI UCSC-SV 20250812 v17.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Presentation on HIE in infants and its manifestations
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Supply Chain Operations Speaking Notes -ICLT Program
Complications of Minimal Access Surgery at WLH
STATICS OF THE RIGID BODIES Hibbelers.pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Cell Types and Its function , kingdom of life
A systematic review of self-coping strategies used by university students to ...
2.FourierTransform-ShortQuestionswithAnswers.pdf
Microbial disease of the cardiovascular and lymphatic systems
VCE English Exam - Section C Student Revision Booklet
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Anesthesia in Laparoscopic Surgery in India
Pharmacology of Heart Failure /Pharmacotherapy of CHF
RMMM.pdf make it easy to upload and study

Querying Incomplete Geospatial Information in RDF

  • 1. Querying Incomplete Geospatial Information in RDF Charalampos Nikolaou and Manolis Koubarakis Department of Informatics and Telecommunications National and Kapodistrian University of Athens International Symposium on Spatial and Temporal Databases (SSTD) 2013 August 23, 2013
  • 2. Motivation • Increased interest in publishing geospatial datasets as linked data (i.e., encoded in RDF and with semantic links to other datasets) • Geospatial information might be: o Quantitative (e.g., exact geometric information) o Qualitative (e.g., topological relations) ... and express knowledge that is o Complete o Incomplete (or indefinite)
  • 5. Nomenclature of Territorial Units for Statistics (NUTS) 316,246 triples
  • 6. Linked Geospatial Data DB Tropes Hellenic FBD Hellenic PD Crime Reports UK NHS (EnAKTing) Open Election Data Project EU Institutions CO2 Emission (EnAKTing) Energy (EnAKTing) EEA Mortality (EnAKTing) Ordnance Survey legislation data.gov.uk UK Postcodes ESD standards ISTAT Immigration Lichfield Spending Scotland Pupils & Exams Traffic Scotland Data Gov.ie reference data.gov. uk TWC LOGD transport data.gov. uk Eurostat Eurostat (FUB) (RKB Explorer) Linked EDGAR (Ontology Central) EURES (Ontology Central) GovTrack Finnish Municipalities New York Times World Factbook Geo Species Italian public schools Project Gutenberg UMBEL riese dbpedia lite dataopenac-uk TCM Gene DIT Daily Med YAGO Open Cyc data dcs Diseasome Enipedia Lexvo DBLP (L3S) Twarql LinkedCT EUNIS Cornetto SMC Journals Ocean Drilling Codices Turismo de Zaragoza Janus AMP Linked GeoData WordNet (W3C) Alpine Ski Austria AEMET Metoffice Weather Forecasts PDB Weather Stations Yahoo! Geo Planet National Radioactivity JP ChEMBL Open Data Thesaurus Sears GESIS Pisa RESEX Scholarometer ACM NVD IBM DEPLOY Newcastle RAE2001 LOCAH Roma CiteSeer Courseware dotAC ePrints IEEE RISKS PROSITE Affymetrix SISVU GEMET Airports STW Budapest IRIT VIVO Indiana (Bio2RDF) PubMed ProDom VIVO Cornell STITCH LAAS NSF KISTI Linked Open Colors SGD Gene Ontology AGROV OC Product DB DBLP (RKB Explorer) Swedish Open Cultural Heritage JISC WordNet (RKB Explorer) EARTh lobid Organisations ECS (RKB Explorer) HGNC LODE Climbing NSZL Catalog Wiki ECS Southampton ECS Southampton EPrints Eurécom UniProt Taxono my lobid Resources Pfam UniProt WordNet (VUA) Ulm P20 UN/ LOCODE SIDER Drug Bank Europeana OAI DBLP (FU Berlin) ERA lingvoj VIAF Deutsche Biographie ~ 62 billion triples BibBase Uberblic Norwegian MeSH UB Mannheim Calames BNB Freebase Rådata nå! GND ndlna data bnf.fr OS DBpedia GeoWord Net El Viajero Tourism IdRef Sudoc iServe Geo Names LCSH Sudoc RDF Book Mashup LIBRIS PSH DDC Open Calais Greek DBpedia ntnusc MARC Codes List totl.net US Census (rdfabout) Piedmont Accomodations URI Burner LEM Thesaurus W SW Dog Food Portuguese DBpedia t4gm info RAMEAU SH LinkedL CCN theses. fr my Experiment flickr wrappr NDL subjects Open Library (Talis) Plymouth Reading Lists Revyu Fishes of Texas (rdfabout) Scotland Geography Linked MDB Event Media US SEC Semantic XBRL FTS Chronicling America Telegraphis Linked Sensor Data (Kno.e.sis) Eurostat Goodwin Family NTU Resource Lists Open Library SSW Thesaur us semantic web.org BBC Music Geo Linked Data Source Code Ecosystem Linked Data Didactal ia Pokedex St. Andrews Resource Lists Manchester Reading Lists gnoss Poképédia Classical (DB Tune) BBC Wildlife Finder NASA (Data Incubator) Ontos News Portal Sussex Reading Lists Bricklink yovisto Semantic Tweet Linked Crunchbase Jamendo (DBtune) Music Brainz (DBTune) Last.FM (rdfize) Taxon Concept LOIUS CORDIS CORDIS (FUB) (Data Incubator) BBC Program mes Rechtspraak. nl Openly Local data.gov.uk intervals London Gazette Discogs (DBTune) OpenEI statistics data.gov. uk GovWILD Brazilian Politicians educatio n.data.g ov.uk Music Brainz (zitgist) RDF ohloh FanHubz patents data.go v.uk research data.gov. uk Klappstuhlclub Lotico (Data Incubator) Last.FM artists Population (EnAKTing) reegle Ren. Energy Generators (DBTune) Surge Radio tags2con delicious Slideshare 2RDF (DBTune) Music Brainz John Peel EUTC Productions business data.gov. uk Crime (EnAKTing) Ox Points GTAA Magnatune Linked User Feedback LOV Audio Scrobbler Moseley Folk OMIM MGI InterPro Smart Link Product Types Ontology Open Corporates Italian Museums Amsterdam Museum UniParc UniRef UniSTS Linked Open Numbers Reactome OGOLOD Pub Chem GeneID KEGG Pathway Medi Care Google Art wrapper meducator KEGG Drug UniPath way Chem2 Bio2RDF Homolo Gene VIVO UF ECCOTCP bible ontology KEGG Enzyme PBAC KEGG Reaction KEGG Compound KEGG Glycan Media Geographic Publications User-generated content Government Cross-domain Life sciences As of September 2011
  • 7. Question How do we manage (represent, store, query) this data efficiently?
  • 8. Challenges: Theory ① RDF extensions for representing and querying incomplete qualitative and quantitative geospatial information • GeoSPARQL • We proposed RDFi • No published algorithm for query processing when considering RCC-8 and constants o Standard OGC query language for RDF data with geospatial information o Topological relations can be expressed/queried, but no reasoning is offered. o Can work with any topological/temporal constraint language with/without constant symbols (e.g., RCC-5, RCC-8, IA) o Formal semantics and algorithm for computing certain answers o Preliminary complexity results for various constraint languages
  • 9. i RDF by example gag:Region rdfs:subClassOf geo:Feature. gag:WestGreece rdf:type gag:Region. gag:Municipality rdfs:subClassOf geo:Feature. West Greece gag:OlympiaMuni noa:Hotspot noa:hotspot rdfs:subClassOf geo:Feature. rdf:type noa:Hospot. noa:Fire noa:fire Olympia rdf:type gag:Municipality. rdfs:subClassOf geo:Feature. rdf:type noa:Fire. gag:OlympiaMuni geo:hasGeometry ex:oGeo. ex:oGeo rdf:type sf:Polygon. ex:oGeo geo:asWKT "POLYGON((..))"^^geo:wktLiteral. noa:hotspot geo:hasGeometry ex:rec. ex:rec geo:asWKT "POLYGON((..))"^^geo:wktLiteral. gag:WestGreece geo:sfContains gag:OlympiaMuni. noa:hotspot geo:sfContains noa:fire.
  • 10. i RDF by example (cont’d) Query: Find fires inside the region of West Greece. West Greece GeoSPARQL query: Olympia CERTAIN SELECT ?f WHERE { ?f rdf:type noa:Fire. gag:WestGreece geo:sfContains ?f. }
  • 11. i RDF by example (cont’d) Query: Find fires inside the region of West Greece. contains contains West Greece Olympia GeoSPARQL query: CERTAIN SELECT ?f WHERE { ?f rdf:type noa:Fire. gag:WestGreece geo:sfContains ?f. }
  • 12. Challenges: Theory ② Efficient computation of the entailment relation Φ⊨Θ • where Φ and Θ are quantifier-free first-order formulas of a constraint language expressing the topological relations of various frameworks (RCC-8, DE-9IM, etc.)
  • 13. Challenges: Theory ③ Computing entailment is equivalent to checking consistency of formulas with constraint networks • Constraint networks: o Spatial relations among regions o Regions might be constant ones (exact geometric information) or identified by a URI • Most recent results considered basic and complete RCC-5 networks with polygonal regions • For RCC-8, deciding consistency is NP-complete • No published algorithm for checking consistency • Are there tractable cases?
  • 14. Challenges: Practice ④ Scale to billions of triples • Reasoners from QSR scale only up to hundreds of regions with complex spatial relations How do they perform in our case? • Setting: o o o o Real linked geospatial datasets No constants Only base RCC-8 relations Evaluation of consistency checking using the well-known path-consistency algorithm
  • 15. Experimental evaluation after one day • Computation of the complete constraint network • Running time: O(n3) • Memory requirements: O(n2) n ≈ thousands to millions hundreds of regions thousands of regions thousands of regions thousands of regions Setup: Intel Xeon E5620, 2.4 GHz, 12MB L3, 48GB RAM, RAID 5, Ubuntu 12.04
  • 16. Network structure • We have started working on algorithms taking into account the structure of these networks: o Node degrees fit a power-law distribution o Network is sparse
  • 17. Network structure (cont’d) • Edges of three kinds: non-tangential proper part externally connected equals • Reflect networks composed of components with hierarchical structure o R-tree extensions (Papadias, Kalnis, Mamoulis, AAAI’99) • Parallel algorithms combined with backward-chaining techniques for lazy query processing o Graph partitioning o Path compression data structures and indexes
  • 18. Related work: Spatial • Qualitative spatial reasoning - Efficient algorithms for consistency checking of constraint networks (complex spatial relations, few number of regions) - Does not consider query processing • Description logic reasoners - PelletSpatial: RCC-8 reasoning (cannot handle disjunctions) - RacerPro: RCC-8 reasoning
  • 19. Related work: Temporal • Chaudhuri (VLDB’88) • The knowledge representation language Telos (TOIS’90) • Foundations of temporal constraint databases (Koubarakis, PhD thesis, ‘94) • Qualitative temporal reasoning community (since 80s) • SQL+i system (BNCOD‘96) • Later system (IEEE’97) • Hurtado and Vaisman (2006)
  • 20. Conclusions • What’s the CHALLENGE? Implementing an efficient query processing system for incomplete geospatial information in RDFi • The desired system should: o reason about qualitative and quantitative spatial information that might be incomplete o be scalable to billions of triples in the most useful cases

Editor's Notes

  • #4: Ordnance Survey is Great Britain's national mapping authority. It offers digital and paper map products for a wide range of business and outdoor uses.
  • #5: GADM is a spatial database of the location of the world's administrative areas for use in GIS and similar software.
  • #6: NUTS is a hierarchical system defined by the Eurostat office of the European Union for dividing the economic territory of EU in 4 levels.
  • #20: Chaudhuri (VLDB’88)Framework for temporal relationships in a database employing a graph model (limited to definite information) The knowledge representation language Telos (1991)Preliminary Prolog implementation by M. Koubarakis and T. Topaloglou. The most efficient implementation of Telos (ConceptBase) does not consider incomplete information.Foundations of temporal constraint databases (Koubarakis, PhD thesis 1994)Database models for (indefinite) temporal constraint databasesSQL+i (1996)Temporal RDBMS for modeling and querying indeterminate temporal factsRepresentation and reasoning employing constraint networksLater system (1997)Querying of temporal knowledge basesLimited query language (no disjunctive expressions)