SlideShare a Scribd company logo
Mapping Relational Databases to RDF with OpenLink Virtuoso © 2008 OpenLink Software, All rights reserved. Orri Erling - Lead Developer, Virtuoso Team
Who Wants to Map? Semantic Web Scalers Expose whatever there is as RDF, the next guy will unify terms, make search and apps Data Warehouse Keepers Data is spread out, has implicit semantics, complex schemas, heterogeneous sources, ambiguous terms but we must make it join and aggregate cleanly © 2008 OpenLink Software, All rights reserved.
Present State SPARQL to SQL exists but still, complex integrations are data warehouses We'd really like to map, but... Can it be otherwise? © 2008 OpenLink Software, All rights reserved.
Why RDF Data Warehouse? Pros Even query performance across all data  Possibility of forward-chaining inference Some SPARQL features may be better supported, e.g. Unspecified predicates  Cons Keeping data up-to-date Complex set up, needs dedicated servers: you don't build them on a whim © 2008 OpenLink Software, All rights reserved.
Why Map? No copying, no timeliness issues RDBMS outperforms RDF for analytics workloads Agile reconfiguration without reloading data © 2008 OpenLink Software, All rights reserved.
Virtuoso Mapping of SPARQL to SQL against any existing schema - whether stored in Virtuoso or elsewhere Physical quad store Federated/local RDBMS © 2008 OpenLink Software, All rights reserved.
For Mapping to Deliver... Tackle any SQL analytics workload in SPARQL without extra cost Deal with arbitrary SQL schema Produce single SQL statements, optimizable by target RDBMS Have intelligence for cases where one RDF entity can come from many relational sources © 2008 OpenLink Software, All rights reserved.
The Cases of Integration Bring similar but heterogeneous schemas into a unified ontology - Union View Translate FKs of one schema to PKs in another - Distributed Join Hide differences in normalization - Views for hiding joins - Unit/Terminology conversions © 2008 OpenLink Software, All rights reserved.
Defining a Mapping Define URI formats and their subclass relations Define which key-column-value combinations make a triple Arbitrary SQL is allowed for mapping values and filtering A single RDF node can be a composite of many columns, e.g. multipart key © 2008 OpenLink Software, All rights reserved.  Use SPARQL/SQL to:
The TPC-H Case The 22 queries as extended SPARQL Each generates a single SQL statement, executable by Virtuoso, Oracle, Others Next make several TPC-H databases on different servers and run the queries against the union © 2008 OpenLink Software, All rights reserved.  http://demo.openlinksw.com/tpc-h/
Where Problems Begin In OpenLink Data Spaces, 6 Collaborative apps all mapped to SIOC: Trivially becomes a union of everything, 1000+ lines of SQL Intelligently (once per app) becomes a Union of : © 2008 OpenLink Software, All rights reserved.  select * from <ods>  where {?s ?p ?o . ?s has_comment ?c .  ?c has_author <xxx> } select post.* from post, comment, user  where c_post = p_id and  c_author = u_id and u_name = f ('xxx')
What One Must Know Mapping  for integration is not trivial Be careful when mapping multiple tables/columns to one class/property Make URI schemes which encode type and source, so that senseless joins are not attempted if types not specified in query Understand what the mapping logic can and cannot optimize Understand what SQL can and cannot optimize View resulting SQL for sanity check  © 2008 OpenLink Software, All rights reserved.
SQL Extensions Mapping must work against any RDBMS/Schema, as is But there is Virtuoso SQL between the mapping and target RDBMS(s) Location and latency - conscious distributed cost model Breakup for making a wide result set into a row per property Inverse functions © 2008 OpenLink Software, All rights reserved.
Use Cases OpenLink Data Spaces - Blog, Wiki, News, Social Network, Feed Aggregation, Tag Clouds, Bookmarks etc. OpenLink's own MIS - “total information awareness”: URI for any CRM Object, Account, Product, Support Case, Email etc.. Musicbrainz phpBB, Drupal, MediaWiki, WordPress, Bugzilla, and others. © 2008 OpenLink Software, All rights reserved.
OpenLink Software © 2008 OpenLink Software, All rights reserved.  Thank You! http://virtuoso.openlinksw.com

More Related Content

PPT
ESWC2008 Relational2RDF - Mapping Relational Databases to RDF with OpenLink V...
PPT
ESWC2008 SPARQL BI OpenLink- SPARQL for Business Intelligence
PPT
ESWC2008 Identity OpenLink - On The Evolution of Terms
PPT
Entity framework 4.0
PPT
Linked Data Driven Data Virtualization for Web-scale Integration
PPT
Session x(ado.net)
PPTX
Ado.net
PPTX
Introduction to dotNetRDF
ESWC2008 Relational2RDF - Mapping Relational Databases to RDF with OpenLink V...
ESWC2008 SPARQL BI OpenLink- SPARQL for Business Intelligence
ESWC2008 Identity OpenLink - On The Evolution of Terms
Entity framework 4.0
Linked Data Driven Data Virtualization for Web-scale Integration
Session x(ado.net)
Ado.net
Introduction to dotNetRDF

Viewers also liked (20)

PDF
Grafico mensual del s&p 500para el 01 11 2011
PDF
SatellitendatenHV2010[1].pdf
PPTX
Tecnologias de informacion
PPT
Yuriy Krainiak
PPTX
Presentación st marys good ideas!
PDF
Informativo Outubro 2011
PPTX
Igor Dankov
PDF
Gestor de proyectos
PPTX
Bristol
PPTX
Sussex villages
PDF
Poster final
PDF
Madres a panama
PDF
Андрій Горбатенко
PPTX
Breathtakingly Wild Swimming Pools Around the World
DOCX
Поради батькам підлітків
PPT
Funções Terapêuticas do Óleo na Massagem - Carolina Santos
PPT
Emission Control Technology
PPTX
утро начинается»
PDF
Build, Test, Deploy, Run, Scale! Sua App na Nuvem com OpenShift, o PaaS da Re...
Grafico mensual del s&p 500para el 01 11 2011
SatellitendatenHV2010[1].pdf
Tecnologias de informacion
Yuriy Krainiak
Presentación st marys good ideas!
Informativo Outubro 2011
Igor Dankov
Gestor de proyectos
Bristol
Sussex villages
Poster final
Madres a panama
Андрій Горбатенко
Breathtakingly Wild Swimming Pools Around the World
Поради батькам підлітків
Funções Terapêuticas do Óleo na Massagem - Carolina Santos
Emission Control Technology
утро начинается»
Build, Test, Deploy, Run, Scale! Sua App na Nuvem com OpenShift, o PaaS da Re...
Ad

Similar to Virtuoso Relational To RDF Mapping (20)

PPT
Virtuoso Universal Server Overview
PPT
Making the Conceptual Layer Real via HTTP based Linked Data
PPTX
Flexible metadata schemes for research data repositories - Clarin Conference...
PPTX
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
PPTX
Scala and spark
PPT
Semantic Web Servers
PPTX
Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote
PPT
Viridians on Rails
PPTX
Virtuoso -- The Prometheus of RDF
PPT
Data Transformation using Semantic Web Standards
PDF
I18n
PPT
C4l2008charper
PDF
Azure BI Cloud Architectural Guidelines.pdf
PDF
Entity Framework Interview Questions PDF By ScholarHat
PPT
SQL Server 2008 for Developers
PPT
200211 Fielding Apachecon
PDF
Asp.net interview questions
PDF
What are DevOps Application Patterns on AWS…and why do I need them?
PPT
Michael Lang Sr. Presentation
PDF
Boost Your Content Strategy for REST APIs
Virtuoso Universal Server Overview
Making the Conceptual Layer Real via HTTP based Linked Data
Flexible metadata schemes for research data repositories - Clarin Conference...
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
Scala and spark
Semantic Web Servers
Virtuoso, The Prometheus of RDF -- Sematics 2014 Conference Keynote
Viridians on Rails
Virtuoso -- The Prometheus of RDF
Data Transformation using Semantic Web Standards
I18n
C4l2008charper
Azure BI Cloud Architectural Guidelines.pdf
Entity Framework Interview Questions PDF By ScholarHat
SQL Server 2008 for Developers
200211 Fielding Apachecon
Asp.net interview questions
What are DevOps Application Patterns on AWS…and why do I need them?
Michael Lang Sr. Presentation
Boost Your Content Strategy for REST APIs
Ad

More from rumito (7)

PPT
Solving Real Problems Using Linked Data
PPT
Open Conceptual Data Models
PPT
Linked Data Planet Key Note
PPT
Data Portability And Data Spaces
PPT
Deploying RDF Linked Data via Virtuoso Universal Server
PPT
RDF Views of SQL Data Power Point Presentation - 1
PPT
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Solving Real Problems Using Linked Data
Open Conceptual Data Models
Linked Data Planet Key Note
Data Portability And Data Spaces
Deploying RDF Linked Data via Virtuoso Universal Server
RDF Views of SQL Data Power Point Presentation - 1
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
cuic standard and advanced reporting.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Encapsulation theory and applications.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Cloud computing and distributed systems.
PDF
Review of recent advances in non-invasive hemoglobin estimation
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Chapter 3 Spatial Domain Image Processing.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
cuic standard and advanced reporting.pdf
Spectroscopy.pptx food analysis technology
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Encapsulation_ Review paper, used for researhc scholars
Encapsulation theory and applications.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Spectral efficient network and resource selection model in 5G networks
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Big Data Technologies - Introduction.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Cloud computing and distributed systems.
Review of recent advances in non-invasive hemoglobin estimation

Virtuoso Relational To RDF Mapping

  • 1. Mapping Relational Databases to RDF with OpenLink Virtuoso © 2008 OpenLink Software, All rights reserved. Orri Erling - Lead Developer, Virtuoso Team
  • 2. Who Wants to Map? Semantic Web Scalers Expose whatever there is as RDF, the next guy will unify terms, make search and apps Data Warehouse Keepers Data is spread out, has implicit semantics, complex schemas, heterogeneous sources, ambiguous terms but we must make it join and aggregate cleanly © 2008 OpenLink Software, All rights reserved.
  • 3. Present State SPARQL to SQL exists but still, complex integrations are data warehouses We'd really like to map, but... Can it be otherwise? © 2008 OpenLink Software, All rights reserved.
  • 4. Why RDF Data Warehouse? Pros Even query performance across all data Possibility of forward-chaining inference Some SPARQL features may be better supported, e.g. Unspecified predicates Cons Keeping data up-to-date Complex set up, needs dedicated servers: you don't build them on a whim © 2008 OpenLink Software, All rights reserved.
  • 5. Why Map? No copying, no timeliness issues RDBMS outperforms RDF for analytics workloads Agile reconfiguration without reloading data © 2008 OpenLink Software, All rights reserved.
  • 6. Virtuoso Mapping of SPARQL to SQL against any existing schema - whether stored in Virtuoso or elsewhere Physical quad store Federated/local RDBMS © 2008 OpenLink Software, All rights reserved.
  • 7. For Mapping to Deliver... Tackle any SQL analytics workload in SPARQL without extra cost Deal with arbitrary SQL schema Produce single SQL statements, optimizable by target RDBMS Have intelligence for cases where one RDF entity can come from many relational sources © 2008 OpenLink Software, All rights reserved.
  • 8. The Cases of Integration Bring similar but heterogeneous schemas into a unified ontology - Union View Translate FKs of one schema to PKs in another - Distributed Join Hide differences in normalization - Views for hiding joins - Unit/Terminology conversions © 2008 OpenLink Software, All rights reserved.
  • 9. Defining a Mapping Define URI formats and their subclass relations Define which key-column-value combinations make a triple Arbitrary SQL is allowed for mapping values and filtering A single RDF node can be a composite of many columns, e.g. multipart key © 2008 OpenLink Software, All rights reserved. Use SPARQL/SQL to:
  • 10. The TPC-H Case The 22 queries as extended SPARQL Each generates a single SQL statement, executable by Virtuoso, Oracle, Others Next make several TPC-H databases on different servers and run the queries against the union © 2008 OpenLink Software, All rights reserved. http://demo.openlinksw.com/tpc-h/
  • 11. Where Problems Begin In OpenLink Data Spaces, 6 Collaborative apps all mapped to SIOC: Trivially becomes a union of everything, 1000+ lines of SQL Intelligently (once per app) becomes a Union of : © 2008 OpenLink Software, All rights reserved. select * from <ods> where {?s ?p ?o . ?s has_comment ?c . ?c has_author <xxx> } select post.* from post, comment, user where c_post = p_id and c_author = u_id and u_name = f ('xxx')
  • 12. What One Must Know Mapping for integration is not trivial Be careful when mapping multiple tables/columns to one class/property Make URI schemes which encode type and source, so that senseless joins are not attempted if types not specified in query Understand what the mapping logic can and cannot optimize Understand what SQL can and cannot optimize View resulting SQL for sanity check © 2008 OpenLink Software, All rights reserved.
  • 13. SQL Extensions Mapping must work against any RDBMS/Schema, as is But there is Virtuoso SQL between the mapping and target RDBMS(s) Location and latency - conscious distributed cost model Breakup for making a wide result set into a row per property Inverse functions © 2008 OpenLink Software, All rights reserved.
  • 14. Use Cases OpenLink Data Spaces - Blog, Wiki, News, Social Network, Feed Aggregation, Tag Clouds, Bookmarks etc. OpenLink's own MIS - “total information awareness”: URI for any CRM Object, Account, Product, Support Case, Email etc.. Musicbrainz phpBB, Drupal, MediaWiki, WordPress, Bugzilla, and others. © 2008 OpenLink Software, All rights reserved.
  • 15. OpenLink Software © 2008 OpenLink Software, All rights reserved. Thank You! http://virtuoso.openlinksw.com