SlideShare a Scribd company logo
RDF and Graph benchmarking
www.ldbc.eu
GRAPH-TA
Barcelona
Feb 19, 2013
Graph-TA, Barcelona Feb 19th, 2013
Agenda
- Why benchmarking
- Presentation of LDBC
-

Remarks
Who is who
Project overview: WP and Task Forces
TUC, Technical Users Community

- Benchmarking RDF and GDB
- Common issues
- Open questions
Why benchmarking
• Two main objectives:
– Allow final users to assess the performance of the
software they want to buy
– Push the technology to the limit to allow for
progress

• Main effort in DB benchmarking up to now
– TPC: Transaction Processing Performance Council
• Relational DBs: Transactional and DSS
LDBC
• Objectives:
– Benchmarks for the emerging field of RDF and
Graph database management systems (GDBs)
– Spur industry cooperation around benchmarks
– Create LDBC foundation during 1Q 2013.
– Become a technology push effort, making
improvements measurable
– Become the de-facto research benchmark, usable,
interesting and open to inputs.
Preliminary remarks
• Nature:

– LDBC is different from other EC projects.
– The objective for LDBC is to survive after the end.

• Opportunity:

– Have a benchmarking effort sponsored by EC.
– Focal point for the vendor community.
– Showcase to the user community.

• Collaboration:

– LDBC should lead to great achievements and world
recognition, with the help of all the community:
industry, technologists and users.
Who is involved
•
•
•
•
•
•
•
•

FORTH, research centre, Greece
TUM, research centre, Germany
UIBK, technology centre, Austria
Neo Technologies, Graph management, Sweeden
OGL, RDF management, UK
ONT, RDF management, Bulgaria
VUA, research institution, Netherlands
DAMA-UPC, research institution, Spain
Project Overview: the WPs
Matrix Organization
• EU project reporting activities (WPs)
• LDBC benchmark task force activities (TFs)
WP1

WP2

WP3

WP4

TF1
TF2
TF3
TF4
TF5

(part of) deliverable/
task force artifact
Task forces
• Focusses on specific benchmarking effort (i.e.
transactional, analytical, integration for RDF/GDB)
• Decides on the Use Case to be used
• Procedure:
– Designs and implements data generation (characteristics,
scale, etc.)
– Incorporates the generic methodology
– Designs specific workload
• Choke points specific for the effort
• Incorporating the needs from users
• Incorporating the opinions from industry
Technical User Community (TUC)
• It will be the driving force for LDBC:
– To help understand users needs and decide use cases
– To decide the type of problem/scenarios to be tackled,
i.e. task forces to be deployed
– To provide typical queries placed to RDF and GDBs

• First TUC meeting, Barcelona 19-20 Nov.
– Start with an on-line questionaire:
http://goo.gl/PwGtK
– The outcomes will determine important directions
Common issues
• Use case for RDF and GDBs:

• Social Network Analysis
• Semantic Publishing, specific for RDF (SP)

• Methodology:

• Audited benchmarks
• Specific rules, similar to TPC

•

Workload for RDF
–
–
–
–
–
–

Throughput, concurrency
Traversals
Reasoning
Data updates
Integration: LOD, geonames, etc.
SP: semantic annotation support,
relationship btwn ontologies and
instances, links to other content, text
and metadata.

•

Workload for GDBs

– Throughput, concurrency
– Traversals, shortest paths, pattern
matching, clustering algorithms
– Data updates, transactionality
– Update semantics (serializable/acid vs
delayed commit vs batch)
– Raw traversal speed, use of indexes
Open questions
• Use cases:
•
•
•
•

How realistic would you see synthetic data generation?
Use of real data like twitter, Facebook or Open Ontologies?
Any suggestions for use cases?
Any suggestion for scenarios: analytical, transactional, integration, others?

• Would it make sense to propose open benchmark scenarios?
• The tight rules of TPC:

– Are those against the realism of benchmarks?
– Can we solve this in any flexible way?

• Will RDF and GDB move towards the same technology/solutions?
• GDBs: no standard language. How to proceed?

More Related Content

PPTX
EDF2013: Selected Talk Josep-L. Larriba-Pey: The Linked Data Benchmark Counci...
PDF
LDBC 6th TUC Meeting conclusions by Peter Boncz
PPTX
Unlocking the value : metadata and linked data at the British Library / Alan ...
PPTX
Unlocking the value: a metadata strategy for the British Library / Alan Danskin
PPT
Europeana v1.0 Overview and ambitions
PDF
20141030 LinDA Workshop echallenges2014 - LinDA project overview
PDF
20 billion triples in production
PDF
Lighthouse: Large-scale graph pattern matching on Giraph
EDF2013: Selected Talk Josep-L. Larriba-Pey: The Linked Data Benchmark Counci...
LDBC 6th TUC Meeting conclusions by Peter Boncz
Unlocking the value : metadata and linked data at the British Library / Alan ...
Unlocking the value: a metadata strategy for the British Library / Alan Danskin
Europeana v1.0 Overview and ambitions
20141030 LinDA Workshop echallenges2014 - LinDA project overview
20 billion triples in production
Lighthouse: Large-scale graph pattern matching on Giraph

Viewers also liked (6)

PDF
Ldbc spb 2.0 evolution
PDF
LDBC SNB Benchmark Auditing
PDF
Social Network Benchmark Interactive Workload
PDF
SADI: A design-pattern for “native” Linked-Data Semantic Web Services
PPTX
Keynote IDEAS2013 - Peter Boncz
PDF
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
Ldbc spb 2.0 evolution
LDBC SNB Benchmark Auditing
Social Network Benchmark Interactive Workload
SADI: A design-pattern for “native” Linked-Data Semantic Web Services
Keynote IDEAS2013 - Peter Boncz
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
Ad

Similar to GRAPH-TA 2013 - RDF and Graph benchmarking - Jose Lluis Larriba Pey (20)

PDF
Graph-TA 2013 - Josep Lluís Larriba Pey
PDF
LDBC 8th TUC Meeting: Introduction and status update
PDF
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
PPTX
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
ODP
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
ODP
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
PDF
LDBC 6th TUC Meeting conclusions
PPT
Benchmarking graph databases on the problem of community detection
PPT
Benchmarking graph databases on the problem of community detection
PDF
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
PDF
Current Trends and Challenges in Big Data Benchmarking
PPTX
EDF2012 Peter Boncz - LOD benchmarking SRbench
PDF
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
PPT
W3C Library Linked Data Incubator Group - 2011
PPTX
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
PPTX
Research into Practice case study 2: Library linked data implementations an...
PDF
(Big) bibliographic data @ ScaDS project meeting - 2015-06-12
PDF
Hide the Stack: Toward Usable Linked Data
PPT
Linked Data Driven Data Virtualization for Web-scale Integration
Graph-TA 2013 - Josep Lluís Larriba Pey
LDBC 8th TUC Meeting: Introduction and status update
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
LDBC 6th TUC Meeting conclusions
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
Ted Willke, Senior Principal Engineer & GM, Datacenter Group, Intel at MLconf SF
Current Trends and Challenges in Big Data Benchmarking
EDF2012 Peter Boncz - LOD benchmarking SRbench
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
W3C Library Linked Data Incubator Group - 2011
LOD2 Plenary Vienna 2012: WP2 - Storing and Querying Very Large Knowledge Bases
Research into Practice case study 2: Library linked data implementations an...
(Big) bibliographic data @ ScaDS project meeting - 2015-06-12
Hide the Stack: Toward Usable Linked Data
Linked Data Driven Data Virtualization for Web-scale Integration
Ad

More from Ioan Toma (8)

PDF
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
PDF
MODAClouds Decision Support System for Cloud Service Selection
PDF
MarkLogic Overview and Use Cases
PDF
Towards Temporal Graph Management and Analytics
PDF
Querying the Wikidata Knowledge Graph
PDF
Lighthouse: Large-scale graph pattern matching on Giraph
PDF
HP Labs: Titan DB on LDBC SNB interactive by Tomer Sagi (HP)
PPTX
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
MODAClouds Decision Support System for Cloud Service Selection
MarkLogic Overview and Use Cases
Towards Temporal Graph Management and Analytics
Querying the Wikidata Knowledge Graph
Lighthouse: Large-scale graph pattern matching on Giraph
HP Labs: Titan DB on LDBC SNB interactive by Tomer Sagi (HP)
SPIMBENCH: A scalable, Schema-Aware Instance Matching Benchmark for the Seman...

Recently uploaded (20)

PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
cuic standard and advanced reporting.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Advanced IT Governance
PDF
Electronic commerce courselecture one. Pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
“AI and Expert System Decision Support & Business Intelligence Systems”
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
cuic standard and advanced reporting.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
The AUB Centre for AI in Media Proposal.docx
Advanced Soft Computing BINUS July 2025.pdf
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Chapter 3 Spatial Domain Image Processing.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Advanced IT Governance
Electronic commerce courselecture one. Pdf

GRAPH-TA 2013 - RDF and Graph benchmarking - Jose Lluis Larriba Pey

  • 1. RDF and Graph benchmarking www.ldbc.eu GRAPH-TA Barcelona Feb 19, 2013 Graph-TA, Barcelona Feb 19th, 2013
  • 2. Agenda - Why benchmarking - Presentation of LDBC - Remarks Who is who Project overview: WP and Task Forces TUC, Technical Users Community - Benchmarking RDF and GDB - Common issues - Open questions
  • 3. Why benchmarking • Two main objectives: – Allow final users to assess the performance of the software they want to buy – Push the technology to the limit to allow for progress • Main effort in DB benchmarking up to now – TPC: Transaction Processing Performance Council • Relational DBs: Transactional and DSS
  • 4. LDBC • Objectives: – Benchmarks for the emerging field of RDF and Graph database management systems (GDBs) – Spur industry cooperation around benchmarks – Create LDBC foundation during 1Q 2013. – Become a technology push effort, making improvements measurable – Become the de-facto research benchmark, usable, interesting and open to inputs.
  • 5. Preliminary remarks • Nature: – LDBC is different from other EC projects. – The objective for LDBC is to survive after the end. • Opportunity: – Have a benchmarking effort sponsored by EC. – Focal point for the vendor community. – Showcase to the user community. • Collaboration: – LDBC should lead to great achievements and world recognition, with the help of all the community: industry, technologists and users.
  • 6. Who is involved • • • • • • • • FORTH, research centre, Greece TUM, research centre, Germany UIBK, technology centre, Austria Neo Technologies, Graph management, Sweeden OGL, RDF management, UK ONT, RDF management, Bulgaria VUA, research institution, Netherlands DAMA-UPC, research institution, Spain
  • 8. Matrix Organization • EU project reporting activities (WPs) • LDBC benchmark task force activities (TFs) WP1 WP2 WP3 WP4 TF1 TF2 TF3 TF4 TF5 (part of) deliverable/ task force artifact
  • 9. Task forces • Focusses on specific benchmarking effort (i.e. transactional, analytical, integration for RDF/GDB) • Decides on the Use Case to be used • Procedure: – Designs and implements data generation (characteristics, scale, etc.) – Incorporates the generic methodology – Designs specific workload • Choke points specific for the effort • Incorporating the needs from users • Incorporating the opinions from industry
  • 10. Technical User Community (TUC) • It will be the driving force for LDBC: – To help understand users needs and decide use cases – To decide the type of problem/scenarios to be tackled, i.e. task forces to be deployed – To provide typical queries placed to RDF and GDBs • First TUC meeting, Barcelona 19-20 Nov. – Start with an on-line questionaire: http://goo.gl/PwGtK – The outcomes will determine important directions
  • 11. Common issues • Use case for RDF and GDBs: • Social Network Analysis • Semantic Publishing, specific for RDF (SP) • Methodology: • Audited benchmarks • Specific rules, similar to TPC • Workload for RDF – – – – – – Throughput, concurrency Traversals Reasoning Data updates Integration: LOD, geonames, etc. SP: semantic annotation support, relationship btwn ontologies and instances, links to other content, text and metadata. • Workload for GDBs – Throughput, concurrency – Traversals, shortest paths, pattern matching, clustering algorithms – Data updates, transactionality – Update semantics (serializable/acid vs delayed commit vs batch) – Raw traversal speed, use of indexes
  • 12. Open questions • Use cases: • • • • How realistic would you see synthetic data generation? Use of real data like twitter, Facebook or Open Ontologies? Any suggestions for use cases? Any suggestion for scenarios: analytical, transactional, integration, others? • Would it make sense to propose open benchmark scenarios? • The tight rules of TPC: – Are those against the realism of benchmarks? – Can we solve this in any flexible way? • Will RDF and GDB move towards the same technology/solutions? • GDBs: no standard language. How to proceed?