SlideShare a Scribd company logo
May15th, 2019
HOW REPRESENTATIVE IS A SPARQL
BENCHMARK? AN ANALYSIS OF RDF
TRIPLESTORE BENCHMARKS
Muhammad Saleem, Gábor Szárnyas, Felix Conrads,
Syed Ahmad Chan Bukhari, Qaiser Mehmood, Axel-
Cyrille Ngonga Ngomo
The Web Conference 2019, San Francisco
1
MOTIVATION
 Various RDF Triplestores
 e.g., Virtuoso, FUSEKI, Blazgraph, Stardog, RDF3X etc.
 Various Triplestore benchmarks
 e.g., WatDiv, FEASIBLE, LDBC, BSBM, SP2Bench etc.
 Varying workload on Triplestores
 Various important SPARQL query features
 Which benchmark is more representative?
 Which benchmark is more suitable to test given
Triplestore?
 How SPARQL features effect the query runtimes?
2
QUERYING BENCHMARK
COMPONENTS
 Dataset(s)
 Queries
 Performance metrics
 Execution rules
3
IMPORTANT RDF DATASET
FEATURES
RDF Datasets used in the querying benchmark should
vary:
 Number of triples
 Number of classes
 Number of resources
 Number of properties
 Number of objects
 Average properties per class
 Average instances per class
 Average in-degree and out-degree
 Structuredness or coherence
 Relationship specialty 4
IMPORTANT SPARQL QUERY FEATURES
 Number of triple patterns
 Number of projection variables
 Number of BGPs
 Number of join vertices
 Mean join vertex degree
 Query result set sizes
 Mean triple pattern selectivity
 BGP-restricted triple pattern selectivity
 Join-restricted triple pattern selectivity
 Overall diversity score (average coefficient of variation)
 Join vertex types (`star', `path', `hybrid', `sink')
 SPARQL clauses used (e.g., LIMIT, UNION, OPTIONAL, FILTER etc.)
5
SPARQL queries as directed hypergraph
IMPORTANT PERFORMANCE METRICS
(1/2)
 Query Processing Related
 Query execution time
 Query Mix per Hour (QMpH)
 Queries per Second (QpS)
 CPU and memory usage
 Intermediate results
 Number of disk/memory swaps
 Result Set Related
 Result set correctness
 Result set completeness
6
IMPORTANT PERFORMANCE METRICS
(2/2)
 Data Storage Related
 Data loading time
 Storage space
 Index size
 Parallelism with/without Updates
 Parallel querying agents
 Parallel data updates agents
7
BENCHMARKS SELECTION CRITERIA
 Target query runtime performance evaluation of triplestores
 RDF Datasets available
 SPARQL queries available
 No reasoning required to get complete results
8
SELECTED BENCHMARKS
 Real data and/or queries
benchmarks
 FishMark
 BioBench
 FEASIBLE
 Dbpedia SPARQL Benchmark (DBPSB)
 Synthetic benchmarks
 Bowlogna
 TrainBench
 Berlin SPARQL Benchmark (BSBM)
 SP2Bench
 WatDiv
 Social Networking Benchmark (SNB)
9
 Real-world datasets and queries
 Dbpedia 3.5.1
 Semantic Web Dog Food (SWDF)
 NCBIGene
 SIDER
 DrugBank
DATASETS ANALYSIS:
STRUCTUREDNESS
10
 Duan et al. assumption
 Real datasets are less structured
 Synthetic datasets are high structured
The dataset structuredness problem
is well covered in recent synthetic
data generators (e.g., WatDiv,
TrainBench)
DATASETS ANALYSIS:
RELATIONSHIP SPECIALTY
11
 Qiao et al. assumption
 Synthetic datasets have low relationship
specialty
The low relationship specialty
problem in synthetic datasets still
exists in general and needs to be
covered in future synthetic
benchmark generation approaches
QUERIES ANALYSIS: OVERALL
DIVERSITY SCORE
12
Benchmarks queries diversity (high to low): FEASIBLE  BioBench 
FishMark  WatDiv  Bowlogna  SP2Bench  BSBM  DBPSB  SNB-
BI  SNB-INT  TrainBench
QUERIES ANALYSIS: DISTRIBUTION OF SPARQL CLAUSES
AND JOIN VERTEX TYPES
13
Only FEASIBLE and BioBench do
not completely miss or
overused features
Synthetic benchmarks often
fail to contain important
SPARQL clauses
PERFORMANCE METRICS
14
BSBM reported the results for
maximum metrics among the
selected benchmarks
SPEARSMAN’S CORRELATION WITH RUNTIMES
15
Highest impact on query runtimes:
PV  JV  TP  Result  JVD 
JTPS  TPS  BGPs  LSQ  BTPS
The SPARQL query features we selected
have a weak correlation with query
execution time, suggesting that the query
runtime is a complex measure affected by
multidimensional SPARQL query features
EFFECT OF DATASETS STRUCTUREDNESS
16
CONCLUSIONS
17
 The dataset structuredness problem is well covered in recent synthetic data
generators (e.g., WatDiv, TrainBench)
 The low relationship specialty problem in synthetic datasets still exists in
general and needs to be covered in future synthetic benchmark generation
approaches
 The FEASIBLE framework employed on DBpedia generated the most diverse
benchmark in our evaluation
 The SPARQL query features we selected have a weak correlation with query
execution time, suggesting that the query runtime is a complex measure
affected by multidimensional SPARQL query features
 Still, the number of projection variables, join vertices, triple patterns, the
result sizes, and the join vertex degree are the top five SPARQL features
that most impact the overall query execution time
 Synthetic benchmarks often fail to contain important SPARQL clauses such
as DISTINCT, FILTER, OPTIONAL, LIMIT and UNION
 The dataset structuredness has a direct correlation with the result sizes
and execution times of queries and indirect correlation with dataset

More Related Content

PPTX
SPARQL Querying Benchmarks ISWC2016
PPTX
SPARQL and Linked Data Benchmarking
PPTX
Practical SPARQL Benchmarking
PPTX
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
PPTX
A Machine Learning Approach to SPARQL Query Performance Prediction
PPTX
FEASIBLE-Benchmark-Framework-ISWC2015
PPTX
Strategies for Processing and Explaining Distributed Queries on Linked Data
PPTX
SQCFramework: SPARQL Query containment Benchmark Generation Framework
SPARQL Querying Benchmarks ISWC2016
SPARQL and Linked Data Benchmarking
Practical SPARQL Benchmarking
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
A Machine Learning Approach to SPARQL Query Performance Prediction
FEASIBLE-Benchmark-Framework-ISWC2015
Strategies for Processing and Explaining Distributed Queries on Linked Data
SQCFramework: SPARQL Query containment Benchmark Generation Framework

Similar to How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benchmarks (20)

PDF
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
PPT
A Pragmatic Approach to Semantic Repositories Benchmarking
PDF
CoreBigBench: Benchmarking Big Data Core Operations
PDF
CoreBigBench: Benchmarking Big Data Core Operations
ODP
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
PDF
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
PPTX
Extended LargeRDFBench
PDF
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
PPTX
Practical SPARQL Benchmarking Revisited
PPTX
LargeRDFBench
PDF
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
ODP
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
PDF
MOCHA 2018 Challenge @ ESWC2018
PDF
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
PPTX
Predicting query performance and explaining results to assist Linked Data con...
PPTX
Scalable Web Data Management using RDF
PPTX
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
PDF
Graph basedrdf storeforapachecassandra
PDF
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
PPTX
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
A Pragmatic Approach to Semantic Repositories Benchmarking
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operations
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
Extended LargeRDFBench
Efficient Distributed In-Memory Processing of RDF Datasets - PhD Viva
Practical SPARQL Benchmarking Revisited
LargeRDFBench
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
FOSDEM2014 - Social Network Benchmark (SNB) Graph Generator - Peter Boncz
MOCHA 2018 Challenge @ ESWC2018
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
Predicting query performance and explaining results to assist Linked Data con...
Scalable Web Data Management using RDF
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
Graph basedrdf storeforapachecassandra
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Ad

More from Muhammad Saleem (14)

PPTX
QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...
PPTX
CostFed: Cost-Based Query Optimization for SPARQL Endpoint Federation
PPTX
Question Answering Over Linked Data: What is Difficult to Answer? What Affect...
PPTX
Federated Query Formulation and Processing Through BioFed
PPTX
Fine-grained Evaluation of SPARQL Endpoint Federation Systems
PPTX
Efficient source selection for sparql endpoint federation
PDF
LSQ: The Linked SPARQL Queries Dataset
PPTX
Federated SPARQL Query Processing ISWC2015 Tutorial
PPTX
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
PPTX
Federated SPARQL query processing over the Web of Data
PPTX
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation
PPTX
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
PPTX
Fostering Serendipity through Big Linked Data
PPTX
Linked Cancer Genome Atlas Database
QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...
CostFed: Cost-Based Query Optimization for SPARQL Endpoint Federation
Question Answering Over Linked Data: What is Difficult to Answer? What Affect...
Federated Query Formulation and Processing Through BioFed
Fine-grained Evaluation of SPARQL Endpoint Federation Systems
Efficient source selection for sparql endpoint federation
LSQ: The Linked SPARQL Queries Dataset
Federated SPARQL Query Processing ISWC2015 Tutorial
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
Federated SPARQL query processing over the Web of Data
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
Fostering Serendipity through Big Linked Data
Linked Cancer Genome Atlas Database
Ad

Recently uploaded (20)

PPTX
A Presentation on Artificial Intelligence
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Machine Learning_overview_presentation.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Electronic commerce courselecture one. Pdf
A Presentation on Artificial Intelligence
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The Rise and Fall of 3GPP – Time for a Sabbatical?
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
Unlocking AI with Model Context Protocol (MCP)
Machine learning based COVID-19 study performance prediction
Digital-Transformation-Roadmap-for-Companies.pptx
Chapter 3 Spatial Domain Image Processing.pdf
MYSQL Presentation for SQL database connectivity
Programs and apps: productivity, graphics, security and other tools
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Machine Learning_overview_presentation.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation theory and applications.pdf
Electronic commerce courselecture one. Pdf

How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benchmarks

  • 1. May15th, 2019 HOW REPRESENTATIVE IS A SPARQL BENCHMARK? AN ANALYSIS OF RDF TRIPLESTORE BENCHMARKS Muhammad Saleem, Gábor Szárnyas, Felix Conrads, Syed Ahmad Chan Bukhari, Qaiser Mehmood, Axel- Cyrille Ngonga Ngomo The Web Conference 2019, San Francisco 1
  • 2. MOTIVATION  Various RDF Triplestores  e.g., Virtuoso, FUSEKI, Blazgraph, Stardog, RDF3X etc.  Various Triplestore benchmarks  e.g., WatDiv, FEASIBLE, LDBC, BSBM, SP2Bench etc.  Varying workload on Triplestores  Various important SPARQL query features  Which benchmark is more representative?  Which benchmark is more suitable to test given Triplestore?  How SPARQL features effect the query runtimes? 2
  • 3. QUERYING BENCHMARK COMPONENTS  Dataset(s)  Queries  Performance metrics  Execution rules 3
  • 4. IMPORTANT RDF DATASET FEATURES RDF Datasets used in the querying benchmark should vary:  Number of triples  Number of classes  Number of resources  Number of properties  Number of objects  Average properties per class  Average instances per class  Average in-degree and out-degree  Structuredness or coherence  Relationship specialty 4
  • 5. IMPORTANT SPARQL QUERY FEATURES  Number of triple patterns  Number of projection variables  Number of BGPs  Number of join vertices  Mean join vertex degree  Query result set sizes  Mean triple pattern selectivity  BGP-restricted triple pattern selectivity  Join-restricted triple pattern selectivity  Overall diversity score (average coefficient of variation)  Join vertex types (`star', `path', `hybrid', `sink')  SPARQL clauses used (e.g., LIMIT, UNION, OPTIONAL, FILTER etc.) 5 SPARQL queries as directed hypergraph
  • 6. IMPORTANT PERFORMANCE METRICS (1/2)  Query Processing Related  Query execution time  Query Mix per Hour (QMpH)  Queries per Second (QpS)  CPU and memory usage  Intermediate results  Number of disk/memory swaps  Result Set Related  Result set correctness  Result set completeness 6
  • 7. IMPORTANT PERFORMANCE METRICS (2/2)  Data Storage Related  Data loading time  Storage space  Index size  Parallelism with/without Updates  Parallel querying agents  Parallel data updates agents 7
  • 8. BENCHMARKS SELECTION CRITERIA  Target query runtime performance evaluation of triplestores  RDF Datasets available  SPARQL queries available  No reasoning required to get complete results 8
  • 9. SELECTED BENCHMARKS  Real data and/or queries benchmarks  FishMark  BioBench  FEASIBLE  Dbpedia SPARQL Benchmark (DBPSB)  Synthetic benchmarks  Bowlogna  TrainBench  Berlin SPARQL Benchmark (BSBM)  SP2Bench  WatDiv  Social Networking Benchmark (SNB) 9  Real-world datasets and queries  Dbpedia 3.5.1  Semantic Web Dog Food (SWDF)  NCBIGene  SIDER  DrugBank
  • 10. DATASETS ANALYSIS: STRUCTUREDNESS 10  Duan et al. assumption  Real datasets are less structured  Synthetic datasets are high structured The dataset structuredness problem is well covered in recent synthetic data generators (e.g., WatDiv, TrainBench)
  • 11. DATASETS ANALYSIS: RELATIONSHIP SPECIALTY 11  Qiao et al. assumption  Synthetic datasets have low relationship specialty The low relationship specialty problem in synthetic datasets still exists in general and needs to be covered in future synthetic benchmark generation approaches
  • 12. QUERIES ANALYSIS: OVERALL DIVERSITY SCORE 12 Benchmarks queries diversity (high to low): FEASIBLE  BioBench  FishMark  WatDiv  Bowlogna  SP2Bench  BSBM  DBPSB  SNB- BI  SNB-INT  TrainBench
  • 13. QUERIES ANALYSIS: DISTRIBUTION OF SPARQL CLAUSES AND JOIN VERTEX TYPES 13 Only FEASIBLE and BioBench do not completely miss or overused features Synthetic benchmarks often fail to contain important SPARQL clauses
  • 14. PERFORMANCE METRICS 14 BSBM reported the results for maximum metrics among the selected benchmarks
  • 15. SPEARSMAN’S CORRELATION WITH RUNTIMES 15 Highest impact on query runtimes: PV  JV  TP  Result  JVD  JTPS  TPS  BGPs  LSQ  BTPS The SPARQL query features we selected have a weak correlation with query execution time, suggesting that the query runtime is a complex measure affected by multidimensional SPARQL query features
  • 16. EFFECT OF DATASETS STRUCTUREDNESS 16
  • 17. CONCLUSIONS 17  The dataset structuredness problem is well covered in recent synthetic data generators (e.g., WatDiv, TrainBench)  The low relationship specialty problem in synthetic datasets still exists in general and needs to be covered in future synthetic benchmark generation approaches  The FEASIBLE framework employed on DBpedia generated the most diverse benchmark in our evaluation  The SPARQL query features we selected have a weak correlation with query execution time, suggesting that the query runtime is a complex measure affected by multidimensional SPARQL query features  Still, the number of projection variables, join vertices, triple patterns, the result sizes, and the join vertex degree are the top five SPARQL features that most impact the overall query execution time  Synthetic benchmarks often fail to contain important SPARQL clauses such as DISTINCT, FILTER, OPTIONAL, LIMIT and UNION  The dataset structuredness has a direct correlation with the result sizes and execution times of queries and indirect correlation with dataset