SlideShare a Scribd company logo
Rob Vesse
rvesse@yarcdata.com
     @RobVesse




                      1
 Regardless of what technology your solution will be built on
  (RDBMS, RDF + SPARQL, NoSQL etc) you need to know it
  performs sufficiently to meet your goals
 You need to justify option X over option Y
   Business – Price vs Performance
   Technical – Does it perform sufficiently?
 No guarantee that a standard benchmark accurately
 models your usage




                                                           2
 Berlin SPARQL Benchmark (BSBM)
   Relational style data model
   Access pattern simulates replacing a traditional RDBMS with a Triple
      Store
 Lehigh University Benchmark (LUBM)
   More typical RDF data model
   Stores require reasoning to answer the queries correctly
 SPARQL2Bench (SP2B)
     Again typical RDF data model
     Queries designed to be hard – cross products, filters, etc.
     Generates artificially massive unrealistic results
     Tests clever optimization and join performance




                                                                      3
 Often no standardized       methodology
   E.g. only BSBM provides a test harness
 Lack of transparency as a result
   If I say I’m 10x faster than you is that really true or did I measure
    differently?
   Are the figures you’re comparing with even current?
 What actually got measured?
   Time to start responding
   Time to count all results
   Something else?
 Even if you run a benchmark does it actually tell you
  anything useful?



                                                                            4
 Java command line tool (and API) for benchmarking
 Designed to be highly configurable
   Runs any set of SPARQL queries you can devise against any HTTP
    based SPARQL endpoint
   Run single and multi-threaded benchmarks
   Generates a variety of statistics
 Methodology
   Runs some quick sanity tests to check the provided endpoint is up
    and working
   Optionally runs W warm up runs prior to actual benchmarking
   Runs a Query Mix N times
      Randomizes query order for each run
      Discards outliers (best and worst runs)
   Calculates averages, variances and standard deviations over the runs
   Generates reports as CSV and XML


                                                                        5
 Response Time
   Time from when query is issued to when results start being received
 Runtime
   Time from when query is issued to all results being received and
    counted
   Exact definition may vary according to configuration
 Queries per Second
   How many times a given query can be executed per second
 Query Mixed per Hour
   How many times a query mix can be executed per hour




                                                                       6
7
   SP2B at 10k, 50k and 250k run with 5 warm-ups and 25 runs
     All options left as defaults i.e. full result counting
     Runs for 50k and 250k skipped if store was incapable of performing the run
        in reasonable time
   Run on following systems
     *nix based stores run on late 2011 Mac Book Pro (quad core, 8GB RAM,
        SSD)
          Java heap space set to 4GB
     Windows based stores run on HP Laptop (dual core, 4GB RAM, HDD)
     Both low powered systems compared to servers
   Benchmarked Stores
       Jena TDB 0.9.1
       Sesame 2.6.5 (Memory and Native Stores)
       Bigdata 1.2 (WORM Store)
       Dydra
       Virtuoso 6.1.3 (Open Source Edition)
       dotNetRDF (In-Memory Store)
       Stardog 0.9.4 (In-Memory and Disk Stores)
       OWLIM


                                                                             8
9
1
0
1
1
 Code Release is management Approved
     Currently undergoing Legal and IP Clearance
     Should be open sourced shortly under a BSD license
     Will be available from https://sourceforge.net/p/sparql-query-bm
     Apologies this isn’t yet available at time of writing
 Example Results data available       from:
   https://dl.dropbox.com/u/590790/semtech2012.tar.gz




                                                                         1
                                                                         2
1
3

More Related Content

PDF
Cassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSE
PDF
Load testing and performance tracing
PDF
Cache optimization
PDF
Lessons PostgreSQL learned from commercial databases, and didn’t
PDF
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
PDF
Scalable and Available, Patterns for Success
PDF
Lightening Talk - PostgreSQL Worst Practices
PPTX
Back tobasicswebinar part6-rev.
Cassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSE
Load testing and performance tracing
Cache optimization
Lessons PostgreSQL learned from commercial databases, and didn’t
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
Scalable and Available, Patterns for Success
Lightening Talk - PostgreSQL Worst Practices
Back tobasicswebinar part6-rev.

What's hot (16)

PPT
dh-slides-perf.ppt
PPTX
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
PPTX
Drupal meets PostgreSQL for DrupalCamp MSK 2014
PDF
PDF
re:dash is awesome
PPTX
OSGifying the repository
PDF
Troubleshooting redis
PPTX
Gluster the ugly parts with Jeff Darcy
PPTX
Actions in QTP
PDF
Breaking the Sound Barrier with Persistent Memory
PDF
CCI2018 - Benchmarking in the cloud
PPT
J2EE Performance And Scalability Bp
PDF
Cassandra from tarball to production
PPTX
Compression talk
PPTX
Cloud Performance Benchmarking
PDF
Why we love pgpool-II and why we hate it!
dh-slides-perf.ppt
Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...
Drupal meets PostgreSQL for DrupalCamp MSK 2014
re:dash is awesome
OSGifying the repository
Troubleshooting redis
Gluster the ugly parts with Jeff Darcy
Actions in QTP
Breaking the Sound Barrier with Persistent Memory
CCI2018 - Benchmarking in the cloud
J2EE Performance And Scalability Bp
Cassandra from tarball to production
Compression talk
Cloud Performance Benchmarking
Why we love pgpool-II and why we hate it!
Ad

Similar to Practical SPARQL Benchmarking (20)

PPTX
Ceph - High Performance Without High Costs
PPT
How Many Slaves (Ukoug)
PPT
UnConference for Georgia Southern Computer Science March 31, 2015
PPTX
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
PDF
Benchmarking Hadoop and Big Data
PDF
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
PPTX
Thing you didn't know you could do in Spark
PPT
Planning for-high-performance-web-application
PPT
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
PPT
Hadoop and Voldemort @ LinkedIn
PDF
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
PPTX
Data Engineering for Data Scientists
PPT
scale_perf_best_practices
PPTX
FHIR Server internals - sqlonfhir
PDF
VMworld 2013: Virtualizing Databases: Doing IT Right
PPTX
Api testing libraries using java script an overview
PDF
SQL on Hadoop benchmarks using TPC-DS query set
PPTX
Microsoft Openness Mongo DB
PPT
Planning For High Performance Web Application
PPTX
Hardware Provisioning
Ceph - High Performance Without High Costs
How Many Slaves (Ukoug)
UnConference for Georgia Southern Computer Science March 31, 2015
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
Benchmarking Hadoop and Big Data
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
Thing you didn't know you could do in Spark
Planning for-high-performance-web-application
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Hadoop and Voldemort @ LinkedIn
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Data Engineering for Data Scientists
scale_perf_best_practices
FHIR Server internals - sqlonfhir
VMworld 2013: Virtualizing Databases: Doing IT Right
Api testing libraries using java script an overview
SQL on Hadoop benchmarks using TPC-DS query set
Microsoft Openness Mongo DB
Planning For High Performance Web Application
Hardware Provisioning
Ad

More from Rob Vesse (8)

PPTX
Challenges and patterns for semantics at scale
PPTX
Apache Jena Elephas and Friends
PPTX
Quadrupling your elephants - RDF and the Hadoop ecosystem
PPTX
Practical SPARQL Benchmarking Revisited
PPTX
Introducing JDBC for SPARQL
PPTX
Everyday Tools for the Semantic Web Developer
PPTX
Everyday Tools for the Semantic Web Developer
PPTX
dotNetRDF - A Semantic Web/RDF Library for .Net Developers
Challenges and patterns for semantics at scale
Apache Jena Elephas and Friends
Quadrupling your elephants - RDF and the Hadoop ecosystem
Practical SPARQL Benchmarking Revisited
Introducing JDBC for SPARQL
Everyday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web Developer
dotNetRDF - A Semantic Web/RDF Library for .Net Developers

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Spectroscopy.pptx food analysis technology
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
August Patch Tuesday
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Approach and Philosophy of On baking technology
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Tartificialntelligence_presentation.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
Teaching material agriculture food technology
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Advanced methodologies resolving dimensionality complications for autism neur...
A comparative study of natural language inference in Swahili using monolingua...
Programs and apps: productivity, graphics, security and other tools
Spectroscopy.pptx food analysis technology
SOPHOS-XG Firewall Administrator PPT.pptx
Heart disease approach using modified random forest and particle swarm optimi...
August Patch Tuesday
OMC Textile Division Presentation 2021.pptx
Approach and Philosophy of On baking technology
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Tartificialntelligence_presentation.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation_ Review paper, used for researhc scholars
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Network Security Unit 5.pdf for BCA BBA.
Reach Out and Touch Someone: Haptics and Empathic Computing
Teaching material agriculture food technology
Building Integrated photovoltaic BIPV_UPV.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11

Practical SPARQL Benchmarking

  • 2.  Regardless of what technology your solution will be built on (RDBMS, RDF + SPARQL, NoSQL etc) you need to know it performs sufficiently to meet your goals  You need to justify option X over option Y  Business – Price vs Performance  Technical – Does it perform sufficiently?  No guarantee that a standard benchmark accurately models your usage 2
  • 3.  Berlin SPARQL Benchmark (BSBM)  Relational style data model  Access pattern simulates replacing a traditional RDBMS with a Triple Store  Lehigh University Benchmark (LUBM)  More typical RDF data model  Stores require reasoning to answer the queries correctly  SPARQL2Bench (SP2B)  Again typical RDF data model  Queries designed to be hard – cross products, filters, etc.  Generates artificially massive unrealistic results  Tests clever optimization and join performance 3
  • 4.  Often no standardized methodology  E.g. only BSBM provides a test harness  Lack of transparency as a result  If I say I’m 10x faster than you is that really true or did I measure differently?  Are the figures you’re comparing with even current?  What actually got measured?  Time to start responding  Time to count all results  Something else?  Even if you run a benchmark does it actually tell you anything useful? 4
  • 5.  Java command line tool (and API) for benchmarking  Designed to be highly configurable  Runs any set of SPARQL queries you can devise against any HTTP based SPARQL endpoint  Run single and multi-threaded benchmarks  Generates a variety of statistics  Methodology  Runs some quick sanity tests to check the provided endpoint is up and working  Optionally runs W warm up runs prior to actual benchmarking  Runs a Query Mix N times  Randomizes query order for each run  Discards outliers (best and worst runs)  Calculates averages, variances and standard deviations over the runs  Generates reports as CSV and XML 5
  • 6.  Response Time  Time from when query is issued to when results start being received  Runtime  Time from when query is issued to all results being received and counted  Exact definition may vary according to configuration  Queries per Second  How many times a given query can be executed per second  Query Mixed per Hour  How many times a query mix can be executed per hour 6
  • 7. 7
  • 8. SP2B at 10k, 50k and 250k run with 5 warm-ups and 25 runs  All options left as defaults i.e. full result counting  Runs for 50k and 250k skipped if store was incapable of performing the run in reasonable time  Run on following systems  *nix based stores run on late 2011 Mac Book Pro (quad core, 8GB RAM, SSD)  Java heap space set to 4GB  Windows based stores run on HP Laptop (dual core, 4GB RAM, HDD)  Both low powered systems compared to servers  Benchmarked Stores  Jena TDB 0.9.1  Sesame 2.6.5 (Memory and Native Stores)  Bigdata 1.2 (WORM Store)  Dydra  Virtuoso 6.1.3 (Open Source Edition)  dotNetRDF (In-Memory Store)  Stardog 0.9.4 (In-Memory and Disk Stores)  OWLIM 8
  • 9. 9
  • 10. 1 0
  • 11. 1 1
  • 12.  Code Release is management Approved  Currently undergoing Legal and IP Clearance  Should be open sourced shortly under a BSD license  Will be available from https://sourceforge.net/p/sparql-query-bm  Apologies this isn’t yet available at time of writing  Example Results data available from:  https://dl.dropbox.com/u/590790/semtech2012.tar.gz 1 2
  • 13. 1 3

Editor's Notes

  • #2: Introduce MyselfMay want to add a disclaimer here about views/opinions expressed primarily being my personal ones and not those of the company a la DVD extras disclaimers ;-)
  • #3: What is says on the slide ;-)
  • #4: Describe the benchmarks – shown on slidesDiscuss deficiencies of each benchmarkBSBMRelational – not really showing off the capabilities of a SPARQL engineLUBMNeed for reasoning – implementation thereof can make a huge difference in performanceForward vs Backward Chaining ReasoningSP2BQueries are unrealisticFocuses on optimization
  • #5: Self explanatory slide for the most partHighlight that just because the store you are interested in is good/bad at a particular benchmark doesn’t tell you whether the store is good/bad for your use case
  • #6: Describe the methodology in detailNote that this is based on an amalgamation of the BSBM style and Revelytix SP2B methodologies
  • #7: Key Point is to cover difference between Response Time and RuntimeNote that this stat can give some interesting information about how stores execute queries – almost instant response time but much longer runtime indicates streaming execution. Long response time with small difference to runtime indicates a batch execution.
  • #8: Run through a brief demo of the command line tool – make sure to have a running Stardog/Fuseki instance to run against – likely safer to use Fuseki as easier to ensure running and open source so no appearance of bias to a commercial productRun on SP2B 10k – will complete in reasonable time while I’m talking – suggest using a limited number of runs for demo purposes.Show the output data (CSV and XML)Key difference is CSV converts to seconds while XML uses raw nanosecondsXML is better for post processingCSV useful for quick import into Spreadsheet tools
  • #9: Discuss the setup for the example results – why the stores were chosen?Ease of availability (open source, runnable on *nix, personal interest etc)Ensure to highlight YMMVDisclaimer – Be sure to state that this is just a arbitrarily selected sample of stores and that performance indicated here may not be representative of the true performance of any store. Most importantly Cray/YarcData is not endorsing any specific store.Again point out the importance of people running their own benchmarks
  • #10: Note how as dataset size increases many stores can’t complete within reasonable time on the machines we usedLogarithmic ScaleMake sure to mention that the fact that many stores did not complete on the 50k and 250k sizes doesn’t mean they are defective, merely that with the machine resources available they couldn’t run in a timely fashion. This leads nicely to the point that it is important to benchmark on the hardware you actually intend to use.
  • #11: Discuss the variation in average runtime – some stores are way ahead of othersNote that some store’s results are heavily influenced by poor performance on certain queries – see next slideLogarithmic Scale
  • #12: Highlight the variation in performance both between stores and queries. Note how certain queries are just fundamentally hard even with clever optimisationIn-Memory trumps disk for relevant stores in most cases