SlideShare a Scribd company logo
Improvements in Bitsy 1.5
Sridhar Ramachandran
Founder, LambdaZen LLC
Background
● Bitsy is a small, fast, embeddable, durable,
in-memory graph database that implements
the Tinkerpop Blueprints API.
● The original presentation on Bitsy is
available at
http://slideshare.net/lambdazen/bitsy-graphdatabase
● Bitsy 1.5 is faster and leaner than before!
○ Has a smaller memory footprint
○ Uses (mostly) lock-free read algorithms
● This presentation covers the improvements
in the 1.5 release.
Major features in the 1.5 release
● The 1.5 release features:
○ Memory-efficient data structures
○ Mostly lock-free read algorithms
● Bitsy’s new memory-efficient data structures
are designed to reduce the overhead of
maintaining adjacency lists and properties.
● Bitsy’s new read algorithms are designed to
use the latest Java “compare-and-set” (CAS)
concurrency features to reduce the overhead
of locks in highly threaded scenarios.
Memory-efficient data structures
● Bitsy 1.0 relied on Java Collections to
maintain adjacency lists and properties of
vertices.
● Java Collections aren’t memory efficient for
small-sized data structures because they
create many holder objects.
● The 1.5 release stores small adjacency lists
(N<24) and small properties (N<16) in hand-
coded objects with minimal overhead.
Memory-efficient data structures
● Different concrete
classes capture
adjacency lists and
properties for small N.
○ This approach reduces
the overall number of
objects.
○ Large adjacency lists are
stored in a compact hash-
set by label referring to
memory-efficient lists.
Adjacency lists for out-degree 0, 1 and 2
Vertex properties for N = 0, 1 and 2
Lock-free reading
● Bitsy 1.5 also introduces lock-free reading
using sequential locks (seqlock).
● Read operations track the sequence
numbers at the start and end.
○ If they are the same -- Success.
○ If they are different -- Retry!
● Reads don’t start till the counter is even.
● Writers increment the counters twice
○ Before the write to make the counter an odd number
○ After the write to make the counter an even number
(Mostly) lock-free reading
● Bitsy’s sequential locks can cause “live lock”
situations when there are too many writers.
● To avoid this, readers degrade to RW locks
after a certain number of retries.
● Seqlock are faster than RW locks in highly
threaded environments where the # of active
threads exceed the # of cores.
● Bitsy uses locks on writes because
○ write-retries are complex with transactions, and
○ locking is not the bottleneck for writes -- the file
system is the bottleneck.
Benchmarks
● The plot below shows the read throughput*
of a test!
application that repeatedly loops through a graph.
*
Tests performed on a $600 HP p7-1287c desktop PC with a single 7200rpm hard disk.
!
The code for this test can be found in BitsyGraphTest.java under the method testMultiThreadedCommits().
Benchmarks
● The lock-free read algorithms in Bitsy 1.5 show a
significantly higher throughput than Bitsy 1.0.
○ Bitsy 1.0 had a drop in performance when the
number of threads exceeded the number of cores.
○ The read throughput exceeds 10M reads/sec!
● Bitsy is now comparable to Neo4J in read throughput*
.
○ This is an apples-to-apples comparison since Neo4J
is embedded and the graph is fully cached.
○ Most “bad” Neo4J benchmarks are taken when the
graph doesn’t fit in memory.
○ Neo4J is extremely fast when the graph fits in
memory -- and now, so is Bitsy!
Another read benchmark
● The following plot shows the traversal performance of
Bitsy 1.5 vs Neo4J 1.9.2 in a multi-threaded setting on a
bipartite graph with 1M vertices and out-degree of 3.
● Again, you can see that the performance is comparable.
Benchmarks for write
● As with 1.0 release, Bitsy’s write throughput is much
higher than Neo4J because of the “No Seek” principle.
○ For more info, please refer to the project page at
http://bitbucket.org/lambdazen/bitsy/
Wrap-up
● The 1.5 release introduces memory-efficient
data structures and (mostly) lock-free
reading to the Bitsy graph database.
○ With these improvements, Bitsy’s read performance
is comparable to Neo4J’s cache.
○ Bitsy’s “No Seek” write algorithms continue to
outperform other graph databases, including Neo4J.
● Bitsy is a dual-licensed product with
○ an AGPL license for open-source projects, and
○ a liberal unlimited-use OEM/end-user license for
commercial projects. Details at lambdazen.com.

More Related Content

PDF
Running MySQL on Linux
PDF
Bitsy graph database
PPTX
Is It Fast? : Measuring MongoDB Performance
PDF
Redis Day Keynote Salvatore Sanfillipo Redis Labs
PPTX
Redis Developers Day 2014 - Redis Labs Talks
PPTX
Inside CynosDB: MariaDB optimized for the cloud at Tencent
PDF
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
PPTX
WiredTiger & What's New in 3.0
Running MySQL on Linux
Bitsy graph database
Is It Fast? : Measuring MongoDB Performance
Redis Day Keynote Salvatore Sanfillipo Redis Labs
Redis Developers Day 2014 - Redis Labs Talks
Inside CynosDB: MariaDB optimized for the cloud at Tencent
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
WiredTiger & What's New in 3.0

What's hot (20)

PPTX
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
PDF
Speed up large-scale ML/DL offline inference job with Alluxio
PPTX
Get More Out of MySQL with TokuDB
PPTX
Get More Out of MongoDB with TokuMX
POTX
WiredTiger MongoDB Integration
PPTX
Day 2 General Session Presentations RedisConf
PDF
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
PPTX
Percona FT / TokuDB
PDF
A Technical Introduction to WiredTiger
PPTX
Ambry : Linkedin's Scalable Geo-Distributed Object Store
PDF
25 snowflake
PPTX
Rit 2011 ats
PDF
Redpanda and ClickHouse
PPTX
Webinar: Introduction to MongoDB 3.0
PPTX
WiredTiger Overview
PPTX
What'sNnew in 3.0 Webinar
PDF
Remote DBA Experts SQL Server 2008 New Features
PPT
BigTable PreReading
PPTX
Hybrid collaborative tiered storage with alluxio
PPTX
Leveraging Structured Data To Reduce Disk, IO & Network Bandwidth
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
Speed up large-scale ML/DL offline inference job with Alluxio
Get More Out of MySQL with TokuDB
Get More Out of MongoDB with TokuMX
WiredTiger MongoDB Integration
Day 2 General Session Presentations RedisConf
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
Percona FT / TokuDB
A Technical Introduction to WiredTiger
Ambry : Linkedin's Scalable Geo-Distributed Object Store
25 snowflake
Rit 2011 ats
Redpanda and ClickHouse
Webinar: Introduction to MongoDB 3.0
WiredTiger Overview
What'sNnew in 3.0 Webinar
Remote DBA Experts SQL Server 2008 New Features
BigTable PreReading
Hybrid collaborative tiered storage with alluxio
Leveraging Structured Data To Reduce Disk, IO & Network Bandwidth
Ad

Similar to Improvements in Bitsy 1.5 (20)

PDF
Threads - Why Can't You Just Play Nicely With Your Memory?
PDF
Threads - Why Can't You Just Play Nicely With Your Memory_
PDF
Apache Iceberg - A Table Format for Hige Analytic Datasets
PDF
Introduction to Memoria
PPTX
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
PPTX
MongoDB World 2015 - A Technical Introduction to WiredTiger
PDF
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PDF
NetflixOSS meetup lightning talks and roadmap
PDF
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...
PDF
Sharding: Past, Present and Future with Krutika Dhananjay
PDF
Boltdb - an embedded key value database
PDF
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
PDF
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
PDF
High performance json- postgre sql vs. mongodb
PPTX
Beyond the Basics 1: Storage Engines
PDF
The Parquet Format and Performance Optimization Opportunities
PDF
Deep Dive into Node.js Event Loop.pdf
ODP
Concept of thread
PDF
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
PDF
Tuning Solr & Pipeline for Logs
Threads - Why Can't You Just Play Nicely With Your Memory?
Threads - Why Can't You Just Play Nicely With Your Memory_
Apache Iceberg - A Table Format for Hige Analytic Datasets
Introduction to Memoria
SolrCloud in Public Cloud: Scaling Compute Independently from Storage - Ilan ...
MongoDB World 2015 - A Technical Introduction to WiredTiger
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
NetflixOSS meetup lightning talks and roadmap
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...
Sharding: Past, Present and Future with Krutika Dhananjay
Boltdb - an embedded key value database
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
High performance json- postgre sql vs. mongodb
Beyond the Basics 1: Storage Engines
The Parquet Format and Performance Optimization Opportunities
Deep Dive into Node.js Event Loop.pdf
Concept of thread
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr & Pipeline for Logs
Ad

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Cloud computing and distributed systems.
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Modernizing your data center with Dell and AMD
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)
Building Integrated photovoltaic BIPV_UPV.pdf
Approach and Philosophy of On baking technology
Mobile App Security Testing_ A Comprehensive Guide.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Cloud computing and distributed systems.
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The AUB Centre for AI in Media Proposal.docx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
NewMind AI Weekly Chronicles - August'25 Week I
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
Modernizing your data center with Dell and AMD
NewMind AI Monthly Chronicles - July 2025
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Review of recent advances in non-invasive hemoglobin estimation

Improvements in Bitsy 1.5

  • 1. Improvements in Bitsy 1.5 Sridhar Ramachandran Founder, LambdaZen LLC
  • 2. Background ● Bitsy is a small, fast, embeddable, durable, in-memory graph database that implements the Tinkerpop Blueprints API. ● The original presentation on Bitsy is available at http://slideshare.net/lambdazen/bitsy-graphdatabase ● Bitsy 1.5 is faster and leaner than before! ○ Has a smaller memory footprint ○ Uses (mostly) lock-free read algorithms ● This presentation covers the improvements in the 1.5 release.
  • 3. Major features in the 1.5 release ● The 1.5 release features: ○ Memory-efficient data structures ○ Mostly lock-free read algorithms ● Bitsy’s new memory-efficient data structures are designed to reduce the overhead of maintaining adjacency lists and properties. ● Bitsy’s new read algorithms are designed to use the latest Java “compare-and-set” (CAS) concurrency features to reduce the overhead of locks in highly threaded scenarios.
  • 4. Memory-efficient data structures ● Bitsy 1.0 relied on Java Collections to maintain adjacency lists and properties of vertices. ● Java Collections aren’t memory efficient for small-sized data structures because they create many holder objects. ● The 1.5 release stores small adjacency lists (N<24) and small properties (N<16) in hand- coded objects with minimal overhead.
  • 5. Memory-efficient data structures ● Different concrete classes capture adjacency lists and properties for small N. ○ This approach reduces the overall number of objects. ○ Large adjacency lists are stored in a compact hash- set by label referring to memory-efficient lists. Adjacency lists for out-degree 0, 1 and 2 Vertex properties for N = 0, 1 and 2
  • 6. Lock-free reading ● Bitsy 1.5 also introduces lock-free reading using sequential locks (seqlock). ● Read operations track the sequence numbers at the start and end. ○ If they are the same -- Success. ○ If they are different -- Retry! ● Reads don’t start till the counter is even. ● Writers increment the counters twice ○ Before the write to make the counter an odd number ○ After the write to make the counter an even number
  • 7. (Mostly) lock-free reading ● Bitsy’s sequential locks can cause “live lock” situations when there are too many writers. ● To avoid this, readers degrade to RW locks after a certain number of retries. ● Seqlock are faster than RW locks in highly threaded environments where the # of active threads exceed the # of cores. ● Bitsy uses locks on writes because ○ write-retries are complex with transactions, and ○ locking is not the bottleneck for writes -- the file system is the bottleneck.
  • 8. Benchmarks ● The plot below shows the read throughput* of a test! application that repeatedly loops through a graph. * Tests performed on a $600 HP p7-1287c desktop PC with a single 7200rpm hard disk. ! The code for this test can be found in BitsyGraphTest.java under the method testMultiThreadedCommits().
  • 9. Benchmarks ● The lock-free read algorithms in Bitsy 1.5 show a significantly higher throughput than Bitsy 1.0. ○ Bitsy 1.0 had a drop in performance when the number of threads exceeded the number of cores. ○ The read throughput exceeds 10M reads/sec! ● Bitsy is now comparable to Neo4J in read throughput* . ○ This is an apples-to-apples comparison since Neo4J is embedded and the graph is fully cached. ○ Most “bad” Neo4J benchmarks are taken when the graph doesn’t fit in memory. ○ Neo4J is extremely fast when the graph fits in memory -- and now, so is Bitsy!
  • 10. Another read benchmark ● The following plot shows the traversal performance of Bitsy 1.5 vs Neo4J 1.9.2 in a multi-threaded setting on a bipartite graph with 1M vertices and out-degree of 3. ● Again, you can see that the performance is comparable.
  • 11. Benchmarks for write ● As with 1.0 release, Bitsy’s write throughput is much higher than Neo4J because of the “No Seek” principle. ○ For more info, please refer to the project page at http://bitbucket.org/lambdazen/bitsy/
  • 12. Wrap-up ● The 1.5 release introduces memory-efficient data structures and (mostly) lock-free reading to the Bitsy graph database. ○ With these improvements, Bitsy’s read performance is comparable to Neo4J’s cache. ○ Bitsy’s “No Seek” write algorithms continue to outperform other graph databases, including Neo4J. ● Bitsy is a dual-licensed product with ○ an AGPL license for open-source projects, and ○ a liberal unlimited-use OEM/end-user license for commercial projects. Details at lambdazen.com.