Improvements in Bitsy 1.5

Improvements in Bitsy 1.5
Sridhar Ramachandran
Founder, LambdaZen LLC

Background
● Bitsy is a small, fast, embeddable, durable,
in-memory graph database that implements
the Tinkerpop Blueprints API.
● The original presentation on Bitsy is
available at
http://slideshare.net/lambdazen/bitsy-graphdatabase
● Bitsy 1.5 is faster and leaner than before!
○ Has a smaller memory footprint
○ Uses (mostly) lock-free read algorithms
● This presentation covers the improvements
in the 1.5 release.

Major features in the 1.5 release
● The 1.5 release features:
○ Memory-efficient data structures
○ Mostly lock-free read algorithms
● Bitsy’s new memory-efficient data structures
are designed to reduce the overhead of
maintaining adjacency lists and properties.
● Bitsy’s new read algorithms are designed to
use the latest Java “compare-and-set” (CAS)
concurrency features to reduce the overhead
of locks in highly threaded scenarios.

Memory-efficient data structures
● Bitsy 1.0 relied on Java Collections to
maintain adjacency lists and properties of
vertices.
● Java Collections aren’t memory efficient for
small-sized data structures because they
create many holder objects.
● The 1.5 release stores small adjacency lists
(N<24) and small properties (N<16) in hand-
coded objects with minimal overhead.

Memory-efficient data structures
● Different concrete
classes capture
adjacency lists and
properties for small N.
○ This approach reduces
the overall number of
objects.
○ Large adjacency lists are
stored in a compact hash-
set by label referring to
memory-efficient lists.
Adjacency lists for out-degree 0, 1 and 2
Vertex properties for N = 0, 1 and 2

Lock-free reading
● Bitsy 1.5 also introduces lock-free reading
using sequential locks (seqlock).
● Read operations track the sequence
numbers at the start and end.
○ If they are the same -- Success.
○ If they are different -- Retry!
● Reads don’t start till the counter is even.
● Writers increment the counters twice
○ Before the write to make the counter an odd number
○ After the write to make the counter an even number

(Mostly) lock-free reading
● Bitsy’s sequential locks can cause “live lock”
situations when there are too many writers.
● To avoid this, readers degrade to RW locks
after a certain number of retries.
● Seqlock are faster than RW locks in highly
threaded environments where the # of active
threads exceed the # of cores.
● Bitsy uses locks on writes because
○ write-retries are complex with transactions, and
○ locking is not the bottleneck for writes -- the file
system is the bottleneck.

Benchmarks
● The plot below shows the read throughput*
of a test!
application that repeatedly loops through a graph.
*
Tests performed on a $600 HP p7-1287c desktop PC with a single 7200rpm hard disk.
!
The code for this test can be found in BitsyGraphTest.java under the method testMultiThreadedCommits().

Benchmarks
● The lock-free read algorithms in Bitsy 1.5 show a
significantly higher throughput than Bitsy 1.0.
○ Bitsy 1.0 had a drop in performance when the
number of threads exceeded the number of cores.
○ The read throughput exceeds 10M reads/sec!
● Bitsy is now comparable to Neo4J in read throughput*
.
○ This is an apples-to-apples comparison since Neo4J
is embedded and the graph is fully cached.
○ Most “bad” Neo4J benchmarks are taken when the
graph doesn’t fit in memory.
○ Neo4J is extremely fast when the graph fits in
memory -- and now, so is Bitsy!

Another read benchmark
● The following plot shows the traversal performance of
Bitsy 1.5 vs Neo4J 1.9.2 in a multi-threaded setting on a
bipartite graph with 1M vertices and out-degree of 3.
● Again, you can see that the performance is comparable.

Benchmarks for write
● As with 1.0 release, Bitsy’s write throughput is much
higher than Neo4J because of the “No Seek” principle.
○ For more info, please refer to the project page at
http://bitbucket.org/lambdazen/bitsy/

Wrap-up
● The 1.5 release introduces memory-efficient
data structures and (mostly) lock-free
reading to the Bitsy graph database.
○ With these improvements, Bitsy’s read performance
is comparable to Neo4J’s cache.
○ Bitsy’s “No Seek” write algorithms continue to
outperform other graph databases, including Neo4J.
● Bitsy is a dual-licensed product with
○ an AGPL license for open-source projects, and
○ a liberal unlimited-use OEM/end-user license for
commercial projects. Details at lambdazen.com.

Improvements in Bitsy 1.5

More Related Content

What's hot (20)

Similar to Improvements in Bitsy 1.5 (20)

Recently uploaded (20)

Improvements in Bitsy 1.5