cache2k, Java Caching, Turbo Charged, FOSDEM 2015

Java Caching, Turbo Charged
JavaDevRoom, FOSDEM 2015
Jens Wilke, headissue GmbH
twitter.com/cruftex
github.com/cruftex
http://cache2k.org

cache2k Overview
● Started in year 2000 as in house product and evolving since
● Focus on in memory (in heap) caching (persistence and off heap is on the
way)
● Research on optimized performance / modern eviction policies
● Open sourced 2013
● Contains features not found in (all) cache products, e.g.:
– On time expiry
– Extensive statistics
– Support for exceptions and nulls
– Blocking fetch for multiple requests on the same key
(read through configuration)

Eviction AlgorithmsEviction Algorithms
flickr:alexander

LRU
1 2 3 4 5 6 7
1 2 3 5 6 74
LRU Entry
cache access => move to front

CLOCK
hand
1=hit
1=hit0=no hit
0=no hit
0=no hit
1=hit
1=hit 1=hit
1=hit

Improving on LRU...
protect the working set
● For completeness: Least frequently used
– LFU
– LRFU
– …
● Split set of entries into cold and hot, to protect the working set
– 2Q
– LIRS
– ARC – Adaptive Replacement Cache
● Nimrod Megiddo and Dharmendra S. Modha (Usenix 2003) – patented by
IBM
– Clock-Pro
● Song Jiang, Feng Chen and Xiaodong Zhang (Usenix 2005)
cold set hot set

Improving on LRU...
history of seen entries
● Keep an LRU list of the evicted keys
● If seen again, insert directly into hot set
cold set hot set
ghost set (only keys)

Clock-Pro+
hand
Hot
0 hits
1 hit
0 hits
2 hits
0 hits
1 hit 4 hits
0 hits
2 hits
handCold
5 hits
0 hits 1 hits

Clock-Pro+ Evaluation
– Only inexpensive operation on access,
no exclusive access needed
– Better efficiency then LRU for most analyzed workloads
– Downside
● Eviction overhead increases when possible hitrates get high
(e.g. 3 entries scanned per eviction at 50% hitrate, 10 entries
scanned at 95%)
● High complexity, no straight forward implementation by the
book, lots of tuning needed (and possible)
– Still missing:
● Optimal selection of cold / hot space sizes

BenchmarksBenchmarks
flickr:bantam10

Benchmark Setup
● Cache implementations:
– Cache2k Version 0.21 (to be release next week)
– EHCache Version 2.9.0
– Guava 18
– Infinispan 7.1.0.CR2
● Oracle JRE 1.8-25
● Hardware
– Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

Test workload
– Keys and values are integers
– Read through configuration, the cache source
just returns the key
– Not practical: emphasis of caching overhead
// run the benchmark
Integer[] trace = ….
for (Integer v : trace) {
cache.get(v);
}
// Implementation of cache source
public Integer get(Integer o) {
incrementMissCount();
return o;
}

Runtime for artificial traces
3 million requests on cache with 500 capacity
Except Hits2000: cache with 2000 capacity
Hits: repeat different 500 values
Random: random select from 1000 values
Eff90 / Eff95: random trace with approx.
90% and 95% hitrate on LRU0
1
2
3
4
5
6
runtimeinseconds
Runtime of 3 million cache requests
cache2k/CLOCK
cache2k/CP+
cache2k/ARC
EHCache
Infinispan
Guava

Runtime for mostly hits
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
runtimeinseconds
Runtime of 3 million cache hits
HashMap+Counter
cache2k/CLOCK
cache2k/CP+
cache2k/ARC
EHCache
Infinispan
Guava
The first four times for Hits:
20ms, 50ms, 50ms, 70ms

Runtime with two threads
0
0.5
1
1.5
2
2.5
runtimeinseconds
3 million cache requests Eff95 per thread count
cache2k/CLOCK
cache2k/CP+
cache2k/ARC
EHCache
Infinispan
Guava
Some CPU consuming
computation is done on
cache miss
Eff95Threads2:
Same trace executed in
separate thread
with index offset

Hitrate comparison -
Artificial traces
0
10
20
30
40
50
60
70
80
90
100
runtimeinseconds
Hitrate of 3 million cache requests
cache2k/CLOCK
cache2k/CP+
cache2k/ARC
EHCache
Infinispan
Guava

Multi2 trace
0
10
20
30
40
50
60
70
80
Hitrates for Multi2 trace
OPT
LRU
CLOCK
CP+
ARC
EHCache
Infinispan
Guava
RAND

Hitrates comparison -
Web12 trace
0
10
20
30
40
50
60
70
80
90
Hitrates for Web12 trace
OPT
LRU
CLOCK
CP+
ARC
EHCache
Infinispan
Guava
RAND

Sprite trace
0
10
20
30
40
50
60
70
80
90
100
Hitrates for Sprite trace
OPT
LRU
CLOCK
CP+
ARC
EHCache
Infinispan
Guava
RAND

Take away
● The goal:
– Eviction algorithm doing better than LRU
– Self tuning / adapting
– Minimal overhead on cache access
Clock-Pro+ is quite there

Get involved...
● Try it: cache2k is on maven central
● Source on github:
● http://github.com/headissue/cache2k
● http://github.com/headissue/cache2k-benchmarks
● Ask questions on stackoverflow!

Thanks & Enjoy Life!Thanks & Enjoy Life!
http://cruftex.nethttp://cruftex.net http://cache2k.orghttp://cache2k.org

cache2k, Java Caching, Turbo Charged, FOSDEM 2015

More Related Content

What's hot (20)

Similar to cache2k, Java Caching, Turbo Charged, FOSDEM 2015 (20)

Recently uploaded (20)

cache2k, Java Caching, Turbo Charged, FOSDEM 2015

Editor's Notes