SlideShare a Scribd company logo
KeyValue Stores
 Jedi Master Edition
Who?
Antonio Garrote
@antoniogarrote



Mauro Pompilio
@malditogeek



Pablo Delgado
@pablete
Agenda
•Why?
•Definitions
•CouchDB
•Redis
•Cassandra
•Ruby Libraries
•Demo application
•Data modeling
•Benchmark
Why?
•Scalability
•Availability
•Fault Tolerance
•Schema-free
•Ease of use
•Performance
•Elasticity
•blah blah blah
NO
silver bullet!
NoSQL != NoSQL
 No SQL  Not Only SQL
Taxonomy
•Key-value stores:
Redis, Voldemort, Cassandra
•Column-oriented datastores:
Cassandra, HBase
•Document collection databases:
CouchDB, MongoDB
•Graph database:
Neo4J, AllegroGraph
•Data structure store:
Redis
CouchDB
   relax!
 •Damien Katz
 •Erlang - OTP compliant
 •schema-less documents
 •high availability
 •completely distributed
 •made for the web
CouchDB


B-Trees . MapReduce . MVCC
Ruby Libraries
•CouchDB

 •Pure: net/http + JSON implementation

 •Thin wrapper: Couchrest
 http://github.com/jchris/couchrest


 •ORM/ActiveRecord: ActiveCouch,
 CouchObject, RelaxDB ..etc
 http://github.com/arunthampi/activecouch
 http://github.com/paulcarey/relaxdb
CouchDB
•Rocks
  •Simplicity and elegance
  •Much more than a DB
  •New possibilities for web apps

•Sucks
  •Speed
  •Speed
  •Speed
Redis
       il meglio d'Italia




classy as a           tasty as
  Giulietta           a pizza
Redis
•Salvatore 'antirez' Sanfilippo
•ANSI C - POSIX compliant

•MemCache-like (on steroids)
•Data structures store:
  •strings
  •counters
  •lists
  •sets + sorted sets (>= 1.1)
Ruby Libraries
•Redis

  •Client: redis-rb
  http://github.com/ezmobius/redis-rb


  •Hash/Object mapper: Ohm
  http://github.com/soveran/ohm


  •ORM: RedisRecord
  http://github.com/malditogeek/redisrecord
Redis
require 'redis'
redis = Redis.new

# Strings
redis['foo'] = 'bar' # => 'bar'
redis['foo']         # => 'bar'

# Expirations
redis.expire('foo', 5) # will expire existing key 'foo' in 5 sec
redis.set('foo', 'bar', 5) # set 'foo' with 5 sec expiration

# Counters
redis.incr('counter')     # => 1
redis.incr('counter', 10) # => 11
redis.decr('counter')     # => 10
Redis
# Lists
%w(1st 2nd 3rd).each { |item| redis.push_tail('logs', item) }
redis.list_range('logs', 0, -1) # => ["1st", "2nd", "3rd"]
redis.pop_head('logs')          # => "1st"
redis.pop_tail('logs')          # => "3rd"


# Sets
%w(one two).each { |item| redis.set_add('foo-tags', item) }
%w(two three).each { |item| redis.set_add('bar-tags', item) }
redis.set_intersect('foo-tags', 'bar-tags') # => ["two"]
redis.set_union('foo-tags', 'bar-tags')     # => ["three", "two",
"one"]
Redis
•Rocks
  •Speed, in memory dataset
  •Asynch non-blocking persistence
  •Non-blocking replication
  •Data structures with atomic operations
  •Ease of use and deployment
•Sucks
  •Sharding (client-side only at the moment)
  •Datasets > RAM
  •Very frequent code updates (?)
Redis
Upcoming coolness...


   •1.1
          •Sorted sets (ZSET), append-only journaling
   •1.2
          •HASH type, JSON dump tool
   •1.3
          •Virtual memory (datasets > RAM)
   •1.4
          •Redis-cluster proxy: consistent hashing and fault
          tollerant nodes
   •1.5
          •Optimizations, UDP GET/SET
Cassandra

BigTable       Dynamo
  by
           +       by
Cassandra
Structure Storage System over P2P network


             •Developed at Facebook
             •Java

             •Dynamo: partition and
             replication
             •Bigtable: Log-structured
             ColumnFamily data model
Ruby Libraries
•Cassandra

  •Client: cassandra
  http://github.com/fauna/cassandra


  •ORM: cassandra_object
  http://github.com/NZKoz/cassandra_object


  •ORM: BigRecord
  http://github.com/openplaces/bigrecord
Cassandra
•Rocks
  •High Availability
  •Incremental Scalability
  •Minimal Administration
  •No Single Point of Failure
•Sucks
  •Thrift API (...not so bad)
  •Change Schema, restart server
  •The Logo
Demo Application




http://github.com/antoniogarrote/conf_rails_hispana_2009
Data Modeling
•Class mapping
•ID generation
•Relationships
   •one-to-one
   •one-to-many
   •many-to-many
•Index sorting
•Pagination
•Data filtering
Cassandra
•Class mapping
    • ColumnFamily :Blog, :Post

•ID generation
  •UUID.new(Time.now)

•Relationships
  •Use ColumnFamily :PostsforUser to
  hold all posts that belong to a user
Cassandra
•Index sorting
  •Columns within a ColumnFamily are stored in
  sorted order. Keys are also sorted (if
  OrderPreservingPartitioner)
•Pagination
  •for keys get_range (start, finish, count)
  •for columns get_slice (start, finish, count)
•Data filtering
  •Use get_range/get_slice and play around with
  start/finish
Redis
•Class mapping
  • Namespaced keys: 'Post:5:title'

•ID generation
  •Redis counters: incr('Post:ids')

•Relationships
  •Redis lists: push_tail('Post:5:_rating_ids', 4)
Redis
•Index sorting
   •Redis sort:
      •sort 'Post:list', by 'Post:*:score', get
      'Post:*:id'


•Pagination
   •Redis lists: list_range('Post:list', 0, -9)

•Data filtering
  •Lookups: 'Post:permalink:fifth_post' => 5
CouchDB
•Type attribute in each document
•CouchDB automatic ID generation
•Related document IDs in the
attributes
•Views with complex keys
•Special attributes for view functions
CouchDB
   View: relation_blog_posts

function(doc){
   if(doc.type=="post"){
       emit([doc.blog_id,
             doc.created_at],
             doc);
   }
}
CouchDB
    View: relation_blog_posts


               GET
/db/design_doc/relation_blog_posts?
         startkey=[blog_1]
VPork
•Utility for load-testing a distributed hash table.
•Allows you to test raw throughput via
concurrent read/writes operations.
•Hardware:
   •2 x comodity servers: CoreDuo 2.5Ghz, 4Gb RAM,
   7200RPM disks
   •CouchDB: 2 instances, round-robin balanced
   •Cassandra: 2 instances
   •Redis: 1 instance

http://github.com/antoniogarrote/vpork
VPork
Throughput with read probability 0.2
VPork
Throughput with read probability 0.5
VPork
Throughput with read probability 0.8
Conclusions
•Complementary to relational solutions
•Each K/V address a different problem
•Best use case:
  •CouchDB: distributed/scalable
  Javascript-only app (no backend)
  •Cassandra: big amount of writes, no
  SPOF
  •Redis: datasets < RAM, lookups,
  cache, buffers
Credits
•All sponsored products, company names, brand names,
trademarks and logos are the property of their respective
owners.
•Alfa Romeo Giulietta: http://www.flickr.com/photos/
mauboi/3296469097/
•Pizza: http://reportingfrombelgium.wordpress.com/2009/
       05/20/belgian-summer-holidays/
•Sammy: http://www.yuddy.com/celebrity/Sammy-Davis-
Jr/bio
•Everything else is from teh internets and is free.

More Related Content

PDF
Cassandra vs. Redis
KEY
Polyglot Persistence & Big Data in the Cloud
PDF
Key-Value-Stores -- The Key to Scaling?
PPTX
A simple introduction to redis
PPTX
Redis and it's data types
PDF
Ruby on Rails & PostgreSQL - v2
PDF
Ruby and Distributed Storage Systems
PPTX
Hadoop Training in Hyderabad
Cassandra vs. Redis
Polyglot Persistence & Big Data in the Cloud
Key-Value-Stores -- The Key to Scaling?
A simple introduction to redis
Redis and it's data types
Ruby on Rails & PostgreSQL - v2
Ruby and Distributed Storage Systems
Hadoop Training in Hyderabad

What's hot (20)

KEY
MongoDB EuroPython 2009
PDF
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
PDF
Real time fulltext search with sphinx
KEY
Sphinx at Craigslist in 2012
PDF
BuilHigh Performance Weibo Platform-Qcon2011
PPTX
Oak, the architecture of Apache Jackrabbit 3
PDF
A Brief Introduction to Redis
PDF
新浪微博开放平台Redis实战
PDF
Data Processing and Ruby in the World
PDF
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
PDF
Jslab rssh: JS as language platform
PDF
Using Sphinx for Search in PHP
PDF
High-Performance Hibernate Devoxx France 2016
KEY
MongoDB, E-commerce and Transactions
PDF
The Wix Microservice Stack
PPTX
Level DB - Quick Cheat Sheet
ODP
MySQL And Search At Craigslist
PPTX
memcached Distributed Cache
PPTX
Sharding
PDF
My Sql And Search At Craigslist
MongoDB EuroPython 2009
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Real time fulltext search with sphinx
Sphinx at Craigslist in 2012
BuilHigh Performance Weibo Platform-Qcon2011
Oak, the architecture of Apache Jackrabbit 3
A Brief Introduction to Redis
新浪微博开放平台Redis实战
Data Processing and Ruby in the World
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Jslab rssh: JS as language platform
Using Sphinx for Search in PHP
High-Performance Hibernate Devoxx France 2016
MongoDB, E-commerce and Transactions
The Wix Microservice Stack
Level DB - Quick Cheat Sheet
MySQL And Search At Craigslist
memcached Distributed Cache
Sharding
My Sql And Search At Craigslist
Ad

Viewers also liked (20)

PDF
Social Networks and the Richness of Data
KEY
Blazing Data With Redis (and LEGOS!)
PPTX
JavaFX - Sketch Board to Production
PDF
JavaFX 2 Rich Desktop Platform
KEY
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
PPTX
JavaFX Your Way - Devoxx Version
PDF
Building RIA Applications with JavaFX
PPT
Python redis talk
PPT
Redis And python at pycon_2011
PDF
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
PDF
JFXtras - JavaFX Controls, Layout, Services, and More
PPTX
Redis/Lessons learned
KEY
Redis, Resque & Friends
PDF
NoSQL Tel Aviv Meetup#1: Introduction to Polyglot Persistance
PDF
JavaFX Layout Secrets with Amy Fowler
PDF
Redis to the Rescue?
KEY
Indexing thousands of writes per second with redis
PPTX
RedisConf 2016 talk - The Redis API: Simple, Composable, Powerful
PDF
Scaling Crashlytics: Building Analytics on Redis 2.6
PDF
Redis persistence in practice
Social Networks and the Richness of Data
Blazing Data With Redis (and LEGOS!)
JavaFX - Sketch Board to Production
JavaFX 2 Rich Desktop Platform
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
JavaFX Your Way - Devoxx Version
Building RIA Applications with JavaFX
Python redis talk
Redis And python at pycon_2011
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
JFXtras - JavaFX Controls, Layout, Services, and More
Redis/Lessons learned
Redis, Resque & Friends
NoSQL Tel Aviv Meetup#1: Introduction to Polyglot Persistance
JavaFX Layout Secrets with Amy Fowler
Redis to the Rescue?
Indexing thousands of writes per second with redis
RedisConf 2016 talk - The Redis API: Simple, Composable, Powerful
Scaling Crashlytics: Building Analytics on Redis 2.6
Redis persistence in practice
Ad

Similar to KeyValue Stores (20)

KEY
Nosql redis-mongo
PDF
Introduction to redis - version 2
PDF
Scaling the Web: Databases & NoSQL
PDF
Spring one2gx2010 spring-nonrelational_data
KEY
MongoDB SF Ruby
PDF
Redis - The Universal NoSQL Tool
PDF
Developing polyglot persistence applications (SpringOne India 2012)
KEY
Nosql-columbia-feb2011
PDF
Developing polyglot persistence applications #javaone 2012
PDF
Developing polyglot persistence applications (SpringOne China 2012)
PDF
Using Spring with NoSQL databases (SpringOne China 2012)
PDF
Where do I put this data? #lessql
PDF
Introduction to Redis
PPTX
Key-value databases in practice Redis @ DotNetToscana
PDF
Datastores
PDF
Mongodb my
PDF
MongoDB
PDF
NoSQL
PPTX
Big Data and the growing relevance of NoSQL
Nosql redis-mongo
Introduction to redis - version 2
Scaling the Web: Databases & NoSQL
Spring one2gx2010 spring-nonrelational_data
MongoDB SF Ruby
Redis - The Universal NoSQL Tool
Developing polyglot persistence applications (SpringOne India 2012)
Nosql-columbia-feb2011
Developing polyglot persistence applications #javaone 2012
Developing polyglot persistence applications (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
Where do I put this data? #lessql
Introduction to Redis
Key-value databases in practice Redis @ DotNetToscana
Datastores
Mongodb my
MongoDB
NoSQL
Big Data and the growing relevance of NoSQL

Recently uploaded (20)

PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPT
Teaching material agriculture food technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
KodekX | Application Modernization Development
PDF
cuic standard and advanced reporting.pdf
PDF
Approach and Philosophy of On baking technology
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Modernizing your data center with Dell and AMD
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
A Presentation on Artificial Intelligence
DOCX
The AUB Centre for AI in Media Proposal.docx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
NewMind AI Monthly Chronicles - July 2025
Reach Out and Touch Someone: Haptics and Empathic Computing
Dropbox Q2 2025 Financial Results & Investor Presentation
Review of recent advances in non-invasive hemoglobin estimation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Teaching material agriculture food technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Unlocking AI with Model Context Protocol (MCP)
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
KodekX | Application Modernization Development
cuic standard and advanced reporting.pdf
Approach and Philosophy of On baking technology
NewMind AI Weekly Chronicles - August'25 Week I
Mobile App Security Testing_ A Comprehensive Guide.pdf
Modernizing your data center with Dell and AMD
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
A Presentation on Artificial Intelligence
The AUB Centre for AI in Media Proposal.docx

KeyValue Stores

  • 1. KeyValue Stores Jedi Master Edition
  • 4. Why? •Scalability •Availability •Fault Tolerance •Schema-free •Ease of use •Performance •Elasticity •blah blah blah
  • 6. NoSQL != NoSQL No SQL Not Only SQL
  • 7. Taxonomy •Key-value stores: Redis, Voldemort, Cassandra •Column-oriented datastores: Cassandra, HBase •Document collection databases: CouchDB, MongoDB •Graph database: Neo4J, AllegroGraph •Data structure store: Redis
  • 8. CouchDB relax! •Damien Katz •Erlang - OTP compliant •schema-less documents •high availability •completely distributed •made for the web
  • 10. Ruby Libraries •CouchDB •Pure: net/http + JSON implementation •Thin wrapper: Couchrest http://github.com/jchris/couchrest •ORM/ActiveRecord: ActiveCouch, CouchObject, RelaxDB ..etc http://github.com/arunthampi/activecouch http://github.com/paulcarey/relaxdb
  • 11. CouchDB •Rocks •Simplicity and elegance •Much more than a DB •New possibilities for web apps •Sucks •Speed •Speed •Speed
  • 12. Redis il meglio d'Italia classy as a tasty as Giulietta a pizza
  • 13. Redis •Salvatore 'antirez' Sanfilippo •ANSI C - POSIX compliant •MemCache-like (on steroids) •Data structures store: •strings •counters •lists •sets + sorted sets (>= 1.1)
  • 14. Ruby Libraries •Redis •Client: redis-rb http://github.com/ezmobius/redis-rb •Hash/Object mapper: Ohm http://github.com/soveran/ohm •ORM: RedisRecord http://github.com/malditogeek/redisrecord
  • 15. Redis require 'redis' redis = Redis.new # Strings redis['foo'] = 'bar' # => 'bar' redis['foo'] # => 'bar' # Expirations redis.expire('foo', 5) # will expire existing key 'foo' in 5 sec redis.set('foo', 'bar', 5) # set 'foo' with 5 sec expiration # Counters redis.incr('counter') # => 1 redis.incr('counter', 10) # => 11 redis.decr('counter') # => 10
  • 16. Redis # Lists %w(1st 2nd 3rd).each { |item| redis.push_tail('logs', item) } redis.list_range('logs', 0, -1) # => ["1st", "2nd", "3rd"] redis.pop_head('logs') # => "1st" redis.pop_tail('logs') # => "3rd" # Sets %w(one two).each { |item| redis.set_add('foo-tags', item) } %w(two three).each { |item| redis.set_add('bar-tags', item) } redis.set_intersect('foo-tags', 'bar-tags') # => ["two"] redis.set_union('foo-tags', 'bar-tags') # => ["three", "two", "one"]
  • 17. Redis •Rocks •Speed, in memory dataset •Asynch non-blocking persistence •Non-blocking replication •Data structures with atomic operations •Ease of use and deployment •Sucks •Sharding (client-side only at the moment) •Datasets > RAM •Very frequent code updates (?)
  • 18. Redis Upcoming coolness... •1.1 •Sorted sets (ZSET), append-only journaling •1.2 •HASH type, JSON dump tool •1.3 •Virtual memory (datasets > RAM) •1.4 •Redis-cluster proxy: consistent hashing and fault tollerant nodes •1.5 •Optimizations, UDP GET/SET
  • 19. Cassandra BigTable Dynamo by + by
  • 20. Cassandra Structure Storage System over P2P network •Developed at Facebook •Java •Dynamo: partition and replication •Bigtable: Log-structured ColumnFamily data model
  • 21. Ruby Libraries •Cassandra •Client: cassandra http://github.com/fauna/cassandra •ORM: cassandra_object http://github.com/NZKoz/cassandra_object •ORM: BigRecord http://github.com/openplaces/bigrecord
  • 22. Cassandra •Rocks •High Availability •Incremental Scalability •Minimal Administration •No Single Point of Failure •Sucks •Thrift API (...not so bad) •Change Schema, restart server •The Logo
  • 24. Data Modeling •Class mapping •ID generation •Relationships •one-to-one •one-to-many •many-to-many •Index sorting •Pagination •Data filtering
  • 25. Cassandra •Class mapping • ColumnFamily :Blog, :Post •ID generation •UUID.new(Time.now) •Relationships •Use ColumnFamily :PostsforUser to hold all posts that belong to a user
  • 26. Cassandra •Index sorting •Columns within a ColumnFamily are stored in sorted order. Keys are also sorted (if OrderPreservingPartitioner) •Pagination •for keys get_range (start, finish, count) •for columns get_slice (start, finish, count) •Data filtering •Use get_range/get_slice and play around with start/finish
  • 27. Redis •Class mapping • Namespaced keys: 'Post:5:title' •ID generation •Redis counters: incr('Post:ids') •Relationships •Redis lists: push_tail('Post:5:_rating_ids', 4)
  • 28. Redis •Index sorting •Redis sort: •sort 'Post:list', by 'Post:*:score', get 'Post:*:id' •Pagination •Redis lists: list_range('Post:list', 0, -9) •Data filtering •Lookups: 'Post:permalink:fifth_post' => 5
  • 29. CouchDB •Type attribute in each document •CouchDB automatic ID generation •Related document IDs in the attributes •Views with complex keys •Special attributes for view functions
  • 30. CouchDB View: relation_blog_posts function(doc){ if(doc.type=="post"){ emit([doc.blog_id, doc.created_at], doc); } }
  • 31. CouchDB View: relation_blog_posts GET /db/design_doc/relation_blog_posts? startkey=[blog_1]
  • 32. VPork •Utility for load-testing a distributed hash table. •Allows you to test raw throughput via concurrent read/writes operations. •Hardware: •2 x comodity servers: CoreDuo 2.5Ghz, 4Gb RAM, 7200RPM disks •CouchDB: 2 instances, round-robin balanced •Cassandra: 2 instances •Redis: 1 instance http://github.com/antoniogarrote/vpork
  • 33. VPork Throughput with read probability 0.2
  • 34. VPork Throughput with read probability 0.5
  • 35. VPork Throughput with read probability 0.8
  • 36. Conclusions •Complementary to relational solutions •Each K/V address a different problem •Best use case: •CouchDB: distributed/scalable Javascript-only app (no backend) •Cassandra: big amount of writes, no SPOF •Redis: datasets < RAM, lookups, cache, buffers
  • 37. Credits •All sponsored products, company names, brand names, trademarks and logos are the property of their respective owners. •Alfa Romeo Giulietta: http://www.flickr.com/photos/ mauboi/3296469097/ •Pizza: http://reportingfrombelgium.wordpress.com/2009/ 05/20/belgian-summer-holidays/ •Sammy: http://www.yuddy.com/celebrity/Sammy-Davis- Jr/bio •Everything else is from teh internets and is free.