SlideShare a Scribd company logo
(Yet-Another-NoSQL-DB) Presented By John Lynch [email_address]
Web App Developers ORM Focus on the App Speed of Development DB Agnostic(ish) Fixed Schema Limits design choices Migration Hell Scaling at DB layer Rails!
Shiny New Toys Decades of research and best practices Awesome ad-hoc query capability Zillions of vendors/tools/libraries/code Flexibility of schema-less design Ability to scale…. Web-scale
Web Scale Changing App Types Social Games Marketing / Advertising Freemium Business Models 1M free => 10K paying customers
NoSQL Landscape Pure Key/Value  (Redis/Tokyo Cabinet/etc) Key/Value+  (CouchDB/MongoDB/Riak) BigTable Type  (Hbase, HyperTable) Choose wisely! No standard API. (Good general overview can be found here: http://cattell.net/datastores/Datastores.pdf)
MongoDB Popular with Ruby community Combines Key/Value with ability to do Indexed Queries
Scaling MongoDB Master, Slave, Replica Set, Replica Pair, Shard Server,  Connection Pool, ack!
Scaling MongoDB  
If all you want is NoSQL… NoSQL on MySQL Leverages all MySQL skills, tools, techniques, stability, dependability If you want NoSQL + Scalability… … not so much.
Riak Developed by Basho.com Used on several large production sites Written in Erlang Distributed – Fault Tolerant Buckets – Keys – Values Values can be anything (json,binary,etc) Ruby & Rails Client  (Ripple project @ Github)
Riak speaks HTTP > curl –i http://host:8098/riak/bucket1/key1 HTTP/1.1 200 OK  X-Riak-Vclock: awpcFAA==  Content-Type: text/plain  Content-Length: 9 Last-Modified: Wed, 01 Se… Etag: 45364657  I am a value Leverage existing HTTP infrastructure, tools, etc
Scaling Riak Riak Riak Riak Riak Riak Http Load Balancer Varnish (cache) Standard HTTP Protocol Rails Rails Rails Rails Rails
Scaling Riak (alt) Riak Rails Nginx Riak Rails Nginx Riak Rails Nginx Riak Rails Nginx Riak Rails Nginx Http Load Balancer Varnish (cache) Standard HTTP Protocol
Key Differentiator -  Distributed Inspired by Amazon’s Dynamo Uses consistent hashing algorithim No “Master Node” No single point of failure Any node can service any request Automatically rebalances as nodes join Tunable CAP Properties Consistency, Availability, Partition Tolerance
N  R  W N = # of copies of the data R = # of nodes necessary to read W = # of nodes necessary to write Tunable by the application, on a per-bucket and per-query basis
Low Value Data (N=2  R=1 W=1) Logging Web Content (N=4  R=1  W=4) Maximum availability and consistency Financial Data (N=4 R=1 W=4  DW=4) DW is “Durable Write” Riak cluster of 4 Physical Computers
Network Split
Network Split
Network Split
Map/Reduce Map steps run on each node Final reduce runs on single node results = Riak::MapReduce.new(client).  add(“albums”). map("function(v){ return [JSON.parse(v.values[0].data).title]; }", :keep => true).run
Links Riak documents can have links to other documents, each link can be “tagged” Link data is separate from doc data Easy URL access to walk these links GET /riak/artists/TheBeatles/albums,_,_/tracks,_,1
When NOT to use Riak Single machine Small scale or bog-standard apps Need rich ad-hoc indexed queries Need mature tools and libraries
Any questions? (First round at Rock Bottom generously sponsored by Basho.com)

More Related Content

PDF
Scaling with Riak at Showyou
PDF
Riak at shareaholic
KEY
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
PPTX
Riak perf wins
PDF
James Turner (Caplin) - Enterprise HTML5 Patterns
PPTX
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
PDF
Solr cloud the 'search first' nosql database extended deep dive
PDF
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz
Scaling with Riak at Showyou
Riak at shareaholic
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
Riak perf wins
James Turner (Caplin) - Enterprise HTML5 Patterns
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
Solr cloud the 'search first' nosql database extended deep dive
Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz

What's hot (20)

PDF
Apache HBase Workshop
PDF
A Collaborative Data Science Development Workflow
PPTX
Lambda architecture: from zero to One
PDF
Becoming Protocol-Agnostic with Kafka, REST, GraphQL & gRPC | Tyler Mills, Sm...
PDF
Building Complete Private Clouds with Apache CloudStack and Riak CS
PPTX
Scala and Spark are Ideal for Big Data
PPTX
Taboola Road To Scale With Apache Spark
PDF
Efficient State Management With Spark 2.0 And Scale-Out Databases
PPTX
Bootstrap SaaS startup using Open Source Tools
PPTX
Keep your Metadata Repository Current with Event-Driven Updates using CDC and...
PDF
HBaseConAsia2018 Track2-6: Scaling 30TB's of data lake with Apache HBase and ...
PDF
Distributed Erlang Systems In Operation
PDF
Column and hadoop
PDF
Big Data visualization with Apache Spark and Zeppelin
PDF
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
PDF
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
PPTX
Apache Con 2021 Structured Data Streaming
PDF
Kafka Streams: What it is, and how to use it?
PDF
Mobius: C# Language Binding For Spark
PPTX
Big Data Platform at Pinterest
Apache HBase Workshop
A Collaborative Data Science Development Workflow
Lambda architecture: from zero to One
Becoming Protocol-Agnostic with Kafka, REST, GraphQL & gRPC | Tyler Mills, Sm...
Building Complete Private Clouds with Apache CloudStack and Riak CS
Scala and Spark are Ideal for Big Data
Taboola Road To Scale With Apache Spark
Efficient State Management With Spark 2.0 And Scale-Out Databases
Bootstrap SaaS startup using Open Source Tools
Keep your Metadata Repository Current with Event-Driven Updates using CDC and...
HBaseConAsia2018 Track2-6: Scaling 30TB's of data lake with Apache HBase and ...
Distributed Erlang Systems In Operation
Column and hadoop
Big Data visualization with Apache Spark and Zeppelin
2015 01-17 Lambda Architecture with Apache Spark, NextML Conference
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Apache Con 2021 Structured Data Streaming
Kafka Streams: What it is, and how to use it?
Mobius: C# Language Binding For Spark
Big Data Platform at Pinterest
Ad

Viewers also liked (12)

PDF
Riak: A friendly key/value store for the web.
PDF
Ember learn from Riak Control
PDF
Migrating to Riak at Shareaholic
PDF
Couchbase Performance Benchmarking 2012
PPTX
All you didn't know about the CAP theorem
PDF
Riak Core: Building Distributed Applications Without Shared State
KEY
Introducing Riak
KEY
Riak in Ten Minutes
PPTX
Introduction to couchbase
KEY
Riak Training Session — Surge 2011
PPTX
NoSQL databases, the CAP theorem, and the theory of relativity
PPTX
CAP Theorem - Theory, Implications and Practices
Riak: A friendly key/value store for the web.
Ember learn from Riak Control
Migrating to Riak at Shareaholic
Couchbase Performance Benchmarking 2012
All you didn't know about the CAP theorem
Riak Core: Building Distributed Applications Without Shared State
Introducing Riak
Riak in Ten Minutes
Introduction to couchbase
Riak Training Session — Surge 2011
NoSQL databases, the CAP theorem, and the theory of relativity
CAP Theorem - Theory, Implications and Practices
Ad

Similar to Rolling With Riak (20)

KEY
DynamoDB Gluecon 2012
ZIP
Gluecon 2012 - DynamoDB
PPT
NoSQL Options Compared
PPTX
ODP
Front Range PHP NoSQL Databases
PPTX
Intro to big data analytics using microsoft machine learning server with spark
PPTX
Big Data and NoSQL for Database and BI Pros
PDF
Building Distributed Systems With Riak and Riak Core
PDF
Couchbase - Yet Another Introduction
PDF
Migrating Monolithic Applications with the Strangler Pattern
PPTX
Aws Summit Berlin 2013 - Understanding database options on AWS
PPTX
MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
PDF
JAX 2013: Modern Architectures with Spring and JavaScript
PPTX
KESALAHAN BACAAN AL-QURAN DALAM TILAWAH AL-QURAN DAN KRITERIA EVALUASI
PDF
A look under the hood at Apache Spark's API and engine evolutions
PDF
Functional programming
 for optimization problems 
in Big Data
PDF
Amazon Elastic Map Reduce - Ian Meyers
PPTX
Apache Spark: Lightning Fast Cluster Computing
PDF
Scalable Stream Processing with Apache Samza
DynamoDB Gluecon 2012
Gluecon 2012 - DynamoDB
NoSQL Options Compared
Front Range PHP NoSQL Databases
Intro to big data analytics using microsoft machine learning server with spark
Big Data and NoSQL for Database and BI Pros
Building Distributed Systems With Riak and Riak Core
Couchbase - Yet Another Introduction
Migrating Monolithic Applications with the Strangler Pattern
Aws Summit Berlin 2013 - Understanding database options on AWS
MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
JAX 2013: Modern Architectures with Spring and JavaScript
KESALAHAN BACAAN AL-QURAN DALAM TILAWAH AL-QURAN DAN KRITERIA EVALUASI
A look under the hood at Apache Spark's API and engine evolutions
Functional programming
 for optimization problems 
in Big Data
Amazon Elastic Map Reduce - Ian Meyers
Apache Spark: Lightning Fast Cluster Computing
Scalable Stream Processing with Apache Samza

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Machine learning based COVID-19 study performance prediction
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
sap open course for s4hana steps from ECC to s4
Agricultural_Statistics_at_a_Glance_2022_0.pdf
cuic standard and advanced reporting.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Machine learning based COVID-19 study performance prediction
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Empathic Computing: Creating Shared Understanding
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Digital-Transformation-Roadmap-for-Companies.pptx
Spectroscopy.pptx food analysis technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation_ Review paper, used for researhc scholars
Mobile App Security Testing_ A Comprehensive Guide.pdf
20250228 LYD VKU AI Blended-Learning.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Rolling With Riak

  • 1. (Yet-Another-NoSQL-DB) Presented By John Lynch [email_address]
  • 2. Web App Developers ORM Focus on the App Speed of Development DB Agnostic(ish) Fixed Schema Limits design choices Migration Hell Scaling at DB layer Rails!
  • 3. Shiny New Toys Decades of research and best practices Awesome ad-hoc query capability Zillions of vendors/tools/libraries/code Flexibility of schema-less design Ability to scale…. Web-scale
  • 4. Web Scale Changing App Types Social Games Marketing / Advertising Freemium Business Models 1M free => 10K paying customers
  • 5. NoSQL Landscape Pure Key/Value (Redis/Tokyo Cabinet/etc) Key/Value+ (CouchDB/MongoDB/Riak) BigTable Type (Hbase, HyperTable) Choose wisely! No standard API. (Good general overview can be found here: http://cattell.net/datastores/Datastores.pdf)
  • 6. MongoDB Popular with Ruby community Combines Key/Value with ability to do Indexed Queries
  • 7. Scaling MongoDB Master, Slave, Replica Set, Replica Pair, Shard Server, Connection Pool, ack!
  • 9. If all you want is NoSQL… NoSQL on MySQL Leverages all MySQL skills, tools, techniques, stability, dependability If you want NoSQL + Scalability… … not so much.
  • 10. Riak Developed by Basho.com Used on several large production sites Written in Erlang Distributed – Fault Tolerant Buckets – Keys – Values Values can be anything (json,binary,etc) Ruby & Rails Client (Ripple project @ Github)
  • 11. Riak speaks HTTP > curl –i http://host:8098/riak/bucket1/key1 HTTP/1.1 200 OK X-Riak-Vclock: awpcFAA== Content-Type: text/plain Content-Length: 9 Last-Modified: Wed, 01 Se… Etag: 45364657 I am a value Leverage existing HTTP infrastructure, tools, etc
  • 12. Scaling Riak Riak Riak Riak Riak Riak Http Load Balancer Varnish (cache) Standard HTTP Protocol Rails Rails Rails Rails Rails
  • 13. Scaling Riak (alt) Riak Rails Nginx Riak Rails Nginx Riak Rails Nginx Riak Rails Nginx Riak Rails Nginx Http Load Balancer Varnish (cache) Standard HTTP Protocol
  • 14. Key Differentiator - Distributed Inspired by Amazon’s Dynamo Uses consistent hashing algorithim No “Master Node” No single point of failure Any node can service any request Automatically rebalances as nodes join Tunable CAP Properties Consistency, Availability, Partition Tolerance
  • 15. N R W N = # of copies of the data R = # of nodes necessary to read W = # of nodes necessary to write Tunable by the application, on a per-bucket and per-query basis
  • 16. Low Value Data (N=2 R=1 W=1) Logging Web Content (N=4 R=1 W=4) Maximum availability and consistency Financial Data (N=4 R=1 W=4 DW=4) DW is “Durable Write” Riak cluster of 4 Physical Computers
  • 20. Map/Reduce Map steps run on each node Final reduce runs on single node results = Riak::MapReduce.new(client). add(“albums”). map("function(v){ return [JSON.parse(v.values[0].data).title]; }", :keep => true).run
  • 21. Links Riak documents can have links to other documents, each link can be “tagged” Link data is separate from doc data Easy URL access to walk these links GET /riak/artists/TheBeatles/albums,_,_/tracks,_,1
  • 22. When NOT to use Riak Single machine Small scale or bog-standard apps Need rich ad-hoc indexed queries Need mature tools and libraries
  • 23. Any questions? (First round at Rock Bottom generously sponsored by Basho.com)