SlideShare a Scribd company logo
Simple, Available Cloud Storage
For Cloudstack
Overview
On March 27, 2012


Basho
announced a new
product called

Riak CS
On September 5, 2012


   BASHO
joined Apache Cloudstack
On March 20, 2013


Riak CS
became open source
Riak CS
        is... storage
enterprise cloud
built                    g
                       in S3-compatibility
   on top          fe r
of           o f
                           multi-tenancy

Riak                     per user reporting
                        large object storage
Enabling you to host your own


      PUBLIC &
      PRIVATE CLOUDS

             or….
Reliable Storage Behind Apps
Basho's Commits

@john_burwell 's contribution:
S3-backed secondary storage feature in 4.1.0
Uses S3 to sync secondary storage across zones


Long term: (shhhhhh!)
Native S3 Support
Federated authentication and authorization
DataPipe
  blog.datapipe.com/datapipe-cloudstack

          “Riak CS provides the high-performance,

                              distributed datastore

          we need to deliver a sound foundation for

                      our cloud storage needs now

                and for many years into the future”

       - Ed Laczynski, VP Cloud Strategy, Datapipe.
Yahoo!
   “Today, Yahoo! leverages Riak CS Enterprise to offer an

               S3-compatible public cloud storage service,

                    as well as dedicated hosting options ...

       Yahoo! is highly supportive of open source software

            and we view Basho’s (OSS) announcement as

                             a positive move that will work

                       to accelerate its ability to innovate

              and ultimately strengthen our cloud platform.”

      - Shingo Saito, cloud product manager, Yahoo!
About Riak
Riak
   Dynamo-inspired key/value store

   Written in Erlang with C/C++

   Open source under Apache 2 license

   Thousands of production deployments
Riak
  High availability

  Low-latency

  Horizontal scalability

  Fault-tolerance

  Ops friendliness
Riak
   Masterless
    • No master/slave or different roles
    • All nodes are equal
    • Write availability and scalability
    • All nodes can accept/route requests
Riak
   No Sharding
     • Consistent hashing
     • Prevents “hot spots”
     • Lowers operational burden of scale
     • Data rebalanced automatically
Riak
  Availability and Fault-Tolerance
    • Automatically replicates data
    • Read and write data during hardware
       failure and network partition
    • Hinted handoff
How It
Works
Riak CS
Stanchion
Riak
1
    Riak CS node
      for every
    node of Riak
Large Object                        1. User uploads an
                                                                                                      object


  S3     Reporting             S3      Reporting                S3      Reporting            S3       Reporting            S3          Reporting
  API      API                 API       API                    API       API                API        API                API           API

   Riak CS                      Riak CS                          Riak CS                       Riak CS                          Riak CS


        1 MB   1 MB   1 MB   1 MB   1 MB   1 MB   1 MB   1 MB    1 MB   1 MB   1 MB   1 MB   1 MB   1 MB   1 MB   1 MB   1 MB   1 MB



                                                                                                 2. Riak CS
   3. Riak CS                                                         Riak                     breaks object
streams chunks
                                                                      Node
                                                                                             into 1 MB chunks
 to Riak nodes                                    Riak                                Riak
                                                  Node                                Node




                                                         Riak                  Riak
                                                                                                    4. Riak replicates
                                                         Node                  Node                 and stores chunks
IC S
     S T
   BA EP
     C
 CON
               USERS
                multi-tenancy:
              Riak CS will track
            individual usage/stats

users identified by     users authenticated by
   access_key                 secret_key
IC S
    S T
  BA EP
    C
CON
      BUCKETS
       users create buckets.
      buckets are like folders.
      store objects in buckets.
     names are globally unique.
IC S
    S T
  BA EP
    C
CON
       OBJECTS
         stored in buckets.
        objects are opaque.
         store any file type.
Features
Riak CS
   Large Object Support
     • Started with 5GB / object
     • Now have multipart upload
     • Content agnostic
Riak CS
   S3-Compatible API
     • Use existing S3 libraries and tools
     • RESTful operations
     • Multipart upload
     • S3-style ACLs for object/bucket
       permissions
     • S3 authentication scheme
Riak CS
   Administration and Users
     • Interface for user creation, deletion,
       and credentials
     • Configure so only admins can create
       users
Riak CS
   New Stuff in Riak 1.3
    • Multipart upload: parts between 5MB
       and 5GB
    • Support for GET range queries
    • Restrict access to buckets based on
       source IP
Riak CS
Riak CS
   Packages
     • Debian
     • Ubuntu
     • FreeBSD
     • Mac
     • Red Hat Enterprise
     • Fedora
     • SmartOS
     • Solaris
     • Source
Operations
built-in
stats &
           track access &
           storage per user

           inspect ops with

DTrace     DTrace probes

           monitor total
support    cluster ops
OPERATIONAL STATS
       exposed via HTTP resource: /riak-cs/stats



HISTOGRAMS & COUNTERS
      block               bucket                object
                      LIST KEYS, CREATE,    GET, PUT, DELETE
  GET, PUT, DELETE
                     DELETE, GET/PUT ACL   HEAD, GET/PUT ACL
THE
      “USAGE”
      BUCKET
TRACK INDIVIDUAL USER’S
ACCESS STORAGE
QUERY USAGE STATS

 Storage and access statistics tracked on
 per-user basis, as rollups for slices of time

 •Operations, Count, BytesIn,
 BytesOut, + system and user
 error
 •Objects, Bytes
Enterprise
Multi-Datacenter Replication

        • For active backups, availability zones,
          disaster recovery, global traffic
        • Real-time or full-sync
        • 24/7 support
        • Per-node or storage-based pricing
SIGN UP FOR AN
ENTERPRISE DEVELOPER
        TRIAL
            basho.com

     http://docs.basho.com/
Riak London
A distributed systems
   meet/drink up

  www.meetup.com/riak-london
github.com/basho
twitter.com/basho
 docs.basho.com
Q&A
 @_stu_

More Related Content

PPTX
Apache cloud stack 4.1 new features deep dive
PPTX
Cloudian & cloudstack
PDF
Riak CS Build Your Own Cloud Storage
PPTX
AWS Cloud SAA Relational Database presentation
PPTX
IBM Cloud Object Storage
PDF
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
PDF
Scaling spark on kubernetes at Lyft
PPTX
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Apache cloud stack 4.1 new features deep dive
Cloudian & cloudstack
Riak CS Build Your Own Cloud Storage
AWS Cloud SAA Relational Database presentation
IBM Cloud Object Storage
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Scaling spark on kubernetes at Lyft
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...

What's hot (10)

PPTX
AWS database services
PPTX
Dev ops for big data cluster management tools
PPTX
Introducing Kubernetes
PPTX
Responding to Digital Transformation With RDS Database Technology
PDF
AWS Database Services-Philadelphia AWS User Group-4-17-2018
PPTX
Introduction to CloudStack: How to Deploy and Manage Infrastructure-as-a-Serv...
PPTX
EC2 and S3 Level 100
PDF
Netflix security monkey overview
PDF
Netflix in the cloud 2011
PDF
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
AWS database services
Dev ops for big data cluster management tools
Introducing Kubernetes
Responding to Digital Transformation With RDS Database Technology
AWS Database Services-Philadelphia AWS User Group-4-17-2018
Introduction to CloudStack: How to Deploy and Manage Infrastructure-as-a-Serv...
EC2 and S3 Level 100
Netflix security monkey overview
Netflix in the cloud 2011
Oracle Databases on AWS - Getting the Best Out of RDS and EC2
Ad

Viewers also liked (9)

PPTX
Cloudstack European user group 11 april 2013
PPTX
Oliver leech cloudstack
PPTX
European Cloudstack User Group
PPTX
Contributing to Apache CloudStack
PDF
TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...
PPTX
CloudStack at Schuberg Philis
PPTX
How to add a new hypervisor to CloudStack - Lessons learned from Hyper-V effort
PDF
Spark Summit EU 2015: SparkUI visualization: a lens into your application
PPTX
Ansible & CloudStack - Configuration Management
Cloudstack European user group 11 april 2013
Oliver leech cloudstack
European Cloudstack User Group
Contributing to Apache CloudStack
TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...
CloudStack at Schuberg Philis
How to add a new hypervisor to CloudStack - Lessons learned from Hyper-V effort
Spark Summit EU 2015: SparkUI visualization: a lens into your application
Ansible & CloudStack - Configuration Management
Ad

Similar to Riak CS in Cloudstack (20)

KEY
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
PDF
Hugfr SPARK & RIAK -20160114_hug_france
PDF
Better, faster, cheaper infrastructure with apache cloud stack and riak cs redux
PDF
Riak at Engine Yard Cloud
PDF
Os riak1-pdf
PDF
Building Distributed Systems With Riak and Riak Core
KEY
Riak seattle-meetup-august
PDF
Getting started with Riak in the Cloud
PPTX
Open stack in sina
PDF
Riak - From Small to Large - StrangeLoop
PDF
Riak - From Small to Large
PPTX
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
PDF
Getting started with MariaDB with Docker
PDF
Leveraging Scala and Akka to build NSDb
PDF
Migrating Netflix from Datacenter Oracle to Global Cassandra
PPTX
What’s new in Apache Spark 2.3
PDF
Akka Streams And Kafka Streams: Where Microservices Meet Fast Data
PDF
451 research basho previews distributed database and momentum
PPTX
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
PPTX
Clustrix Database Percona Ruby on Rails benchmark
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
Hugfr SPARK & RIAK -20160114_hug_france
Better, faster, cheaper infrastructure with apache cloud stack and riak cs redux
Riak at Engine Yard Cloud
Os riak1-pdf
Building Distributed Systems With Riak and Riak Core
Riak seattle-meetup-august
Getting started with Riak in the Cloud
Open stack in sina
Riak - From Small to Large - StrangeLoop
Riak - From Small to Large
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
Getting started with MariaDB with Docker
Leveraging Scala and Akka to build NSDb
Migrating Netflix from Datacenter Oracle to Global Cassandra
What’s new in Apache Spark 2.3
Akka Streams And Kafka Streams: Where Microservices Meet Fast Data
451 research basho previews distributed database and momentum
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Clustrix Database Percona Ruby on Rails benchmark

More from ShapeBlue (20)

PPTX
The Yotta x CloudStack Advantage: Scalable, India-First Cloud
PPTX
Simplifying End-to-End Apache CloudStack Deployment with a Web-Based Automati...
PPTX
Extensions Framework (XaaS) - Enabling Orchestrate Anything
PDF
CloudStack GPU Integration - Rohit Yadav
PPTX
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
PDF
Ampere Offers Energy-Efficient Future For AI And Cloud
PDF
Empowering Cloud Providers with Apache CloudStack and Stackbill
PDF
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
PDF
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
PDF
Fully Open-Source Private Clouds: Freedom, Security, and Control
PPTX
Pushing the Limits: CloudStack at 25K Hosts
PPTX
Stretching CloudStack over multiple datacenters
PPTX
Proposed Feature: Monitoring and Managing Cloud Usage Costs in Apache CloudStack
PPSX
CloudStack + KVM: Your Local Cloud Lab
PDF
I’d like to resell your CloudStack services, but...
PDF
Storage Setup for LINSTOR/DRBD/CloudStack
PDF
Apache CloudStack 101 - Introduction, What’s New and What’s Coming
PDF
Development of an Оbject Storage Plugin for CloudStack, Christian Reichert, s...
PDF
VM-HA with CloudStack and Linstor, Rene Peinthor
PDF
How We Use CloudStack to Provide Managed Hosting, Swen Brüseke, proIO
The Yotta x CloudStack Advantage: Scalable, India-First Cloud
Simplifying End-to-End Apache CloudStack Deployment with a Web-Based Automati...
Extensions Framework (XaaS) - Enabling Orchestrate Anything
CloudStack GPU Integration - Rohit Yadav
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
Ampere Offers Energy-Efficient Future For AI And Cloud
Empowering Cloud Providers with Apache CloudStack and Stackbill
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
Fully Open-Source Private Clouds: Freedom, Security, and Control
Pushing the Limits: CloudStack at 25K Hosts
Stretching CloudStack over multiple datacenters
Proposed Feature: Monitoring and Managing Cloud Usage Costs in Apache CloudStack
CloudStack + KVM: Your Local Cloud Lab
I’d like to resell your CloudStack services, but...
Storage Setup for LINSTOR/DRBD/CloudStack
Apache CloudStack 101 - Introduction, What’s New and What’s Coming
Development of an Оbject Storage Plugin for CloudStack, Christian Reichert, s...
VM-HA with CloudStack and Linstor, Rene Peinthor
How We Use CloudStack to Provide Managed Hosting, Swen Brüseke, proIO

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
project resource management chapter-09.pdf
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Hybrid model detection and classification of lung cancer
PDF
Encapsulation theory and applications.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Tartificialntelligence_presentation.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
1 - Historical Antecedents, Social Consideration.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
project resource management chapter-09.pdf
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Hybrid model detection and classification of lung cancer
Encapsulation theory and applications.pdf
WOOl fibre morphology and structure.pdf for textiles
Encapsulation_ Review paper, used for researhc scholars
Tartificialntelligence_presentation.pptx
A Presentation on Artificial Intelligence
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Group 1 Presentation -Planning and Decision Making .pptx
NewMind AI Weekly Chronicles - August'25-Week II
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Chapter 5: Probability Theory and Statistics
OMC Textile Division Presentation 2021.pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Enhancing emotion recognition model for a student engagement use case through...
1 - Historical Antecedents, Social Consideration.pdf

Riak CS in Cloudstack

  • 1. Simple, Available Cloud Storage For Cloudstack
  • 3. On March 27, 2012 Basho announced a new product called Riak CS
  • 4. On September 5, 2012 BASHO joined Apache Cloudstack
  • 5. On March 20, 2013 Riak CS became open source
  • 6. Riak CS is... storage enterprise cloud built g in S3-compatibility on top fe r of o f multi-tenancy Riak per user reporting large object storage
  • 7. Enabling you to host your own PUBLIC & PRIVATE CLOUDS or…. Reliable Storage Behind Apps
  • 8. Basho's Commits @john_burwell 's contribution: S3-backed secondary storage feature in 4.1.0 Uses S3 to sync secondary storage across zones Long term: (shhhhhh!) Native S3 Support Federated authentication and authorization
  • 9. DataPipe blog.datapipe.com/datapipe-cloudstack “Riak CS provides the high-performance, distributed datastore we need to deliver a sound foundation for our cloud storage needs now and for many years into the future” - Ed Laczynski, VP Cloud Strategy, Datapipe.
  • 10. Yahoo! “Today, Yahoo! leverages Riak CS Enterprise to offer an S3-compatible public cloud storage service, as well as dedicated hosting options ... Yahoo! is highly supportive of open source software and we view Basho’s (OSS) announcement as a positive move that will work to accelerate its ability to innovate and ultimately strengthen our cloud platform.” - Shingo Saito, cloud product manager, Yahoo!
  • 12. Riak Dynamo-inspired key/value store Written in Erlang with C/C++ Open source under Apache 2 license Thousands of production deployments
  • 13. Riak High availability Low-latency Horizontal scalability Fault-tolerance Ops friendliness
  • 14. Riak Masterless • No master/slave or different roles • All nodes are equal • Write availability and scalability • All nodes can accept/route requests
  • 15. Riak No Sharding • Consistent hashing • Prevents “hot spots” • Lowers operational burden of scale • Data rebalanced automatically
  • 16. Riak Availability and Fault-Tolerance • Automatically replicates data • Read and write data during hardware failure and network partition • Hinted handoff
  • 19. 1 Riak CS node for every node of Riak
  • 20. Large Object 1. User uploads an object S3 Reporting S3 Reporting S3 Reporting S3 Reporting S3 Reporting API API API API API API API API API API Riak CS Riak CS Riak CS Riak CS Riak CS 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 1 MB 2. Riak CS 3. Riak CS Riak breaks object streams chunks Node into 1 MB chunks to Riak nodes Riak Riak Node Node Riak Riak 4. Riak replicates Node Node and stores chunks
  • 21. IC S S T BA EP C CON USERS multi-tenancy: Riak CS will track individual usage/stats users identified by users authenticated by access_key secret_key
  • 22. IC S S T BA EP C CON BUCKETS users create buckets. buckets are like folders. store objects in buckets. names are globally unique.
  • 23. IC S S T BA EP C CON OBJECTS stored in buckets. objects are opaque. store any file type.
  • 25. Riak CS Large Object Support • Started with 5GB / object • Now have multipart upload • Content agnostic
  • 26. Riak CS S3-Compatible API • Use existing S3 libraries and tools • RESTful operations • Multipart upload • S3-style ACLs for object/bucket permissions • S3 authentication scheme
  • 27. Riak CS Administration and Users • Interface for user creation, deletion, and credentials • Configure so only admins can create users
  • 28. Riak CS New Stuff in Riak 1.3 • Multipart upload: parts between 5MB and 5GB • Support for GET range queries • Restrict access to buckets based on source IP
  • 30. Riak CS Packages • Debian • Ubuntu • FreeBSD • Mac • Red Hat Enterprise • Fedora • SmartOS • Solaris • Source
  • 32. built-in stats & track access & storage per user inspect ops with DTrace DTrace probes monitor total support cluster ops
  • 33. OPERATIONAL STATS exposed via HTTP resource: /riak-cs/stats HISTOGRAMS & COUNTERS block bucket object LIST KEYS, CREATE, GET, PUT, DELETE GET, PUT, DELETE DELETE, GET/PUT ACL HEAD, GET/PUT ACL
  • 34. THE “USAGE” BUCKET TRACK INDIVIDUAL USER’S ACCESS STORAGE
  • 35. QUERY USAGE STATS Storage and access statistics tracked on per-user basis, as rollups for slices of time •Operations, Count, BytesIn, BytesOut, + system and user error •Objects, Bytes
  • 37. Multi-Datacenter Replication • For active backups, availability zones, disaster recovery, global traffic • Real-time or full-sync • 24/7 support • Per-node or storage-based pricing
  • 38. SIGN UP FOR AN ENTERPRISE DEVELOPER TRIAL basho.com http://docs.basho.com/
  • 39. Riak London A distributed systems meet/drink up www.meetup.com/riak-london

Editor's Notes

  • #2: Welcome/Intro
  • #3: Here ’ s the basics
  • #7: Very high level discussion, segue into brief discussion of Riak
  • #8: What you get is a platform on which you can host your own public and private clouds.
  • #9: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #10: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #11: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #13: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #14: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #15: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #16: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #17: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #19: a Riak CS stack is composed of 3 critical components. Riak CS exposes an API to the users and is responsible for logging/tracking stats. All the data is stored in and retrieved from Riak. Run multiple instances of Riak and Riak CS for scale. Theres a third component, a single instance of a piece of software called stanchion that is responsible for tying it all together. Stanchion in essence provides the S3-like behavior at an architectural level, ensures user and bucket uniqueness globally, etc....
  • #20: 1-to-1 pairing, and why.
  • #21: 1. user PUTs object into Riak CS. The request will be via an S3 API and signed by their credentials. 2. once authenticated, object is chunked (remind why this is important) 3. as object is chunked, chunks sent to Riak. (you can use haproxy in the middle here) 4. Riak stores the chunks, yay!
  • #22: Riak CS is multi-tenant. Each user is assigned an access_key and a secret_key. Users are authenticated by the system by signing requests using a combination of both keys. If the keys are valid, the requests will be allowed; else, denied. User details stored in “ user ” bucket, identified by access_key. Furthermore, every user ’ s activity will be tracked by Riak CS and stored for billing/metering purposes(more later)
  • #23: Objects are stored in buckets. Users ’ s can create and remove buckets as well as list their contents. Buckets are essentially a namespace, and are very much like folders. Bucket names must be globally unique, so if you have two users both try to create a bucket named “ kittens ” , whoever creates that bucket first will own it, etc.
  • #24: Put objects in buckets. Objects are chunked and replicated, but that all happens behind the scenes and not exposed to the user.
  • #26: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #27: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #28: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #29: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #30: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #31: You can think of Riak in many ways as a distributed filesystem. Riak is awesome because all nodes are equal. Has distribution protocols that allows incredibly straightforward scaling and even balance of load by tokenizing a huge keyspace and using consistent hashing, etc. This however is not conducive to large object storage because of latency and other network limitations when moving files around during topology changes. Also, developing on top of Riak is non-trivial, so interactions with the database can be a pain. Riak CS abstracts away both the complexity of interactions via a simple, S3-compatible API as well as uses Riak ’ s inherent functionality to provide a solution for large object storage.
  • #33: Riak CS provides stats on user activity and total cluster operations, as well as ships with DTrace probes you can use to inspect/debug a live system. So at any given time you can monitor a Riak CS cluster for both expected behavior and anomalies. From an administrative perspective, (as mentioned earlier) Riak CS will track each individual user ’ s activity, so that you can define usage limits and billing policies if necessary.
  • #34: Riak CS, just like Riak, uses Boundary ’ s Folsom stats library for monitoring cluster operations. These start when Riak CS starts, are not persisted to disk. Get stats with an HTTP request to /riak-cs/stats. You ’ ll get back counters and histograms that track the total number of operations performed on blocks, buckets and objects. For instance, see the total number of GET or PUT operations on objects in the Riak CS cluster. These stats are going to be most useful if you ’ re trying to diagnose unexpected behavior. Hopefully that ’ s never the case, but shit happens.
  • #35: RiakCS has a reserved namespace for tracking user activity. This is the “ usage ” bucket and is the foundation for metering and building custom billing policies in Riak CS. Every time a user performs an operation, RiakCS will store this data in an object in the usage bucket identified by that users ’ s access_key. You can configure the frequency with which these reports are persisted as well as the ability for user ’ s to request their own usage statistics.
  • #36: Limitations = no periods greater than 31 days
  • #38: Limitations = no periods greater than 31 days
  • #41: Introduce yourself