SlideShare a Scribd company logo
Radical NoSQL
                          Scalability
                                with




                                      @tlberglund
Tuesday, October 16, 12
Data Model


                                       for developers
Tuesday, October 16, 12
Column




Tuesday, October 16, 12
Column

                          full_name: “Tim Berglund”




Tuesday, October 16, 12
Column

                          full_name: “Tim Berglund”

                                          20120425T1832




Tuesday, October 16, 12
Column
                          Key/Value pair


                              full_name: “Tim Berglund”

                                              20120425T1832




Tuesday, October 16, 12
Column
                          Key/Value pair


                              full_name: “Tim Berglund”

                                              20120425T1832




                                                Timestamp


Tuesday, October 16, 12
Column




Tuesday, October 16, 12
Column

          ‣ Key-value pair




Tuesday, October 16, 12
Column

          ‣ Key-value pair
          ‣ Optionally typed




Tuesday, October 16, 12
Column

          ‣ Key-value pair
          ‣ Optionally typed
          ‣ Timestamped




Tuesday, October 16, 12
Column

          ‣ Key-value pair
          ‣ Optionally typed
          ‣ Timestamped
          ‣ Fundamental unit



Tuesday, October 16, 12
Row




Tuesday, October 16, 12
Row


                          column




Tuesday, October 16, 12
Row


                          column   column




Tuesday, October 16, 12
Row


                          column   column   column




Tuesday, October 16, 12
Row


                 row key   column   column   column




Tuesday, October 16, 12
Row


                             name:   bday:    role:
                tlberglund
                               Tim    06-15    teacher




Tuesday, October 16, 12
Row
           Row Key



                             name:   bday:    role:
                tlberglund
                               Tim    06-15    teacher




Tuesday, October 16, 12
Row
           Row Key



                             name:   bday:     role:
                tlberglund
                               Tim    06-15     teacher




                                     Columns


Tuesday, October 16, 12
Row


                             name:   bday:    role:
                tlberglund
                               Tim    06-15    teacher




Tuesday, October 16, 12
Row


                             bday:    name:   role:
                tlberglund
                              06-15     Tim    teacher




Tuesday, October 16, 12
Row


                                bday:    name:   role:
                tlberglund
                                 06-15     Tim    teacher


                          Sorted by UTF8Type comparator




Tuesday, October 16, 12
Table




Tuesday, October 16, 12
Table




Tuesday, October 16, 12
Table
                                name:   role:     status:
                          tim
                                  Tim   teacher     Cool




Tuesday, October 16, 12
Table
                                     name:     role:        status:
                            tim
                                       Tim     teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen




Tuesday, October 16, 12
Table
                                     name:     role:        status:
                            tim
                                       Tim     teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO




Tuesday, October 16, 12
Table
                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Table
                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Outer hash key
                                       Table
                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Outer hash key
                                       Table
                                                   Inner hash key

                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool


                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Outer hash key
                                       Table
                                                   Inner hash key

                                     name:      role:        status:
                            tim
                                       Tim      teacher        Cool
                                                                        Sparse
                                    name:      role:
                          kristen               marketing
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      role:        status:
                          matt
                                     Matt       founder      ubercool




Tuesday, October 16, 12
Database

                            Accounts

                           ClickStream

                              Orders

                          InventoryEvents




Tuesday, October 16, 12
Database

                                  Accounts

                                 ClickStream
                   Tables
                                    Orders

                                InventoryEvents




Tuesday, October 16, 12
Cluster

                            System
                           Database



                          Application
                           Database




Tuesday, October 16, 12
Secondary Indexes




Tuesday, October 16, 12
Secondary Indexes

          ‣ Ubiquitous in relational databases




Tuesday, October 16, 12
Secondary Indexes

          ‣ Ubiquitous in relational databases
          ‣ Supported in Cassandra, with
            qualifications




Tuesday, October 16, 12
Secondary Indexes
                                     name:      email:      role:
                            tim
                                       Tim       tb@a.com    teacher


                                    name:      email:
                          kristen               k@ds.com
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      email:    status:
                          matt
                                     Matt       m@ds.com  ubercool




Tuesday, October 16, 12
Secondary Indexes
                                     name:      email:      role:
                            tim
                                       Tim       tb@a.com    teacher


                                    name:      email:
                          kristen               k@ds.com
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      email:    status:
                          matt
                                     Matt       m@ds.com  ubercool




Tuesday, October 16, 12
Secondary Indexes
                                     name:      email:      role:
                            tim
                                       Tim       tb@a.com    teacher


                                    name:      email:
                          kristen               k@ds.com
                                     Kristen


                          billy      role:
                                      CEO


                                    name:      email:    status:
                          matt
                                     Matt       m@ds.com  ubercool




Tuesday, October 16, 12
Secondary Indexes




Tuesday, October 16, 12
Secondary Indexes
          ‣ In relational databases: performant for
            high cardinality




Tuesday, October 16, 12
Secondary Indexes
          ‣ In relational databases: performant for
            high cardinality
          ‣ In Cassandra: the reverse




Tuesday, October 16, 12
Secondary Indexes
          ‣ In relational databases: performant for
            high cardinality
          ‣ In Cassandra: the reverse
          ‣ Not suitable for lookup-by-email




Tuesday, October 16, 12
Secondary Indexes
          ‣ In relational databases: performant for
            high cardinality
          ‣ In Cassandra: the reverse
          ‣ Not suitable for lookup-by-email
          ‣ Suitable for lookup by: region code,
            gender, state, etc.

Tuesday, October 16, 12
Tuesday, October 16, 12
Why ?




Tuesday, October 16, 12
Why BigTable?




Tuesday, October 16, 12
Why BigTable?

                          Flexibility



Tuesday, October 16, 12
Why BigTable?

                            Flexibility
                          Performance


Tuesday, October 16, 12
Query Language


                                      for developers
Tuesday, October 16, 12
Tuesday, October 16, 12
CQL


Tuesday, October 16, 12
CQL
               (Cassandra Query Language)



Tuesday, October 16, 12
CREATE


Tuesday, October 16, 12
CREATE KEYSPACE




Tuesday, October 16, 12
CREATE KEYSPACE


     CREATE KEYSPACE DemoKeyspace
      WITH strategy_class='SimpleStrategy'
      AND strategy_options:replication_factor=1;




Tuesday, October 16, 12
CREATE TABLE




Tuesday, October 16, 12
CREATE TABLE

                          CREATE TABLE accounts
                           (KEY text PRIMARY KEY)
                             WITH comparator=text
                             AND default_validation=text;




Tuesday, October 16, 12
CREATE TABLE




Tuesday, October 16, 12
CREATE TABLE

                          CREATE TABLE accounts
                           (KEY text PRIMARY KEY,
                            name text,
                            email text,
                            signed_up_at timestamp)
                           WITH comparator=text;




Tuesday, October 16, 12
INSERT


Tuesday, October 16, 12
INSERT




Tuesday, October 16, 12
INSERT

                          INSERT INTO accounts
                            (KEY, name, email, signed_up_at)
                            VALUES
                            ('tlberglund',
                             'Tim Berglund',
                             'tlberglund@gmail.com',
                             '2012-04-25');




Tuesday, October 16, 12
INSERT




Tuesday, October 16, 12
INSERT

                          INSERT INTO events
                            (KEY, 0, 1, 2, 3, 4)
                            VALUES
                            ('2012-04-25T11:04:34-0700',
                             55.4, 56.2, 59.6, 65.3, 79)
                            USING CONSISTENCY QUORUM
                            AND TTL 86400;




Tuesday, October 16, 12
SELECT


Tuesday, October 16, 12
SELECT




Tuesday, October 16, 12
SELECT


                          SELECT *
                            FROM accounts
                            WHERE KEY='tlberglund';




Tuesday, October 16, 12
SELECT




Tuesday, October 16, 12
SELECT

                  SELECT 1..3
                    FROM events
                    WHERE KEY='2012-04-25T11:04:34-0700';




Tuesday, October 16, 12
SELECT




Tuesday, October 16, 12
SELECT

                          SELECT *
                            FROM accounts
                            WHERE KEY='tlberglund'
                            USING CONSISTENCY ONE;




Tuesday, October 16, 12
UPDATE


Tuesday, October 16, 12
UPDATE




Tuesday, October 16, 12
UPDATE

     UPDATE accounts
       SET last_login='2012-04-25T09:37:35-0700'
       WHERE KEY='tlberglund';




Tuesday, October 16, 12
Distribution Model


Tuesday, October 16, 12
Hash Ring


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
Writing a Key


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000
                name: Tim
                               E000          2000




                            C000                4000




                               A000          6000


                                      8000


Tuesday, October 16, 12
0000
                3D97: Tim
                               E000          2000




                            C000                4000




                               A000          6000


                                      8000


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000
                                             3D97: Tim




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000
            role: Teacher
                               E000          2000




                            C000                4000




                               A000          6000


                                      8000


Tuesday, October 16, 12
0000
          9C4F: Teacher
                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000


                             E000                 2000




                          C000                       4000




                              A000
                           9C4F: Teacher          6000


                                           8000


Tuesday, October 16, 12
Reading a Key


Tuesday, October 16, 12
0000

             3D97?
                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000

             3D97?
                             E000          2000
           name: Tim


                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000

             9C4F?
                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000

             9C4F?
                             E000          2000
      role: Teacher


                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
Replication


Tuesday, October 16, 12
Replication




Tuesday, October 16, 12
Replication


                          Replication factor (N)




Tuesday, October 16, 12
Replication


                          Replication factor (N)
         Pluggable placement strategies



Tuesday, October 16, 12
Replication Strategies




Tuesday, October 16, 12
Replication Strategies


                          Simple




Tuesday, October 16, 12
Replication Strategies


                                  Simple
                          Network Topology Aware



Tuesday, October 16, 12
Simple Strategy
                                        0000


                                 E000          2000




                              C000                4000


                    N=3
                                 A000          6000


                                        8000

Tuesday, October 16, 12
Simple Strategy
                                        0000
             3D97: Tim
                                 E000          2000




                              C000                4000


                    N=3
                                 A000          6000


                                        8000

Tuesday, October 16, 12
Simple Strategy
                                        0000


                                 E000          2000




                              C000                4000
                                                 3D97: Tim




                    N=3
                                 A000          6000


                                        8000

Tuesday, October 16, 12
Simple Strategy
                                        0000


                                 E000            2000




                              C000                   4000
                                                    3D97: Tim




                    N=3
                                 A000           6000
                                               3D97: Tim




                                        8000

Tuesday, October 16, 12
Simple Strategy
                                          0000


                                 E000                 2000




                              C000                        4000
                                                         3D97: Tim




                    N=3
                                 A000                6000
                                                    3D97: Tim




                                          8000
                                        3D97: Tim


Tuesday, October 16, 12
Topology Aware
                             DC1    DC2
                             2000   0000



                             6000   4000



                             A000   8000



                             E000   C000




Tuesday, October 16, 12
Topology Aware
                             DC1    DC2
                             2000   0000



                             6000   4000



                             A000   8000



                             E000   C000




Tuesday, October 16, 12
Topology Aware
               3D97: Tim        DC1     DC2
                                 2000   0000



                                 6000   4000



                                 A000   8000



                                 E000   C000




Tuesday, October 16, 12
Topology Aware
                                DC1     DC2
                                 2000     0000



                                 6000     4000
                                        3D97: Tim




                                 A000     8000



                                 E000     C000




Tuesday, October 16, 12
Topology Aware
                                 DC1        DC2
                                 2000         0000



                                 6000
                                3D97: Tim     4000
                                            3D97: Tim




                                 A000         8000



                                 E000         C000




Tuesday, October 16, 12
Topology Aware
                                 DC1        DC2
                                 2000         0000



                                 6000
                                3D97: Tim     4000
                                            3D97: Tim




                                 A000         8000
                                            3D97: Tim




                                 E000         C000




Tuesday, October 16, 12
Topology Aware
                                 DC1        DC2
                                 2000         0000



                                 6000
                                3D97: Tim     4000
                                            3D97: Tim




                                 A000
                                3D97: Tim     8000
                                            3D97: Tim




                                 E000         C000




Tuesday, October 16, 12
Writing


Tuesday, October 16, 12
Client Connections


Tuesday, October 16, 12
Client Connections
                                           0000


                                    E000          2000




                                 C000                4000


                Client Machine

                                    A000          6000


                                           8000




Tuesday, October 16, 12
Client Connections
                                               0000


                                        E000          2000




                Client Machine
                                 ?   C000                4000




                                        A000          6000


                                               8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000




                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          3D97?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          3D97?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          3D97?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          3D97?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000




                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          9C4F?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          9C4F?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          9C4F?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Client Connections
                                                    0000


                                             E000          2000


                          9C4F?
                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Load Balancer




Tuesday, October 16, 12
Load Balancer
                            Hardware VIP




Tuesday, October 16, 12
Load Balancer
                            Hardware VIP
                              HAProxy




Tuesday, October 16, 12
Load Balancer
                            Hardware VIP
                              HAProxy
                          Round-robin DNS



Tuesday, October 16, 12
Load Balancer
                            Hardware VIP
                              HAProxy
                          Round-robin DNS
                             Client-side


Tuesday, October 16, 12
Load Balancer
                            Hardware VIP
                              HAProxy
                          Round-robin DNS
                             Client-side
                            (Hector does this)


Tuesday, October 16, 12
Write Consistency


Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          2000




                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                      0000


                                               E000          2000




                                            C000                4000
     Client Machine


                                               A000          6000


                          Load Balancer               8000

                                          14C7

Tuesday, October 16, 12
Write Consistency
                                                       0000


                          Coordinator           E000          2000




                                             C000                4000
     Client Machine


                                                A000          6000


                           Load Balancer               8000

                                           14C7

Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                          C000                4000
     Client Machine


                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                          C000                4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                                           X
                                          C000                 4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                                           X
                                          C000                 4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                                           X
                                          C000                 4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




Tuesday, October 16, 12
Write Consistency
                                                    0000


                                             E000          ----




                                                           X
                                          C000                 4000
     Client Machine
                                 14C7
                                             A000          6000


                          Load Balancer             8000




                            Coordinator
                          stores a “hint”
Tuesday, October 16, 12
Write Consistency




Tuesday, October 16, 12
Write Consistency

          ‣ ANY
                 At least one node (hinted handoffs allowed)




Tuesday, October 16, 12
Write Consistency

          ‣ ANY
                 At least one node (hinted handoffs allowed)
          ‣ ONE
                 At least one node (no hinted handoffs)




Tuesday, October 16, 12
Write Consistency

          ‣ ANY
                 At least one node (hinted handoffs allowed)
          ‣ ONE
                 At least one node (no hinted handoffs)
          ‣ QUORUM
                 (N/2)+1 nodes



Tuesday, October 16, 12
Write Consistency




Tuesday, October 16, 12
Write Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone




Tuesday, October 16, 12
Write Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone
          ‣ EACH_QUORUM
                 (N/2)+1 nodes in all availability zones




Tuesday, October 16, 12
Write Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone
          ‣ EACH_QUORUM
                 (N/2)+1 nodes in all availability zones
          ‣ ALL
                 Write successfully to all replicas



Tuesday, October 16, 12
Reading


Tuesday, October 16, 12
Read Consistency
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                            9C4F?
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                            9C4F?
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim   TODAY




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim   TODAY


                                            Tim   TODAY
            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim   TODAY


                                            Tim   TODAY
            C000                     4000

                                            Tim   TODAY


                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                  Coordinator
                                                 9C4F?
                           0000


                    E000             2000       Tim   TODAY


                                                Tim   TODAY
            C000                        4000

                                                Tim   TODAY


                    A000             6000


                           8000




Tuesday, October 16, 12
Read Consistency
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                            9C4F?
                           0000


                    E000          2000




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim   TODAY




            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim    TODAY


                                            Jim   YESTERDAY

            C000                     4000




                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                             9C4F?
                           0000


                    E000          2000      Tim    TODAY


                                            Jim   YESTERDAY

            C000                     4000

                                            Tim    TODAY


                    A000          6000


                           8000




Tuesday, October 16, 12
Read Consistency
                                                 9C4F?
                           0000


                    E000          2000       Tim    TODAY


                                             Jim   YESTERDAY

            C000                     4000

                                             Tim    TODAY


                    A000          6000


                           8000
                                  Inconsistent

Tuesday, October 16, 12
Passive Read Repair




Tuesday, October 16, 12
Passive Read Repair
          ‣ Initiated by coordinator




Tuesday, October 16, 12
Passive Read Repair
          ‣ Initiated by coordinator
          ‣ Cleans up entropy in a single row




Tuesday, October 16, 12
Passive Read Repair
          ‣ Initiated by coordinator
          ‣ Cleans up entropy in a single row
          ‣ Happens regardless of consistency
            level




Tuesday, October 16, 12
Passive Read Repair
          ‣ Initiated by coordinator
          ‣ Cleans up entropy in a single row
          ‣ Happens regardless of consistency
            level
          ‣ Just reading the database reduces its
            entropy

Tuesday, October 16, 12
Read Consistency




Tuesday, October 16, 12
Read Consistency

          ‣ ONE
                 Get response from the closest replica




Tuesday, October 16, 12
Read Consistency

          ‣ ONE
                 Get response from the closest replica
          ‣ QUORUM
                 Get (N/2)+1 nodes, return most recent
                 timestamp




Tuesday, October 16, 12
Read Consistency




Tuesday, October 16, 12
Read Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone




Tuesday, October 16, 12
Read Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone
          ‣ EACH_QUORUM
                 (N/2)+1 nodes in all availability zones




Tuesday, October 16, 12
Read Consistency

          ‣ LOCAL_QUORUM
                 (N/2)+1 nodes in this availability zone
          ‣ EACH_QUORUM
                 (N/2)+1 nodes in all availability zones
          ‣ ALL
                 Wait for all replicas to respond



Tuesday, October 16, 12
Tuesday, October 16, 12
But what
                          about Column
                            Families?




Tuesday, October 16, 12
Replication For Real




Tuesday, October 16, 12
Replication For Real

                          Rows are replicated




Tuesday, October 16, 12
Replication For Real

                          Rows are replicated
                           “Key” is row key




Tuesday, October 16, 12
Replication For Real

                           Rows are replicated
                             “Key” is row key
                          “Value” is the row data



Tuesday, October 16, 12
Replication For Real

                            Rows are replicated
                              “Key” is row key
                          “Value” is the row data
                          Implications for row size

Tuesday, October 16, 12
Gossip


Tuesday, October 16, 12
Gossip




Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale




Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale
          ‣ Cluster still needs to know its state




Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale
          ‣ Cluster still needs to know its state
          ‣ Computes a real-valued “suspicion”
            for each node




Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale
          ‣ Cluster still needs to know its state
          ‣ Computes a real-valued “suspicion”
            for each node
          ‣ Probabilistic



Tuesday, October 16, 12
Gossip
          ‣ Naive heartbeats don’t scale
          ‣ Cluster still needs to know its state
          ‣ Computes a real-valued “suspicion”
            for each node
          ‣ Probabilistic
          ‣ Just like the real thing

Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000


                             E000                    2000


                                  How are
                                 you, E000?


                          C000                          4000




                             A000                    6000


                                              8000


Tuesday, October 16, 12
0000


                             E000                2000

                                    I'm cool.




                          C000                      4000




                             A000                6000


                                          8000


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000


                             E000                    2000




                          C000                          4000
                                 How about
                                 you, A000?




                             A000                    6000


                                              8000


Tuesday, October 16, 12
0000


                             E000                   2000




                          C000                         4000


                                    Oh, I'm fine.



                             A000                   6000


                                             8000


Tuesday, October 16, 12
0000


                             E000                      2000

                                    I trust
                                 e000. Not so
                                  sure about
                                   A000...

                          C000                            4000




                             A000                      6000


                                                8000


Tuesday, October 16, 12
0000


                             E000          2000




                          C000                4000




                             A000          6000


                                    8000


Tuesday, October 16, 12
0000


                             E000   C000, what
                                                 2000
                                      do you
                                      know?




                          C000                      4000




                             A000                6000


                                        8000


Tuesday, October 16, 12
0000


                             E000                       2000



                                    I'm Great!


                          C000                             4000




                             A000                       6000


                                                 8000


Tuesday, October 16, 12
0000


                             E000                     2000


                                   E000 is
                                 doing well.

                          C000                           4000




                             A000                     6000


                                               8000


Tuesday, October 16, 12
0000


                             E000                     2000


                                     Poor A000 is
                                    having trouble
                                        lately.

                          C000                           4000




                             A000                     6000


                                               8000


Tuesday, October 16, 12
0000


                             E000                 2000
                                    Hmmm, so...




                          C000                       4000




                             A000                 6000


                                       8000


Tuesday, October 16, 12
0000


                             E000            2000
                                    ✔ C000
                                    ✔ E000
                                    ✘ A000




                          C000                  4000




                             A000            6000


                                    8000


Tuesday, October 16, 12
Gossip Config




Tuesday, October 16, 12
Gossip Config

          ‣ A new node needs “seed nodes”




Tuesday, October 16, 12
Gossip Config

          ‣ A new node needs “seed nodes”
          ‣ Seed nodes configured in
            $CASSANDRA_HOME/conf/
            cassandra.yaml



Tuesday, October 16, 12
Storage


Tuesday, October 16, 12
Storage Engine




Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”




Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”
          ‣ All writes are sequential




Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”
          ‣ All writes are sequential
          ‣ All writes are immutable




Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”
          ‣ All writes are sequential
          ‣ All writes are immutable
          ‣ Designed to avoid seeks



Tuesday, October 16, 12
Storage Engine
          ‣ “Log-structured storage”
          ‣ All writes are sequential
          ‣ All writes are immutable
          ‣ Designed to avoid seeks
          ‣ Writes are faster than reads


Tuesday, October 16, 12
Write Sequence




Tuesday, October 16, 12
Write Sequence
              Write from            Commit
              Coordinator             Log




Tuesday, October 16, 12
Write Sequence
              Write from            Commit
              Coordinator             Log




                                    Memtable



Tuesday, October 16, 12
Write Sequence
              Write from                             Commit
              Coordinator                              Log




                SSTable(s)                           Memtable

                             (lots of tuning here)

Tuesday, October 16, 12
Write Sequence
              Write from                              Commit
              Coordinator                               Log


                            Compaction
                                 (more tuning here)




                SSTable(s)                            Memtable

                              (lots of tuning here)

Tuesday, October 16, 12
Commit Log




Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first




Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first
          ‣ Commit succeeds before node
            reports successful write




Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first
          ‣ Commit succeeds before node
            reports successful write
          ‣ Append-only, sequential writes




Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first
          ‣ Commit succeeds before node
            reports successful write
          ‣ Append-only, sequential writes
          ‣ One per server



Tuesday, October 16, 12
Commit Log
          ‣ Writes go here first
          ‣ Commit succeeds before node
            reports successful write
          ‣ Append-only, sequential writes
          ‣ One per server
          ‣ Good to have a dedicated spindle

Tuesday, October 16, 12
MemTable




Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure




Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure
          ‣ One per column family




Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure
          ‣ One per column family
          ‣ Holds most recent row changes




Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure
          ‣ One per column family
          ‣ Holds most recent row changes
          ‣ Tunable memory use



Tuesday, October 16, 12
MemTable
          ‣ An in-memory structure
          ‣ One per column family
          ‣ Holds most recent row changes
          ‣ Tunable memory use
          ‣ Flushed to disk when “full”


Tuesday, October 16, 12
SSTable




Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here




Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here
          ‣ Many SSTable files per column family




Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here
          ‣ Many SSTable files per column family
          ‣ Every SSTable is immutable




Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here
          ‣ Many SSTable files per column family
          ‣ Every SSTable is immutable
          ‣ SSTables are accessed during reads



Tuesday, October 16, 12
SSTable
          ‣ Memtables flushed to disk here
          ‣ Many SSTable files per column family
          ‣ Every SSTable is immutable
          ‣ SSTables are accessed during reads
          ‣ Must be compacted


Tuesday, October 16, 12
Read Sequence

                              Read from
                             Coordinator




Tuesday, October 16, 12
Read Sequence

                              Memtable




Tuesday, October 16, 12
Read Sequence

                              Memtable


           Are all columns here?



Tuesday, October 16, 12
Read Sequence

                                  Memtable


           Are all columns here?
                          YES: stop and return result.



Tuesday, October 16, 12
Read Sequence

                                  Memtable


           Are all columns here?
                          YES: stop and return result.
                          NO: continue.

Tuesday, October 16, 12
Read Sequence
                               Newest
                               SSTable




Tuesday, October 16, 12
Read Sequence
                               Newest
                               SSTable


           Are all columns here?



Tuesday, October 16, 12
Read Sequence
                                   Newest
                                   SSTable


           Are all columns here?
                          YES: stop and return result.



Tuesday, October 16, 12
Read Sequence
                                   Newest
                                   SSTable


           Are all columns here?
                          YES: stop and return result.
                          NO: continue.

Tuesday, October 16, 12
Read Sequence
                              Next Oldest
                                SSTable




Tuesday, October 16, 12
Read Sequence
                              Next Oldest
                                SSTable


        How about now?



Tuesday, October 16, 12
Read Sequence
                                 Next Oldest
                                   SSTable


        How about now?
                          YES: great!



Tuesday, October 16, 12
Read Sequence
                                 Next Oldest
                                   SSTable


        How about now?
                          YES: great!
                          NO: keep looking...

Tuesday, October 16, 12
Read Sequence

                              And so on.




Tuesday, October 16, 12
Read Sequence

                                And so on.



                          Doesn’t this get old?


Tuesday, October 16, 12
Read Sequence




Tuesday, October 16, 12
Read Sequence

          ‣ Check Memtable first




Tuesday, October 16, 12
Read Sequence

          ‣ Check Memtable first
          ‣ Read SSTables from newest to oldest




Tuesday, October 16, 12
Read Sequence

          ‣ Check Memtable first
          ‣ Read SSTables from newest to oldest
          ‣ Bloom filters prevent most reads




Tuesday, October 16, 12
Read Sequence

          ‣ Check Memtable first
          ‣ Read SSTables from newest to oldest
          ‣ Bloom filters prevent most reads
          ‣ Compaction shrinks number of files



Tuesday, October 16, 12
Compaction


Tuesday, October 16, 12
Compaction




Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one




Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one
          ‣ Performed in the background




Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one
          ‣ Performed in the background
          ‣ Node still operates




Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one
          ‣ Performed in the background
          ‣ Node still operates
          ‣ Requires extra disk space



Tuesday, October 16, 12
Compaction
          ‣ Combine many SSTables into one
          ‣ Performed in the background
          ‣ Node still operates
          ‣ Requires extra disk space
          ‣ Three tunable varieties


Tuesday, October 16, 12
Compaction




Tuesday, October 16, 12
Compaction
          ‣ Major
                 All SSTables are merged into one clean one




Tuesday, October 16, 12
Compaction
          ‣ Major
                 All SSTables are merged into one clean one
          ‣ Minor
                 Similarly-sized SSTables are merged together
                 after reaching a threshold




Tuesday, October 16, 12
Compaction
          ‣ Major
                 All SSTables are merged into one clean one
          ‣ Minor
                 Similarly-sized SSTables are merged together
                 after reaching a threshold
          ‣ Leveled
                 http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra




Tuesday, October 16, 12
http://www.datastax.com/
                   products/community




Tuesday, October 16, 12
http://www.datastax.com/
                   products/community

                                         t hi s!
                               n lo ad
                           D ow


Tuesday, October 16, 12
Tim Berglund
    tlberglund@github.com
    @tlberglund


   Thank You
Tuesday, October 16, 12

More Related Content

PPTX
Cassandra ppt 2
PDF
Nuove professioni dell'editoria
PDF
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...
PDF
Cassandra datamodel
PDF
Apache Cassandra
PDF
DataStax: Backup and Restore in Cassandra and OpsCenter
PDF
Introduction to Cassandra & Data model
PDF
Cassandra 0.7, Los Angeles High Scalability Group
Cassandra ppt 2
Nuove professioni dell'editoria
Cassandra Day Chicago 2015: Building Your First Application with Apache Cassa...
Cassandra datamodel
Apache Cassandra
DataStax: Backup and Restore in Cassandra and OpsCenter
Introduction to Cassandra & Data model
Cassandra 0.7, Los Angeles High Scalability Group

Viewers also liked (14)

PPTX
Apache Cassandra, part 2 – data model example, machinery
PDF
Cassandra Data Modeling
PPTX
Cassandra Data Modeling - Practical Considerations @ Netflix
PPTX
Cassandra into
PPT
Cassandra Data Model
PPTX
Webinar | Introducing DataStax Enterprise 4.6
PPTX
Data Modeling Basics for the Cloud with DataStax
PDF
Cassandra Community Webinar | The World's Next Top Data Model
PDF
Overview of DataStax OpsCenter
PPTX
Webinar | Target Modernizes Retail with Engaging Digital Experiences
PDF
Cassandra
PDF
Advanced data modeling with apache cassandra
PDF
Cassandra at NoSql Matters 2012
PPTX
Cassandra Performance and Scalability on AWS
Apache Cassandra, part 2 – data model example, machinery
Cassandra Data Modeling
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra into
Cassandra Data Model
Webinar | Introducing DataStax Enterprise 4.6
Data Modeling Basics for the Cloud with DataStax
Cassandra Community Webinar | The World's Next Top Data Model
Overview of DataStax OpsCenter
Webinar | Target Modernizes Retail with Engaging Digital Experiences
Cassandra
Advanced data modeling with apache cassandra
Cassandra at NoSql Matters 2012
Cassandra Performance and Scalability on AWS
Ad

More from JAX London (20)

PDF
Everything I know about software in spaghetti bolognese: managing complexity
PDF
Devops with the S for Sharing - Patrick Debois
PPT
Busy Developer's Guide to Windows 8 HTML/JavaScript Apps
PDF
It's code but not as we know: Infrastructure as Code - Patrick Debois
KEY
Locks? We Don't Need No Stinkin' Locks - Michael Barker
PDF
Worse is better, for better or for worse - Kevlin Henney
PDF
Java performance: What's the big deal? - Trisha Gee
PDF
Clojure made-simple - John Stevenson
PDF
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
PDF
Play framework 2 : Peter Hilton
PDF
Complexity theory and software development : Tim Berglund
PDF
Why FLOSS is a Java developer's best friend: Dave Gruber
PDF
Akka in Action: Heiko Seeburger
PDF
NoSQL Smackdown 2012 : Tim Berglund
PDF
Closures, the next "Big Thing" in Java: Russel Winder
KEY
Java and the machine - Martijn Verburg and Kirk Pepperdine
PDF
Mongo DB on the JVM - Brendan McAdams
PDF
New opportunities for connected data - Ian Robinson
PDF
HTML5 Websockets and Java - Arun Gupta
PDF
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
Everything I know about software in spaghetti bolognese: managing complexity
Devops with the S for Sharing - Patrick Debois
Busy Developer's Guide to Windows 8 HTML/JavaScript Apps
It's code but not as we know: Infrastructure as Code - Patrick Debois
Locks? We Don't Need No Stinkin' Locks - Michael Barker
Worse is better, for better or for worse - Kevlin Henney
Java performance: What's the big deal? - Trisha Gee
Clojure made-simple - John Stevenson
HTML alchemy: the secrets of mixing JavaScript and Java EE - Matthias Wessendorf
Play framework 2 : Peter Hilton
Complexity theory and software development : Tim Berglund
Why FLOSS is a Java developer's best friend: Dave Gruber
Akka in Action: Heiko Seeburger
NoSQL Smackdown 2012 : Tim Berglund
Closures, the next "Big Thing" in Java: Russel Winder
Java and the machine - Martijn Verburg and Kirk Pepperdine
Mongo DB on the JVM - Brendan McAdams
New opportunities for connected data - Ian Robinson
HTML5 Websockets and Java - Arun Gupta
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
Ad

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Empathic Computing: Creating Shared Understanding
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Electronic commerce courselecture one. Pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
20250228 LYD VKU AI Blended-Learning.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
“AI and Expert System Decision Support & Business Intelligence Systems”
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Empathic Computing: Creating Shared Understanding
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Monthly Chronicles - July 2025
NewMind AI Weekly Chronicles - August'25 Week I
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Encapsulation_ Review paper, used for researhc scholars
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Building Integrated photovoltaic BIPV_UPV.pdf
Electronic commerce courselecture one. Pdf

Radical NoSQL Scalability with Cassandra - Tim Berglund

  • 1. Radical NoSQL Scalability with @tlberglund Tuesday, October 16, 12
  • 2. Data Model for developers Tuesday, October 16, 12
  • 4. Column full_name: “Tim Berglund” Tuesday, October 16, 12
  • 5. Column full_name: “Tim Berglund” 20120425T1832 Tuesday, October 16, 12
  • 6. Column Key/Value pair full_name: “Tim Berglund” 20120425T1832 Tuesday, October 16, 12
  • 7. Column Key/Value pair full_name: “Tim Berglund” 20120425T1832 Timestamp Tuesday, October 16, 12
  • 9. Column ‣ Key-value pair Tuesday, October 16, 12
  • 10. Column ‣ Key-value pair ‣ Optionally typed Tuesday, October 16, 12
  • 11. Column ‣ Key-value pair ‣ Optionally typed ‣ Timestamped Tuesday, October 16, 12
  • 12. Column ‣ Key-value pair ‣ Optionally typed ‣ Timestamped ‣ Fundamental unit Tuesday, October 16, 12
  • 14. Row column Tuesday, October 16, 12
  • 15. Row column column Tuesday, October 16, 12
  • 16. Row column column column Tuesday, October 16, 12
  • 17. Row row key column column column Tuesday, October 16, 12
  • 18. Row name: bday: role: tlberglund Tim 06-15 teacher Tuesday, October 16, 12
  • 19. Row Row Key name: bday: role: tlberglund Tim 06-15 teacher Tuesday, October 16, 12
  • 20. Row Row Key name: bday: role: tlberglund Tim 06-15 teacher Columns Tuesday, October 16, 12
  • 21. Row name: bday: role: tlberglund Tim 06-15 teacher Tuesday, October 16, 12
  • 22. Row bday: name: role: tlberglund 06-15 Tim teacher Tuesday, October 16, 12
  • 23. Row bday: name: role: tlberglund 06-15 Tim teacher Sorted by UTF8Type comparator Tuesday, October 16, 12
  • 26. Table name: role: status: tim Tim teacher Cool Tuesday, October 16, 12
  • 27. Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen Tuesday, October 16, 12
  • 28. Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO Tuesday, October 16, 12
  • 29. Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 30. Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 31. Outer hash key Table name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 32. Outer hash key Table Inner hash key name: role: status: tim Tim teacher Cool name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 33. Outer hash key Table Inner hash key name: role: status: tim Tim teacher Cool Sparse name: role: kristen marketing Kristen billy role: CEO name: role: status: matt Matt founder ubercool Tuesday, October 16, 12
  • 34. Database Accounts ClickStream Orders InventoryEvents Tuesday, October 16, 12
  • 35. Database Accounts ClickStream Tables Orders InventoryEvents Tuesday, October 16, 12
  • 36. Cluster System Database Application Database Tuesday, October 16, 12
  • 38. Secondary Indexes ‣ Ubiquitous in relational databases Tuesday, October 16, 12
  • 39. Secondary Indexes ‣ Ubiquitous in relational databases ‣ Supported in Cassandra, with qualifications Tuesday, October 16, 12
  • 40. Secondary Indexes name: email: role: tim Tim tb@a.com teacher name: email: kristen k@ds.com Kristen billy role: CEO name: email: status: matt Matt m@ds.com ubercool Tuesday, October 16, 12
  • 41. Secondary Indexes name: email: role: tim Tim tb@a.com teacher name: email: kristen k@ds.com Kristen billy role: CEO name: email: status: matt Matt m@ds.com ubercool Tuesday, October 16, 12
  • 42. Secondary Indexes name: email: role: tim Tim tb@a.com teacher name: email: kristen k@ds.com Kristen billy role: CEO name: email: status: matt Matt m@ds.com ubercool Tuesday, October 16, 12
  • 44. Secondary Indexes ‣ In relational databases: performant for high cardinality Tuesday, October 16, 12
  • 45. Secondary Indexes ‣ In relational databases: performant for high cardinality ‣ In Cassandra: the reverse Tuesday, October 16, 12
  • 46. Secondary Indexes ‣ In relational databases: performant for high cardinality ‣ In Cassandra: the reverse ‣ Not suitable for lookup-by-email Tuesday, October 16, 12
  • 47. Secondary Indexes ‣ In relational databases: performant for high cardinality ‣ In Cassandra: the reverse ‣ Not suitable for lookup-by-email ‣ Suitable for lookup by: region code, gender, state, etc. Tuesday, October 16, 12
  • 51. Why BigTable? Flexibility Tuesday, October 16, 12
  • 52. Why BigTable? Flexibility Performance Tuesday, October 16, 12
  • 53. Query Language for developers Tuesday, October 16, 12
  • 56. CQL (Cassandra Query Language) Tuesday, October 16, 12
  • 59. CREATE KEYSPACE CREATE KEYSPACE DemoKeyspace WITH strategy_class='SimpleStrategy' AND strategy_options:replication_factor=1; Tuesday, October 16, 12
  • 61. CREATE TABLE CREATE TABLE accounts (KEY text PRIMARY KEY) WITH comparator=text AND default_validation=text; Tuesday, October 16, 12
  • 63. CREATE TABLE CREATE TABLE accounts (KEY text PRIMARY KEY, name text, email text, signed_up_at timestamp) WITH comparator=text; Tuesday, October 16, 12
  • 66. INSERT INSERT INTO accounts (KEY, name, email, signed_up_at) VALUES ('tlberglund', 'Tim Berglund', 'tlberglund@gmail.com', '2012-04-25'); Tuesday, October 16, 12
  • 68. INSERT INSERT INTO events (KEY, 0, 1, 2, 3, 4) VALUES ('2012-04-25T11:04:34-0700', 55.4, 56.2, 59.6, 65.3, 79) USING CONSISTENCY QUORUM AND TTL 86400; Tuesday, October 16, 12
  • 71. SELECT SELECT * FROM accounts WHERE KEY='tlberglund'; Tuesday, October 16, 12
  • 73. SELECT SELECT 1..3 FROM events WHERE KEY='2012-04-25T11:04:34-0700'; Tuesday, October 16, 12
  • 75. SELECT SELECT * FROM accounts WHERE KEY='tlberglund' USING CONSISTENCY ONE; Tuesday, October 16, 12
  • 78. UPDATE UPDATE accounts SET last_login='2012-04-25T09:37:35-0700' WHERE KEY='tlberglund'; Tuesday, October 16, 12
  • 81. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 82. Writing a Key Tuesday, October 16, 12
  • 83. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 84. 0000 name: Tim E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 85. 0000 3D97: Tim E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 86. 0000 E000 2000 C000 4000 3D97: Tim A000 6000 8000 Tuesday, October 16, 12
  • 87. 0000 role: Teacher E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 88. 0000 9C4F: Teacher E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 89. 0000 E000 2000 C000 4000 A000 9C4F: Teacher 6000 8000 Tuesday, October 16, 12
  • 90. Reading a Key Tuesday, October 16, 12
  • 91. 0000 3D97? E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 92. 0000 3D97? E000 2000 name: Tim C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 93. 0000 9C4F? E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 94. 0000 9C4F? E000 2000 role: Teacher C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 97. Replication Replication factor (N) Tuesday, October 16, 12
  • 98. Replication Replication factor (N) Pluggable placement strategies Tuesday, October 16, 12
  • 100. Replication Strategies Simple Tuesday, October 16, 12
  • 101. Replication Strategies Simple Network Topology Aware Tuesday, October 16, 12
  • 102. Simple Strategy 0000 E000 2000 C000 4000 N=3 A000 6000 8000 Tuesday, October 16, 12
  • 103. Simple Strategy 0000 3D97: Tim E000 2000 C000 4000 N=3 A000 6000 8000 Tuesday, October 16, 12
  • 104. Simple Strategy 0000 E000 2000 C000 4000 3D97: Tim N=3 A000 6000 8000 Tuesday, October 16, 12
  • 105. Simple Strategy 0000 E000 2000 C000 4000 3D97: Tim N=3 A000 6000 3D97: Tim 8000 Tuesday, October 16, 12
  • 106. Simple Strategy 0000 E000 2000 C000 4000 3D97: Tim N=3 A000 6000 3D97: Tim 8000 3D97: Tim Tuesday, October 16, 12
  • 107. Topology Aware DC1 DC2 2000 0000 6000 4000 A000 8000 E000 C000 Tuesday, October 16, 12
  • 108. Topology Aware DC1 DC2 2000 0000 6000 4000 A000 8000 E000 C000 Tuesday, October 16, 12
  • 109. Topology Aware 3D97: Tim DC1 DC2 2000 0000 6000 4000 A000 8000 E000 C000 Tuesday, October 16, 12
  • 110. Topology Aware DC1 DC2 2000 0000 6000 4000 3D97: Tim A000 8000 E000 C000 Tuesday, October 16, 12
  • 111. Topology Aware DC1 DC2 2000 0000 6000 3D97: Tim 4000 3D97: Tim A000 8000 E000 C000 Tuesday, October 16, 12
  • 112. Topology Aware DC1 DC2 2000 0000 6000 3D97: Tim 4000 3D97: Tim A000 8000 3D97: Tim E000 C000 Tuesday, October 16, 12
  • 113. Topology Aware DC1 DC2 2000 0000 6000 3D97: Tim 4000 3D97: Tim A000 3D97: Tim 8000 3D97: Tim E000 C000 Tuesday, October 16, 12
  • 116. Client Connections 0000 E000 2000 C000 4000 Client Machine A000 6000 8000 Tuesday, October 16, 12
  • 117. Client Connections 0000 E000 2000 Client Machine ? C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 118. Client Connections 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 119. Client Connections 0000 E000 2000 3D97? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 120. Client Connections 0000 E000 2000 3D97? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 121. Client Connections 0000 E000 2000 3D97? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 122. Client Connections 0000 E000 2000 3D97? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 123. Client Connections 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 124. Client Connections 0000 E000 2000 9C4F? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 125. Client Connections 0000 E000 2000 9C4F? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 126. Client Connections 0000 E000 2000 9C4F? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 127. Client Connections 0000 E000 2000 9C4F? C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 129. Load Balancer Hardware VIP Tuesday, October 16, 12
  • 130. Load Balancer Hardware VIP HAProxy Tuesday, October 16, 12
  • 131. Load Balancer Hardware VIP HAProxy Round-robin DNS Tuesday, October 16, 12
  • 132. Load Balancer Hardware VIP HAProxy Round-robin DNS Client-side Tuesday, October 16, 12
  • 133. Load Balancer Hardware VIP HAProxy Round-robin DNS Client-side (Hector does this) Tuesday, October 16, 12
  • 135. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 136. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 137. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 138. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 139. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 140. Write Consistency 0000 E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 141. Write Consistency 0000 Coordinator E000 2000 C000 4000 Client Machine A000 6000 Load Balancer 8000 14C7 Tuesday, October 16, 12
  • 142. Write Consistency 0000 E000 ---- C000 4000 Client Machine A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 143. Write Consistency 0000 E000 ---- C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 144. Write Consistency 0000 E000 ---- X C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 145. Write Consistency 0000 E000 ---- X C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 146. Write Consistency 0000 E000 ---- X C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Tuesday, October 16, 12
  • 147. Write Consistency 0000 E000 ---- X C000 4000 Client Machine 14C7 A000 6000 Load Balancer 8000 Coordinator stores a “hint” Tuesday, October 16, 12
  • 149. Write Consistency ‣ ANY At least one node (hinted handoffs allowed) Tuesday, October 16, 12
  • 150. Write Consistency ‣ ANY At least one node (hinted handoffs allowed) ‣ ONE At least one node (no hinted handoffs) Tuesday, October 16, 12
  • 151. Write Consistency ‣ ANY At least one node (hinted handoffs allowed) ‣ ONE At least one node (no hinted handoffs) ‣ QUORUM (N/2)+1 nodes Tuesday, October 16, 12
  • 153. Write Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone Tuesday, October 16, 12
  • 154. Write Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone ‣ EACH_QUORUM (N/2)+1 nodes in all availability zones Tuesday, October 16, 12
  • 155. Write Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone ‣ EACH_QUORUM (N/2)+1 nodes in all availability zones ‣ ALL Write successfully to all replicas Tuesday, October 16, 12
  • 157. Read Consistency 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 158. Read Consistency 9C4F? 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 159. Read Consistency 9C4F? 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 160. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 161. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Tim TODAY C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 162. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Tim TODAY C000 4000 Tim TODAY A000 6000 8000 Tuesday, October 16, 12
  • 163. Read Consistency Coordinator 9C4F? 0000 E000 2000 Tim TODAY Tim TODAY C000 4000 Tim TODAY A000 6000 8000 Tuesday, October 16, 12
  • 164. Read Consistency 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 165. Read Consistency 9C4F? 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 166. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 167. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Jim YESTERDAY C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 168. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Jim YESTERDAY C000 4000 Tim TODAY A000 6000 8000 Tuesday, October 16, 12
  • 169. Read Consistency 9C4F? 0000 E000 2000 Tim TODAY Jim YESTERDAY C000 4000 Tim TODAY A000 6000 8000 Inconsistent Tuesday, October 16, 12
  • 170. Passive Read Repair Tuesday, October 16, 12
  • 171. Passive Read Repair ‣ Initiated by coordinator Tuesday, October 16, 12
  • 172. Passive Read Repair ‣ Initiated by coordinator ‣ Cleans up entropy in a single row Tuesday, October 16, 12
  • 173. Passive Read Repair ‣ Initiated by coordinator ‣ Cleans up entropy in a single row ‣ Happens regardless of consistency level Tuesday, October 16, 12
  • 174. Passive Read Repair ‣ Initiated by coordinator ‣ Cleans up entropy in a single row ‣ Happens regardless of consistency level ‣ Just reading the database reduces its entropy Tuesday, October 16, 12
  • 176. Read Consistency ‣ ONE Get response from the closest replica Tuesday, October 16, 12
  • 177. Read Consistency ‣ ONE Get response from the closest replica ‣ QUORUM Get (N/2)+1 nodes, return most recent timestamp Tuesday, October 16, 12
  • 179. Read Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone Tuesday, October 16, 12
  • 180. Read Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone ‣ EACH_QUORUM (N/2)+1 nodes in all availability zones Tuesday, October 16, 12
  • 181. Read Consistency ‣ LOCAL_QUORUM (N/2)+1 nodes in this availability zone ‣ EACH_QUORUM (N/2)+1 nodes in all availability zones ‣ ALL Wait for all replicas to respond Tuesday, October 16, 12
  • 183. But what about Column Families? Tuesday, October 16, 12
  • 184. Replication For Real Tuesday, October 16, 12
  • 185. Replication For Real Rows are replicated Tuesday, October 16, 12
  • 186. Replication For Real Rows are replicated “Key” is row key Tuesday, October 16, 12
  • 187. Replication For Real Rows are replicated “Key” is row key “Value” is the row data Tuesday, October 16, 12
  • 188. Replication For Real Rows are replicated “Key” is row key “Value” is the row data Implications for row size Tuesday, October 16, 12
  • 191. Gossip ‣ Naive heartbeats don’t scale Tuesday, October 16, 12
  • 192. Gossip ‣ Naive heartbeats don’t scale ‣ Cluster still needs to know its state Tuesday, October 16, 12
  • 193. Gossip ‣ Naive heartbeats don’t scale ‣ Cluster still needs to know its state ‣ Computes a real-valued “suspicion” for each node Tuesday, October 16, 12
  • 194. Gossip ‣ Naive heartbeats don’t scale ‣ Cluster still needs to know its state ‣ Computes a real-valued “suspicion” for each node ‣ Probabilistic Tuesday, October 16, 12
  • 195. Gossip ‣ Naive heartbeats don’t scale ‣ Cluster still needs to know its state ‣ Computes a real-valued “suspicion” for each node ‣ Probabilistic ‣ Just like the real thing Tuesday, October 16, 12
  • 196. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 197. 0000 E000 2000 How are you, E000? C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 198. 0000 E000 2000 I'm cool. C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 199. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 200. 0000 E000 2000 C000 4000 How about you, A000? A000 6000 8000 Tuesday, October 16, 12
  • 201. 0000 E000 2000 C000 4000 Oh, I'm fine. A000 6000 8000 Tuesday, October 16, 12
  • 202. 0000 E000 2000 I trust e000. Not so sure about A000... C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 203. 0000 E000 2000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 204. 0000 E000 C000, what 2000 do you know? C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 205. 0000 E000 2000 I'm Great! C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 206. 0000 E000 2000 E000 is doing well. C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 207. 0000 E000 2000 Poor A000 is having trouble lately. C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 208. 0000 E000 2000 Hmmm, so... C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 209. 0000 E000 2000 ✔ C000 ✔ E000 ✘ A000 C000 4000 A000 6000 8000 Tuesday, October 16, 12
  • 211. Gossip Config ‣ A new node needs “seed nodes” Tuesday, October 16, 12
  • 212. Gossip Config ‣ A new node needs “seed nodes” ‣ Seed nodes configured in $CASSANDRA_HOME/conf/ cassandra.yaml Tuesday, October 16, 12
  • 215. Storage Engine ‣ “Log-structured storage” Tuesday, October 16, 12
  • 216. Storage Engine ‣ “Log-structured storage” ‣ All writes are sequential Tuesday, October 16, 12
  • 217. Storage Engine ‣ “Log-structured storage” ‣ All writes are sequential ‣ All writes are immutable Tuesday, October 16, 12
  • 218. Storage Engine ‣ “Log-structured storage” ‣ All writes are sequential ‣ All writes are immutable ‣ Designed to avoid seeks Tuesday, October 16, 12
  • 219. Storage Engine ‣ “Log-structured storage” ‣ All writes are sequential ‣ All writes are immutable ‣ Designed to avoid seeks ‣ Writes are faster than reads Tuesday, October 16, 12
  • 221. Write Sequence Write from Commit Coordinator Log Tuesday, October 16, 12
  • 222. Write Sequence Write from Commit Coordinator Log Memtable Tuesday, October 16, 12
  • 223. Write Sequence Write from Commit Coordinator Log SSTable(s) Memtable (lots of tuning here) Tuesday, October 16, 12
  • 224. Write Sequence Write from Commit Coordinator Log Compaction (more tuning here) SSTable(s) Memtable (lots of tuning here) Tuesday, October 16, 12
  • 226. Commit Log ‣ Writes go here first Tuesday, October 16, 12
  • 227. Commit Log ‣ Writes go here first ‣ Commit succeeds before node reports successful write Tuesday, October 16, 12
  • 228. Commit Log ‣ Writes go here first ‣ Commit succeeds before node reports successful write ‣ Append-only, sequential writes Tuesday, October 16, 12
  • 229. Commit Log ‣ Writes go here first ‣ Commit succeeds before node reports successful write ‣ Append-only, sequential writes ‣ One per server Tuesday, October 16, 12
  • 230. Commit Log ‣ Writes go here first ‣ Commit succeeds before node reports successful write ‣ Append-only, sequential writes ‣ One per server ‣ Good to have a dedicated spindle Tuesday, October 16, 12
  • 232. MemTable ‣ An in-memory structure Tuesday, October 16, 12
  • 233. MemTable ‣ An in-memory structure ‣ One per column family Tuesday, October 16, 12
  • 234. MemTable ‣ An in-memory structure ‣ One per column family ‣ Holds most recent row changes Tuesday, October 16, 12
  • 235. MemTable ‣ An in-memory structure ‣ One per column family ‣ Holds most recent row changes ‣ Tunable memory use Tuesday, October 16, 12
  • 236. MemTable ‣ An in-memory structure ‣ One per column family ‣ Holds most recent row changes ‣ Tunable memory use ‣ Flushed to disk when “full” Tuesday, October 16, 12
  • 238. SSTable ‣ Memtables flushed to disk here Tuesday, October 16, 12
  • 239. SSTable ‣ Memtables flushed to disk here ‣ Many SSTable files per column family Tuesday, October 16, 12
  • 240. SSTable ‣ Memtables flushed to disk here ‣ Many SSTable files per column family ‣ Every SSTable is immutable Tuesday, October 16, 12
  • 241. SSTable ‣ Memtables flushed to disk here ‣ Many SSTable files per column family ‣ Every SSTable is immutable ‣ SSTables are accessed during reads Tuesday, October 16, 12
  • 242. SSTable ‣ Memtables flushed to disk here ‣ Many SSTable files per column family ‣ Every SSTable is immutable ‣ SSTables are accessed during reads ‣ Must be compacted Tuesday, October 16, 12
  • 243. Read Sequence Read from Coordinator Tuesday, October 16, 12
  • 244. Read Sequence Memtable Tuesday, October 16, 12
  • 245. Read Sequence Memtable Are all columns here? Tuesday, October 16, 12
  • 246. Read Sequence Memtable Are all columns here? YES: stop and return result. Tuesday, October 16, 12
  • 247. Read Sequence Memtable Are all columns here? YES: stop and return result. NO: continue. Tuesday, October 16, 12
  • 248. Read Sequence Newest SSTable Tuesday, October 16, 12
  • 249. Read Sequence Newest SSTable Are all columns here? Tuesday, October 16, 12
  • 250. Read Sequence Newest SSTable Are all columns here? YES: stop and return result. Tuesday, October 16, 12
  • 251. Read Sequence Newest SSTable Are all columns here? YES: stop and return result. NO: continue. Tuesday, October 16, 12
  • 252. Read Sequence Next Oldest SSTable Tuesday, October 16, 12
  • 253. Read Sequence Next Oldest SSTable How about now? Tuesday, October 16, 12
  • 254. Read Sequence Next Oldest SSTable How about now? YES: great! Tuesday, October 16, 12
  • 255. Read Sequence Next Oldest SSTable How about now? YES: great! NO: keep looking... Tuesday, October 16, 12
  • 256. Read Sequence And so on. Tuesday, October 16, 12
  • 257. Read Sequence And so on. Doesn’t this get old? Tuesday, October 16, 12
  • 259. Read Sequence ‣ Check Memtable first Tuesday, October 16, 12
  • 260. Read Sequence ‣ Check Memtable first ‣ Read SSTables from newest to oldest Tuesday, October 16, 12
  • 261. Read Sequence ‣ Check Memtable first ‣ Read SSTables from newest to oldest ‣ Bloom filters prevent most reads Tuesday, October 16, 12
  • 262. Read Sequence ‣ Check Memtable first ‣ Read SSTables from newest to oldest ‣ Bloom filters prevent most reads ‣ Compaction shrinks number of files Tuesday, October 16, 12
  • 265. Compaction ‣ Combine many SSTables into one Tuesday, October 16, 12
  • 266. Compaction ‣ Combine many SSTables into one ‣ Performed in the background Tuesday, October 16, 12
  • 267. Compaction ‣ Combine many SSTables into one ‣ Performed in the background ‣ Node still operates Tuesday, October 16, 12
  • 268. Compaction ‣ Combine many SSTables into one ‣ Performed in the background ‣ Node still operates ‣ Requires extra disk space Tuesday, October 16, 12
  • 269. Compaction ‣ Combine many SSTables into one ‣ Performed in the background ‣ Node still operates ‣ Requires extra disk space ‣ Three tunable varieties Tuesday, October 16, 12
  • 271. Compaction ‣ Major All SSTables are merged into one clean one Tuesday, October 16, 12
  • 272. Compaction ‣ Major All SSTables are merged into one clean one ‣ Minor Similarly-sized SSTables are merged together after reaching a threshold Tuesday, October 16, 12
  • 273. Compaction ‣ Major All SSTables are merged into one clean one ‣ Minor Similarly-sized SSTables are merged together after reaching a threshold ‣ Leveled http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra Tuesday, October 16, 12
  • 274. http://www.datastax.com/ products/community Tuesday, October 16, 12
  • 275. http://www.datastax.com/ products/community t hi s! n lo ad D ow Tuesday, October 16, 12
  • 276. Tim Berglund tlberglund@github.com @tlberglund Thank You Tuesday, October 16, 12