SlideShare a Scribd company logo
N-O-SQL
new database technologies on the rise




                                                                              http://www.flickr.com/photos/wolfgangstaudt/2215246206/



      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
Who am I



» Steven Noels - stevenn@outerthought.org

» Outerthought : scalable content applications

» makers of Daisy and Lily open source CMS




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   2
Agenda


» raison d’être: what brought us here

» concepts: required theory readings

» market overview: trees & the forest

» experiences and (h)in(d)sights




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   3
Raison d’être

  IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
History

                                                                                     2. simplification


                                       1. standardization
  hierarchical databases

      IMS
                 XMLDB                                                       RDBMS

       OODBMS




         IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                      5
Inconsistency through slave lag




                                                                                John Quinn (Digg)
    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org         6
Scaling writes (1)




                                                                                 John Quinn (Digg)
     IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org         7
Scaling writes (2)




                                                                                 John Quinn (Digg)
     IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org         8
Issues with partitioning


» lose the ability to make arbitrary queries

» have to predict data access patterns when
 formulating partitioning strategy
» complex and fragile systems




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   9
Replication complexity




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   10
Scaling relational systems




» When scaling relational systems you loose
 their advantages but retain their overhead




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   11
History


                                      4. rethinking
                                      the problem


     RDBMS                                                                     NOSQL




                            caching
                            denormalisation
                            sharding
                            replication ...
   3. scaling

        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org      12
Moore vs Kryder

» seek time is
 constant (network
 latency as well?)
» transfer rate ! spindles !

» as a principle, writes are
 hard to scale


       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   13
Cambrian Explosion




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   14
Buzz-oriented
                                                            development




                                                             ?
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   15
Cambrian Explosion




                               N-O-SQL




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   16
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   17
The Perspective of Cost




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   18
Common themes

» SCALE SCALE SCALE

» new datamodels

» devops

» N-O-SQL

» The Cloud :
 technology is of no interest anymore


      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   19
Numbers of scale




                   http://qos.doubleclick.net/counters/

    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   20
Types of scaling
» scaling for usage                                 » scaling types of ops
 » volume of users                                     » concurrent read
 » volume of data                                      » concurrent write




   availability                                            partioning
   replication                                             consistency

                            distribution

           IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   21
Distributed
systems are
hard !
  IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
8 fallacies of distributed computing
» The network is reliable.

» Latency is zero.

» Bandwidth is infinite.




                                                                                    Peter Deutsch and James Gosling
» The network is secure.

» Topology doesn't change.

» There is one administrator.

» Transport cost is zero.

» The network is homogeneous.

        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org         23
New Data

» sparse structures

» weak schemas

» graphs

» semi-structured

» document-oriented



       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   24
N-O-SQL =
not only SQL !

  IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
The NOSQL footprint
                            free-structured or sparse data



                                                               NOSQL

                                               MongoDB
                                             CouchDB
                                                  neo4j

                                                          Cassandra




                                                                       available (complexity)
   simple operational




                                                               HBase




                                                                         highly scalable and
      constraints
         ACID,




                                   SQL




                                 referential integrity,
                                      typed data



         IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org              26
NOSQL, if you need ...


» horizontal scaling (out rather than up)

» unusually common data (aka free-structured)

» speed (especially for writes)

» the bleeding edge




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   27
SQL/RDBMS, if you need ...


» SQL

» ACID

» normalisation

» a defined liability




        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   28
Theory

  IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
Robust systems




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   30
Academic background



» Amazon Dynamo

» Google BigTable

» Eric Brewer CAP theorem




      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   31
Amazon Dynamo
» coined the term ‘eventual consistency’

» consistent hashing




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   32
Consistent hashing




                                                     http://horicky.blogspot.com/2009/11/nosql-patterns.html


    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                          33
Consistent hashing



                                   - node C
                                   + node D




                                                             http://www.lexemetech.com/2007/11/consistent-hashing.html


    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                                    34
Google BigTable
» multi-dimensional column-oriented database

» on top of GoogleFileSystem

» object versioning




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   35
CAP theorem


             strong                                high
           consistency                          availability



                               partition-
                               tolerance



   IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   36
CAP
» Strong Consistency: all clients see the
 same view, even in the presence of updates
» High Availability: all clients can find some
 replica of the data, even in the presence of
 failures
» Partition-tolerance: the system
 properties hold even when the system is
 partitioned

      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   37
Consistency


» Where is my data I just updated?

» Ideal world :

 The result of every write-operation is
 reflected by subsequent read-operations.



       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   38
Consistency




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   39
Sunny-day scenario




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   40
Network partioning




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   41
Culture Clash
» Classic distributed systems: focus on ACID
 » atomic
 » consistent
 » isolated
 » durable

» Modern internet systems: focus on BASE
 » basically available
 » soft-state (or scalable)
 » eventually consistent

        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   42
Culture Clash
» ACID                                              » BASE
 » highest priority: strong                            » availability and scaling
  consistency for                                          highest priorities
  transactions                                         » weak consistency
 » availability less important
                                                       » optimistic
 » pessimistic
                                                       » best effort
 » rigorous analysis
                                                       » simple and fast
 » complex mechanisms

                                         spectrum

           IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   43
Building for failure

» defensive programming

» creating replicas

» disk flushing

» watch out for failure of utility infrastructure

» conscious sync/async decisions



       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   44
Possible storage failures
» Application errors

» Repeatable DB failures

» Unrepeatable DB failures

» OS errors

» Local cluster HW failure




                                                                                    Michael Stonebreaker
» Local cluster network partitioning

» Disaster

» WAN network failure between remote clusters

        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org       45
Availability ≠
total async !

  IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
✘
The Enterprise Service Bus




                                bus =

                          congestion



    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   47
Bus systems

» objects don’t fit in a pipe

» object ➙ message

» serialization / de-serialization cost

» message size

» queuing = cost



       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   48
Use a mixture of both



»async + sync



                                        stuff which matters !




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   49
Numbers of scale




                   http://qos.doubleclick.net/counters/

    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   50
Processing large datasets :

Map/Reduce

       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
Smart Data
» sparse as a feature

» weak schemas

» ad-hoc indexing

» organic analytics

» near-data processing

» live(ly) datawarehouse

» distribution ➙ parallellization ➙ performance

       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   52
Hadoop: HDFS + MapReduce
» single filesystem + single execution-space




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   53
MapReduce example: WordCount




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   54
MapReduce




   IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   55
MapReduce and HDFS




                                                                                © lars george

    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org            56
Physical architecture




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   57
Processing large datasets with MR

» Benefit from parallellisation

» Less modelling upfront (ad-hoc processing)

» Compartmentalized approach reduces
 operational risks
» AsterData et al. have SQL/MR hybrids for
 huge-scale BI


       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   58
Market
overview

  IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
Categories


» key-value stores

» column stores

» document stores

» graph databases




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   60
Key-value stores



» Redis

» Voldemort

» Tokyo Cabinet




          IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   61
Redis



» REmote DIctionary Server

» http://code.google.com/p/redis/

» vmware




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   62
Redis Features
» persisted memcache, ‘awesome’

» RAM-based + persistable

» key ➙ values: string, list, set

» higher-level ops
 » i.e. push/pop and sort for lists
» fast (very)

» configurable durability

» client-managed sharding

        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   63
Voldemort




» http://project-voldemort.com/

» LinkedIn




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   64
Voldemort


» persistent

» distributed

» fault-tolerant

» hash table




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   65
Voldemort


                                                        API: GET, PUT,
                                                        DELETE




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   66
Voldemort




                    routing logic moving up the stack,
                    smaller latency

    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   67
Voldemort data format

» key+values = arrays of bytes

» So how do we objects ⬌ bytes ?

 » json
 » string
 » java-serialization
 » protobuf
 » identity


          IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   68
Tokyo Cabinet




» http://1978th.net/tokyocabinet/

» mixi.jp (i.e. Facebook Japan)




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   69
Product Family




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   70
Tokyo Cabinet
» memory or filesystem

» hash, b-tree, fixed-length, table




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   71
Column stores



» BigTable

» HBase

» Cassandra




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   72
BigTable



» http://labs.google.com/papers/bigtable.html

» Google

» layered on top of GFS




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   73
HBase




» http://hadoop.apache.org/hbase/

» StumbleUpon / Adobe / Cloudera




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   74
HBase
» sorted                                            » persisted
» distributed                                       » storage system
» column-oriented
» multi-dimensional
» highly-available                                  » adds random access
» high-performance                                     reads and writes atop
                                                       HDFS


           IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   75
HBase data model
» Distributed multi-dimensional sparse map

» Multi-dimensional keys:
 (table, row, family:column, timestamp) → value




» Keys are arbitrary strings

» Access to row data is atomic
        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   76
Storage architecture




                                                                                © lars george

    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org            77
Cassandra




» http://cassandra.apache.org/

» Rackspace / Facebook




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   78
Cassandra
» Key-value store (with added structure)

» Reliability (identical nodes)

» Eventual consistent

» Distributed
                                                                                       A
                                                                       C
» Tunable
 » Partitioning
                                                                                   P
 » Replication

       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org           79
Cassandra write pattern




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   80
Cassandra applicability
            FIT                                                      NO FIT

» Scalable reliability                            » Flexible indexing
  (through identical                              » Only PK-based
  nodes)                                            querying
» Linear scaling                                  » Big Binary Data
» Write throughput                                » 1 Row must fit in
» Large Data Sets                                   RAM entirely

          IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   81
Document stores


» CouchDB

» MongoDB

» Riak

» MarkLogic




         IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   82
CouchDB




» http://couchdb.apache.org/

» couch.io




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   83
CouchDB


» fault-tolerant

» schema-free

» document-oriented

» accessible via a RESTful HTTP/JSON API




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   84
CouchDB documents

{
    “_id”: ”BCCD12CBB”,
    “_rev”: ”AB764C”,
    “type”: ”person”,
    “name”: ”Darth Vader”,
    “age”: 63,
    “headware”: [“Helmet”, “Sombrero”],
    “dark_side”: true
}


      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   85
CouchDB REST API


» HTTP
 » PUT /db/docid
 » GET /db/docid
 » POST /db/docid
 » DELETE /db/docid




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   86
CouchDB Views
» MapReduce-based

» Filter, Collate, Aggregate

» Javascript

         map                                                        reduce
 function (doc) {                                    function (Key, Values) {
   for(var i in doc.tags)                              var sum = 0;
     emit(doc.tags[i], 1);                             for(var i in Values)
 }                                                       sum += Values[i];
                                                       return sum;
                                                     }



        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   87
CouchDB


» be careful on semantics
 » replication ≠ partioning/sharding !
 » distributed database = distributable database

» sharded / distributed deployment
 requires proxy layer



       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   88
MongoDB




» http://www.mongodb.org/

» 10gen




      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   89
MongoDB
» cfr. CouchDB, really

» except for:
 » C++
 » performance focus
 » runtime queries (mapreduce still available)
 » native drivers (no REST/HTTP layering)
 » no MVCC: update-in-place
 » auto sharding (alpha)

         IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   90
Riak




» http://riak.basho.com/

» Basho Technologies




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   91
Riak

» buckets/keys, links

» values/content = bucket + metadata

» pluggable storage engines (fs, (D)ETS, InnoDB)

» HTTP/REST API

» automatic distribution

» mapreduce using Javascript


       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   92
Jackrabbit




» http://jackrabbit.apache.org/

» Day Software




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   93
Jackrabbit

» reference
 implementation for
 JSR 170 & 283
» remoting: WebDAV &
 RMI
» persistence: RDBMS,
 fs, memory


       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   94
Jackrabbit

» Java-centric (duh)

» complex repository model (nodes+properties)
 » mixins, inheritance

» workspaces

» query language

» no partioning/sharding


       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   95
JCR API levels




    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   96
Graph databases




» Neo4j

» AllegroGraph (RDF)




      IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   97
Neo4j




» http://neo4j.org/

» Neo Technology




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   98
Neo4j
» data = nodes + relationships + key/value properties




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   99
Neo4j
» many language bindings, little remoting

» ‘whiteboard’ friendly

» scaling to complexity (rather than volume?)

» lots of focus on domain modelling

» SPARQL/SAIL impl for triple geeks

» mostly RAM centric (with disk swapping &
 persistence)

       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   100
Experiences &
(h)in(d)sights

  IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
NOSQL applicability

» Horizontal scaling

» Multi-Master

» Data representation
 » search of simplicity
 » data that doesn’t fit the E-R model
   (graphs, trees, versions)
» Speed


        IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   102
Tools for the trade


» non-relational data: Couch, Mongo, Riak

» massive quantities: Cassandra, HBase

» persistent caching: Redis, Voldemort

» graphs: neo4j




       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   103
Tool selection
» be careful on the marketeese:
 smoke and mirrors beware!
» monitor dev list, IRC, Twitter, blogs

» monitor project ‘sponsors’

» mix-and-match

» DON’T NOSQL WITHOUT INTERNAL SYS
 ARCHS & DEV(OP)S !

       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   104
}
     aptness




                                                                 NOSQL
internet
enterprise




                                                                                           }

                                                                                                 SQL
corporate
community




                                                                                           complexity
               IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                105
Our NOSQL-based project: Lily

» (open source)

» scalable store (Apache HBase)

» and search (Apache SOLR)

» content repository

» α due mid 2010

» www.lilycms.org or @outerthought


       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   106
Lily architecture
     distributed process coordination
     and configuration (ZooKeeper)




                                                                                              }
                                                        query          update       indexer
                                   Lily
   Lily                                                                                           Lily Store Server
                                  store
  client
                                  node                   WAL            MQ           M/R

  client


                                                                                              }
                                  store
                                  node                                   2ary       WAL /         HBase Region Server
                                                       documents
                                                                       indexes       MQ
  client

                                  store
                                  node

                                                                                              }   Hadoop DFS




                                                                       REST




                                                         index
                                                        replica
                                                                   inverted index


                                                                       replica      replica
                                                                                              }   SOLR




                IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                               107
When combining store
and search, make sure
your (search) index
doesn’t become the
store.

   IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
Key lessons learned

» importance of keyspace design

» secondary indexing

» data de-normalization

» schema vs. code flexibility?

» distribution is everywhere
 and you shouldn’t forget about it


       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   109
Reading material

» Amazon Dynamo, Google BigTable, CAP

» http://nosql.mypopescu.com/

» http://nosql-database.org/

» http://twitter.com/nosqlupdate

» http://highscalability.com/



       IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   110
Questions?




                                                                  http://www.flickr.com/photos/leehaywood/4237636853/


    IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org                                 111
Thanks for your
                                    attention !



                                » stevenn@outerthought.org

                                »           @stevenn

IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org   112

More Related Content

PDF
KVIV / NoSQL : the new generation of database servers
PDF
Lily for the Bay Area HBase UG - NYC edition
PDF
The Lily RowLog library
PPTX
Making the most out of your internet search
PPTX
A Patient's request to exchange medical costs in last year of life for Hep C Tx.
PDF
A day in the life of Brad the bread loaf
PDF
San Mateo County Real Estate Price Report
PDF
Learning Lessons: Building a CMS on top of NoSQL technologies
KVIV / NoSQL : the new generation of database servers
Lily for the Bay Area HBase UG - NYC edition
The Lily RowLog library
Making the most out of your internet search
A Patient's request to exchange medical costs in last year of life for Hep C Tx.
A day in the life of Brad the bread loaf
San Mateo County Real Estate Price Report
Learning Lessons: Building a CMS on top of NoSQL technologies

Similar to N-O-SQL, new database technologies on the rise (20)

KEY
Building a CMS on top of NoSQL (for ParisJUG)
PDF
Welcome to the Age of Data
PDF
Hadoop World 2011: Lily: Smart Data at Scale, Made Easy
PDF
Outerthought / Lily Partnerships
PDF
NoSQL intro for YaJUG / NoSQL UG Luxembourg
PDF
Lily @ Work Webinar
PDF
NoSQL with Hadoop and HBase
PDF
Devoxx 2010 | Tools In Action : Kauri and Lily
PDF
Sirris innovate2011 - Lily, Smart Data at scale made easy, Steven Noels, Oute...
PDF
Devoxx 2010 | LAB : ReST in Java
PDF
Revitalizing Aging Architectures with Microservices
PDF
Lily at HUG UK
PDF
Huguk lily
PDF
MongoDB and the Internet of Things
PDF
NewSQL Database Overview
PDF
From Content Storage to Scaling Smart Data
PPTX
Binary Analysis - Luxembourg
PDF
Federated Approach for Interoperating AEC/FM Ontologies
PDF
Possibilities of generative models
PDF
The world is the computer and the programmer is you
Building a CMS on top of NoSQL (for ParisJUG)
Welcome to the Age of Data
Hadoop World 2011: Lily: Smart Data at Scale, Made Easy
Outerthought / Lily Partnerships
NoSQL intro for YaJUG / NoSQL UG Luxembourg
Lily @ Work Webinar
NoSQL with Hadoop and HBase
Devoxx 2010 | Tools In Action : Kauri and Lily
Sirris innovate2011 - Lily, Smart Data at scale made easy, Steven Noels, Oute...
Devoxx 2010 | LAB : ReST in Java
Revitalizing Aging Architectures with Microservices
Lily at HUG UK
Huguk lily
MongoDB and the Internet of Things
NewSQL Database Overview
From Content Storage to Scaling Smart Data
Binary Analysis - Luxembourg
Federated Approach for Interoperating AEC/FM Ontologies
Possibilities of generative models
The world is the computer and the programmer is you
Ad

More from NGDATA (6)

PDF
NGDATA Corporate Presentation
PDF
20110514 appsforghent
PPT
Big Data
PDF
Devoxx 2010 | Tools In Action : Kauri and Lily
KEY
NoSQL BOF at Devoxx
KEY
NoSQL "Tools in Action" talk at Devoxx
NGDATA Corporate Presentation
20110514 appsforghent
Big Data
Devoxx 2010 | Tools In Action : Kauri and Lily
NoSQL BOF at Devoxx
NoSQL "Tools in Action" talk at Devoxx
Ad

Recently uploaded (20)

PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PPTX
Cloud computing and distributed systems.
PDF
KodekX | Application Modernization Development
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Approach and Philosophy of On baking technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
cuic standard and advanced reporting.pdf
PDF
Advanced IT Governance
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Understanding_Digital_Forensics_Presentation.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Unlocking AI with Model Context Protocol (MCP)
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Cloud computing and distributed systems.
KodekX | Application Modernization Development
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Approach and Philosophy of On baking technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
20250228 LYD VKU AI Blended-Learning.pptx
MYSQL Presentation for SQL database connectivity
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
cuic standard and advanced reporting.pdf
Advanced IT Governance
CIFDAQ's Market Insight: SEC Turns Pro Crypto

N-O-SQL, new database technologies on the rise

  • 1. N-O-SQL new database technologies on the rise http://www.flickr.com/photos/wolfgangstaudt/2215246206/ IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 2. Who am I » Steven Noels - stevenn@outerthought.org » Outerthought : scalable content applications » makers of Daisy and Lily open source CMS IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 2
  • 3. Agenda » raison d’être: what brought us here » concepts: required theory readings » market overview: trees & the forest » experiences and (h)in(d)sights IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 3
  • 4. Raison d’être IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 5. History 2. simplification 1. standardization hierarchical databases IMS XMLDB RDBMS OODBMS IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 5
  • 6. Inconsistency through slave lag John Quinn (Digg) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 6
  • 7. Scaling writes (1) John Quinn (Digg) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 7
  • 8. Scaling writes (2) John Quinn (Digg) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 8
  • 9. Issues with partitioning » lose the ability to make arbitrary queries » have to predict data access patterns when formulating partitioning strategy » complex and fragile systems IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 9
  • 10. Replication complexity IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 10
  • 11. Scaling relational systems » When scaling relational systems you loose their advantages but retain their overhead IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 11
  • 12. History 4. rethinking the problem RDBMS NOSQL caching denormalisation sharding replication ... 3. scaling IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 12
  • 13. Moore vs Kryder » seek time is constant (network latency as well?) » transfer rate ! spindles ! » as a principle, writes are hard to scale IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 13
  • 14. Cambrian Explosion IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 14
  • 15. Buzz-oriented development ? IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 15
  • 16. Cambrian Explosion N-O-SQL IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 16
  • 17. IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 17
  • 18. The Perspective of Cost IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 18
  • 19. Common themes » SCALE SCALE SCALE » new datamodels » devops » N-O-SQL » The Cloud : technology is of no interest anymore IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 19
  • 20. Numbers of scale http://qos.doubleclick.net/counters/ IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 20
  • 21. Types of scaling » scaling for usage » scaling types of ops » volume of users » concurrent read » volume of data » concurrent write availability partioning replication consistency distribution IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 21
  • 22. Distributed systems are hard ! IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 23. 8 fallacies of distributed computing » The network is reliable. » Latency is zero. » Bandwidth is infinite. Peter Deutsch and James Gosling » The network is secure. » Topology doesn't change. » There is one administrator. » Transport cost is zero. » The network is homogeneous. IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 23
  • 24. New Data » sparse structures » weak schemas » graphs » semi-structured » document-oriented IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 24
  • 25. N-O-SQL = not only SQL ! IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 26. The NOSQL footprint free-structured or sparse data NOSQL MongoDB CouchDB neo4j Cassandra available (complexity) simple operational HBase highly scalable and constraints ACID, SQL referential integrity, typed data IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 26
  • 27. NOSQL, if you need ... » horizontal scaling (out rather than up) » unusually common data (aka free-structured) » speed (especially for writes) » the bleeding edge IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 27
  • 28. SQL/RDBMS, if you need ... » SQL » ACID » normalisation » a defined liability IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 28
  • 29. Theory IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 30. Robust systems IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 30
  • 31. Academic background » Amazon Dynamo » Google BigTable » Eric Brewer CAP theorem IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 31
  • 32. Amazon Dynamo » coined the term ‘eventual consistency’ » consistent hashing IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 32
  • 33. Consistent hashing http://horicky.blogspot.com/2009/11/nosql-patterns.html IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 33
  • 34. Consistent hashing - node C + node D http://www.lexemetech.com/2007/11/consistent-hashing.html IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 34
  • 35. Google BigTable » multi-dimensional column-oriented database » on top of GoogleFileSystem » object versioning IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 35
  • 36. CAP theorem strong high consistency availability partition- tolerance IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 36
  • 37. CAP » Strong Consistency: all clients see the same view, even in the presence of updates » High Availability: all clients can find some replica of the data, even in the presence of failures » Partition-tolerance: the system properties hold even when the system is partitioned IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 37
  • 38. Consistency » Where is my data I just updated? » Ideal world : The result of every write-operation is reflected by subsequent read-operations. IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 38
  • 39. Consistency IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 39
  • 40. Sunny-day scenario IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 40
  • 41. Network partioning IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 41
  • 42. Culture Clash » Classic distributed systems: focus on ACID » atomic » consistent » isolated » durable » Modern internet systems: focus on BASE » basically available » soft-state (or scalable) » eventually consistent IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 42
  • 43. Culture Clash » ACID » BASE » highest priority: strong » availability and scaling consistency for highest priorities transactions » weak consistency » availability less important » optimistic » pessimistic » best effort » rigorous analysis » simple and fast » complex mechanisms spectrum IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 43
  • 44. Building for failure » defensive programming » creating replicas » disk flushing » watch out for failure of utility infrastructure » conscious sync/async decisions IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 44
  • 45. Possible storage failures » Application errors » Repeatable DB failures » Unrepeatable DB failures » OS errors » Local cluster HW failure Michael Stonebreaker » Local cluster network partitioning » Disaster » WAN network failure between remote clusters IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 45
  • 46. Availability ≠ total async ! IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 47. ✘ The Enterprise Service Bus bus = congestion IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 47
  • 48. Bus systems » objects don’t fit in a pipe » object ➙ message » serialization / de-serialization cost » message size » queuing = cost IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 48
  • 49. Use a mixture of both »async + sync stuff which matters ! IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 49
  • 50. Numbers of scale http://qos.doubleclick.net/counters/ IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 50
  • 51. Processing large datasets : Map/Reduce IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 52. Smart Data » sparse as a feature » weak schemas » ad-hoc indexing » organic analytics » near-data processing » live(ly) datawarehouse » distribution ➙ parallellization ➙ performance IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 52
  • 53. Hadoop: HDFS + MapReduce » single filesystem + single execution-space IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 53
  • 54. MapReduce example: WordCount IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 54
  • 55. MapReduce IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 55
  • 56. MapReduce and HDFS © lars george IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 56
  • 57. Physical architecture IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 57
  • 58. Processing large datasets with MR » Benefit from parallellisation » Less modelling upfront (ad-hoc processing) » Compartmentalized approach reduces operational risks » AsterData et al. have SQL/MR hybrids for huge-scale BI IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 58
  • 59. Market overview IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 60. Categories » key-value stores » column stores » document stores » graph databases IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 60
  • 61. Key-value stores » Redis » Voldemort » Tokyo Cabinet IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 61
  • 62. Redis » REmote DIctionary Server » http://code.google.com/p/redis/ » vmware IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 62
  • 63. Redis Features » persisted memcache, ‘awesome’ » RAM-based + persistable » key ➙ values: string, list, set » higher-level ops » i.e. push/pop and sort for lists » fast (very) » configurable durability » client-managed sharding IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 63
  • 64. Voldemort » http://project-voldemort.com/ » LinkedIn IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 64
  • 65. Voldemort » persistent » distributed » fault-tolerant » hash table IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 65
  • 66. Voldemort API: GET, PUT, DELETE IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 66
  • 67. Voldemort routing logic moving up the stack, smaller latency IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 67
  • 68. Voldemort data format » key+values = arrays of bytes » So how do we objects ⬌ bytes ? » json » string » java-serialization » protobuf » identity IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 68
  • 69. Tokyo Cabinet » http://1978th.net/tokyocabinet/ » mixi.jp (i.e. Facebook Japan) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 69
  • 70. Product Family IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 70
  • 71. Tokyo Cabinet » memory or filesystem » hash, b-tree, fixed-length, table IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 71
  • 72. Column stores » BigTable » HBase » Cassandra IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 72
  • 73. BigTable » http://labs.google.com/papers/bigtable.html » Google » layered on top of GFS IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 73
  • 74. HBase » http://hadoop.apache.org/hbase/ » StumbleUpon / Adobe / Cloudera IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 74
  • 75. HBase » sorted » persisted » distributed » storage system » column-oriented » multi-dimensional » highly-available » adds random access » high-performance reads and writes atop HDFS IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 75
  • 76. HBase data model » Distributed multi-dimensional sparse map » Multi-dimensional keys: (table, row, family:column, timestamp) → value » Keys are arbitrary strings » Access to row data is atomic IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 76
  • 77. Storage architecture © lars george IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 77
  • 78. Cassandra » http://cassandra.apache.org/ » Rackspace / Facebook IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 78
  • 79. Cassandra » Key-value store (with added structure) » Reliability (identical nodes) » Eventual consistent » Distributed A C » Tunable » Partitioning P » Replication IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 79
  • 80. Cassandra write pattern IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 80
  • 81. Cassandra applicability FIT NO FIT » Scalable reliability » Flexible indexing (through identical » Only PK-based nodes) querying » Linear scaling » Big Binary Data » Write throughput » 1 Row must fit in » Large Data Sets RAM entirely IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 81
  • 82. Document stores » CouchDB » MongoDB » Riak » MarkLogic IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 82
  • 83. CouchDB » http://couchdb.apache.org/ » couch.io IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 83
  • 84. CouchDB » fault-tolerant » schema-free » document-oriented » accessible via a RESTful HTTP/JSON API IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 84
  • 85. CouchDB documents { “_id”: ”BCCD12CBB”, “_rev”: ”AB764C”, “type”: ”person”, “name”: ”Darth Vader”, “age”: 63, “headware”: [“Helmet”, “Sombrero”], “dark_side”: true } IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 85
  • 86. CouchDB REST API » HTTP » PUT /db/docid » GET /db/docid » POST /db/docid » DELETE /db/docid IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 86
  • 87. CouchDB Views » MapReduce-based » Filter, Collate, Aggregate » Javascript map reduce function (doc) { function (Key, Values) { for(var i in doc.tags) var sum = 0; emit(doc.tags[i], 1); for(var i in Values) } sum += Values[i]; return sum; } IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 87
  • 88. CouchDB » be careful on semantics » replication ≠ partioning/sharding ! » distributed database = distributable database » sharded / distributed deployment requires proxy layer IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 88
  • 89. MongoDB » http://www.mongodb.org/ » 10gen IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 89
  • 90. MongoDB » cfr. CouchDB, really » except for: » C++ » performance focus » runtime queries (mapreduce still available) » native drivers (no REST/HTTP layering) » no MVCC: update-in-place » auto sharding (alpha) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 90
  • 91. Riak » http://riak.basho.com/ » Basho Technologies IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 91
  • 92. Riak » buckets/keys, links » values/content = bucket + metadata » pluggable storage engines (fs, (D)ETS, InnoDB) » HTTP/REST API » automatic distribution » mapreduce using Javascript IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 92
  • 93. Jackrabbit » http://jackrabbit.apache.org/ » Day Software IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 93
  • 94. Jackrabbit » reference implementation for JSR 170 & 283 » remoting: WebDAV & RMI » persistence: RDBMS, fs, memory IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 94
  • 95. Jackrabbit » Java-centric (duh) » complex repository model (nodes+properties) » mixins, inheritance » workspaces » query language » no partioning/sharding IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 95
  • 96. JCR API levels IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 96
  • 97. Graph databases » Neo4j » AllegroGraph (RDF) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 97
  • 98. Neo4j » http://neo4j.org/ » Neo Technology IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 98
  • 99. Neo4j » data = nodes + relationships + key/value properties IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 99
  • 100. Neo4j » many language bindings, little remoting » ‘whiteboard’ friendly » scaling to complexity (rather than volume?) » lots of focus on domain modelling » SPARQL/SAIL impl for triple geeks » mostly RAM centric (with disk swapping & persistence) IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 100
  • 101. Experiences & (h)in(d)sights IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 102. NOSQL applicability » Horizontal scaling » Multi-Master » Data representation » search of simplicity » data that doesn’t fit the E-R model (graphs, trees, versions) » Speed IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 102
  • 103. Tools for the trade » non-relational data: Couch, Mongo, Riak » massive quantities: Cassandra, HBase » persistent caching: Redis, Voldemort » graphs: neo4j IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 103
  • 104. Tool selection » be careful on the marketeese: smoke and mirrors beware! » monitor dev list, IRC, Twitter, blogs » monitor project ‘sponsors’ » mix-and-match » DON’T NOSQL WITHOUT INTERNAL SYS ARCHS & DEV(OP)S ! IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 104
  • 105. } aptness NOSQL internet enterprise } SQL corporate community complexity IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 105
  • 106. Our NOSQL-based project: Lily » (open source) » scalable store (Apache HBase) » and search (Apache SOLR) » content repository » α due mid 2010 » www.lilycms.org or @outerthought IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 106
  • 107. Lily architecture distributed process coordination and configuration (ZooKeeper) } query update indexer Lily Lily Lily Store Server store client node WAL MQ M/R client } store node 2ary WAL / HBase Region Server documents indexes MQ client store node } Hadoop DFS REST index replica inverted index replica replica } SOLR IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 107
  • 108. When combining store and search, make sure your (search) index doesn’t become the store. IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
  • 109. Key lessons learned » importance of keyspace design » secondary indexing » data de-normalization » schema vs. code flexibility? » distribution is everywhere and you shouldn’t forget about it IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 109
  • 110. Reading material » Amazon Dynamo, Google BigTable, CAP » http://nosql.mypopescu.com/ » http://nosql-database.org/ » http://twitter.com/nosqlupdate » http://highscalability.com/ IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 110
  • 111. Questions? http://www.flickr.com/photos/leehaywood/4237636853/ IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 111
  • 112. Thanks for your attention ! » stevenn@outerthought.org » @stevenn IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 112