SlideShare a Scribd company logo
Polyglot persistence for Java
  developers - moving out of the
      relational comfort zone



Chris Richardson

Author of POJOs in Action
Founder of CloudFoundry.com
Chris.Richardson@SpringSource.Com
@crichardson
Overall presentation goal


The joy and pain of
   building Java
  applications that
     use NoSQL

     5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                          Slide 2
About Chris
              •    Grew up in England and live in Oakland, CA
              •    Over 25+ years of software development
                   experience including 14 years of Java
              •    Speaker at JavaOne, SpringOne, NFJS,
                   JavaPolis, Spring Experience, etc.
              •    Organize the Oakland JUG and the Groovy
                   Grails meetup




                                     http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/



     5/3/11        Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                           Slide 3
Agenda
o  The trouble with relational
   databases
o  Overview of NoSQL databases
o  Introduction to Spring Data
o  NoSQL case study: POJOs in Action




      5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                           Slide 4
Relational databases are great
o  SQL = Rich, declarative query language
o  Database enforces referential integrity
o  ACID semantics
o  Well understood by developers
o  Well supported by frameworks and tools, e.g. Spring
   JDBC, Hibernate, JPA
o  Well understood by operations
     n    Configuration
     n    Care and feeding
     n    Backups
     n    Tuning
     n    Failure and recovery
     n    Performance characteristics
o  But….


              5/3/11     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                     Slide 5
The trouble with relational databases
o  Object/relational impedance mismatch
   n  Complicated to map rich domain model to
       relational schema
o  Relational schema is rigid
   n  Difficult to handle semi-structured data, e.g.
       varying attributes
   n  Schema changes = downtime or $$
o  Extremely difficult/impossible to scale writes:
   n  Vertical scaling is limited/requires $$
   n  Horizontal scaling is limited or requires $$
o  Performance can be suboptimal for some use
   cases

         5/3/11     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                Slide 6
NoSQL databases have emerged…
Each one offers some
combination of:
o  High performance
o  High scalability
o  Rich data-model
o  Schema less
In return for:
o  Limited transactions
o  Relaxed consistency
o  …

       5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 7
… but there are few commonalities
o  Everyone and their dog has written one
o  Different data models
   n    Key-value
   n    Column
   n    Document
   n    Graph
o  Different APIs – No JDBC, Hibernate,
   JPA (generally)
o  “Same sorry state as the database
   market in the 1970s before SQL was
   invented” http://queue.acm.org/detail.cfm?id=1961297

           5/3/11    Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                 Slide 8
How to I access my data?




     5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                          Slide 9
Reality Check - Relational DBs are not going away


  §  NoSQL
      usage small
      by
      comparison
      …
  §  But
      growing…




         5/3/11     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                Slide 10
Future = multi-paradigm data storage
for enterprise applications




                                                     IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg




      5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                              Slide 11
Agenda
o  The trouble with relational databases
o  Overview of NoSQL databases
o  Introduction to Spring Data
o  NoSQL case study: POJOs in Action




      5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                           Slide 12
Redis
o  Advanced key-value store
   n  Think memcached on steroids (the good kind)
                                                                                     K1              V1
   n  Values can be binary strings, Lists, Sets, Ordered
         Sets, Hash maps, ..                                                         K2              V2
   n  Operations for each data type, e.g. appending
       to a list, adding to a set, retrieving a slice of a                           K3              V2
       list, …
o  Very fast:
   n    In-memory operations
   n    ~100K operations/second on entry-level hardware
o  Persistent
   n  Periodic snapshots of memory OR append
       commands to log file
   n  Limits are size of keys retained in memory.
o  Has transactions
   n    Commands can be batched and executed
         atomically



            5/3/11       Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                          Slide 13
Redis CLI
redis> sadd myset a
(integer) 1
redis> sadd myset b
(integer) 1
redis> smembers myset
1. "a"
2. "b"
redis> srem myset a
(integer) 1
redis> smembers myset
1. "b"

       5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 14
Scaling Redis
o  Master/slave replication
   n  Tree of Redis servers
   n  Non-persistent master can replicate to a persistent slave
   n  Use slaves for read-only queries
o  Sharding
   n  Client-side only – consistent hashing based on key
   n  Server-side sharding – coming one day
o  Run multiple servers per physical host
   n  Server is single threaded => Leverage multiple CPUs
   n  32 bit more efficient than 64 bit
o  Optional "virtual memory"
   n  Ideally data should fit in RAM
   n  Values (not keys) written to disc


          5/3/11      Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                  Slide 15
Redis use cases
o  Use in conjunction with another database as the
   SOR
o  Drop-in replacement for Memcached
     n  Session state
     n  Cache of data retrieved from SOR
     n  Denormalized datastore for high-performance queries
o    Hit counts using INCR command
o    Randomly selecting an item – SRANDMEMBER
o    Queuing – Lists with LPOP, RPUSH, ….
o    High score tables – Sorted sets

o  Notable users: github, guardian.co.uk, ….

           5/3/11    Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                 Slide 16
Cassandra
o  An Apache open-source project originally
   developed by Facebook for inbox search
o  Extremely scalable
o  Fast writes = append to a log
o  Data is replicated and sharded
o  Rack and datacenter aware
o  Column-oriented database
  n  The data model will hurt your brain
  n  4 or 5-dimensional hash map

        5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                             Slide 17
Cassandra data model
                             My Column family (within a key space)
   Keys    Columns


   a       colA: value1            colB: value2                         colC: value3


   b       colA: value             colD: value                          colE: value




o  4-D map: keySpace x key x columnFamily x
  column è value
o  Column names are dynamic; can contain
   data
o  Arbitrary number of columns
o  One CF row = one DDD aggregate
          5/3/11          Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                       Slide 18
Cassandra data model – insert/update
                            My Column family (within a key space)
  Keys    Columns


  a       colA: value1            colB: value2                         colC: value3


  b       colA: value             colD: value                          colE: value    Transaction =
                                                                                      updates to a
                                                                                      row within a
                                                                                      ColumnFamily

                    Insert(key=a, columName=colZ, value=foo)
  Keys    Columns


  a       colA: value1            colB: value2                         colC: value3   colZ: foo


  b       colA: value             colD: value                          colE: value


         5/3/11          Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                             Slide 19
Cassandra query example – slice
Key   Columns
  s
      colA:             colB:                                colC:                 colZ:
a
      value1            value2                               value3                 foo

      colA:              colD:                                colE:
b
      value              value                                value




         slice(key=a, startColumn=colA, endColumnName=colC)


Key    Columns                                                           You can also do a
  s
                                                                         rangeSlice which
      colA:             colB:
a
      value1            value2                                           returns a range of keys
                                                                         – less efficient



      5/3/11       Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                           Slide 20
Super Column Families – one more
dimension
                             My Column family (within a key space)
  Keys     Super columns

                         ScA                                                    ScB
  a
          colA: value1            colB: value2                          colC: value3


  b
          colA: value              colD: value                          colE: value



                    Insert(key=a, superColumn=scB, columName=colZ, value=foo)


                                    keySpace x key x columnFamily x superColumn x column -> value
  Keys      Super columns

                          ScA                                                    ScB
  a
           colA: value1            colB: value2                          colC:colZ: foo
                                                                               value3

  b
           colA: value              colD: value                           colE: value

         5/3/11           Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                          Slide 21
Getting data with super slice
                                                           My Column family (within a key space)

  Keys     Super columns

                         ScA                                                   ScB
  a
          colA: value1           colB: value2                          colC: value3


  b
          colA: value             colD: value                          colE: value



                    slice(key=a, startColumn=scB, endColumnName=scC)


                                   keySpace x key x columnFamily x Super column x column -> value


  Keys      Super columns

                                                                               ScB
  a
                                                                        colC: value3



         5/3/11          Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                   Slide 22
Cassandra CLI
$ bin/cassandra-cli -h localhost
Connected to: "Test Cluster" on localhost/9160
Welcome to cassandra CLI.
[default@unknown] use Keyspace1;
Authenticated to keyspace: Keyspace1
[default@Keyspace1] list restaurantDetails;
Using default limit of 100
-------------------
RowKey: 1
=> (super_column=attributes,
     (column=json, value={"id":
    1,"name":"Ajanta","menuItems"....

[default@Keyspace1] get restaurantDetails['1']['attributes’];
=> (column=json, value={"id":1,"name":"Ajanta","menuItems"....




          5/3/11    Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                Slide 23
Scaling Cassandra
                                                                                             • Client connects to any node
                                                                                             • Dynamically add/remove nodes
                 Keys = [D, A]
                                           Node 1                                            • Reads/Writes specify how many nodes
                                                                                             • Configurable # of replicas
                                          Token = A                                                 •  adjacent nodes
                                                                                                    •  rack and data center aware
                                 Replicates to
                                                                     Replicates to




                  Node 4                                                         Node 2
                                                                                                              Keys = [A, B]
                Token = D                                                       Token = B

                                  Replicates to
Keys = [C, D]                                                                                 Replicates to




                                           Node 3
                                          Token = C
                                                                Keys = [B, C]


                     5/3/11                 Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                                              Slide 24
Cassandra use cases
o  Use cases
  •  Big data
  •  Persistent cache
  •  (Write intensive) Logging

o  Who is using it
  n  Digg, Facebook, Twitter, Reddit, Rackspace
  n  Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX
  n  The largest production cluster has over 100
      TB of data in over 150 machines. –
      Casssandra web site
        5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                             Slide 25
MongoDB
o  Document-oriented database
  n  JSON-style documents: Lists, Maps, primitives
  n  Documents organized into collections (~table)
o  Full or partial document updates
  n  Transactional update in place on one document
  n  Atomic Modifiers
o  Rich query language for dynamic queries
o  Index support – secondary and compound
o  GridFS for efficiently storing large files
o  Map/Reduce

         5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                              Slide 26
Data Model = Binary JSON documents
 {
     "name" : "Ajanta",
     "type" : "Indian",
     "serviceArea" : [
        "94619",

     ],
        "94618"                                                                        One document
     "openingHours" : [
        {
                                                                                            =
           "dayOfWeek" : Monday,
           "open" : 1730,
                                                                                     one DDD aggregate
           "close" : 2130
        }
     ],
     "_id" : ObjectId("4bddc2f49d1505567c6220a0")
 }



o  Sequence of bytes on disk = fast I/O
     n    No joins/seeks
     n    In-place updates when possible => no index updates
o  Transaction = update of single document

               5/3/11          Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                Slide 27
MongoDB CLI
$ bin/mongo
> use mydb
> r1 = {name: 'Ajanta'}
{name: 'Ajanta'}
> r2 = {name: 'Montclair Egg Shop'}
{name: 'Montclair Egg Shop'}
> db.restaurants.save(r1)
> r1
{ _id: ObjectId("98…"), name: "Ajanta"}
> db.restaurants.save(r2)
> r2
{ _id: ObjectId("66…"), name: "Montclair Egg Shop"}
> db.restaurants.find({name: /^A/})
{ _id: ObjectId("98…"), name: "Ajanta"}
> db.restaurants.update({name: "Ajanta"},
                        {name: "Ajanta Restaurant"})

         5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                              Slide 28
MongoDB query by example
o  Find a restaurant that serves the 94619 zip
   code and is open at 6pm on a Monday
  {
      serviceArea:"94619",
      openingHours: {
        $elemMatch : {
             "dayOfWeek" : "Monday",
             "open": {$lte: 1800},
             "close": {$gte: 1800}
         }
      }
  }
            DBCursor cursor = collection.find(qbeObject);
            while (cursor.hasNext()) {
               DBObject o = cursor.next();
               …
             }


            5/3/11        Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                      Slide 29
Scaling MongoDB
                         Shard 1                                                             Shard 2
          Mongod                                                              Mongod
          (replica)                                                           (replica)

    Mongod                                                  Mongod
   (master)           Mongod                               (master)                       Mongod
                      (replica)                                                           (replica)


Config
Server

mongod
                                                                                             A shard consists of a
                                 mongos                                                      replica set =
                                                                                             generalization of
                                                                                             master slave
mongod


mongod                                                                                     Collections spread
                                                                                             over multiple
                                    client                                                       shards



         5/3/11          Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                         Slide 30
MongoDB use cases
o  Use cases
  n    Real-time analytics
  n    Content management systems
  n    Single document partial update
  n    Caching
  n    High volume writes
o  Who is using it?
  n    Shutterfly, Foursquare
  n    Bit.ly Intuit
  n    SourceForge, NY Times
  n    GILT Groupe, Evite,
  n    SugarCRM

          5/3/11    Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                Slide 31
Other NoSQL databases
o  SimpleDB – “key-value”
o  Neo4J – graph database
o  CouchDB – document-oriented
o  Membase – key-value
o  Riak – key-value + links
o  Hbase – column-oriented
o  …
 http://nosql-database.org/ has a list of 122 NoSQL databases




          5/3/11       Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                   Slide 32
Agenda
o  The trouble with relational databases
o  Overview of NoSQL databases
o  Introduction to Spring Data
o  NoSQL case study: POJOs in Action




      5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                           Slide 33
NoSQL Java APIs


Database             Libraries
Redis                Jedis, JRedis, JDBC-Redis, RJC

Cassandra            Raw Thrift if you are a masochist
                     Hector, …

MongoDB              MongoDB provides a Java driver




            5/3/11      Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                    Slide 34
Spring Data Project Goals
o  Bring classic Spring value propositions
   to a wide range of NoSQL databases:
  n  Productivity
  n  Programming model consistency: E.g.
      <NoSQL>Template classes
  n  “Portability”
o  Many entry points to use
  n    Auto-generated repository implementations
  n    Opinionated APIs (Think JdbcTemplate)
  n    Object Mapping (Java and GORM)
  n    Cross Store Persistence Programming model
  n    Productivity support in Roo and Grails
                                             Slide 35
Spring Data sub-projects
§ Commons: Polyglot persistence
§ Key-Value: Redis, Riak
§ Document: MongoDB, CouchDB
§ Graph: Neo4j
§ GORM for NoSQL
§ Various milestone releases
  § Key Value 1.0.0.M3 (Apr 6, 2011)
  § Document 1.0.0.M2 (April 9, 2011)
  § Graph - Neo4j Support 1.0.0 (April 19, 2011)
  § …             http://www.springsource.org/spring-data

         5/3/11     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                Slide 36
MongoTemplate
Simplifies data         MongoTemplate              POJO ó DBObject
access            databaseName                         mapping
Translates        userId
                  Password
exceptions
                  defaultCollectionName

                  writeConcern
                  writeResultChecking
                                                     <<interface>>
                  save()                             MongoConvertor
                  insert()                       write(Object, DBObject)
                  remove()                       read(Class, DBObject)
                  updateFirst()
                  findOne()
                  find()
                  …
                                               SimpleMongo
                                        uses     Converter
                           Mongo
                                                         MongoMapping
                     (Java Driver class)
                                                           Converter

                                                              Slide 37
Richer mapping                          Annotations define mapping:
                                          @Document, @Id, @Indexed,
                                          @PersistanceConstructor,
@Document                                 @CompoundIndex, @DBRef,
public class Person {                     @GeoSpatialIndexed, @Value

                                          Map fields instead of properties
 @Id                                      è no getters or setters required
 private ObjectId id;
 private String firstname;                Non-default constructor

 @Indexed                                 Index generation
 private String lastname;

 @PersistenceConstructor
 public Person(String firstname, String lastname) {
   this.firstname = firstname;
   this.lastname = lastname;
 }

….
}
                                                                    Slide 38
Generic Mongo Repositories
interface PersonRepository extends MongoRepository<Person, ObjectId> {
   List<Person> findByLastname(String lastName);
}

<bean>
 <mongo:repositories
  base-package="net.chrisrichardson.mongodb.example.mongorepository"
     mongo-template-ref="mongoTemplate" />
</beans>



Person p = new Person("John", "Doe");
personRepository.save(p);

Person p2 = personRepository.findOne(p.getId());

List<Person> johnDoes = personRepository.findByLastname("Doe");
assertEquals(1, johnDoes.size());

                                                                   Slide 39
Support for the QueryDSL project
 Generated from                         Type-safe
 domain model class                     composable queries


QPerson person = QPerson.person;

Predicate predicate =
       person.homeAddress.street1.eq("1 High Street")
              .and(person.firstname.eq("John"))

List<Person> people = personRepository.findAll(predicate);

assertEquals(1, people.size());
assertPersonEquals(p, people.get(0));

                                                        Slide 40
Cross-store/polyglot persistence
                                Person person = new Person(…);
@Entity
public class Person {           entityManager.persist(person);
  // In Database
 @Id private Long id;           Person p2 = entityManager.find(…)
 private String firstname;
 private String lastname;

// In MongoDB
@RelatedDocument private Address address;



     { "_id" : ObjectId(”….."),
      "_entity_id" : NumberLong(1),
       "_entity_class" : "net.. Person",
     "_entity_field_name" : "address",
        "zip" : "94611", "street1" : "1 High Street", …}

                                                           Slide 41
Agenda
o  The trouble with relational databases
o  Overview of NoSQL databases
o  Introduction to Spring Data
o  NoSQL case study: POJOs in
   Action




      5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                           Slide 42
Food to Go
o  Customer enters delivery address and delivery
   time
o  System displays available restaurants
    n  = restaurants that serve the zip code of the delivery
        address AND are open at the delivery time
class Restaurant {                                       class TimeRange {
  long id;                                                 long id;
  String name;                                             int dayOfWeek;
  Set<String> serviceArea;                                 int openingTime;
  Set<TimeRange> openingHours;
                                                           int closingTime;
  List<MenuItem> menuItems;
                                                         }
}

                                                         class MenuItem {
                                                           String name;
                                                           double price;
                                                         }



            5/3/11       Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                     Slide 43
Database schema
ID                       Name                                                  …                        RESTAURANT
                                                                                                        table
1                        Ajanta
2                        Montclair Eggshop

Restaurant_id                zipcode                                                           RESTAURANT_ZIPCODE
                                                                                               table
1                            94707
1                            94619
2                            94611
2                            94619                                                            RESTAURANT_TIME_RANGE
                                                                                              table
Restaurant_id       dayOfWeek                           openTime                                closeTime
1                   Monday                              1130                                    1430
1                   Monday                              1730                                    2130
2                   Tuesday                             1130                                    …

                5/3/11            Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                            Slide 44
SQL for finding available
 restaurants
select r.*            Straightforward
from restaurant r     three-way join
 inner join restaurant_time_range tr
   on r.id =tr.restaurant_id
 inner join restaurant_zipcode sa
   on r.id = sa.restaurant_id
where “94619’ = sa.zip_code
and tr.day_of_week=“monday”
and tr.openingtime <= 1930
and 1930 <=tr.closingtime

       5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 45
Redis - Persisting restaurants is
    “easy”
rest:1:details           [ name: “Ajanta”, … ]

rest:1:serviceArea       [ “94619”, “94611”, …]

rest:1:openingHours      [10, 11]

timerange:10             [“dayOfWeek”: “Monday”, ..]

timerange:11             [“dayOfWeek”: “Tuesday”, ..]


                               OR

rest:1                    [ name: “Ajanta”,
                            “serviceArea:0” : “94611”, “serviceArea:1” : “94619”,
                            “menuItem:0:name”, “Chicken Vindaloo”,
                            …]



                               OR


 rest:1                   { .. A BIG STRING/BYTE ARRAY, E.G. JSON }



                      5/3/11          Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                  Slide 46
BUT…
o  … we can only retrieve them via
   primary key
 è Queries instead of data model
drives NoSQL database design
è We need to implement indexes
o  But how can a key-value store
   support a query that has


                                                             ?
  n  A 3-way join
  n  Multiple =
  n  > and <

       5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 47
Denormalization eliminates joins
 Restaurant_id   Day_of_week     Open_time                    Close_time                 Zip_code


 1               Monday          1130                         1430                       94707
 1               Monday          1130                         1430                       94619
 1               Monday          1730                         2130                       94707
 1               Monday          1730                         2130                       94619
 2               Monday          0700                         1430                       94619
 …


                                                                                       One simple query
     SELECT restaurant_id, open_time                                                   No joins
      FROM time_range_zip_code                                                         Two = and one <
      WHERE day_of_week = ‘Monday’
        AND zip_code = 94619
        AND 1815 < close_time
        AND open_time < 1815
                                                                Application filters out opening
                                                                times after delivery time

             5/3/11        Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                      Slide 48
Eliminate multiple =’s with
concatenation
 SELECT restaurant_id, open_time
  FROM time_range_zip_code
  WHERE day_of_week = ‘Monday’
    AND zip_code = 94619
    AND 1815 < close_time




  94619:Monday       ….


     GET 94619:Monday


      5/3/11     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                             Slide 49
Sorted sets support range queries
  Key                                         Sorted Set [ Entry:Score, …]

  closingTimes:94707:Monday                   [1130_1:1430, 1730_1:2130]

  closingTimes:94619:Monday                   [0700_2:1430, 1130_1:1430, 1730_2:2130]



  ZRANGEBYSCORE
   closingTimes:94619:Monday                                     Member: OpeningTime_RestaurantId
                                                                 Score:  ClosingTime
   1815 2359
  ->
  {1730_1:2130}



 1730 is before 1815 => Ajanta is open



         5/3/11       Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                         Slide 50
Querying my data


              What did you
              just tell me
                to do!?



     5/3/11     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 51
RedisTemplate-based code
@Repository
public class AvailableRestaurantRepositoryRedisImpl implements AvailableRestaurantRepository {

@Autowired private final StringRedisTemplate redisTemplate;

private BoundZSetOperations<String, String> closingTimes(int dayOfWeek, String zipCode) {
   return redisTemplate.boundZSetOps(AvailableRestaurantKeys.closingTimesKey(dayOfWeek, zipCode));
 }

public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) {
  String zipCode = deliveryAddress.getZip();
  int timeOfDay = timeOfDay(deliveryTime);
  int dayOfWeek = dayOfWeek(deliveryTime);

  Set<String> closingTrs = closingTimes(dayOfWeek, zipCode).rangeByScore(timeOfDay, 2359);
  Set<String> restaurantIds = new HashSet<String>();
  String paddedTimeOfDay = FormattingUtil.format4(timeOfDay);
  for (String trId : closingTrs) {
    if (trId.substring(0, 4).compareTo(paddedTimeOfDay) <= 0)
      restaurantIds.add(StringUtils.substringAfterLast(trId, "_"));
  }

  Collection<String> jsonForRestaurants = redisTemplate.opsForValue().multiGet(
                                             AvailableRestaurantKeys.timeRangeRestaurantInfoKeys(restaurantIds ));
   List<AvailableRestaurant> restaurants = new ArrayList<AvailableRestaurant>();
   for (String json : jsonForRestaurants) {
     restaurants.add(AvailableRestaurant.fromJson(json));
   }
   return restaurants;
 }




                         5/3/11             Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                                Slide 52
Redis – Spring configuration
@Configuration
public class RedisConfiguration extends AbstractDatabaseConfig {

    @Bean
    public RedisConnectionFactory jedisConnectionFactory() {
      JedisConnectionFactory factory = new JedisConnectionFactory();
      factory.setHostName(databaseHostName);
      factory.setPort(6379);
      factory.setUsePool(true);
      JedisPoolConfig poolConfig = new JedisPoolConfig();
      poolConfig.setMaxActive(1000);
      factory.setPoolConfig(poolConfig);
      return factory;
    }

    @Bean
    public StringRedisTemplate stringRedisTemplate(RedisConnectionFactory factory) {
      StringRedisTemplate template = new StringRedisTemplate();
      template.setConnectionFactory(factory);
      return template;
    }
}


                    5/3/11        Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                              Slide 53
Deleting/Updating a restaurant
o  Need to delete members of the sorted
   sets
o  But we can’t “find by a foreign key”
o  To delete a restaurant:
  n  GET JSON details of Restaurant (incl.
      openingHours + serviceArea)
  n  Re-compute sorted set keys and
      members and delete them
  n  Delete the JSON


       5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 54
Cassandra: Easy to store
restaurants
                                                       Column Family: RestaurantDetails
  Keys               Columns




  1        name: Ajanta                 type: Indian                                    …



          name: Montclair
  2                                  type: Breakfast                                    …
             Egg Shop




                            OR
                                                       Column Family: RestaurantDetails
  Keys               Columns




  1      details: { JSON DOCUMENT }




  2      details: { JSON DOCUMENT }




         5/3/11             Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                            Slide 55
But we can’t query this
o  Similar challenges to using Redis
o  No joins è denormalize
o  Can use composite/concatenated
   keys
  n  Prefix - equality match
  n  Suffix - can be range scan
o  Some limited querying options
  n  Row key – exact or range
  n  Column name – exact or range

       5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                            Slide 56
Cassandra: Find restaurants that close
after the delivery time and then filter
       Keys         Super Columns

                       1430                                      1430                             2130

94619:Mon
                                                        1130_1: JSON FOR                   1730_1: JSON FOR
               0700_2: JSON FOR EGG
                                                             AJANTA                             AJANTA




                        SuperSlice
                         key= 94619:Mon
                         SliceStart = 1815
                         SliceEnd = 2359

       Keys         Super Columns

                                                                                                  2130

94619:Mon
                                                                                           1730_1: JSON FOR
                                                                                                AJANTA




                                            18:15 is after 17:30 => {Ajanta}


              5/3/11           Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                              Slide 57
Cassandra/Hector code
import me.prettyprint.hector.api.Cluster;

public class CassandraHelper {
  @Autowired private final Cluster cluster;

    public <T> List<T> getSuperSlice(String keyspace, String columnFamily,
                                     String key, String sliceStart, String sliceEnd,
                                     SuperSliceResultMapper<T> resultMapper) {

        SuperSliceQuery<String, String, String, String> q =
         HFactory.createSuperSliceQuery(HFactory.createKeyspace(keyspace, cluster),
             StringSerializer.get(), StringSerializer.get(), StringSerializer.get(), StringSerializer.get());
        q.setColumnFamily(columnFamily);
        q.setKey(key);
        q.setRange(sliceStart, sliceEnd, false, 10000);

        QueryResult<SuperSlice<String, String, String>> qr = q.execute();

        SuperColumnRowProcessor<T> rowProcessor = new SuperColumnRowProcessor<T>(resultMapper);

        for (HSuperColumn<String, String, String> superColumn : qr.get().getSuperColumns()) {
          List<HColumn<String, String>> columns = superColumn.getColumns();
          rowProcessor.processRow(key, superColumn.getName(), columns);
        }
        return rowProcessor.getResult();
    }
}

                            5/3/11           Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                         Slide 58
MongoDB = easy to store
{
    "_id": "1234"
    "name": "Ajanta",
    "serviceArea": ["94619", "99999"],
    "openingHours": [
        {
           "dayOfWeek": 1,
           "open": 1130,
           "close": 1430
        },
        {
           "dayOfWeek": 2,
           "open": 1130,
           "close": 1430
        },
        …
     ]
}
               5/3/11     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                      Slide 59
MongoDB = easy to query
{
    "serviceArea": "94619",
    "openingHours": {
       "$elemMatch": {
          "open": { "$lte": 1815},
          "dayOfWeek": 4,
          "close": { $gte": 1815}
       }
    }
       db.availableRestaurants.ensureIndex({serviceArea: 1})
          5/3/11    Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                Slide 60
MongoTemplate-based code
@Repository
public class AvailableRestaurantRepositoryMongoDbImpl
                               implements AvailableRestaurantRepository {

@Autowired private final MongoTemplate mongoTemplate;

@Autowired @Override
public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress,
                                                          Date deliveryTime) {
 int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
 int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);

Query query = new Query(where("serviceArea").is(deliveryAddress.getZip())
       .and("openingHours”).elemMatch(where("dayOfWeek").is(dayOfWeek)
              .and("openingTime").lte(timeOfDay)
              .and("closingTime").gte(timeOfDay)));

    return mongoTemplate.find(AVAILABLE_RESTAURANTS_COLLECTION, query,
                               AvailableRestaurant.class);
}

              mongoTemplate.ensureIndex(“availableRestaurants”,
                 new Index().on("serviceArea", Order.ASCENDING));
                   5/3/11        Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                             Slide 61
MongoDB – Spring Configuration
@Configuration
public class MongoConfig extends AbstractDatabaseConfig {
 private @Value("#{mongoDbProperties.databaseName}")
 String mongoDbDatabase;

    public @Bean MongoFactoryBean mongo() {
      MongoFactoryBean factory = new MongoFactoryBean();
      factory.setHost(databaseHostName);
      MongoOptions options = new MongoOptions();
      options.connectionsPerHost = 500;
      factory.setMongoOptions(options);
      return factory;
    }

    public @Bean
    MongoTemplate mongoTemplate(Mongo mongo) throws Exception {
      MongoTemplate mongoTemplate = new MongoTemplate(mongo, mongoDbDatabase);
      mongoTemplate.setWriteConcern(WriteConcern.SAFE);
      mongoTemplate.setWriteResultChecking(WriteResultChecking.EXCEPTION);
      return mongoTemplate;
    }
}


                  5/3/11       Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                           Slide 62
Is NoSQL webscale?
Benchmarking is still work in
progress but so far
                                                                                                http://www.youtube.com/watch?
                                                                                                v=b2F-DItXtZs



                                         Redis                              Mongo                  Cassandra
Insert for PK                            Awesome                            Awesome                Fast*
Find by PK                               Awesome                            Awesome                Fast
Insert for find available                Fast                               Awesome                Ok*
Find available                           Awesome                            Ok                     Ok
restaurants

                         * Cassandra can be clustered for improved write performance




          In other words: it depends
                5/3/11              Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                                                      Slide 63
Summary
o  Relational databases are great but
   n    Object/relational impedance mismatch
   n    Relational schema is rigid
   n    Extremely difficult/impossible to scale writes
   n    Performance can be suboptimal
o  Each NoSQL databases can solve some
   combination of those problems BUT
   n  Limited transactions
   n  Query-driven, denormalized database design
   n  …
                         è
o  Carefully pick the NoSQL DB for your application
o  Consider a polyglot persistence architecture

            5/3/11     Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                                   Slide 64
Thank you!

                                           My contact info

                                           chris.richardson@springsource.com


                                           @crichardson




    5/3/11   Copyright (c) 2011 Chris Richardson. All rights reserved.
                                                                         Slide 65

More Related Content

PDF
Map, flatmap and reduce are your new best friends (javaone, svcc)
PDF
[DSC 2016] 系列活動:李泳泉 / 星火燎原 - Spark 機器學習初探
PDF
Big Data Processing using Apache Spark and Clojure
PDF
Improving application design with a rich domain model (springone 2007)
PDF
The Gremlin in the Graph
PDF
Soft Shake Event / A soft introduction to Neo4J
PDF
Spark workshop
PDF
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Map, flatmap and reduce are your new best friends (javaone, svcc)
[DSC 2016] 系列活動:李泳泉 / 星火燎原 - Spark 機器學習初探
Big Data Processing using Apache Spark and Clojure
Improving application design with a rich domain model (springone 2007)
The Gremlin in the Graph
Soft Shake Event / A soft introduction to Neo4J
Spark workshop
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...

What's hot (20)

PDF
Collections forceawakens
PDF
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
ODP
Stratosphere Intro (Java and Scala Interface)
PDF
Groovy On Trading Desk (2010)
PDF
Traversing Graph Databases with Gremlin
PDF
The Path-o-Logical Gremlin
PDF
Groovy Finance
PDF
DevFest Istanbul - a free guided tour of Neo4J
PDF
Extending lifespan with Hadoop and R
PDF
Deep Learning in the Wild with Arno Candel
PDF
ACM DBPL Keynote: The Graph Traversal Machine and Language
KEY
Data Binding in qooxdoo
PPTX
Scala - The Simple Parts, SFScala presentation
PPT
Spark training-in-bangalore
PDF
V8 hidden class and inline cache
PDF
Scala 2013 review
PDF
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
PPTX
Scala meetup - Intro to spark
PDF
Native interfaces for R
Collections forceawakens
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
Stratosphere Intro (Java and Scala Interface)
Groovy On Trading Desk (2010)
Traversing Graph Databases with Gremlin
The Path-o-Logical Gremlin
Groovy Finance
DevFest Istanbul - a free guided tour of Neo4J
Extending lifespan with Hadoop and R
Deep Learning in the Wild with Arno Candel
ACM DBPL Keynote: The Graph Traversal Machine and Language
Data Binding in qooxdoo
Scala - The Simple Parts, SFScala presentation
Spark training-in-bangalore
V8 hidden class and inline cache
Scala 2013 review
Engineering Fast Indexes for Big-Data Applications: Spark Summit East talk by...
Scala meetup - Intro to spark
Native interfaces for R
Ad

Similar to Polyglot persistence for Java developers - moving out of the relational comfort zone (20)

PDF
Polygot persistence for Java Developers - August 2011 / @Oakjug
PDF
In pursuit of expressivity: Groovy and Scala compared
PDF
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
PDF
SQL, NoSQL, NewSQL? What's a developer to do?
PDF
Spring one2gx2010 spring-nonrelational_data
PDF
NoSQL in Perspective
PPTX
Minnebar 2013 - Scaling with Cassandra
PPTX
An Intro to NoSQL Databases
PDF
Five Ways To Do Data Analytics "The Wrong Way"
PDF
Microservice-based software architecture
PDF
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
PPTX
Introduction to Redis
PDF
PDF
What’s New in ScyllaDB Open Source 5.0
PDF
Arrays in database systems, the next frontier?
PPT
Keynote: Building Tomorrow's Ceph - Ceph Day Frankfurt
PPT
NoSql Databases
PDF
Boosting Machine Learning with Redis Modules and Spark
PDF
Non Relational Databases
PDF
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Polygot persistence for Java Developers - August 2011 / @Oakjug
In pursuit of expressivity: Groovy and Scala compared
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, Egypt
SQL, NoSQL, NewSQL? What's a developer to do?
Spring one2gx2010 spring-nonrelational_data
NoSQL in Perspective
Minnebar 2013 - Scaling with Cassandra
An Intro to NoSQL Databases
Five Ways To Do Data Analytics "The Wrong Way"
Microservice-based software architecture
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
Introduction to Redis
What’s New in ScyllaDB Open Source 5.0
Arrays in database systems, the next frontier?
Keynote: Building Tomorrow's Ceph - Ceph Day Frankfurt
NoSql Databases
Boosting Machine Learning with Redis Modules and Spark
Non Relational Databases
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ad

More from Chris Richardson (20)

PDF
The microservice architecture: what, why, when and how?
PDF
More the merrier: a microservices anti-pattern
PDF
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
PDF
Dark Energy, Dark Matter and the Microservices Patterns?!
PDF
Dark energy, dark matter and microservice architecture collaboration patterns
PDF
Scenarios_and_Architecture_SkillsMatter_April_2022.pdf
PDF
Using patterns and pattern languages to make better architectural decisions
PDF
iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...
PDF
Events to the rescue: solving distributed data problems in a microservice arc...
PDF
A pattern language for microservices - June 2021
PDF
QConPlus 2021: Minimizing Design Time Coupling in a Microservice Architecture
PDF
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...
PDF
Designing loosely coupled services
PDF
Microservices - an architecture that enables DevOps (T Systems DevOps day)
PDF
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
PDF
Decompose your monolith: Six principles for refactoring a monolith to microse...
PDF
TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...
PDF
Overview of the Eventuate Tram Customers and Orders application
PDF
An overview of the Eventuate Platform
PDF
#DevNexus202 Decompose your monolith
The microservice architecture: what, why, when and how?
More the merrier: a microservices anti-pattern
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
Dark Energy, Dark Matter and the Microservices Patterns?!
Dark energy, dark matter and microservice architecture collaboration patterns
Scenarios_and_Architecture_SkillsMatter_April_2022.pdf
Using patterns and pattern languages to make better architectural decisions
iSAQB gathering 2021 keynote - Architectural patterns for rapid, reliable, fr...
Events to the rescue: solving distributed data problems in a microservice arc...
A pattern language for microservices - June 2021
QConPlus 2021: Minimizing Design Time Coupling in a Microservice Architecture
Mucon 2021 - Dark energy, dark matter: imperfect metaphors for designing micr...
Designing loosely coupled services
Microservices - an architecture that enables DevOps (T Systems DevOps day)
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
Decompose your monolith: Six principles for refactoring a monolith to microse...
TDC2020 - The microservice architecture: enabling rapid, reliable, frequent a...
Overview of the Eventuate Tram Customers and Orders application
An overview of the Eventuate Platform
#DevNexus202 Decompose your monolith

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
cuic standard and advanced reporting.pdf
PDF
KodekX | Application Modernization Development
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Per capita expenditure prediction using model stacking based on satellite ima...
Big Data Technologies - Introduction.pptx
Encapsulation_ Review paper, used for researhc scholars
NewMind AI Monthly Chronicles - July 2025
cuic standard and advanced reporting.pdf
KodekX | Application Modernization Development
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Understanding_Digital_Forensics_Presentation.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
The Rise and Fall of 3GPP – Time for a Sabbatical?
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
MYSQL Presentation for SQL database connectivity
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...

Polyglot persistence for Java developers - moving out of the relational comfort zone

  • 1. Polyglot persistence for Java developers - moving out of the relational comfort zone Chris Richardson Author of POJOs in Action Founder of CloudFoundry.com Chris.Richardson@SpringSource.Com @crichardson
  • 2. Overall presentation goal The joy and pain of building Java applications that use NoSQL 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 2
  • 3. About Chris •  Grew up in England and live in Oakland, CA •  Over 25+ years of software development experience including 14 years of Java •  Speaker at JavaOne, SpringOne, NFJS, JavaPolis, Spring Experience, etc. •  Organize the Oakland JUG and the Groovy Grails meetup http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/ 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 3
  • 4. Agenda o  The trouble with relational databases o  Overview of NoSQL databases o  Introduction to Spring Data o  NoSQL case study: POJOs in Action 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 4
  • 5. Relational databases are great o  SQL = Rich, declarative query language o  Database enforces referential integrity o  ACID semantics o  Well understood by developers o  Well supported by frameworks and tools, e.g. Spring JDBC, Hibernate, JPA o  Well understood by operations n  Configuration n  Care and feeding n  Backups n  Tuning n  Failure and recovery n  Performance characteristics o  But…. 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 5
  • 6. The trouble with relational databases o  Object/relational impedance mismatch n  Complicated to map rich domain model to relational schema o  Relational schema is rigid n  Difficult to handle semi-structured data, e.g. varying attributes n  Schema changes = downtime or $$ o  Extremely difficult/impossible to scale writes: n  Vertical scaling is limited/requires $$ n  Horizontal scaling is limited or requires $$ o  Performance can be suboptimal for some use cases 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 6
  • 7. NoSQL databases have emerged… Each one offers some combination of: o  High performance o  High scalability o  Rich data-model o  Schema less In return for: o  Limited transactions o  Relaxed consistency o  … 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 7
  • 8. … but there are few commonalities o  Everyone and their dog has written one o  Different data models n  Key-value n  Column n  Document n  Graph o  Different APIs – No JDBC, Hibernate, JPA (generally) o  “Same sorry state as the database market in the 1970s before SQL was invented” http://queue.acm.org/detail.cfm?id=1961297 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 8
  • 9. How to I access my data? 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 9
  • 10. Reality Check - Relational DBs are not going away §  NoSQL usage small by comparison … §  But growing… 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 10
  • 11. Future = multi-paradigm data storage for enterprise applications IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 11
  • 12. Agenda o  The trouble with relational databases o  Overview of NoSQL databases o  Introduction to Spring Data o  NoSQL case study: POJOs in Action 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 12
  • 13. Redis o  Advanced key-value store n  Think memcached on steroids (the good kind) K1 V1 n  Values can be binary strings, Lists, Sets, Ordered Sets, Hash maps, .. K2 V2 n  Operations for each data type, e.g. appending to a list, adding to a set, retrieving a slice of a K3 V2 list, … o  Very fast: n  In-memory operations n  ~100K operations/second on entry-level hardware o  Persistent n  Periodic snapshots of memory OR append commands to log file n  Limits are size of keys retained in memory. o  Has transactions n  Commands can be batched and executed atomically 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 13
  • 14. Redis CLI redis> sadd myset a (integer) 1 redis> sadd myset b (integer) 1 redis> smembers myset 1. "a" 2. "b" redis> srem myset a (integer) 1 redis> smembers myset 1. "b" 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 14
  • 15. Scaling Redis o  Master/slave replication n  Tree of Redis servers n  Non-persistent master can replicate to a persistent slave n  Use slaves for read-only queries o  Sharding n  Client-side only – consistent hashing based on key n  Server-side sharding – coming one day o  Run multiple servers per physical host n  Server is single threaded => Leverage multiple CPUs n  32 bit more efficient than 64 bit o  Optional "virtual memory" n  Ideally data should fit in RAM n  Values (not keys) written to disc 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 15
  • 16. Redis use cases o  Use in conjunction with another database as the SOR o  Drop-in replacement for Memcached n  Session state n  Cache of data retrieved from SOR n  Denormalized datastore for high-performance queries o  Hit counts using INCR command o  Randomly selecting an item – SRANDMEMBER o  Queuing – Lists with LPOP, RPUSH, …. o  High score tables – Sorted sets o  Notable users: github, guardian.co.uk, …. 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 16
  • 17. Cassandra o  An Apache open-source project originally developed by Facebook for inbox search o  Extremely scalable o  Fast writes = append to a log o  Data is replicated and sharded o  Rack and datacenter aware o  Column-oriented database n  The data model will hurt your brain n  4 or 5-dimensional hash map 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 17
  • 18. Cassandra data model My Column family (within a key space) Keys Columns a colA: value1 colB: value2 colC: value3 b colA: value colD: value colE: value o  4-D map: keySpace x key x columnFamily x column è value o  Column names are dynamic; can contain data o  Arbitrary number of columns o  One CF row = one DDD aggregate 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 18
  • 19. Cassandra data model – insert/update My Column family (within a key space) Keys Columns a colA: value1 colB: value2 colC: value3 b colA: value colD: value colE: value Transaction = updates to a row within a ColumnFamily Insert(key=a, columName=colZ, value=foo) Keys Columns a colA: value1 colB: value2 colC: value3 colZ: foo b colA: value colD: value colE: value 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 19
  • 20. Cassandra query example – slice Key Columns s colA: colB: colC: colZ: a value1 value2 value3 foo colA: colD: colE: b value value value slice(key=a, startColumn=colA, endColumnName=colC) Key Columns You can also do a s rangeSlice which colA: colB: a value1 value2 returns a range of keys – less efficient 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 20
  • 21. Super Column Families – one more dimension My Column family (within a key space) Keys Super columns ScA ScB a colA: value1 colB: value2 colC: value3 b colA: value colD: value colE: value Insert(key=a, superColumn=scB, columName=colZ, value=foo) keySpace x key x columnFamily x superColumn x column -> value Keys Super columns ScA ScB a colA: value1 colB: value2 colC:colZ: foo value3 b colA: value colD: value colE: value 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 21
  • 22. Getting data with super slice My Column family (within a key space) Keys Super columns ScA ScB a colA: value1 colB: value2 colC: value3 b colA: value colD: value colE: value slice(key=a, startColumn=scB, endColumnName=scC) keySpace x key x columnFamily x Super column x column -> value Keys Super columns ScB a colC: value3 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 22
  • 23. Cassandra CLI $ bin/cassandra-cli -h localhost Connected to: "Test Cluster" on localhost/9160 Welcome to cassandra CLI. [default@unknown] use Keyspace1; Authenticated to keyspace: Keyspace1 [default@Keyspace1] list restaurantDetails; Using default limit of 100 ------------------- RowKey: 1 => (super_column=attributes, (column=json, value={"id": 1,"name":"Ajanta","menuItems".... [default@Keyspace1] get restaurantDetails['1']['attributes’]; => (column=json, value={"id":1,"name":"Ajanta","menuItems".... 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 23
  • 24. Scaling Cassandra • Client connects to any node • Dynamically add/remove nodes Keys = [D, A] Node 1 • Reads/Writes specify how many nodes • Configurable # of replicas Token = A •  adjacent nodes •  rack and data center aware Replicates to Replicates to Node 4 Node 2 Keys = [A, B] Token = D Token = B Replicates to Keys = [C, D] Replicates to Node 3 Token = C Keys = [B, C] 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 24
  • 25. Cassandra use cases o  Use cases •  Big data •  Persistent cache •  (Write intensive) Logging o  Who is using it n  Digg, Facebook, Twitter, Reddit, Rackspace n  Cloudkick, Cisco, SimpleGeo, Ooyala, OpenX n  The largest production cluster has over 100 TB of data in over 150 machines. – Casssandra web site 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 25
  • 26. MongoDB o  Document-oriented database n  JSON-style documents: Lists, Maps, primitives n  Documents organized into collections (~table) o  Full or partial document updates n  Transactional update in place on one document n  Atomic Modifiers o  Rich query language for dynamic queries o  Index support – secondary and compound o  GridFS for efficiently storing large files o  Map/Reduce 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 26
  • 27. Data Model = Binary JSON documents { "name" : "Ajanta", "type" : "Indian", "serviceArea" : [ "94619", ], "94618" One document "openingHours" : [ { = "dayOfWeek" : Monday, "open" : 1730, one DDD aggregate "close" : 2130 } ], "_id" : ObjectId("4bddc2f49d1505567c6220a0") } o  Sequence of bytes on disk = fast I/O n  No joins/seeks n  In-place updates when possible => no index updates o  Transaction = update of single document 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 27
  • 28. MongoDB CLI $ bin/mongo > use mydb > r1 = {name: 'Ajanta'} {name: 'Ajanta'} > r2 = {name: 'Montclair Egg Shop'} {name: 'Montclair Egg Shop'} > db.restaurants.save(r1) > r1 { _id: ObjectId("98…"), name: "Ajanta"} > db.restaurants.save(r2) > r2 { _id: ObjectId("66…"), name: "Montclair Egg Shop"} > db.restaurants.find({name: /^A/}) { _id: ObjectId("98…"), name: "Ajanta"} > db.restaurants.update({name: "Ajanta"}, {name: "Ajanta Restaurant"}) 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 28
  • 29. MongoDB query by example o  Find a restaurant that serves the 94619 zip code and is open at 6pm on a Monday { serviceArea:"94619", openingHours: { $elemMatch : { "dayOfWeek" : "Monday", "open": {$lte: 1800}, "close": {$gte: 1800} } } } DBCursor cursor = collection.find(qbeObject); while (cursor.hasNext()) { DBObject o = cursor.next(); … } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 29
  • 30. Scaling MongoDB Shard 1 Shard 2 Mongod Mongod (replica) (replica) Mongod Mongod (master) Mongod (master) Mongod (replica) (replica) Config Server mongod A shard consists of a mongos replica set = generalization of master slave mongod mongod Collections spread over multiple client shards 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 30
  • 31. MongoDB use cases o  Use cases n  Real-time analytics n  Content management systems n  Single document partial update n  Caching n  High volume writes o  Who is using it? n  Shutterfly, Foursquare n  Bit.ly Intuit n  SourceForge, NY Times n  GILT Groupe, Evite, n  SugarCRM 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 31
  • 32. Other NoSQL databases o  SimpleDB – “key-value” o  Neo4J – graph database o  CouchDB – document-oriented o  Membase – key-value o  Riak – key-value + links o  Hbase – column-oriented o  … http://nosql-database.org/ has a list of 122 NoSQL databases 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 32
  • 33. Agenda o  The trouble with relational databases o  Overview of NoSQL databases o  Introduction to Spring Data o  NoSQL case study: POJOs in Action 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 33
  • 34. NoSQL Java APIs Database Libraries Redis Jedis, JRedis, JDBC-Redis, RJC Cassandra Raw Thrift if you are a masochist Hector, … MongoDB MongoDB provides a Java driver 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 34
  • 35. Spring Data Project Goals o  Bring classic Spring value propositions to a wide range of NoSQL databases: n  Productivity n  Programming model consistency: E.g. <NoSQL>Template classes n  “Portability” o  Many entry points to use n  Auto-generated repository implementations n  Opinionated APIs (Think JdbcTemplate) n  Object Mapping (Java and GORM) n  Cross Store Persistence Programming model n  Productivity support in Roo and Grails Slide 35
  • 36. Spring Data sub-projects § Commons: Polyglot persistence § Key-Value: Redis, Riak § Document: MongoDB, CouchDB § Graph: Neo4j § GORM for NoSQL § Various milestone releases § Key Value 1.0.0.M3 (Apr 6, 2011) § Document 1.0.0.M2 (April 9, 2011) § Graph - Neo4j Support 1.0.0 (April 19, 2011) § … http://www.springsource.org/spring-data 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 36
  • 37. MongoTemplate Simplifies data MongoTemplate POJO ó DBObject access databaseName mapping Translates userId Password exceptions defaultCollectionName writeConcern writeResultChecking <<interface>> save() MongoConvertor insert() write(Object, DBObject) remove() read(Class, DBObject) updateFirst() findOne() find() … SimpleMongo uses Converter Mongo MongoMapping (Java Driver class) Converter Slide 37
  • 38. Richer mapping Annotations define mapping: @Document, @Id, @Indexed, @PersistanceConstructor, @Document @CompoundIndex, @DBRef, public class Person { @GeoSpatialIndexed, @Value Map fields instead of properties @Id è no getters or setters required private ObjectId id; private String firstname; Non-default constructor @Indexed Index generation private String lastname; @PersistenceConstructor public Person(String firstname, String lastname) { this.firstname = firstname; this.lastname = lastname; } …. } Slide 38
  • 39. Generic Mongo Repositories interface PersonRepository extends MongoRepository<Person, ObjectId> { List<Person> findByLastname(String lastName); } <bean> <mongo:repositories base-package="net.chrisrichardson.mongodb.example.mongorepository" mongo-template-ref="mongoTemplate" /> </beans> Person p = new Person("John", "Doe"); personRepository.save(p); Person p2 = personRepository.findOne(p.getId()); List<Person> johnDoes = personRepository.findByLastname("Doe"); assertEquals(1, johnDoes.size()); Slide 39
  • 40. Support for the QueryDSL project Generated from Type-safe domain model class composable queries QPerson person = QPerson.person; Predicate predicate = person.homeAddress.street1.eq("1 High Street") .and(person.firstname.eq("John")) List<Person> people = personRepository.findAll(predicate); assertEquals(1, people.size()); assertPersonEquals(p, people.get(0)); Slide 40
  • 41. Cross-store/polyglot persistence Person person = new Person(…); @Entity public class Person { entityManager.persist(person); // In Database @Id private Long id; Person p2 = entityManager.find(…) private String firstname; private String lastname; // In MongoDB @RelatedDocument private Address address; { "_id" : ObjectId(”….."), "_entity_id" : NumberLong(1), "_entity_class" : "net.. Person", "_entity_field_name" : "address", "zip" : "94611", "street1" : "1 High Street", …} Slide 41
  • 42. Agenda o  The trouble with relational databases o  Overview of NoSQL databases o  Introduction to Spring Data o  NoSQL case study: POJOs in Action 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 42
  • 43. Food to Go o  Customer enters delivery address and delivery time o  System displays available restaurants n  = restaurants that serve the zip code of the delivery address AND are open at the delivery time class Restaurant { class TimeRange { long id; long id; String name; int dayOfWeek; Set<String> serviceArea; int openingTime; Set<TimeRange> openingHours; int closingTime; List<MenuItem> menuItems; } } class MenuItem { String name; double price; } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 43
  • 44. Database schema ID Name … RESTAURANT table 1 Ajanta 2 Montclair Eggshop Restaurant_id zipcode RESTAURANT_ZIPCODE table 1 94707 1 94619 2 94611 2 94619 RESTAURANT_TIME_RANGE table Restaurant_id dayOfWeek openTime closeTime 1 Monday 1130 1430 1 Monday 1730 2130 2 Tuesday 1130 … 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 44
  • 45. SQL for finding available restaurants select r.* Straightforward from restaurant r three-way join inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id where “94619’ = sa.zip_code and tr.day_of_week=“monday” and tr.openingtime <= 1930 and 1930 <=tr.closingtime 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 45
  • 46. Redis - Persisting restaurants is “easy” rest:1:details [ name: “Ajanta”, … ] rest:1:serviceArea [ “94619”, “94611”, …] rest:1:openingHours [10, 11] timerange:10 [“dayOfWeek”: “Monday”, ..] timerange:11 [“dayOfWeek”: “Tuesday”, ..] OR rest:1 [ name: “Ajanta”, “serviceArea:0” : “94611”, “serviceArea:1” : “94619”, “menuItem:0:name”, “Chicken Vindaloo”, …] OR rest:1 { .. A BIG STRING/BYTE ARRAY, E.G. JSON } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 46
  • 47. BUT… o  … we can only retrieve them via primary key è Queries instead of data model drives NoSQL database design è We need to implement indexes o  But how can a key-value store support a query that has ? n  A 3-way join n  Multiple = n  > and < 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 47
  • 48. Denormalization eliminates joins Restaurant_id Day_of_week Open_time Close_time Zip_code 1 Monday 1130 1430 94707 1 Monday 1130 1430 94619 1 Monday 1730 2130 94707 1 Monday 1730 2130 94619 2 Monday 0700 1430 94619 … One simple query SELECT restaurant_id, open_time No joins FROM time_range_zip_code Two = and one < WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time AND open_time < 1815 Application filters out opening times after delivery time 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 48
  • 49. Eliminate multiple =’s with concatenation SELECT restaurant_id, open_time FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time 94619:Monday …. GET 94619:Monday 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 49
  • 50. Sorted sets support range queries Key Sorted Set [ Entry:Score, …] closingTimes:94707:Monday [1130_1:1430, 1730_1:2130] closingTimes:94619:Monday [0700_2:1430, 1130_1:1430, 1730_2:2130] ZRANGEBYSCORE closingTimes:94619:Monday Member: OpeningTime_RestaurantId Score: ClosingTime 1815 2359 -> {1730_1:2130} 1730 is before 1815 => Ajanta is open 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 50
  • 51. Querying my data What did you just tell me to do!? 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 51
  • 52. RedisTemplate-based code @Repository public class AvailableRestaurantRepositoryRedisImpl implements AvailableRestaurantRepository { @Autowired private final StringRedisTemplate redisTemplate; private BoundZSetOperations<String, String> closingTimes(int dayOfWeek, String zipCode) { return redisTemplate.boundZSetOps(AvailableRestaurantKeys.closingTimesKey(dayOfWeek, zipCode)); } public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { String zipCode = deliveryAddress.getZip(); int timeOfDay = timeOfDay(deliveryTime); int dayOfWeek = dayOfWeek(deliveryTime); Set<String> closingTrs = closingTimes(dayOfWeek, zipCode).rangeByScore(timeOfDay, 2359); Set<String> restaurantIds = new HashSet<String>(); String paddedTimeOfDay = FormattingUtil.format4(timeOfDay); for (String trId : closingTrs) { if (trId.substring(0, 4).compareTo(paddedTimeOfDay) <= 0) restaurantIds.add(StringUtils.substringAfterLast(trId, "_")); } Collection<String> jsonForRestaurants = redisTemplate.opsForValue().multiGet( AvailableRestaurantKeys.timeRangeRestaurantInfoKeys(restaurantIds )); List<AvailableRestaurant> restaurants = new ArrayList<AvailableRestaurant>(); for (String json : jsonForRestaurants) { restaurants.add(AvailableRestaurant.fromJson(json)); } return restaurants; } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 52
  • 53. Redis – Spring configuration @Configuration public class RedisConfiguration extends AbstractDatabaseConfig { @Bean public RedisConnectionFactory jedisConnectionFactory() { JedisConnectionFactory factory = new JedisConnectionFactory(); factory.setHostName(databaseHostName); factory.setPort(6379); factory.setUsePool(true); JedisPoolConfig poolConfig = new JedisPoolConfig(); poolConfig.setMaxActive(1000); factory.setPoolConfig(poolConfig); return factory; } @Bean public StringRedisTemplate stringRedisTemplate(RedisConnectionFactory factory) { StringRedisTemplate template = new StringRedisTemplate(); template.setConnectionFactory(factory); return template; } } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 53
  • 54. Deleting/Updating a restaurant o  Need to delete members of the sorted sets o  But we can’t “find by a foreign key” o  To delete a restaurant: n  GET JSON details of Restaurant (incl. openingHours + serviceArea) n  Re-compute sorted set keys and members and delete them n  Delete the JSON 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 54
  • 55. Cassandra: Easy to store restaurants Column Family: RestaurantDetails Keys Columns 1 name: Ajanta type: Indian … name: Montclair 2 type: Breakfast … Egg Shop OR Column Family: RestaurantDetails Keys Columns 1 details: { JSON DOCUMENT } 2 details: { JSON DOCUMENT } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 55
  • 56. But we can’t query this o  Similar challenges to using Redis o  No joins è denormalize o  Can use composite/concatenated keys n  Prefix - equality match n  Suffix - can be range scan o  Some limited querying options n  Row key – exact or range n  Column name – exact or range 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 56
  • 57. Cassandra: Find restaurants that close after the delivery time and then filter Keys Super Columns 1430 1430 2130 94619:Mon 1130_1: JSON FOR 1730_1: JSON FOR 0700_2: JSON FOR EGG AJANTA AJANTA SuperSlice key= 94619:Mon SliceStart = 1815 SliceEnd = 2359 Keys Super Columns 2130 94619:Mon 1730_1: JSON FOR AJANTA 18:15 is after 17:30 => {Ajanta} 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 57
  • 58. Cassandra/Hector code import me.prettyprint.hector.api.Cluster; public class CassandraHelper { @Autowired private final Cluster cluster; public <T> List<T> getSuperSlice(String keyspace, String columnFamily, String key, String sliceStart, String sliceEnd, SuperSliceResultMapper<T> resultMapper) { SuperSliceQuery<String, String, String, String> q = HFactory.createSuperSliceQuery(HFactory.createKeyspace(keyspace, cluster), StringSerializer.get(), StringSerializer.get(), StringSerializer.get(), StringSerializer.get()); q.setColumnFamily(columnFamily); q.setKey(key); q.setRange(sliceStart, sliceEnd, false, 10000); QueryResult<SuperSlice<String, String, String>> qr = q.execute(); SuperColumnRowProcessor<T> rowProcessor = new SuperColumnRowProcessor<T>(resultMapper); for (HSuperColumn<String, String, String> superColumn : qr.get().getSuperColumns()) { List<HColumn<String, String>> columns = superColumn.getColumns(); rowProcessor.processRow(key, superColumn.getName(), columns); } return rowProcessor.getResult(); } } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 58
  • 59. MongoDB = easy to store { "_id": "1234" "name": "Ajanta", "serviceArea": ["94619", "99999"], "openingHours": [ { "dayOfWeek": 1, "open": 1130, "close": 1430 }, { "dayOfWeek": 2, "open": 1130, "close": 1430 }, … ] } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 59
  • 60. MongoDB = easy to query { "serviceArea": "94619", "openingHours": { "$elemMatch": { "open": { "$lte": 1815}, "dayOfWeek": 4, "close": { $gte": 1815} } } db.availableRestaurants.ensureIndex({serviceArea: 1}) 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 60
  • 61. MongoTemplate-based code @Repository public class AvailableRestaurantRepositoryMongoDbImpl implements AvailableRestaurantRepository { @Autowired private final MongoTemplate mongoTemplate; @Autowired @Override public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime); int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime); Query query = new Query(where("serviceArea").is(deliveryAddress.getZip()) .and("openingHours”).elemMatch(where("dayOfWeek").is(dayOfWeek) .and("openingTime").lte(timeOfDay) .and("closingTime").gte(timeOfDay))); return mongoTemplate.find(AVAILABLE_RESTAURANTS_COLLECTION, query, AvailableRestaurant.class); } mongoTemplate.ensureIndex(“availableRestaurants”, new Index().on("serviceArea", Order.ASCENDING)); 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 61
  • 62. MongoDB – Spring Configuration @Configuration public class MongoConfig extends AbstractDatabaseConfig { private @Value("#{mongoDbProperties.databaseName}") String mongoDbDatabase; public @Bean MongoFactoryBean mongo() { MongoFactoryBean factory = new MongoFactoryBean(); factory.setHost(databaseHostName); MongoOptions options = new MongoOptions(); options.connectionsPerHost = 500; factory.setMongoOptions(options); return factory; } public @Bean MongoTemplate mongoTemplate(Mongo mongo) throws Exception { MongoTemplate mongoTemplate = new MongoTemplate(mongo, mongoDbDatabase); mongoTemplate.setWriteConcern(WriteConcern.SAFE); mongoTemplate.setWriteResultChecking(WriteResultChecking.EXCEPTION); return mongoTemplate; } } 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 62
  • 63. Is NoSQL webscale? Benchmarking is still work in progress but so far http://www.youtube.com/watch? v=b2F-DItXtZs Redis Mongo Cassandra Insert for PK Awesome Awesome Fast* Find by PK Awesome Awesome Fast Insert for find available Fast Awesome Ok* Find available Awesome Ok Ok restaurants * Cassandra can be clustered for improved write performance In other words: it depends 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 63
  • 64. Summary o  Relational databases are great but n  Object/relational impedance mismatch n  Relational schema is rigid n  Extremely difficult/impossible to scale writes n  Performance can be suboptimal o  Each NoSQL databases can solve some combination of those problems BUT n  Limited transactions n  Query-driven, denormalized database design n  … è o  Carefully pick the NoSQL DB for your application o  Consider a polyglot persistence architecture 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 64
  • 65. Thank you! My contact info chris.richardson@springsource.com @crichardson 5/3/11 Copyright (c) 2011 Chris Richardson. All rights reserved. Slide 65