SlideShare a Scribd company logo
NoSQL: What Is It and Why Would I Care?
     Eberhard Wolff




21.09.11
Alternative Databases: NoSQL
►    NoSQL: Not only SQL


►    A good example for a catchy but bad name
►    Not positive definition, rather “not something else”
►    Now: Even less clear
Why NoSQL?
►    Exponential data growth


►    More and more connected data
     >  Hypertext, Blogs, User generated content, Blogs


►    Semi structured
     >  User generated content
     >  Full text search / indices instead of Query-by-Example


►    Integration on the database less common


►    Cloud prefers scale out over scale up
     >  Cloud supports scale up: Reboot into larger machine
     >  …but eventually you will need to scale out i.e. add more machines
NoSQL Flavors
►    Key / value store
►    Document
►    Wide Column: Lots of Columns


►    Graph Database: Graphs with nodes, relationships and properties
►    Object databases: Stores objects – not rows


►    Note: NoSQL is actually vaguely defined
Key-Value Stores
►    Maps keys to values                               Key   Value
►    Just a large globally available Map               42    Some
                                                             data
►    i.e. not very powerful data model
►    Advantages
     >  Easy to understand
     >  Easier to build scale out solutions
        (no joins, easy sharding etc)
►    Disadvantages
     >  Simplistic data model
     >  Not a good fit for complex data
     >  Might add complexity to the application code
•    Focus in Scalability
•    Redis: Think cache + Persistence
•    Riak
Key Value Store: Hybrid Approach
►    Might just be used to store specific data


►    I.e. scores of players in an online game
     >  No complex structure
     >  Need to scale
     >  Lots of reads and write


►    Player name, age, address would still be in a RDBMS


►    Hybrid approach
Key-Value Stores: Store All Data
►    Storing data as serialized blobs
     >  "user:someuser" è "someuser|someuser@example.com|more|data|here"
►    Storing data as multiple keys
     >  "user:username:someuser" è "someuser"
     >  "user:email:someuser" è "someuser@example.com"
     >  Requires multi get/set to be efficient
     >  Allows some querying if the database supports wildcards,
        like "user:email:someuser*"
►    Storing links
     >  Blob: "basket:someuser" è"...|item|1|product|product:123|..."
     >  Separate keys: "basket:someuser:item:1:product" è "product:123"
        –  Multi-get: "basket:someuser:*" loads the shopping basket and all items
►    Easy to understand, hard to implement
Document Stores
►    Aggregates are typically stored as "documents“ (key-value collection)
►    JSON, BSON (binary JSON) and XML are common
►    Still no schema, so add any data at runtime
►    The semi-structure of the document allows the database to build indexes, allowing
     queries that address properties of the document
     >  E.g. "find all baskets that contain the product 123"
►    Relations might be modeled as links
►    Advantages
     >  Good fit for semi structured data
     >  In particular a good fit for JSON, XML, HTML…
     >  Probably the easiest transition from RDBMS
►    Disadvantages
     >  Does not scale to the key/value store level
►    Focus on semi structured data e.g. JSON
►    MongoDB, CouchDB
Wide Column
►     Add any "column" you like to a row
                                                                          XX

►     Not a key-value store, but a "key-(column-value)" store        XX        XX        XX        XX

                                                                               XX   XX   XX
►     Column families are like tables                                     XX   XX        XX        XX


►     E.g. in the "Users" column family                              XX        XX   XX             XX

                                                                          XX        XX        XX   XX
      >  "someuser" è ("username"è"someuser"),                     XX        XX        XX        XX

                         ("email" è"someuser@example.com")          XX   XX

                                                                               XX   XX   XX
►     Since columns are named, some databases provide indexing          XX                    XX   XX

      >  E.g. Google AppEngine allows you to define columns that can XX queried
                                                                     be       XX              XX

                                                                          XX   XX        XX        XX
►     Advantages                                                          XX   XX   XX        XX

      >  Easy to store complex and heterogeous data                  XX        xX   XX   XX   XX



§    Apache Cassandra
§    Amazon SimpleDB
Graph
►    Nodes with Properties
►    Typed relationships with properties


►    Ideal e.g. to model relations in a social network


►    Easy to find number of followers, degree of relation etc.


►    Neo4j
What happened to Queries?
►    Data is easily and quickly read/stored using primary key
►    Denormalize data for commonly used queries
     >  Store twitter inbox in key/value as
        –  "inbox:someuser" è ("posts:123", "posts:234", ...)
     >  instead of doing the query (RDBMS)
        –  select p.* from POSTS p, POSTLINKS pl where p.id = pl.postId and
           pl.userid=42
►    Store reverse lookup
     >  ”ewolff|following" è (”spring_rod", ”spring_juergen")
     >  ”post:435|RT" è (”post:42", ”post:21")
What It Means for Developers
§  More technologies to have fun with
§  Broader choice of persistence stores
§  Probably Cross Store Persistence
    •  Store name, firstname etc in RDBMS
    •  Store followers in Graph database

  •  Store Content in RDBMS
  •  Store User Generated Content in Document database


§  Spring Data
    •  Similar APIs for JPA and NoSQL
    •  Support for cross store persistence
    •  Sophisticated support for generic DAOs
    •  E.g. just add findByName() method, implementation is provided
§  QueryDSL
    •  JPA Criteria API done right

More Related Content

PPTX
Introduction to mongo db
PDF
Full metal mongo
PPT
NoSQL - "simple" web monitoring
PPTX
DMDW Extra Lesson - NoSql and MongoDB
PPTX
Building a Scalable Inbox System with MongoDB and Java
PDF
Starting with MongoDB
PDF
Mysql to mongo
PDF
Python Files
Introduction to mongo db
Full metal mongo
NoSQL - "simple" web monitoring
DMDW Extra Lesson - NoSql and MongoDB
Building a Scalable Inbox System with MongoDB and Java
Starting with MongoDB
Mysql to mongo
Python Files

What's hot (20)

PPTX
MongoDb and NoSQL
PDF
Scaling up and accelerating Drupal 8 with NoSQL
PDF
MongoDB at FrozenRails
KEY
Schema Design with MongoDB
PPTX
Mongo db queries
KEY
Optimize drupal using mongo db
PDF
Building Apps with MongoDB
PPTX
Introduction to NoSQL CassandraDB
PDF
Introduction to CouchDB - LA Hacker News
PPTX
Back to Basics Webinar 3: Schema Design Thinking in Documents
PDF
CouchDB at New York PHP
PPTX
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
ODP
MongoDB - A Document NoSQL Database
PPTX
Introduction to (sql)
PDF
Database Architecture and Basic Concepts
PPTX
Mongo db
PPTX
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
PDF
How to search extracted data
PPTX
Webinar: Back to Basics: Thinking in Documents
KEY
MongoDB Strange Loop 2009
MongoDb and NoSQL
Scaling up and accelerating Drupal 8 with NoSQL
MongoDB at FrozenRails
Schema Design with MongoDB
Mongo db queries
Optimize drupal using mongo db
Building Apps with MongoDB
Introduction to NoSQL CassandraDB
Introduction to CouchDB - LA Hacker News
Back to Basics Webinar 3: Schema Design Thinking in Documents
CouchDB at New York PHP
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
MongoDB - A Document NoSQL Database
Introduction to (sql)
Database Architecture and Basic Concepts
Mongo db
Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
How to search extracted data
Webinar: Back to Basics: Thinking in Documents
MongoDB Strange Loop 2009
Ad

Similar to NoSQL Overview (20)

PDF
Non Relational Databases
PPTX
Lviv EDGE 2 - NoSQL
KEY
Spring Data Neo4j Intro SpringOne 2011
PDF
Is NoSQL The Future of Data Storage?
PPTX
No sql introduction_v1.1.1
PDF
Polyglot persistence for Java developers - moving out of the relational comfo...
PPTX
Choosing your NoSQL storage
PDF
Scaling the Web: Databases & NoSQL
PDF
Non Relational Databases And World Domination
PDF
PPTX
Sql vs NoSQL
PDF
Spring one2gx2010 spring-nonrelational_data
PPT
No sql or Not only SQL
PDF
NoSQL overview #phptostart turin 11.07.2011
PDF
Sql no sql
PPTX
A Practical Look at the NOSQL and Big Data Hullabaloo
KEY
NoSQL in the context of Social Web
PDF
Slide presentation pycassa_upload
PPTX
Module 2.2 Introduction to NoSQL Databases.pptx
PDF
Functional Dependencies and Normalization for Relational Databases
Non Relational Databases
Lviv EDGE 2 - NoSQL
Spring Data Neo4j Intro SpringOne 2011
Is NoSQL The Future of Data Storage?
No sql introduction_v1.1.1
Polyglot persistence for Java developers - moving out of the relational comfo...
Choosing your NoSQL storage
Scaling the Web: Databases & NoSQL
Non Relational Databases And World Domination
Sql vs NoSQL
Spring one2gx2010 spring-nonrelational_data
No sql or Not only SQL
NoSQL overview #phptostart turin 11.07.2011
Sql no sql
A Practical Look at the NOSQL and Big Data Hullabaloo
NoSQL in the context of Social Web
Slide presentation pycassa_upload
Module 2.2 Introduction to NoSQL Databases.pptx
Functional Dependencies and Normalization for Relational Databases
Ad

More from adesso AG (20)

PDF
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP (Kurzversion)
PDF
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP
PDF
Mythos High Performance Teams
PPTX
A Business-Critical SharePoint Solution From adesso AG
PDF
Was Sie über NoSQL Datenbanken wissen sollten!
PDF
Continuous Delivery praktisch
PDF
Agilität, Snapshots und Continuous Delivery
PDF
Wozu Portlets – reichen HTML5 und Rest nicht aus für moderne Portale?
PDF
Getriebene Anwendungslandschaften
PDF
Google App Engine JAX PaaS Parade 2013
PDF
Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)
PDF
OOP 2013 NoSQL Suche
PDF
NoSQL in der Cloud - Why?
PPTX
Lean web architecture mit jsf 2.0, cdi & co.
PDF
Schlanke Webarchitekturen nicht nur mit JSF 2 und CDI
PDF
Zehn Hinweise für Architekten
PDF
Agile Praktiken
PDF
Java und Cloud - nicht nur mit PaaS
PDF
Neue EBusiness Perspektiven durch HTML5
PDF
CloudConf2011 Introduction to Google App Engine
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP (Kurzversion)
SNMP Applied - Sicheres Anwendungs-Monitoring mit SNMP
Mythos High Performance Teams
A Business-Critical SharePoint Solution From adesso AG
Was Sie über NoSQL Datenbanken wissen sollten!
Continuous Delivery praktisch
Agilität, Snapshots und Continuous Delivery
Wozu Portlets – reichen HTML5 und Rest nicht aus für moderne Portale?
Getriebene Anwendungslandschaften
Google App Engine JAX PaaS Parade 2013
Wartbare Web-Anwendungen mit Knockout.js und Model-View-ViewModel (MVVM)
OOP 2013 NoSQL Suche
NoSQL in der Cloud - Why?
Lean web architecture mit jsf 2.0, cdi & co.
Schlanke Webarchitekturen nicht nur mit JSF 2 und CDI
Zehn Hinweise für Architekten
Agile Praktiken
Java und Cloud - nicht nur mit PaaS
Neue EBusiness Perspektiven durch HTML5
CloudConf2011 Introduction to Google App Engine

Recently uploaded (20)

PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Approach and Philosophy of On baking technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Electronic commerce courselecture one. Pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Cloud computing and distributed systems.
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
A Presentation on Artificial Intelligence
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Machine learning based COVID-19 study performance prediction
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Mobile App Security Testing_ A Comprehensive Guide.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Approach and Philosophy of On baking technology
Per capita expenditure prediction using model stacking based on satellite ima...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Electronic commerce courselecture one. Pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Reach Out and Touch Someone: Haptics and Empathic Computing
Cloud computing and distributed systems.
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
A Presentation on Artificial Intelligence
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Machine learning based COVID-19 study performance prediction

NoSQL Overview

  • 1. NoSQL: What Is It and Why Would I Care? Eberhard Wolff 21.09.11
  • 2. Alternative Databases: NoSQL ►  NoSQL: Not only SQL ►  A good example for a catchy but bad name ►  Not positive definition, rather “not something else” ►  Now: Even less clear
  • 3. Why NoSQL? ►  Exponential data growth ►  More and more connected data >  Hypertext, Blogs, User generated content, Blogs ►  Semi structured >  User generated content >  Full text search / indices instead of Query-by-Example ►  Integration on the database less common ►  Cloud prefers scale out over scale up >  Cloud supports scale up: Reboot into larger machine >  …but eventually you will need to scale out i.e. add more machines
  • 4. NoSQL Flavors ►  Key / value store ►  Document ►  Wide Column: Lots of Columns ►  Graph Database: Graphs with nodes, relationships and properties ►  Object databases: Stores objects – not rows ►  Note: NoSQL is actually vaguely defined
  • 5. Key-Value Stores ►  Maps keys to values Key Value ►  Just a large globally available Map 42 Some data ►  i.e. not very powerful data model ►  Advantages >  Easy to understand >  Easier to build scale out solutions (no joins, easy sharding etc) ►  Disadvantages >  Simplistic data model >  Not a good fit for complex data >  Might add complexity to the application code •  Focus in Scalability •  Redis: Think cache + Persistence •  Riak
  • 6. Key Value Store: Hybrid Approach ►  Might just be used to store specific data ►  I.e. scores of players in an online game >  No complex structure >  Need to scale >  Lots of reads and write ►  Player name, age, address would still be in a RDBMS ►  Hybrid approach
  • 7. Key-Value Stores: Store All Data ►  Storing data as serialized blobs >  "user:someuser" è "someuser|someuser@example.com|more|data|here" ►  Storing data as multiple keys >  "user:username:someuser" è "someuser" >  "user:email:someuser" è "someuser@example.com" >  Requires multi get/set to be efficient >  Allows some querying if the database supports wildcards, like "user:email:someuser*" ►  Storing links >  Blob: "basket:someuser" è"...|item|1|product|product:123|..." >  Separate keys: "basket:someuser:item:1:product" è "product:123" –  Multi-get: "basket:someuser:*" loads the shopping basket and all items ►  Easy to understand, hard to implement
  • 8. Document Stores ►  Aggregates are typically stored as "documents“ (key-value collection) ►  JSON, BSON (binary JSON) and XML are common ►  Still no schema, so add any data at runtime ►  The semi-structure of the document allows the database to build indexes, allowing queries that address properties of the document >  E.g. "find all baskets that contain the product 123" ►  Relations might be modeled as links ►  Advantages >  Good fit for semi structured data >  In particular a good fit for JSON, XML, HTML… >  Probably the easiest transition from RDBMS ►  Disadvantages >  Does not scale to the key/value store level ►  Focus on semi structured data e.g. JSON ►  MongoDB, CouchDB
  • 9. Wide Column ►  Add any "column" you like to a row XX ►  Not a key-value store, but a "key-(column-value)" store XX XX XX XX XX XX XX ►  Column families are like tables XX XX XX XX ►  E.g. in the "Users" column family XX XX XX XX XX XX XX XX >  "someuser" è ("username"è"someuser"), XX XX XX XX ("email" è"someuser@example.com") XX XX XX XX XX ►  Since columns are named, some databases provide indexing XX XX XX >  E.g. Google AppEngine allows you to define columns that can XX queried be XX XX XX XX XX XX ►  Advantages XX XX XX XX >  Easy to store complex and heterogeous data XX xX XX XX XX §  Apache Cassandra §  Amazon SimpleDB
  • 10. Graph ►  Nodes with Properties ►  Typed relationships with properties ►  Ideal e.g. to model relations in a social network ►  Easy to find number of followers, degree of relation etc. ►  Neo4j
  • 11. What happened to Queries? ►  Data is easily and quickly read/stored using primary key ►  Denormalize data for commonly used queries >  Store twitter inbox in key/value as –  "inbox:someuser" è ("posts:123", "posts:234", ...) >  instead of doing the query (RDBMS) –  select p.* from POSTS p, POSTLINKS pl where p.id = pl.postId and pl.userid=42 ►  Store reverse lookup >  ”ewolff|following" è (”spring_rod", ”spring_juergen") >  ”post:435|RT" è (”post:42", ”post:21")
  • 12. What It Means for Developers §  More technologies to have fun with §  Broader choice of persistence stores §  Probably Cross Store Persistence •  Store name, firstname etc in RDBMS •  Store followers in Graph database •  Store Content in RDBMS •  Store User Generated Content in Document database §  Spring Data •  Similar APIs for JPA and NoSQL •  Support for cross store persistence •  Sophisticated support for generic DAOs •  E.g. just add findByName() method, implementation is provided §  QueryDSL •  JPA Criteria API done right