SlideShare a Scribd company logo
NoSQL for SQL Professionals
Don Pinto
Product Manager
NoSQL
+ +
More Data More Users Interactive Apps
Macro Trends Driving NoSQL Technology
Lacking Solutions, Users Forced to Invent
Dynamo
October 2007
Cassandra
August 2008
Voldemort
February 2009November 2006
Bigtable
Very few organizations can build and maintain database software technology.
But every organization building interactive web applications needs this technology.
What Is Biggest Data Management Problem
Driving Use of NoSQL in Coming Year?
Lack of flexibility/
rigid schemas
Inability to
scale out data
Performance
challenges
Cost All of these Other
49%
35%
29%
16%
12%
11%
Source: Couchbase Survey, December 2011, n = 1351.
Relational vs. NoSQL
Key Differences
RDBMS Scales Up
Get a bigger, more complex server
Users
Application Scales Out
Just add more commodity web servers
Users
System Cost
Application Performance
Relational Technology Scales Up
Relational Database
Web/App Server Tier
Expensive and disruptive sharding, doesn’t perform at web scale
System Cost
Application Performance
Won’t
scale
beyond
this point
NoSQL Database Scales Out Like App
Tier
NoSQL Database Scales Out
Cost and performance mirrors app tier
Users
Scaling out flattens the cost and performance curves
Couchbase Distributed Data Store
Application Scales Out
Just add more commodity web servers
Users
System Cost
Application Performance
Application Performance
System Cost
Web/App Server Tier
No sql for sql professionals
Relational vs Document Data Model
Relational data model Document data model
Collection of complex documents with
arbitrary, nested data formats and
varying “record” format.
Highly-structured table organization
with rigidly-defined data formats and
record structure.
C1 C2 C3 C4
JSON
JSON
JSON
{
}
RDBMS Example: User Profile
Address Info
1 DEN 30303CO
2 MV 94040CA
3 CHI 60609IL
User Info
KEY First ZIP_idLast
4 NY 10010NY
1 Dipti 2Borkar
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
ZIP_id CITY ZIPSTATE
1 2
2 MV 94040CA
To get information about specific user, you perform a join across two tables
Document Example: User Profile
All data in a single document
{
“ID”: 1,
“FIRST”: “Dipti”,
“LAST”: “Borkar”,
“ZIP”: “94040”,
“CITY”: “MV”,
“STATE”: “CA”
}
JSON
= +
Making a Change Using RDBMS
User ID First Last Zip
1 Dipti Borkar 94040
2 Joe Smith 94040
3 Ali Dodson 94040
4 Sarah Gorin NW1
5 Bob Young 30303
6 Nancy Baker 10010
7 Ray Jones 31311
8 Lee Chen V5V3M
•
•
•
50000 Doug Moore 04252
50001 Mary White SW195
50002 Lisa Clark 12425
Country
ID
TEL
3
001
Country ID Country name
001 USA
002 UK
003 Argentina
004 Australia
005 Aruba
006 Austria
007 Brazil
008 Canada
009 Chile
•
•
•
130 Portugal
131 Romania
132 Russia
133 Spain
134 Sweden
User ID Photo ID Comment
2 d043 NYC
2 b054 Bday
5 c036 Miami
7 d072 Sunset
5002 e086 Spain
Photo Table
001
007
001
133
133
User ID Status ID Text
1 a42 At conf
4 b26 excited
5 c32 hockey
12 d83 Go A’s
5000 e34 sailing
Status Table
134
007
008
001
005
Country Table
User ID Affl ID Affl Name
2 a42 Cal
4 b96 USC
7 c14 UW
8 e22 Oxford
Affiliations Table
Country
ID
001
001
001
002
Country
ID
Country
ID
001
001
002
001
001
001
008
001
002
001
User Table
.
.
.
Making the Same Change With a
Document DB
{
“ID”: 1,
“FIRST”: “Don”,
“LAST”: “Pinto”,
“ZIP”: “94040”,
“CITY”: “MV”,
“STATE”: “CA”,
“STATUS”:
{ “TEXT”: “At Conf”
}
}
“GEO_LOC”: “134” },
“COUNTRY”: ”USA”
Just add information to a document
JSON
,}
User
ID
First Last Zip
1 Frank Wiegel 94040
2 Joe Smith 94040
3 Ali Dodson 94040
4 Sarah Gorin NW1
5 Bob Young 30303
6 Nancy Baker 10010
7 Ray Jones 31311
8 Lee Chen V5V3
•
•
•
5000 Doug Moore 04252
5001 Mary White 41694
5002 Lisa Clark 12425
User
ID
Photo
ID Comment
2 d043 NYC
2 b054 Bday
5 c036 Miami
7 d072 Sunset
5002 e086 Spain
User Table Photo Table
User
ID
Status
ID Text
1 a42 At conf
4 b26 excited
5 c32 hockey
12 d83 Go A’s
5000 e34 sailing
Status Table
User
ID
Affiliations
ID
Affiliations
Name
2 a42 Cal
4 b96 USC
7 c14 UW
8 e22 Oxford
Affiliations Table
Relational vs Document Performance
1 Frank 94040Weigel
a421 At conf
5 Bob 30303Young
c0365 Miami
4 Sarah NW1Gorin
b264 hockey
JSON
{
}
JSON
{
}
JSON
{
}
JSON
{
}
JSON
{
}
JSON
{
}
JSON
{
}
JSON
{
}
JSON
{
}
JSON
{
}
8 Lee V5V3Chen
e228 Oxford5002 Lisa 12425Clark
e0865002 Spain
c0325 excited
Faster response times and higher throughput
Document Databases Easily Accommodate
Unstructured Data
{
“ID”: 1,
“NAME”: “Fairmont San Francisco”,
“DESCRIPTION”: “Historic grandeur…”,
“AVG_REVIEWER_SCORE”: “4.3”,
“AMENITY”: {“TYPE”: “gym”,
DESCRIPTION: “fitness center”
},
{“TYPE”: “wifi”,
“DESCRIPTION”: “free wifi”},
“RATE_TYPE”: “nightly”,
“PRICE”: “$199”,
“REVIEWS”: [“review_1”, “review_2”],
“ATTRACTIONS”: “Chinatown”,
}
JSON
{
“ID”: 2,
“NAME”: “W San Francisco”,
“DESCRIPTION”: “Chic, hip accommodations..”,
“AVG_REVIEWER_SCORE”: “4.0”,
“AMENITY”: {“TYPE”: “spa”,
DESCRIPTION: “Bliss Spa”
},
{“TYPE”: “wifi”,
“DESCRIPTION”: “free wifi”},
{“TYPE”: “dining”,
“DESCRIPTION”: “bar/lounge”},
“RATE_TYPE”: “nightly”,
“PRICE”: “$194”,
“REVIEWS”: [“review_1”, “review_2”],
}
JSON
Hotels
Document Databases Easily Accommodate
Unstructured Data
{
“ID”: 1,
“NAME”:
“Fairmont San
Francisco”,
…}
JSON
{
“REVIEW_ID”: 1,
“REVIEW”: “Loved Hotel &
Location”,
“WOULD RECOMMEND”:
“yes”,
“AVG_REVIEWER_SCORE”: “5”,
“REVIEW_DATE”: “May 29,
2013”,
“USER_PROFILE_ID”: “271”,
}
JSON
{
“REVIEW_ID”: 2,
“REVIEW”: “Nice, but a few
kinks”,
“WOULD RECOMMEND”:
“yes”,
“AVG_REVIEWER_SCORE”: “4”,
“REVIEW_DATE”: “May 22,
2013”,
“USER_PROFILE_ID”: “923”,
}
JSON
Hotels
Reviews
Document Databases Easily Accommodate
Unstructured Data
{
“ID”: 1,
“NAME”:
“Fairmont San
Francisco”,
…}
JSON
Hotel Descriptions
Reviews
{
“REVIEW_ID”:
1,
“REVIEW”:
“Loved Hotel…”,
…}
JSON
{
“REVIEW_ID”:
2,
“REVIEW”:
“Nice, but …”,
…}
JSON
User Profiles {
“USER_ID”: 1,
“DISPLAY_NAME ”:
“Ted’s Trip Experience”,
“CITY”: “Saratoga”,
“STATE”: “California”,
“NUM_OF_REVIEWS”:
“8”,
}
JSON
{
“USER_ID”: 1,
“DISPLAY_NAME ”:
“WhatWhat567”,
“CITY”: “Kansas
City”,
“STATE”: “MO”,
“NUM_OF_REVIEWS”:
“3”,
} JSON
Document Databases Easily Accommodate
Unstructured Data
{
“ID”: 1,
“NAME”:
“Fairmont San
Francisco”,
…}
JSON
Hotel Descriptions
Reviews
{
“REVIEW_ID”:
1,
“REVIEW”:
“Loved Hotel…”,
…}
JSON
{
“REVIEW_ID”:
2,
“REVIEW”:
“Nice, but …”,
…}
JSON
User Profiles
{
“USER_ID”: 1,
“DISPLAY”:
“Ted’s Trip…”,
…}
JSON
{
“USER_ID”: 2,
“DISPLAY”:
“WhatWhat …”,
…}
JSON
Document IDs associates related objects
Hotels
points to
reviews
Reviews
points
to users
Indexing with Document Databases
Index on AVG_REVIEWER_SCORE
Indexing with Document Databases
Index on AVG_REVIEWER_SCORE
…
4.0, doc_id
4.0, doc_id
4.1, doc_id
4.3, doc_id
5.0, doc_id
…
Index
Querying with Document Databases
Query on AVG_REVIEWER_SCORE
…
3.4, doc_id
3.4, doc_id
3.5, doc_id
3.6, doc_id
3.7, doc_id
3.8, doc_id
4.0, doc_id
4.1, doc_id
4.3, doc_id
4.5, doc_id
4.7, doc_id
4.9, doc_id
5.0, doc_id
…
5.0, doc_id
Index Matching ResultsQuery
Flavors of NoSQL
NoSQL catalog
Key-Value
memcached redis
Data Structure Document Column Graph
mongoDB
couchbase cassandra
Cache
(memoryonly)
Database
(memory/disk)
Neo4j
Couchbase Open Source Project
• Leading NoSQL database project
focused on distributed database
technology and surrounding
ecosystem
• Supports both key-value and
document-oriented use cases
• All components are available
under the Apache 2.0 Public
License
• Obtained as packaged software in
both enterprise and community
editions.
Couchbase
Open Source Project
Easy
Scalability
Consistent High
Performance
Always
On
24x365
Grow cluster without
application changes, without
downtime with a single click
Consistent sub-millisecond
read and write response times
with consistent high throughput
No downtime for software
upgrades, hardware
maintenance, etc.
JSON
JSON
JSON
JSONJSON
Flexible Data
Model
JSON document model with
no fixed schema.
Couchbase Server
Couchbase Server Architecture
Heartbeat
Processmonitor
Globalsingletonsupervisor
Configurationmanager
on each node
Rebalanceorchestrator
Nodehealthmonitor
one per cluster
vBucketstateandreplicationmanager
http
RESTmanagementAPI/WebUI
HTTP
8091
Erlang port mapper
4369
Distributed Erlang
21100 - 21199
Erlang/OTP
storage interface
Couchbase EP Engine
11210
Memcapable 2.0
Moxi
11211
Memcapable 1.0
Memcached
New Persistence Layer
8092
Query APIQueryEngine
Data Manager Cluster Manager
Couchbase Server Architecture
Replication, Rebalance,
Shard State Manager
REST management
API/Web UI
8091
Admin Console
Erlang/OTP
11210 / 11211
Data access ports
Object-managed
Cache
Multi-threaded
Persistence Engine
8092
Query APIQueryEngine
http
Data Manager Cluster Manager
Where is NoSQL a good fit?
Market Adoption
Internet Companies Enterprises
• Communications
• Retail
• Financial Services
• Health Care
• Automotive/Airline
• Agriculture
• Consumer Electronics
• Business Systems
• Social Gaming
• Ad Networks
• Social Networks
• Online Business Services
• E-Commerce
• Online Media
• Content Management
• Cloud Services
Application Characteristics - Data driven
• 3rd party or user defined structure (Twitter feeds)
• Support for unlimited data growth (Viral apps)
• Data with non-homogenous structure
• Need to quickly and often change data structure
• Variable length documents
• Sparse data records
• Hierarchical data
NoSQL is a good fit
Application Characteristics - Performance
driven
• Low latency critical (ex. 1millisecond)
• High throughput (ex. 200000 ops / sec)
• Large number of users
• Unknown demand with sudden growth of users/data
• Predominantly direct document access
• Read / Mixed / Write heavy workloads
NoSQL is a good fit
Q & A
Thank you!
don@couchbase.com
@nosqldon
www.linkedin.com/in/donpinto/
Extra - Couchbase Operations
33 2
Single node - Couchbase Write
Operation
Managed Cache
DiskQueue
Disk
Replication
Queue
App Server
Couchbase Server Node
Doc 1Doc 1
Doc 1
To other node
33 2
Single node - Couchbase Update
Operation
Managed Cache
DiskQueue
Replication
Queue
App Server
Doc 1’
Doc 1
Doc 1’Doc 1
Doc 1’
Disk
To other node
Couchbase Server Node
GET
Doc1
33 2
Single node - Couchbase Read
Operation
DiskQueue
Replication
Queue
App Server
Doc 1
Doc 1Doc 1
Managed Cache
Disk
To other node
Couchbase Server Node
33 2
Single node – Couchbase Cache Miss
2
DiskQueue
Replication
Queue
App Server
Couchbase Server Node
Doc 1
Doc 3Doc 5 Doc 2Doc 4
Doc 6 Doc 5 Doc 4 Doc 3 Doc 2
Doc 4
GET
Doc1
Doc 1
Doc 1
Managed Cache
Disk
To other node
COUCHBASE SERVER CLUSTER
Basic Operation
• Docs distributed evenly across
servers
• Each server stores both active and
replica docs
Only one server active at a time
• Client library provides app with
simple interface to database
• Cluster map provides map
to which server doc is on
App never needs to know
• App reads, writes, updates docs
• Multiple app servers can access same
document at same time
User Configured Replica Count = 1
READ/WRITE/UPDATE
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 1
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc
SERVER 2
Doc 8
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
REPLICA
Doc 6
Doc 3
Doc 2
Doc
Doc
Doc
REPLICA
Doc 7
Doc 9
Doc 5
Doc
Doc
Doc
SERVER 3
Doc 6
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Doc 9
Add Nodes to Cluster
• Two servers added
One-click operation
• Docs automatically
rebalanced across
cluster
Even distribution of docs
Minimum doc movement
• Cluster map updated
• App database
calls now distributed
over larger number of
servers
REPLICA
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc 4
Doc 1
Doc
Doc
SERVER 1
REPLICA
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc 6
Doc 3
Doc
Doc
SERVER 2
REPLICA
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc 7
Doc 9
Doc
Doc
SERVER 3 SERVER 4 SERVER 5
REPLICA
ACTIVE
REPLICA
ACTIVE
Doc
Doc 8 Doc
Doc 9 Doc
Doc 2 Doc
Doc 8 Doc
Doc 5 Doc
Doc 6
READ/WRITE/UPDATE READ/WRITE/UPDATE
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
COUCHBASE SERVER CLUSTER
User Configured Replica Count = 1
Fail Over Node
REPLICA
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc 4
Doc 1
Doc
Doc
SERVER 1
REPLICA
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc 6
Doc 3
Doc
Doc
SERVER 2
REPLICA
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc 7
Doc 9
Doc
Doc
SERVER 3 SERVER 4 SERVER 5
REPLICA
ACTIVE
REPLICA
ACTIVE
Doc 9
Doc 8
Doc Doc 6 Doc
Doc
Doc 5 Doc
Doc 2
Doc 8 Doc
Doc
• App servers accessing docs
• Requests to Server 3 fail
• Cluster detects server failed
Promotes replicas of docs to
active
Updates cluster map
• Requests for docs now go to
appropriate server
• Typically rebalance
would follow
Doc
Doc 1 Doc 3
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
User Configured Replica Count = 1
COUCHBASE SERVER CLUSTER

More Related Content

PDF
Couchbase Overview Nov 2013
PPT
Present simple
PPTX
[okaweb × HTML5 fun] HTML5で人気のAPIを使って 未来価値を創造しよう
PPTX
JavaScriptライフを10倍楽しくする方法-HTML5fun-
PPTX
Trivia game
PPTX
Impressie Wittenberg
PPTX
Team 4 Chp 4 & 5
PPT
Hist 140 hoover dam
Couchbase Overview Nov 2013
Present simple
[okaweb × HTML5 fun] HTML5で人気のAPIを使って 未来価値を創造しよう
JavaScriptライフを10倍楽しくする方法-HTML5fun-
Trivia game
Impressie Wittenberg
Team 4 Chp 4 & 5
Hist 140 hoover dam

Viewers also liked (20)

PDF
Services & Products of Optimal Management
PPT
PDF
Relatório de monitoramento do plano nacional de cidadania
PDF
Office and Retail Projects
PDF
U2 Contaminacion del Aire parte 2 de 2 - Guia Didactica
DOCX
Abstract
PPT
Past continuous
PPT
El primer condicional
POT
The simpsons
PPT
Cockatoo
PPTX
ちゃんとWeb会議
PPSX
L’aparell digestiu
PPT
Hist 141 the great depression & ww2
PPTX
javascriptの基礎
PPSX
Krzhizhanovsky 2008 by Professor Caryl Emerson
PPTX
Diversity of global comics the arab world
DOCX
Creating A Sample Libary Assigment Brief
PDF
MongoDBJP 納涼もんご祭り
PDF
Концепция проекта Optimal Management
DOC
Formatofduediligence 020608
Services & Products of Optimal Management
Relatório de monitoramento do plano nacional de cidadania
Office and Retail Projects
U2 Contaminacion del Aire parte 2 de 2 - Guia Didactica
Abstract
Past continuous
El primer condicional
The simpsons
Cockatoo
ちゃんとWeb会議
L’aparell digestiu
Hist 141 the great depression & ww2
javascriptの基礎
Krzhizhanovsky 2008 by Professor Caryl Emerson
Diversity of global comics the arab world
Creating A Sample Libary Assigment Brief
MongoDBJP 納涼もんご祭り
Концепция проекта Optimal Management
Formatofduediligence 020608
Ad

Similar to No sql for sql professionals (20)

PPTX
Characteristics of no sql databases
PDF
Navigating the Transition from relational to NoSQL - CloudCon Expo 2012
PDF
Os Gottfrid
PPTX
N1QL: What's new in Couchbase 5.0
PDF
Couchbase overview033113long
PDF
Couchbase overview033113long
PDF
JSON Support in DB2 for z/OS
PPTX
NoSQL Data Modeling using Couchbase
PPTX
API Management and OAuth for Web, Mobile and the Cloud: Scott Morrison's Pres...
PPTX
SomeSQL at Skyscanner - Scaling in a changing world of databases and hardware
PDF
Graph Database Use Cases - StampedeCon 2015
PDF
Graph database Use Cases
PPTX
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
PPTX
Webinar: General Technical Overview of MongoDB for Ops Teams
PPTX
From SQL to NoSQL: Structured Querying for JSON
PDF
MongoDB World 2019: MongoDB Implementation at T-Mobile
PPTX
SQL to NoSQL: Top 6 Questions
PPTX
Big Data Platform Industrialization
PPTX
Big Data Platform Industrialization
PPTX
IOOF IT System Modernisation
Characteristics of no sql databases
Navigating the Transition from relational to NoSQL - CloudCon Expo 2012
Os Gottfrid
N1QL: What's new in Couchbase 5.0
Couchbase overview033113long
Couchbase overview033113long
JSON Support in DB2 for z/OS
NoSQL Data Modeling using Couchbase
API Management and OAuth for Web, Mobile and the Cloud: Scott Morrison's Pres...
SomeSQL at Skyscanner - Scaling in a changing world of databases and hardware
Graph Database Use Cases - StampedeCon 2015
Graph database Use Cases
N1QL+GSI: Language and Performance Improvements in Couchbase 5.0 and 5.5
Webinar: General Technical Overview of MongoDB for Ops Teams
From SQL to NoSQL: Structured Querying for JSON
MongoDB World 2019: MongoDB Implementation at T-Mobile
SQL to NoSQL: Top 6 Questions
Big Data Platform Industrialization
Big Data Platform Industrialization
IOOF IT System Modernisation
Ad

Recently uploaded (20)

PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Transform Your Business with a Software ERP System
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Introduction to Artificial Intelligence
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
System and Network Administraation Chapter 3
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
AI in Product Development-omnex systems
PPTX
Odoo POS Development Services by CandidRoot Solutions
Which alternative to Crystal Reports is best for small or large businesses.pdf
Transform Your Business with a Software ERP System
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Introduction to Artificial Intelligence
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
wealthsignaloriginal-com-DS-text-... (1).pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
How to Migrate SBCGlobal Email to Yahoo Easily
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Adobe Illustrator 28.6 Crack My Vision of Vector Design
How to Choose the Right IT Partner for Your Business in Malaysia
Navsoft: AI-Powered Business Solutions & Custom Software Development
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Design an Analysis of Algorithms I-SECS-1021-03
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
2025 Textile ERP Trends: SAP, Odoo & Oracle
System and Network Administraation Chapter 3
CHAPTER 2 - PM Management and IT Context
AI in Product Development-omnex systems
Odoo POS Development Services by CandidRoot Solutions

No sql for sql professionals

  • 1. NoSQL for SQL Professionals Don Pinto Product Manager
  • 2. NoSQL + + More Data More Users Interactive Apps Macro Trends Driving NoSQL Technology
  • 3. Lacking Solutions, Users Forced to Invent Dynamo October 2007 Cassandra August 2008 Voldemort February 2009November 2006 Bigtable Very few organizations can build and maintain database software technology. But every organization building interactive web applications needs this technology.
  • 4. What Is Biggest Data Management Problem Driving Use of NoSQL in Coming Year? Lack of flexibility/ rigid schemas Inability to scale out data Performance challenges Cost All of these Other 49% 35% 29% 16% 12% 11% Source: Couchbase Survey, December 2011, n = 1351.
  • 7. RDBMS Scales Up Get a bigger, more complex server Users Application Scales Out Just add more commodity web servers Users System Cost Application Performance Relational Technology Scales Up Relational Database Web/App Server Tier Expensive and disruptive sharding, doesn’t perform at web scale System Cost Application Performance Won’t scale beyond this point
  • 8. NoSQL Database Scales Out Like App Tier NoSQL Database Scales Out Cost and performance mirrors app tier Users Scaling out flattens the cost and performance curves Couchbase Distributed Data Store Application Scales Out Just add more commodity web servers Users System Cost Application Performance Application Performance System Cost Web/App Server Tier
  • 10. Relational vs Document Data Model Relational data model Document data model Collection of complex documents with arbitrary, nested data formats and varying “record” format. Highly-structured table organization with rigidly-defined data formats and record structure. C1 C2 C3 C4 JSON JSON JSON { }
  • 11. RDBMS Example: User Profile Address Info 1 DEN 30303CO 2 MV 94040CA 3 CHI 60609IL User Info KEY First ZIP_idLast 4 NY 10010NY 1 Dipti 2Borkar 2 Joe 2Smith 3 Ali 2Dodson 4 John 3Doe ZIP_id CITY ZIPSTATE 1 2 2 MV 94040CA To get information about specific user, you perform a join across two tables
  • 12. Document Example: User Profile All data in a single document { “ID”: 1, “FIRST”: “Dipti”, “LAST”: “Borkar”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA” } JSON = +
  • 13. Making a Change Using RDBMS User ID First Last Zip 1 Dipti Borkar 94040 2 Joe Smith 94040 3 Ali Dodson 94040 4 Sarah Gorin NW1 5 Bob Young 30303 6 Nancy Baker 10010 7 Ray Jones 31311 8 Lee Chen V5V3M • • • 50000 Doug Moore 04252 50001 Mary White SW195 50002 Lisa Clark 12425 Country ID TEL 3 001 Country ID Country name 001 USA 002 UK 003 Argentina 004 Australia 005 Aruba 006 Austria 007 Brazil 008 Canada 009 Chile • • • 130 Portugal 131 Romania 132 Russia 133 Spain 134 Sweden User ID Photo ID Comment 2 d043 NYC 2 b054 Bday 5 c036 Miami 7 d072 Sunset 5002 e086 Spain Photo Table 001 007 001 133 133 User ID Status ID Text 1 a42 At conf 4 b26 excited 5 c32 hockey 12 d83 Go A’s 5000 e34 sailing Status Table 134 007 008 001 005 Country Table User ID Affl ID Affl Name 2 a42 Cal 4 b96 USC 7 c14 UW 8 e22 Oxford Affiliations Table Country ID 001 001 001 002 Country ID Country ID 001 001 002 001 001 001 008 001 002 001 User Table . . .
  • 14. Making the Same Change With a Document DB { “ID”: 1, “FIRST”: “Don”, “LAST”: “Pinto”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA”, “STATUS”: { “TEXT”: “At Conf” } } “GEO_LOC”: “134” }, “COUNTRY”: ”USA” Just add information to a document JSON ,}
  • 15. User ID First Last Zip 1 Frank Wiegel 94040 2 Joe Smith 94040 3 Ali Dodson 94040 4 Sarah Gorin NW1 5 Bob Young 30303 6 Nancy Baker 10010 7 Ray Jones 31311 8 Lee Chen V5V3 • • • 5000 Doug Moore 04252 5001 Mary White 41694 5002 Lisa Clark 12425 User ID Photo ID Comment 2 d043 NYC 2 b054 Bday 5 c036 Miami 7 d072 Sunset 5002 e086 Spain User Table Photo Table User ID Status ID Text 1 a42 At conf 4 b26 excited 5 c32 hockey 12 d83 Go A’s 5000 e34 sailing Status Table User ID Affiliations ID Affiliations Name 2 a42 Cal 4 b96 USC 7 c14 UW 8 e22 Oxford Affiliations Table Relational vs Document Performance 1 Frank 94040Weigel a421 At conf 5 Bob 30303Young c0365 Miami 4 Sarah NW1Gorin b264 hockey JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } 8 Lee V5V3Chen e228 Oxford5002 Lisa 12425Clark e0865002 Spain c0325 excited Faster response times and higher throughput
  • 16. Document Databases Easily Accommodate Unstructured Data { “ID”: 1, “NAME”: “Fairmont San Francisco”, “DESCRIPTION”: “Historic grandeur…”, “AVG_REVIEWER_SCORE”: “4.3”, “AMENITY”: {“TYPE”: “gym”, DESCRIPTION: “fitness center” }, {“TYPE”: “wifi”, “DESCRIPTION”: “free wifi”}, “RATE_TYPE”: “nightly”, “PRICE”: “$199”, “REVIEWS”: [“review_1”, “review_2”], “ATTRACTIONS”: “Chinatown”, } JSON { “ID”: 2, “NAME”: “W San Francisco”, “DESCRIPTION”: “Chic, hip accommodations..”, “AVG_REVIEWER_SCORE”: “4.0”, “AMENITY”: {“TYPE”: “spa”, DESCRIPTION: “Bliss Spa” }, {“TYPE”: “wifi”, “DESCRIPTION”: “free wifi”}, {“TYPE”: “dining”, “DESCRIPTION”: “bar/lounge”}, “RATE_TYPE”: “nightly”, “PRICE”: “$194”, “REVIEWS”: [“review_1”, “review_2”], } JSON Hotels
  • 17. Document Databases Easily Accommodate Unstructured Data { “ID”: 1, “NAME”: “Fairmont San Francisco”, …} JSON { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel & Location”, “WOULD RECOMMEND”: “yes”, “AVG_REVIEWER_SCORE”: “5”, “REVIEW_DATE”: “May 29, 2013”, “USER_PROFILE_ID”: “271”, } JSON { “REVIEW_ID”: 2, “REVIEW”: “Nice, but a few kinks”, “WOULD RECOMMEND”: “yes”, “AVG_REVIEWER_SCORE”: “4”, “REVIEW_DATE”: “May 22, 2013”, “USER_PROFILE_ID”: “923”, } JSON Hotels Reviews
  • 18. Document Databases Easily Accommodate Unstructured Data { “ID”: 1, “NAME”: “Fairmont San Francisco”, …} JSON Hotel Descriptions Reviews { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”, …} JSON { “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”, …} JSON User Profiles { “USER_ID”: 1, “DISPLAY_NAME ”: “Ted’s Trip Experience”, “CITY”: “Saratoga”, “STATE”: “California”, “NUM_OF_REVIEWS”: “8”, } JSON { “USER_ID”: 1, “DISPLAY_NAME ”: “WhatWhat567”, “CITY”: “Kansas City”, “STATE”: “MO”, “NUM_OF_REVIEWS”: “3”, } JSON
  • 19. Document Databases Easily Accommodate Unstructured Data { “ID”: 1, “NAME”: “Fairmont San Francisco”, …} JSON Hotel Descriptions Reviews { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”, …} JSON { “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”, …} JSON User Profiles { “USER_ID”: 1, “DISPLAY”: “Ted’s Trip…”, …} JSON { “USER_ID”: 2, “DISPLAY”: “WhatWhat …”, …} JSON Document IDs associates related objects Hotels points to reviews Reviews points to users
  • 20. Indexing with Document Databases Index on AVG_REVIEWER_SCORE
  • 21. Indexing with Document Databases Index on AVG_REVIEWER_SCORE … 4.0, doc_id 4.0, doc_id 4.1, doc_id 4.3, doc_id 5.0, doc_id … Index
  • 22. Querying with Document Databases Query on AVG_REVIEWER_SCORE … 3.4, doc_id 3.4, doc_id 3.5, doc_id 3.6, doc_id 3.7, doc_id 3.8, doc_id 4.0, doc_id 4.1, doc_id 4.3, doc_id 4.5, doc_id 4.7, doc_id 4.9, doc_id 5.0, doc_id … 5.0, doc_id Index Matching ResultsQuery
  • 24. NoSQL catalog Key-Value memcached redis Data Structure Document Column Graph mongoDB couchbase cassandra Cache (memoryonly) Database (memory/disk) Neo4j
  • 25. Couchbase Open Source Project • Leading NoSQL database project focused on distributed database technology and surrounding ecosystem • Supports both key-value and document-oriented use cases • All components are available under the Apache 2.0 Public License • Obtained as packaged software in both enterprise and community editions. Couchbase Open Source Project
  • 26. Easy Scalability Consistent High Performance Always On 24x365 Grow cluster without application changes, without downtime with a single click Consistent sub-millisecond read and write response times with consistent high throughput No downtime for software upgrades, hardware maintenance, etc. JSON JSON JSON JSONJSON Flexible Data Model JSON document model with no fixed schema. Couchbase Server
  • 27. Couchbase Server Architecture Heartbeat Processmonitor Globalsingletonsupervisor Configurationmanager on each node Rebalanceorchestrator Nodehealthmonitor one per cluster vBucketstateandreplicationmanager http RESTmanagementAPI/WebUI HTTP 8091 Erlang port mapper 4369 Distributed Erlang 21100 - 21199 Erlang/OTP storage interface Couchbase EP Engine 11210 Memcapable 2.0 Moxi 11211 Memcapable 1.0 Memcached New Persistence Layer 8092 Query APIQueryEngine Data Manager Cluster Manager
  • 28. Couchbase Server Architecture Replication, Rebalance, Shard State Manager REST management API/Web UI 8091 Admin Console Erlang/OTP 11210 / 11211 Data access ports Object-managed Cache Multi-threaded Persistence Engine 8092 Query APIQueryEngine http Data Manager Cluster Manager
  • 29. Where is NoSQL a good fit?
  • 30. Market Adoption Internet Companies Enterprises • Communications • Retail • Financial Services • Health Care • Automotive/Airline • Agriculture • Consumer Electronics • Business Systems • Social Gaming • Ad Networks • Social Networks • Online Business Services • E-Commerce • Online Media • Content Management • Cloud Services
  • 31. Application Characteristics - Data driven • 3rd party or user defined structure (Twitter feeds) • Support for unlimited data growth (Viral apps) • Data with non-homogenous structure • Need to quickly and often change data structure • Variable length documents • Sparse data records • Hierarchical data NoSQL is a good fit
  • 32. Application Characteristics - Performance driven • Low latency critical (ex. 1millisecond) • High throughput (ex. 200000 ops / sec) • Large number of users • Unknown demand with sudden growth of users/data • Predominantly direct document access • Read / Mixed / Write heavy workloads NoSQL is a good fit
  • 33. Q & A
  • 35. Extra - Couchbase Operations
  • 36. 33 2 Single node - Couchbase Write Operation Managed Cache DiskQueue Disk Replication Queue App Server Couchbase Server Node Doc 1Doc 1 Doc 1 To other node
  • 37. 33 2 Single node - Couchbase Update Operation Managed Cache DiskQueue Replication Queue App Server Doc 1’ Doc 1 Doc 1’Doc 1 Doc 1’ Disk To other node Couchbase Server Node
  • 38. GET Doc1 33 2 Single node - Couchbase Read Operation DiskQueue Replication Queue App Server Doc 1 Doc 1Doc 1 Managed Cache Disk To other node Couchbase Server Node
  • 39. 33 2 Single node – Couchbase Cache Miss 2 DiskQueue Replication Queue App Server Couchbase Server Node Doc 1 Doc 3Doc 5 Doc 2Doc 4 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Doc 4 GET Doc1 Doc 1 Doc 1 Managed Cache Disk To other node
  • 40. COUCHBASE SERVER CLUSTER Basic Operation • Docs distributed evenly across servers • Each server stores both active and replica docs Only one server active at a time • Client library provides app with simple interface to database • Cluster map provides map to which server doc is on App never needs to know • App reads, writes, updates docs • Multiple app servers can access same document at same time User Configured Replica Count = 1 READ/WRITE/UPDATE ACTIVE Doc 5 Doc 2 Doc Doc Doc SERVER 1 ACTIVE Doc 4 Doc 7 Doc Doc Doc SERVER 2 Doc 8 ACTIVE Doc 1 Doc 2 Doc Doc Doc REPLICA Doc 4 Doc 1 Doc 8 Doc Doc Doc REPLICA Doc 6 Doc 3 Doc 2 Doc Doc Doc REPLICA Doc 7 Doc 9 Doc 5 Doc Doc Doc SERVER 3 Doc 6 APP SERVER 1 COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP APP SERVER 2 Doc 9
  • 41. Add Nodes to Cluster • Two servers added One-click operation • Docs automatically rebalanced across cluster Even distribution of docs Minimum doc movement • Cluster map updated • App database calls now distributed over larger number of servers REPLICA ACTIVE Doc 5 Doc 2 Doc Doc Doc 4 Doc 1 Doc Doc SERVER 1 REPLICA ACTIVE Doc 4 Doc 7 Doc Doc Doc 6 Doc 3 Doc Doc SERVER 2 REPLICA ACTIVE Doc 1 Doc 2 Doc Doc Doc 7 Doc 9 Doc Doc SERVER 3 SERVER 4 SERVER 5 REPLICA ACTIVE REPLICA ACTIVE Doc Doc 8 Doc Doc 9 Doc Doc 2 Doc Doc 8 Doc Doc 5 Doc Doc 6 READ/WRITE/UPDATE READ/WRITE/UPDATE APP SERVER 1 COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP APP SERVER 2 COUCHBASE SERVER CLUSTER User Configured Replica Count = 1
  • 42. Fail Over Node REPLICA ACTIVE Doc 5 Doc 2 Doc Doc Doc 4 Doc 1 Doc Doc SERVER 1 REPLICA ACTIVE Doc 4 Doc 7 Doc Doc Doc 6 Doc 3 Doc Doc SERVER 2 REPLICA ACTIVE Doc 1 Doc 2 Doc Doc Doc 7 Doc 9 Doc Doc SERVER 3 SERVER 4 SERVER 5 REPLICA ACTIVE REPLICA ACTIVE Doc 9 Doc 8 Doc Doc 6 Doc Doc Doc 5 Doc Doc 2 Doc 8 Doc Doc • App servers accessing docs • Requests to Server 3 fail • Cluster detects server failed Promotes replicas of docs to active Updates cluster map • Requests for docs now go to appropriate server • Typically rebalance would follow Doc Doc 1 Doc 3 APP SERVER 1 COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP APP SERVER 2 User Configured Replica Count = 1 COUCHBASE SERVER CLUSTER

Editor's Notes

  • #15: The data is modeled for the application code and not for the database.
  • #31: These are the market segments
  • #37: 1.  A set request comes in from the application .2.  Couchbase Server responses back that they key is written3. Couchbase Server then Replicates the data out to memory in the other nodes4. At the same time it is put the data into a write que to be persisted to disk
  • #38: 1.  A set request comes in from the application .2.  Couchbase Server responses back that they key is written3. Couchbase Server then Replicates the data out to memory in the other nodes4. At the same time it is put the data into a write que to be persisted to disk
  • #39: 1.  A set request comes in from the application .2.  Couchbase Server responses back that they key is written3. Couchbase Server then Replicates the data out to memory in the other nodes4. At the same time it is put the data into a write que to be persisted to disk
  • #40: 1.  A set request comes in from the application .2.  Couchbase Server responses back that they key is written3. Couchbase Server then Replicates the data out to memory in the other nodes4. At the same time it is put the data into a write que to be persisted to disk
  • #41: Bulletize the text. Make sure the builds work.
  • #42: Bulletize the text. Make sure build work properly.
  • #43: Bulletize the text. Make sure build work properly.