SlideShare a Scribd company logo
JSON Data
Modeling
Matthew D. Groves, @mgroves
Modeling Data in a Relational World
2
Billing
ConnectionsPurchases
Contacts
Customer
Where am I?
3
• GDG Indy
• https://www.meetup.com/indy-gdg
Who am I?
4
• Matthew D. Groves
• Developer Advocate for Couchbase
• @mgroves on Twitter
• Podcast and blog: https://crosscuttingconcerns.com
• "I am not an expert, but I am an enthusiast." – Alan Stevens
by @natelovett
JSON Data
Modeling
Matthew D. Groves, @mgroves
6
AGENDA
01/ Why NoSQL?
02/ JSON Data Modeling
03/ Accessing Data
04/ Migrating Data
05/ Summary / Q&A
Why NoSQL?
7
1
NoSQL Landscape
Document
• Couchbase
• MongoDB
• DynamoDB
• Firestore
Graph
• OrientDB
• Neo4J
• CosmosDB
Key-Value
• Couchbase
• Riak
• BerkeleyDB
• Redis Wide Column
• Hbase
• Cassandra
• Hypertable
NoSQL Landscape
• Get by key(s)
• Set by key(s)
• Replace by key(s)
• Delete by key(s)
• Map/Reduce
Document
• Couchbase
• MongoDB
• DynamoDB
• Firestore
What's NoSQL?
1
0
Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
Why NoSQL? Scalability
Why NoSQL? Flexibility
Why NoSQL? Availability
Why NoSQL? Performance
Use Cases for NoSQL
• Communication
• Gaming
• Advertising
• Travel booking
• Loyalty programs
• Fraud monitoring
• Social media
• Finance
• Caching
• Session
• User profile
• Catalog
• Content management
• Personalization
• Customer 360
• IoT
https://www.couchbase.com/customers
Use Cases
1
6
JSON Data
Modeling
1
7
2
Properties of Real-World Data
1
8
Modeling Data in a Relational World
1
9
Billing
ConnectionsPurchases
Contacts
Customer
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30”
}
Customer DocumentKey: CBL2015
©2017 Couchbase Inc. 21
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer {
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
}
]
}
Customer DocumentKey: CBL2015
CustomerID Item Amount Date
CBL2015 laptop 1499.99 2019-03
Table: Purchases
CustomerID Name DOB
CBL2015 Jane Smith 1990-01-30
Table: Customer {
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Customer DocumentKey: CBL2015
CustomerID Item Amount Date
CBL2015 laptop 1499.99 2019-03
CBL2015 phone 99.99 2018-12
Table: Purchases
CustomerID ConnId Relation
CBL2015 XYZ987 Brother
CBL2015 SKR007 Father
Table: Connections {
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5827-2842-...",
"expiry" : "2019-03"
}, ...
],
"Connections" : [
{
"ConnId" : "XYZ987",
"Relation" : "Brother"
},
{
"ConnId" : "SKR007",
"Relation" : "Father"
}
}
Customer DocumentKey: CBL201
©2017 Couchbase Inc. 24
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"cardnum" : "5827-2842…",
"expiry" : "2019-03",
"cardType" : "visa",
"Connections" : [
{
"CustId" : "XYZ987",
"Relation" : "Brother"
},
{
"CustId" : "SKR007",
" Relation " : "Father"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52
}
{ "id":19, item: "ipad2", "amt": 623.52
}
]
}
DocumentKey: CBL2015
Custome
rID
Name DOB Cardnum Expiry CardType
CBL201
5
Jane
Smith
1990-01-
30
5827-
2842…
2019-03 visa
CustomerI
D
ConnId Relation
CBL2015 XYZ987 Brother
CBL2015 SKR007 Father
CustomerI
D
item amt
CBL2015 mac 2823.5
2
CBL2015 ipad2 623.52
CustomerI
D
ConnId Name
CBL2015 XYZ987 Joe
Smith
CBL2015 SKR007 Sam
Smith
Contacts
Customer
ConnectionsPurchases
{
"Name" : "Bob Jones",
"DOB" : "1980-01-29",
"Billing" : [
{
"type" : "visa",
"cardnum" : "5927-2842-2847-3909",
"expiry" : "2020-03"
},
{
"type" : "master",
"cardnum" : "6273-2842-2847-3909",
"expiry" : "2019-11"
}
],
"Connections" : [
{
"CustId" : "XYZ987",
"Relation" : "Brother"
},
{
"CustId" : "PQR823",
"Relation" : "Father"
}
],
"Purchases" : [
{ "id":12, item: "mac", "amt": 2823.52 },
{ "id":19, item: "ipad2", "amt": 623.52 }
]
}
DocumentKey: CBL2016
CustomerID Name DOB
CBL2016 Bob Jones 1980-01-29
Custome
rID
Type Cardnum Expiry
CBL2016 visa 5927… 2020-03
CBL2016 maste
r
6273… 2019-11
CustomerI
D
ConnId Relation
CBL2016 XYZ987 Brother
CBL2016 SKR007 Father
CustomerI
D
item amt
CBL2016 mac 2823.5
2
CBL2016 ipad2 623.52
CustomerI
D
ConnI
d
Name
CBL201
6
XYZ98
7
Joe
Smith
CBL201
6
SKR0
07
Sam
Smith
Contacts
Customer
Billing
ConnectionsPurchases
Relationship is one-to-one or one-to-many
Store related data as nested objects
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Modeling your data: Strategies / rules of thumb
Relationship is many-to-one or many-to-
many
Store related data as separate documents
{
"Name" : "Jane
Smith",
"DOB" : "1990-01-
30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}
Modeling your data: Strategies / rules of thumb
Modeling tools
• Hackolade
• Erwin DM NoSQL
• Idera ER/Studio
Accessing Data
2
9
3
Data reads are mostly parent fields
Store children as separate documents
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}
Modeling your data: Strategies / rules of thumb
Data reads are mostly parent + child fields
Store children as nested objects
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Modeling your data: Strategies / rules of thumb
Data writes are mostly parent or child (not
both)
Store children as separate documents
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Connections" : [
"XYZ987",
"PQR823",
"PQR828"
]
}
Modeling your data: Strategies / rules of thumb
Data writes are mostly parent and child (both)
Store children as nested objects
{
"Name" : "Jane Smith",
"DOB" : "1990-01-30",
"Purchases" : [
{
"item" : "laptop",
"amount" : 1499.99,
"date" : "2019-03",
},
{
"item" : "phone",
"amount" : 99.99,
"date" : "2018-12"
}
]
}
Modeling your data: Strategies / rules of thumb
If … Then …
Relationship is one-to-one or one-to-many Store related data as nested objects
Relationship is many-to-one or many-to-
many
Store related data as separate documents
Data reads are mostly parent fields Store children as separate documents
Data reads are mostly parent + child fields Store children as nested objects
Data writes are mostly parent or child (not
both)
Store children as separate documents
Data writes are mostly parent and child
(both)
Store children as nested objects
Modeling your data: Strategies / rules of thumb
Accessing your data (Couchbase)
Key-Value
(CRUD)
N1QL
(SQL
Query)
Full Text
(Search)
Documents
Indexes Indexes
Views
(JS Query)
Analytics
(Query)
MapReduc
e
SQL++
Key/Value
public ShoppingCart GetCartById(string id)
{
return _bucket.Get<ShoppingCart>(id).Value;
}
public void CreateShoppingCart()
{
_bucket.Insert(new Document<ShoppingCart>
{
Id = "shopping-cart-1",
Content = new ShoppingCart { . . . }
});
}
Key/Value: Recommendations for keys
•Natural Keys
•Human Readable
•Deterministic
•Semantic
Key/Value: Example keys
• author::matt
• author::matt::blogs
• blog::csharp_8_features
• blog::csharp_8_features::comments
Subdocument access
3
9
{
"username": "mgroves",
"profile": {
"phoneNumber": "123-456-7890",
"address": {
"street": "123 main st",
"city": "Grove City",
"state": "Ohio"
}
}
}
Subcollection (Firestore)
4
0
N1QL
Understanding your Query Plan
Full Text Search
Concept Strategies & Recommendations
Key-Value Operations provide the best
possible performance
• Create an effective key naming strategy
• Create an optimized data model
Full Text Search is well-suited to text • Facets / ranges / geography
• Language aware
N1QL queries provide the most flexibility –
everything else
• Query data regardless of how it is
modeled
• Good indexing is vital
Accessing your data: Strategies and recommendation
Migrating Data
4
5
4
Migration options: Requirements
ETL / data cleanse / data enrichment
Migration options: Tools
Migration options: BYO
Migration options: KISS
Export
Transform
Import
StagingNoSQL
Relational
Migration Recommendations: Align
Migration Recommendations: Expect Failure
Migration Recommendations: Ensure
Sync NoSQL and relational? Automatic Replication
Couchbase
KafkaSource Sink
RDBMSDCP
Stream
How can you sync NoSQL and relational?
RDBMS
Handler / Eventing
Couchbase
GoldenGate
https://github.com/mahurtado/CouchbaseGoldenGateAdapter
Sync NoSQL and relational? Manual.
Summary
5
6
5
Pick the right
application
Summary
Proof of Concept
Summary
Match the data
access method to
requirements
Summary
https://blog.couchbase.com/proof-of-concept-move-
relational/
https://blog.couchbase.com/json-data-modeling-rdbms-users/
Resources: Blog posts
Resources: Me!
6
1
•@mgroves
•twitch.tv/matthewdgroves
•forums.couchbase.com
Frequently Asked Questions
6
2
1. How is Couchbase different than Mongo?
2. Is Couchbase the same thing as CouchDb?
3. How tall are you? Do you play basketball?
4. What is the Couchbase licensing situation?
5. Is Couchbase a Managed Cloud Service (DBaaS)?
Managed Cloud Server (DBaaS)
6
3
< Back
https://www.couchbase.com/products/cloud
MongoDB vs Couchbase
6
4
• Architecture
• Memory first architecture
• Master-master architecture
• Auto-sharding
• Features
• SQL (N1QL)
• Full Text Search
• Analytics (NoETL)
< Back
Licensing
6
5
< Back
Couchbase Server Community
• Source code is Open Source (Apache 2)
• Binary release is one release behind Enterprise (except major versions)
• Free to use in dev/test/qa/prod
• Forum support only
Couchbase Server Enterprise
• Source code is mostly Open Source (Apache 2)
• Some features not available on Community (XDCR TLS, MDS, Rack Zone,
etc)
• Free to use in dev/test/qa
• Need commercial license for prod
• Paid support provided
CouchDB and Couchbase
6
6
< Back
memcached

More Related Content

DOCX
EXPLOIT POST EXPLOITATION
PDF
BGA CTF Ethical Hacking Yarışması Çözümleri
DOCX
DOS DDOS TESTLERİ
PDF
Zmap Hack The Planet
PDF
Web Uygulama Güvenliği 101
PDF
Nessus Kullanım Kitapçığı
PDF
Arp protokolu ve guvenlik zafiyeti
EXPLOIT POST EXPLOITATION
BGA CTF Ethical Hacking Yarışması Çözümleri
DOS DDOS TESTLERİ
Zmap Hack The Planet
Web Uygulama Güvenliği 101
Nessus Kullanım Kitapçığı
Arp protokolu ve guvenlik zafiyeti

What's hot (20)

PPTX
Jvm tuning for low latency application & Cassandra
PDF
Web Servislerine Yönelik Sızma Testleri
PDF
Owasp top 10 inceleme
DOCX
LINUX, WINDOWS VE AĞ SİSTEMLERİ SIZMA TESTLERİ
PDF
Güvenlik Testlerinde Bilgi Toplama
PDF
Kablosuz Ağlara Yapılan Saldırılar
DOCX
GÜVENLİK SİSTEMLERİNİ ATLATMA
PDF
LLMNR ve NetBIOS Poisoning
PDF
Metasploit El Kitabı
ODP
İnformation Gathering - Bilgi Toplama (Cyber Security - Siber Güvenlik))
PDF
Güvenli Veri Silme ve Dosya Kurtarma
PDF
Hping Kullanarak Ağ Keşif Çalışmaları
PDF
DNS Protokolüne Yönelik Güncel Saldırı Teknikleri & Çözüm Önerileri
PDF
Bilgi Güvenliği Farkındalık Eğitimi
PDF
İnternet Üzerinde Anonimlik ve Tespit Yöntemleri
PDF
Dns Amplification Zafiyeti
PDF
Kablosuz Ağlarda Adli Analiz
PDF
Bilgi Güvenliğinde Sızma Testleri
PPTX
BTRisk - Siber Olay Tespit ve Mudahale Egitimi
PDF
S3 Select를 통한 빠른 데이터 분석하기 - 트랙2, Community Day 2018 re:Invent 특집
Jvm tuning for low latency application & Cassandra
Web Servislerine Yönelik Sızma Testleri
Owasp top 10 inceleme
LINUX, WINDOWS VE AĞ SİSTEMLERİ SIZMA TESTLERİ
Güvenlik Testlerinde Bilgi Toplama
Kablosuz Ağlara Yapılan Saldırılar
GÜVENLİK SİSTEMLERİNİ ATLATMA
LLMNR ve NetBIOS Poisoning
Metasploit El Kitabı
İnformation Gathering - Bilgi Toplama (Cyber Security - Siber Güvenlik))
Güvenli Veri Silme ve Dosya Kurtarma
Hping Kullanarak Ağ Keşif Çalışmaları
DNS Protokolüne Yönelik Güncel Saldırı Teknikleri & Çözüm Önerileri
Bilgi Güvenliği Farkındalık Eğitimi
İnternet Üzerinde Anonimlik ve Tespit Yöntemleri
Dns Amplification Zafiyeti
Kablosuz Ağlarda Adli Analiz
Bilgi Güvenliğinde Sızma Testleri
BTRisk - Siber Olay Tespit ve Mudahale Egitimi
S3 Select를 통한 빠른 데이터 분석하기 - 트랙2, Community Day 2018 re:Invent 특집
Ad

Similar to JSON Data Modeling - GDG Indy - April 2020 (20)

PDF
Data Modeling and Relational to NoSQL
PPTX
Json data modeling june 2017 - pittsburgh tech fest
PPTX
JSON Data Modeling - July 2018 - Tulsa Techfest
PDF
JSON Data Modeling in Document Database
PPTX
Querying NoSQL with SQL - MIGANG - July 2017
PPTX
Querying NoSQL with SQL - KCDC - August 2017
PPTX
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
PPTX
Query in Couchbase. N1QL: SQL for JSON
PPTX
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
PPTX
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
PPTX
Introducing N1QL: New SQL Based Query Language for JSON
PPTX
NoSQL Data Modeling using Couchbase
PDF
The Why, When, and How of NoSQL - A Practical Approach
PDF
Three Things You Need to Know About Document Data Modeling in NoSQL
PPTX
Putting the SQL Back in NoSQL - October 2022 - All Things Open
PPTX
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
PPTX
Utilizing Arrays: Modeling, Querying and Indexing
PDF
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
PPTX
From SQL to NoSQL: Structured Querying for JSON
PDF
moving_from_relational_to_nosql_couchbase_2016
Data Modeling and Relational to NoSQL
Json data modeling june 2017 - pittsburgh tech fest
JSON Data Modeling - July 2018 - Tulsa Techfest
JSON Data Modeling in Document Database
Querying NoSQL with SQL - MIGANG - July 2017
Querying NoSQL with SQL - KCDC - August 2017
SQL for JSON: Rich, Declarative Querying for NoSQL Databases and Applications 
Query in Couchbase. N1QL: SQL for JSON
Bringing SQL to NoSQL: Rich, Declarative Query for NoSQL
Querying NoSQL with SQL: HAVING Your JSON Cake and SELECTing it too
Introducing N1QL: New SQL Based Query Language for JSON
NoSQL Data Modeling using Couchbase
The Why, When, and How of NoSQL - A Practical Approach
Three Things You Need to Know About Document Data Modeling in NoSQL
Putting the SQL Back in NoSQL - October 2022 - All Things Open
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Utilizing Arrays: Modeling, Querying and Indexing
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
From SQL to NoSQL: Structured Querying for JSON
moving_from_relational_to_nosql_couchbase_2016
Ad

More from Matthew Groves (20)

PPTX
CREAM - That Conference Austin - January 2024.pptx
PPTX
FluentMigrator - Dayton .NET - July 2023
PPTX
Cache Rules Everything Around Me - DevIntersection - December 2022
PPTX
Cache Rules Everything Around Me - Momentum - October 2022.pptx
PPTX
Don't Drop ACID (July 2021)
PPTX
Don't Drop ACID - Data Love - April 2021
PPTX
Demystifying NoSQL - All Things Open - October 2020
PPTX
Autonomous Microservices - Manning - July 2020
PPTX
CONDG April 23 2020 - Baskar Rao - GraphQL
PPTX
Background Tasks Without a Separate Service: Hangfire for ASP.NET - KCDC - Ju...
PPTX
Intro to SQL++ - Detroit Tech Watch - June 2019
PPTX
Autonomous Microservices - CodeMash - January 2019
PPTX
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
PPTX
5 NoSQL Options - Toronto - May 2018
PPTX
Full stack development with node and NoSQL - All Things Open - October 2017
PPTX
5 Popular Choices for NoSQL on a Microsoft Platform - All Things Open - Octob...
PPTX
I Have a NoSQL toaster - DC - August 2017
PPTX
I Have a NoSQL Toaster - Troy .NET User Group - July 2017
PDF
I have a NoSQL Toaster - ConnectJS - October 2016
PDF
Full stack development with Node and NoSQL - Austin Node.JS Group - October ...
CREAM - That Conference Austin - January 2024.pptx
FluentMigrator - Dayton .NET - July 2023
Cache Rules Everything Around Me - DevIntersection - December 2022
Cache Rules Everything Around Me - Momentum - October 2022.pptx
Don't Drop ACID (July 2021)
Don't Drop ACID - Data Love - April 2021
Demystifying NoSQL - All Things Open - October 2020
Autonomous Microservices - Manning - July 2020
CONDG April 23 2020 - Baskar Rao - GraphQL
Background Tasks Without a Separate Service: Hangfire for ASP.NET - KCDC - Ju...
Intro to SQL++ - Detroit Tech Watch - June 2019
Autonomous Microservices - CodeMash - January 2019
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 NoSQL Options - Toronto - May 2018
Full stack development with node and NoSQL - All Things Open - October 2017
5 Popular Choices for NoSQL on a Microsoft Platform - All Things Open - Octob...
I Have a NoSQL toaster - DC - August 2017
I Have a NoSQL Toaster - Troy .NET User Group - July 2017
I have a NoSQL Toaster - ConnectJS - October 2016
Full stack development with Node and NoSQL - Austin Node.JS Group - October ...

Recently uploaded (20)

PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation theory and applications.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Big Data Technologies - Introduction.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Unlocking AI with Model Context Protocol (MCP)
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation theory and applications.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Reach Out and Touch Someone: Haptics and Empathic Computing
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
MYSQL Presentation for SQL database connectivity
Big Data Technologies - Introduction.pptx
Spectral efficient network and resource selection model in 5G networks
Per capita expenditure prediction using model stacking based on satellite ima...
Encapsulation_ Review paper, used for researhc scholars
Mobile App Security Testing_ A Comprehensive Guide.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Unlocking AI with Model Context Protocol (MCP)
The AUB Centre for AI in Media Proposal.docx
MIND Revenue Release Quarter 2 2025 Press Release

JSON Data Modeling - GDG Indy - April 2020

  • 1. JSON Data Modeling Matthew D. Groves, @mgroves
  • 2. Modeling Data in a Relational World 2 Billing ConnectionsPurchases Contacts Customer
  • 3. Where am I? 3 • GDG Indy • https://www.meetup.com/indy-gdg
  • 4. Who am I? 4 • Matthew D. Groves • Developer Advocate for Couchbase • @mgroves on Twitter • Podcast and blog: https://crosscuttingconcerns.com • "I am not an expert, but I am an enthusiast." – Alan Stevens by @natelovett
  • 5. JSON Data Modeling Matthew D. Groves, @mgroves
  • 6. 6 AGENDA 01/ Why NoSQL? 02/ JSON Data Modeling 03/ Accessing Data 04/ Migrating Data 05/ Summary / Q&A
  • 8. NoSQL Landscape Document • Couchbase • MongoDB • DynamoDB • Firestore Graph • OrientDB • Neo4J • CosmosDB Key-Value • Couchbase • Riak • BerkeleyDB • Redis Wide Column • Hbase • Cassandra • Hypertable
  • 9. NoSQL Landscape • Get by key(s) • Set by key(s) • Replace by key(s) • Delete by key(s) • Map/Reduce Document • Couchbase • MongoDB • DynamoDB • Firestore
  • 10. What's NoSQL? 1 0 Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved.
  • 15. Use Cases for NoSQL • Communication • Gaming • Advertising • Travel booking • Loyalty programs • Fraud monitoring • Social media • Finance • Caching • Session • User profile • Catalog • Content management • Personalization • Customer 360 • IoT https://www.couchbase.com/customers
  • 19. Modeling Data in a Relational World 1 9 Billing ConnectionsPurchases Contacts Customer
  • 20. CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30” } Customer DocumentKey: CBL2015
  • 21. ©2017 Couchbase Inc. 21 CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", } ] } Customer DocumentKey: CBL2015 CustomerID Item Amount Date CBL2015 laptop 1499.99 2019-03 Table: Purchases
  • 22. CustomerID Name DOB CBL2015 Jane Smith 1990-01-30 Table: Customer { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", }, { "item" : "phone", "amount" : 99.99, "date" : "2018-12" } ] } Customer DocumentKey: CBL2015 CustomerID Item Amount Date CBL2015 laptop 1499.99 2019-03 CBL2015 phone 99.99 2018-12 Table: Purchases
  • 23. CustomerID ConnId Relation CBL2015 XYZ987 Brother CBL2015 SKR007 Father Table: Connections { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-...", "expiry" : "2019-03" }, ... ], "Connections" : [ { "ConnId" : "XYZ987", "Relation" : "Brother" }, { "ConnId" : "SKR007", "Relation" : "Father" } } Customer DocumentKey: CBL201
  • 24. ©2017 Couchbase Inc. 24 { "Name" : "Jane Smith", "DOB" : "1990-01-30", "cardnum" : "5827-2842…", "expiry" : "2019-03", "cardType" : "visa", "Connections" : [ { "CustId" : "XYZ987", "Relation" : "Brother" }, { "CustId" : "SKR007", " Relation " : "Father" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ] } DocumentKey: CBL2015 Custome rID Name DOB Cardnum Expiry CardType CBL201 5 Jane Smith 1990-01- 30 5827- 2842… 2019-03 visa CustomerI D ConnId Relation CBL2015 XYZ987 Brother CBL2015 SKR007 Father CustomerI D item amt CBL2015 mac 2823.5 2 CBL2015 ipad2 623.52 CustomerI D ConnId Name CBL2015 XYZ987 Joe Smith CBL2015 SKR007 Sam Smith Contacts Customer ConnectionsPurchases
  • 25. { "Name" : "Bob Jones", "DOB" : "1980-01-29", "Billing" : [ { "type" : "visa", "cardnum" : "5927-2842-2847-3909", "expiry" : "2020-03" }, { "type" : "master", "cardnum" : "6273-2842-2847-3909", "expiry" : "2019-11" } ], "Connections" : [ { "CustId" : "XYZ987", "Relation" : "Brother" }, { "CustId" : "PQR823", "Relation" : "Father" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 }, { "id":19, item: "ipad2", "amt": 623.52 } ] } DocumentKey: CBL2016 CustomerID Name DOB CBL2016 Bob Jones 1980-01-29 Custome rID Type Cardnum Expiry CBL2016 visa 5927… 2020-03 CBL2016 maste r 6273… 2019-11 CustomerI D ConnId Relation CBL2016 XYZ987 Brother CBL2016 SKR007 Father CustomerI D item amt CBL2016 mac 2823.5 2 CBL2016 ipad2 623.52 CustomerI D ConnI d Name CBL201 6 XYZ98 7 Joe Smith CBL201 6 SKR0 07 Sam Smith Contacts Customer Billing ConnectionsPurchases
  • 26. Relationship is one-to-one or one-to-many Store related data as nested objects { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", }, { "item" : "phone", "amount" : 99.99, "date" : "2018-12" } ] } Modeling your data: Strategies / rules of thumb
  • 27. Relationship is many-to-one or many-to- many Store related data as separate documents { "Name" : "Jane Smith", "DOB" : "1990-01- 30", "Connections" : [ "XYZ987", "PQR823", "PQR828" ] } Modeling your data: Strategies / rules of thumb
  • 28. Modeling tools • Hackolade • Erwin DM NoSQL • Idera ER/Studio
  • 30. Data reads are mostly parent fields Store children as separate documents { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Connections" : [ "XYZ987", "PQR823", "PQR828" ] } Modeling your data: Strategies / rules of thumb
  • 31. Data reads are mostly parent + child fields Store children as nested objects { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", }, { "item" : "phone", "amount" : 99.99, "date" : "2018-12" } ] } Modeling your data: Strategies / rules of thumb
  • 32. Data writes are mostly parent or child (not both) Store children as separate documents { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Connections" : [ "XYZ987", "PQR823", "PQR828" ] } Modeling your data: Strategies / rules of thumb
  • 33. Data writes are mostly parent and child (both) Store children as nested objects { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Purchases" : [ { "item" : "laptop", "amount" : 1499.99, "date" : "2019-03", }, { "item" : "phone", "amount" : 99.99, "date" : "2018-12" } ] } Modeling your data: Strategies / rules of thumb
  • 34. If … Then … Relationship is one-to-one or one-to-many Store related data as nested objects Relationship is many-to-one or many-to- many Store related data as separate documents Data reads are mostly parent fields Store children as separate documents Data reads are mostly parent + child fields Store children as nested objects Data writes are mostly parent or child (not both) Store children as separate documents Data writes are mostly parent and child (both) Store children as nested objects Modeling your data: Strategies / rules of thumb
  • 35. Accessing your data (Couchbase) Key-Value (CRUD) N1QL (SQL Query) Full Text (Search) Documents Indexes Indexes Views (JS Query) Analytics (Query) MapReduc e SQL++
  • 36. Key/Value public ShoppingCart GetCartById(string id) { return _bucket.Get<ShoppingCart>(id).Value; } public void CreateShoppingCart() { _bucket.Insert(new Document<ShoppingCart> { Id = "shopping-cart-1", Content = new ShoppingCart { . . . } }); }
  • 37. Key/Value: Recommendations for keys •Natural Keys •Human Readable •Deterministic •Semantic
  • 38. Key/Value: Example keys • author::matt • author::matt::blogs • blog::csharp_8_features • blog::csharp_8_features::comments
  • 39. Subdocument access 3 9 { "username": "mgroves", "profile": { "phoneNumber": "123-456-7890", "address": { "street": "123 main st", "city": "Grove City", "state": "Ohio" } } }
  • 41. N1QL
  • 44. Concept Strategies & Recommendations Key-Value Operations provide the best possible performance • Create an effective key naming strategy • Create an optimized data model Full Text Search is well-suited to text • Facets / ranges / geography • Language aware N1QL queries provide the most flexibility – everything else • Query data regardless of how it is modeled • Good indexing is vital Accessing your data: Strategies and recommendation
  • 46. Migration options: Requirements ETL / data cleanse / data enrichment
  • 53. Sync NoSQL and relational? Automatic Replication Couchbase KafkaSource Sink RDBMSDCP Stream
  • 54. How can you sync NoSQL and relational? RDBMS Handler / Eventing Couchbase GoldenGate https://github.com/mahurtado/CouchbaseGoldenGateAdapter
  • 55. Sync NoSQL and relational? Manual.
  • 59. Match the data access method to requirements Summary
  • 62. Frequently Asked Questions 6 2 1. How is Couchbase different than Mongo? 2. Is Couchbase the same thing as CouchDb? 3. How tall are you? Do you play basketball? 4. What is the Couchbase licensing situation? 5. Is Couchbase a Managed Cloud Service (DBaaS)?
  • 63. Managed Cloud Server (DBaaS) 6 3 < Back https://www.couchbase.com/products/cloud
  • 64. MongoDB vs Couchbase 6 4 • Architecture • Memory first architecture • Master-master architecture • Auto-sharding • Features • SQL (N1QL) • Full Text Search • Analytics (NoETL) < Back
  • 65. Licensing 6 5 < Back Couchbase Server Community • Source code is Open Source (Apache 2) • Binary release is one release behind Enterprise (except major versions) • Free to use in dev/test/qa/prod • Forum support only Couchbase Server Enterprise • Source code is mostly Open Source (Apache 2) • Some features not available on Community (XDCR TLS, MDS, Rack Zone, etc) • Free to use in dev/test/qa • Need commercial license for prod • Paid support provided
  • 66. CouchDB and Couchbase 6 6 < Back memcached

Editor's Notes

  • #3: Most developers are probably already familiar with the relational way of modeling data But if you want the benefits of a non-relational database, you have to think differently about modeling
  • #7: Spend just a little time on why people are using NoSQL Talk about how data is modeled differently in JSON Let’s talk about why SQL is good and why SQL for JSON is needed Talk about accessing data, since that has an effect on modeling Maybe we'll get to migrating/syncing data from relational to nosql
  • #8: SQL (relational) databases are great. They give you LOT OF functionality. Great set of abstractions (tables, columns, data types, constraints, triggers, SQL, ACID TRANSACTIONS, stored procedures and more) at a highly reasonable cost. Change is inevitable One thing RDBMS does not handle well is CHANGE. Change of schema (both logical and physical), change of hardware, change of capacity. NoSQL databases ESPECIALLY ONES DESIGNED TO BE DISTRIBUTED tend to help solve problems with: agility, scalability, performance, and availability
  • #9: Let’s talk about what NoSQL is, first. NoSQL generally refers to databases which lack SQL or don’t use a relational model Once the SQL language, transaction became optional, flurry of databases were created using distinct approaches for common use-cases. KEY-Value simply provided quick access to data for a given KEY. Wide Column databases can store large number of arbitrary columns in each row Graph databases store data and relationships as first class concepts Document databases aggregate data into a hierarchical structure. With JSON is a means to the end. Document databases provide flexible schema,built-in data types, rich structure, implicit relationships using JSON.
  • #10: When we look at document databases, they originally came with a Minimal set of APIs and features But as they continue to mature, we’re seeing more features being added And generally I’m seeing a convergent trend between relational and NoSQL But anyway, this set of minimal features, lacking a SQL language and tables gives us the buzzword “nosql”
  • #11: Think of a document database at the simplest as a type of a key/value store, where the value is in a known format You write code where you start with a key, and you ask the database to return the document That corresponds to that key. And the same with creating/updating
  • #12: If you are using a cloud based DBaaS, this is basically what's going on behind the scenes Elastic scaling Size your cluster for today Scale out on demand Cost effective scaling Commodity hardware On premise or on cloud Scale OUT instead of Scale UP [example: changing the channel to a soccer game or Game of Thrones, everyone makes the same API request in the same 5 minutes] [example: TV show lets watchers vote during some period of the week, so you can scale up during that period of time] [example: black Friday]
  • #13: Schema flexibility Easier management of change in the business requirements Easier management of change in the structure of the data Sometimes you're pulling together data, integrating from different sources (e.g. ELT) and that flexibility helps Document database means that you have no rigid schema. You can do whatever the heck you want. That being said, you SHOULDN’T. You should still have discipline about your data.
  • #14: If one machine goes down, customers can still use the other. Or if you need to perform maintenance, upgrade, etc, you don't have to take the whole system down This is related to scaling Built-in replication and fail-over No application downtime when hardware fails Online maintenance & upgrade No application downtime
  • #15: NoSQL systems are optimized for specific access patterns Low response time for web & mobile user experience Millisecond latency Consistently high throughput to handle growth [perf measures can be subjective – talk about architecture, integrated cache, maybe mention MDS too]
  • #16: NoSQL is very versatile and can be used for a wide variety of use cases Including, but NOT limited to these If you're exploring NoSQL, make sure you have the right project or right use case. Using a NoSQL database does NOT mean you have to abandon relational databases. Most large websites use a combination. And it's also worth pointing out that plenty of companies are doing (most) of these use cases with relational as well. Relational usually is at least mediocre, NoSQL may be BETTER But usually the catalyst is one of the earlier reasons: performance, flexibility, scale.
  • #18: Let’s talk about data modeling a bit, because storing data in JSON Is different that storing in tables.
  • #19: Let’s look at modeling Customer data. This is an example of what a customer might look like. You might do this as part of a proof of concept, discovery, requirements gathering, planning, etc There is a rich structure: attributes, potentially sub-attributes (first name and last name) Relationships: to other data (other customers, to products perhaps) Value evolution: Maybe we’d start with one purchase, add more as Helen makes more purchases Structure evolution: Maybe we start will billing information being properties of Helen, then evolve later to be multiple billing options
  • #20: Let’s look at modeling Customer data. This is an example of what a customer might look like There is a rich structure: attributes, potentially sub-attributes (first name and last name) Relationships: to other data (other customers, to products perhaps) Value evolution: Maybe we’d start with one purchase, add more as Helen makes more purchases Structure evolution: Maybe we start will billing information being properties of Helen, then evolve later to be multiple billing options
  • #21: Let’s see how to represent customer data in JSON. The primary (CustomerID) becomes the DocumentKey Column name-Column value becomes KEY-VALUE pair.
  • #22: We aren’t normal form anymore Rich Structure & Relationships Billing information is stored as a sub-document There could be more than a single credit card. So, use an array.
  • #23: Value evolution Simply add additional array element or update a value.
  • #24: Structure evolution Simply add new key-value pairs No downtime to add new KV pairs Applications can validate data Structure evolution over time. Relations via Reference
  • #25: So, finally, you have a JSON document that represents a CUSTOMER. In a single JSON document, relationship between the data is implicit by use of sub-structures and arrays and arrays of sub-structures.
  • #26: So, finally, you have a JSON document that represents a CUSTOMER. In a single JSON document, relationship between the data is implicit by use of sub-structures and arrays and arrays of sub-structures.
  • #29: Hackolade supports couchbase, mongo, elastic, Cassandra, dynamo, firebase/firestore Erwin supports mongodb and couchbase Idera supports just mongodb
  • #30: The way you plan to get data out can also affect the way you model your data
  • #35: What types of relationships are being modeled? How are the relationships accessed?
  • #36: I've mostly been talking about key/value access Most NoSQL databases will have at least one other way to access data besides key/value. What does this have to do with modeling? Because modeling doesn't exist in a vacuum. You have to think about how you are going to interact with your data. I'm going to show you some examples from Couchbase. In Couchbase, we have N1QL, which is ANSI SQL for JSON I'm also going to briefly cover FTS today. There are other options I'm not going to cover today, including Analytics and Views/MapReduce.
  • #37: Just to reiterate on key/value If you know the key already, it's really simple and extremely fast to access that piece of data.
  • #38: Since key/value is so fast and easy, it would benefit us to use it as much as possible. Here are some tips to maximize your key/value usage.
  • #39: Starting from "matt" you can walk through this chain of documents With ONLY key/value access.
  • #40: Another thing to consider is whether or not your nosql database has a "subdocument" API If they do, then you have even more flexibility Not all of them do, and some databases may call this something different But the idea is this: if you only need "address", without a subdocument API You'd have to pull the entire document over the wire when doing reads/writes With a subdocument API, you can specify just a specific part to read/write This can be very helpful is you have large documents, or if you are doing a lot of reads and writes that only need a small portion of the data Firestore: there is a different concept of "subcollection". You have collections that contain documents, and then documents themselves can contain collections. Those documents have their own keys. If you DELETE a parent, then this subcollection will stick around. But when it comes to modeling considerations, with Firestore you can access these subcollections and look up individual keys in them. This is where we diverge a little bit from JSON modeling. But the idea is that you don't have to get the ENTIRE document, you can target individual pieces.
  • #41: In firestore, you have a collection of documents (rooms, in this case). An individual document (roomA) can itself have a collection (messages) And so on. You can do this in plain JSON in just about every other document database You can nest as deep as you want in JSON. Much like the subdocument I mentioned before, you can do a similar thing here. If you want to make a change To one message, you can address it directly with "rooms/roomA/messages/message1" EXCEPT that Firestore treats the documents in the subcollection kinda like separate documents. Gotcha: "Deleting a document does not delete its subcollections" So you should be careful when using this, you could end up with something kinda like orphan documents
  • #42: N1QL is powerful in it's flexibility, declarative nature, familiar to developers, JOINs, etc. But note, once we step out of key/value access, we need to involve other processes: We gotta parse the query, most likely use an index service, and in the end we'll get a bunch of keys to lookup the data. There is overhead involved, but sometimes this is a necessity.
  • #43: When you're outside of key/value access, you must understand the query plan. This is true for ANY database, relational or NoSQL. As an example, here's a Couchbase SQL query. I execute this and it ran in 1.2 seconds. It's using AN index on the TYPE field, but notice that name field on there. I can bring up a visualization of the query plan to see which parts are taking up the most time. In Couchbase, there is an index advisor. It suggested an index for me. After creating the index, the same query went to 146ms (about 8 times faster) Covering index could make this even faster (note the *)
  • #44: This is a search that revolves around text. Things like stemming, language awareness, facets, ranking, etc. This is, again, a very simple example. I'm searching for the keyword "submarine". In my application, this query may be limited to a certain search radius, or it may be limited to a certain facet, etc. But the end results are language aware, ranked matches. You would use this INSTEAD OF a sql 'like' for instance
  • #45: Build a proof of concept, which will help you see if NoSQL is the right fit And it will also help you understand the access patterns better Just to sum up Also note that in Couchbase, you can combine FTS and N1QL
  • #46: Migrating or syncing Because often Couchbase and relational are complimentary Couchbase can be used for engagement, Relational for transactions
  • #47: Are you going to take the time to clean up the data? Do you need to? Do you need to enrich or restructure the data to take advantage of Json? Also I'm using the term MIGRATION, but it might NOT be the case that you are abandoning a database in favor of another. You might want to sync data, you might want to make a copy of data into a more suitable database for your use case, etc.
  • #48: informatica, Talend, DART, ODBC, Kafka, Debezium, Spark, Kafka, Nifi
  • #49: Node / Python / bash / Powershell / curl / REST etc GoldenGate / DTS / SSIS CLI: cbimport, mongoimport, etc
  • #50: Again, proof of concept KISS: export to CSV and use N1QL to do any ETL that’s required Export to CSV Import as documents into a 'staging' bucket Use N1QL to transform Insert into new bucket
  • #51: Align with your data model The modeling step is vital. If you don't model your data to take advantages of JSON, then you are not going to see the advantages of using JSON. Don't treat a JSON database as if it were a relational database. Basically keep it as simple as you can and plan for failure. Developers often think of the migration process as “One and Done”, but the reality is that data migration is often an ongoing headache that DevOps needs to monitor and manage in a production environment. Make everyone’s life easier by thinking about the long game as much as possible.
  • #52: Plan for failure Bad source data Hardware failure Resource limitations (proof of concept vs MVP) Developers often think of the migration process as “One and Done”, but the reality is that data migration is often an ongoing headache that DevOps needs to monitor and manage in a production environment.
  • #53: Ensure: Interruptible, restartable, logged -> predictable Make everyone’s life easier by thinking about the long game as much as possible.
  • #54: From NoSQL to relational You can also turn this around and use Kafka in the other direction
  • #55: From relational to NoSQL: Goldendate is from oracle Cdata for SSIS and Couchbase https://github.com/mahurtado/CouchbaseGoldenGateAdapter https://www.cdata.com/drivers/couchbase
  • #56: Make it part of your application directly May or may not be reusable This is a lot of work, so make sure you have a good reason
  • #58: Focus on SOA, microservices, application/use case specific
  • #59: Modeling, Focus, Success Criteria, Review Architecture consider using a tool like Hackolade to define models rigorously and collaboratively
  • #60: Use Document type, Versionid Create optimized, understandable keys Weigh nested, referenced or mixed designs Add indexes: Simple, Compound, Functional, Partial, Array, Covering, Memory Optimized N1QL, Key-value, Views
  • #62: This is my family My enormous head barely fits in the picture
  • #64: Couchbase Cloud is currently in limited beta
  • #65: Memory first: integrated cache, you don't need to put redis on top of couchbase Master-master: easier scaling, better scaling Auto-sharding: we call vBuckets, you don't have to come up with a sharding scheme, it's done by crc32 N1QL: SQL, mongo has a more limited query language and it's not SQL-like Full Text Search: Using the bleve search engine, language aware FTS capabilities built in Mobile & sync: Mongo has nothing like the offline-first and sync capabilities couchbase offers Mongo DOES have a DbaaS cloud provider
  • #66: Everything I've shown you today is available in Community edition The only N1QL feature I can think of not in Community is INFER and Query Plan Visualizer The Enterprise features you probably don't need unless you are Enterprise developer.