Schema Design Best Practices with Buzz Moschetti

Document Database Schema Design

3
Agenda
Schema
Documents
Document Schema Design
Patterns

5
First thing that comes to mind…

6
But there are other types of schema

8
What is a Document?
{
name: ‘Dutch Constitution’,
headline: ‘The Present State of Holand’,
efforced_by: ‘King and Parliament’
date: ‘11 October 1848’,
labels: [legal, society, rules],
freedoms: [
{ name: ‘Speach’,
text: 'Any censorship is absolutely forbidden'},
{ name: ‘Association’,
text: 'This right can be limited by formal law,'},
}
}

10
The focus is "What I want to Build"
• We focus on how to use Data
– Not on how to store it
• Use flexibility of schema to adjust to new
features and iterations deliver more
features
– Do not be restricted by the need to
add functionality
• Scale to accommodate your application
data needs
– Don't be afraid of being successful
• Out of the Box Full features
– Text Search
– Geospatial, Rich queries
– Map Reduce and Aggregation
Framework

11
Mind Set
Application Application
RELATIONAL WAY MONGODB WAY

14
Discrete Documents
{
policyNum: 123,
type: auto,
customerId: abc,
payment: 899,
deductible: 500,
make: Taurus,
model: Ford,
VIN: 123ABC456,
}
{
policyNum: 456,
type: life,
customerId: efg,
payment: 240,
policyValue: 125000,
start: jan, 1995
end: jan, 2015
}
{
policyNum: 789,
type: home,
customerId: hij,
payment: 650,
deductible: 1000,
floodCoverage: No,
street: “10 Maple Lane”,
city: “Springfield”,
state: “Maryland”
}

15
Time Series
{
_id: "20130310/resource/home.htm",
metadata: {
date: ISODate("2013-03-10T00:00:00Z"),
site: "main-site",
page: "home.htm",
…
},
month : 3,
total : 9120637,
hourly: {
0 : 361012,
1 : 399034,
…,
23 : 387010 },
hour-minute: {
0 : { 0 : 5678,
1 : 6745,
2 : 9212,
…
59 : 6823
},
1 : { 0 : 8765,
1 : 8976,
2 : 8345,
…
59 : 9812
},
…
23 : { 0 : 7453,
1 : 7432,
2 : 7901,
…
59 : 8764
}

16
Referencing vs Embedding
{
_id: 111,
name: "Friso",
beers: [
{ name: "SuperBock", comment: "AWESOME" },
{ name: "Bavaria", comment: "Boooohhhohoohoh"}
]
}
{
_id: 21,
user_id: 111,
name: "SuperBock",
comment: "AWESOME"
}
{
_id: 22,
user_id: 111,
name: "Bavaria",
comment: "Boooohhhohoohoh"
}
{
_id: 111,
name: "Friso"
}
Embedding
Referencing

17
Referencing vs Embedding
Referencing Embedding
Data grows in different ways Want to retrieve all info in one go
(avoid round trips to database)
Is access by different access patterns
and workflows
Assure atomic operations
Have a different lifecycle When data changes in the same rate
and in the same pace

19
Unbounded Arrays/Documents
db.profile.insert( doc0 );
{_id: 1, selfies: [x0001]}
{_id: 2, selfies: [x0101]}
db.profile.update({_id:1},
{$push:{selfies: x0202});

20
{_id: 1, selfies: [x0001,
x0202]}
{_id: 2, selfies: [x0101]}
db.profile.update({_id:1},
{$push:{selfies: x0202});

21
{_id: 1, selfies: [x0001,
x0202]}
{_id: 2, selfies: [x0101]}
db.profile.update({_id: i},
{$push:{selfies: xXXX});
for i in all_profiles:
{_id: 3, selfies: [x0103…]}
{_id: 4, selfies: [x0104…]}

23
Overloaded Documents
{
name: 'Norberto',
role: 'Technical Evangelist',
talks: [
{
title: 'Document Database Schema Design',
description:'This talk is a short introduction...',
schedule: '12:10 - 12:25'
},
{
title: 'Scalable Cluster in 15 minutes!',
description: 'This talk is a quick introduction...',
schedule: '14:50 - 15:05'}
]
twitter: 'nleite',
email: 'norberto@mongodb.com',
bio: 'Norberto Leite is Technical Evangelist...'
address: 'Calle Artistas, Madrid',
supporter: { clube: 'FC Porto', description: 'Best Club in the WORLD' }
conferences: ['GOTO', 'MongoDB World' ...],
git_activity: [{type: 'pr', hook:'3142ji3423j342'}],
selfies: [0x13423423423423, 0x13423434324234]
}

24
{
name: 'Norberto',
role: 'Technical Evangelist',
talks: [
{
title: 'Document Database Schema Design',
description:'This talk is a short introduction...',
schedule: '12:10 - 12:25'
},
{
title: 'Scalable Cluster in 15 minutes!',
description: 'This talk is a quick introduction...',
schedule: '14:50 - 15:05'}
]
twitter: 'nleite',
email: 'norberto@mongodb.com',
bio: 'Norberto Leite is Technical Evangelist...'
...
}
100% data usage

25
...
address: 'Calle Artistas, Madrid',
supporter: { clube: 'FC Porto',
description: 'Best Club in the WORLD' }
conferences: ['GOTO', 'MongoDB World' ...],
git_activity: [{type: 'pr', hook:'3142ji3423j342'}]
selfies: [0x13423423423423, 0x13423434324234]
...
}
0.1% data usage
?

26
Highly Nested Documents
{
name: 'Some Dude',
arguments: [
{
properties: [
{
fields: [
topics: {
a:1,
...
}
]
}
]
}
]
}
}
Please, don't go
further than 5
levels!

27
Collection over-Normalization

29
Final Notes
• Think on how you want your data to be used
• Don't be afraid of making mistakes
– It's normal (to normalize) and to make the first
attempts with a relational mindset in place
• Make use of the flexibility of schema do adjust or
schema design
• Talk to us if you need help!

Schema Design Best Practices with Buzz Moschetti

More Related Content

What's hot (19)

Similar to Schema Design Best Practices with Buzz Moschetti (20)

More from MongoDB (20)

Recently uploaded (20)

Schema Design Best Practices with Buzz Moschetti

Editor's Notes