SlideShare a Scribd company logo
Schema Design Best Practices with Buzz Moschetti
Document Database Schema Design
3
Agenda
Schema
Documents
Document Schema Design
Patterns
Schema
5
First thing that comes to mind…
6
But there are other types of schema
Documents
8
What is a Document?
{
name: ‘Dutch Constitution’,
headline: ‘The Present State of Holand’,
efforced_by: ‘King and Parliament’
date: ‘11 October 1848’,
labels: [legal, society, rules],
freedoms: [
{ name: ‘Speach’,
text: 'Any censorship is absolutely forbidden'},
{ name: ‘Association’,
text: 'This right can be limited by formal law,'},
}
}
Document Schema Design
10
The focus is "What I want to Build"
• We focus on how to use Data
– Not on how to store it
• Use flexibility of schema to adjust to new
features and iterations deliver more
features
– Do not be restricted by the need to
add functionality
• Scale to accommodate your application
data needs
– Don't be afraid of being successful
• Out of the Box Full features
– Text Search
– Geospatial, Rich queries
– Map Reduce and Aggregation
Framework
11
Mind Set
Application Application
RELATIONAL WAY MONGODB WAY
Patterns
14
Discrete Documents
{
policyNum: 123,
type: auto,
customerId: abc,
payment: 899,
deductible: 500,
make: Taurus,
model: Ford,
VIN: 123ABC456,
}
{
policyNum: 456,
type: life,
customerId: efg,
payment: 240,
policyValue: 125000,
start: jan, 1995
end: jan, 2015
}
{
policyNum: 789,
type: home,
customerId: hij,
payment: 650,
deductible: 1000,
floodCoverage: No,
street: “10 Maple Lane”,
city: “Springfield”,
state: “Maryland”
}
15
Time Series
{
_id: "20130310/resource/home.htm",
metadata: {
date: ISODate("2013-03-10T00:00:00Z"),
site: "main-site",
page: "home.htm",
…
},
month : 3,
total : 9120637,
hourly: {
0 : 361012,
1 : 399034,
…,
23 : 387010 },
hour-minute: {
0 : { 0 : 5678,
1 : 6745,
2 : 9212,
…
59 : 6823
},
1 : { 0 : 8765,
1 : 8976,
2 : 8345,
…
59 : 9812
},
…
23 : { 0 : 7453,
1 : 7432,
2 : 7901,
…
59 : 8764
}
16
Referencing vs Embedding
{
_id: 111,
name: "Friso",
beers: [
{ name: "SuperBock", comment: "AWESOME" },
{ name: "Bavaria", comment: "Boooohhhohoohoh"}
]
}
{
_id: 21,
user_id: 111,
name: "SuperBock",
comment: "AWESOME"
}
{
_id: 22,
user_id: 111,
name: "Bavaria",
comment: "Boooohhhohoohoh"
}
{
_id: 111,
name: "Friso"
}
Embedding
Referencing
17
Referencing vs Embedding
Referencing Embedding
Data grows in different ways Want to retrieve all info in one go
(avoid round trips to database)
Is access by different access patterns
and workflows
Assure atomic operations
Have a different lifecycle When data changes in the same rate
and in the same pace
Anti-Patterns
19
Unbounded Arrays/Documents
db.profile.insert( doc0 );
{_id: 1, selfies: [x0001]}
db.profile.insert( doc2 );
{_id: 2, selfies: [x0101]}
db.profile.update({_id:1},
{$push:{selfies: x0202});
20
Unbounded Arrays/Documents
db.profile.insert( doc0 );
{_id: 1, selfies: [x0001,
x0202]}
db.profile.insert( doc2 );
{_id: 2, selfies: [x0101]}
db.profile.update({_id:1},
{$push:{selfies: x0202});
21
Unbounded Arrays/Documents
db.profile.insert( doc0 );
{_id: 1, selfies: [x0001,
x0202]}
db.profile.insert( doc2 );
{_id: 2, selfies: [x0101]}
db.profile.update({_id: i},
{$push:{selfies: xXXX});
for i in all_profiles:
{_id: 3, selfies: [x0103…]}
{_id: 4, selfies: [x0104…]}
22
Overloaded Documents
23
Overloaded Documents
{
name: 'Norberto',
role: 'Technical Evangelist',
talks: [
{
title: 'Document Database Schema Design',
description:'This talk is a short introduction...',
schedule: '12:10 - 12:25'
},
{
title: 'Scalable Cluster in 15 minutes!',
description: 'This talk is a quick introduction...',
schedule: '14:50 - 15:05'}
]
twitter: 'nleite',
email: 'norberto@mongodb.com',
bio: 'Norberto Leite is Technical Evangelist...'
address: 'Calle Artistas, Madrid',
supporter: { clube: 'FC Porto', description: 'Best Club in the WORLD' }
conferences: ['GOTO', 'MongoDB World' ...],
git_activity: [{type: 'pr', hook:'3142ji3423j342'}],
selfies: [0x13423423423423, 0x13423434324234]
}
24
Overloaded Documents
{
name: 'Norberto',
role: 'Technical Evangelist',
talks: [
{
title: 'Document Database Schema Design',
description:'This talk is a short introduction...',
schedule: '12:10 - 12:25'
},
{
title: 'Scalable Cluster in 15 minutes!',
description: 'This talk is a quick introduction...',
schedule: '14:50 - 15:05'}
]
twitter: 'nleite',
email: 'norberto@mongodb.com',
bio: 'Norberto Leite is Technical Evangelist...'
...
}
100% data usage
25
Overloaded Documents
...
address: 'Calle Artistas, Madrid',
supporter: { clube: 'FC Porto',
description: 'Best Club in the WORLD' }
conferences: ['GOTO', 'MongoDB World' ...],
git_activity: [{type: 'pr', hook:'3142ji3423j342'}]
selfies: [0x13423423423423, 0x13423434324234]
...
}
0.1% data usage
?
26
Highly Nested Documents
{
name: 'Some Dude',
arguments: [
{
properties: [
{
fields: [
topics: {
a:1,
...
}
]
}
]
}
]
}
}
Please, don't go
further than 5
levels!
27
Collection over-Normalization
Is it Easy?
29
Final Notes
• Think on how you want your data to be used
• Don't be afraid of making mistakes
– It's normal (to normalize) and to make the first
attempts with a relational mindset in place
• Make use of the flexibility of schema do adjust or
schema design
• Talk to us if you need help!
Schema Design Best Practices with Buzz Moschetti

More Related Content

PPTX
MongoDB Days Silicon Valley: Introducing MongoDB 3.2
PPTX
Introduction to MongoDB
PPTX
Python and MongoDB as a Market Data Platform by James Blackburn
PPTX
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
PPTX
Webinar: Best Practices for Getting Started with MongoDB
PPT
MongoDB Tick Data Presentation
PPTX
Back to Basics Webinar 2: Your First MongoDB Application
PPTX
Back to Basics: My First MongoDB Application
MongoDB Days Silicon Valley: Introducing MongoDB 3.2
Introduction to MongoDB
Python and MongoDB as a Market Data Platform by James Blackburn
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
Webinar: Best Practices for Getting Started with MongoDB
MongoDB Tick Data Presentation
Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics: My First MongoDB Application

What's hot (19)

PDF
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
PPTX
Webinar: Back to Basics: Thinking in Documents
PPTX
MongoDB Schema Design: Practical Applications and Implications
PPTX
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
ODP
MongoDB - Ekino PHP
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
PPTX
Back to Basics Webinar 3: Introduction to Replica Sets
PPTX
MongoDB Schema Design: Four Real-World Examples
PDF
MongodB Internals
PPTX
Mongo db workshop # 02
PPTX
Eagle6 Enterprise Situational Awareness
PDF
Data modeling for Elasticsearch
PDF
Creating social features at BranchOut using MongoDB
PPTX
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
PPTX
MongoDB 101
PPTX
User Data Management with MongoDB
PPT
5 Pitfalls to Avoid with MongoDB
PPTX
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
Back to Basics Webinar 1: Introduction to NoSQL
Webinar: Back to Basics: Thinking in Documents
MongoDB Schema Design: Practical Applications and Implications
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
MongoDB - Ekino PHP
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 3: Introduction to Replica Sets
MongoDB Schema Design: Four Real-World Examples
MongodB Internals
Mongo db workshop # 02
Eagle6 Enterprise Situational Awareness
Data modeling for Elasticsearch
Creating social features at BranchOut using MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
MongoDB 101
User Data Management with MongoDB
5 Pitfalls to Avoid with MongoDB
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
Ad

Similar to Schema Design Best Practices with Buzz Moschetti (20)

PPTX
Schema design mongo_boston
PDF
MongoDB and Schema Design
PDF
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
PPTX
Schema Design
KEY
Modeling Data in MongoDB
PDF
Mongo db data-models guide
PDF
Mongo db data-models-guide
PDF
Schema Design
PDF
Schema & Design
PPTX
Schema Design
PPTX
Webinar: Schema Design
PPTX
Dev Jumpstart: Schema Design Best Practices
PPTX
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
PPTX
Techorama - Evolvable Application Development with MongoDB
PDF
MongoDB Meetup
PDF
MongoDB Atlas Workshop - Singapore
PDF
MongoDB Schema Design Tips & Tricks
PPTX
Webinar: Scaling MongoDB
PPTX
An Evening with MongoDB Detroit 2013
ODP
MongoDB - A Document NoSQL Database
Schema design mongo_boston
MongoDB and Schema Design
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Schema Design
Modeling Data in MongoDB
Mongo db data-models guide
Mongo db data-models-guide
Schema Design
Schema & Design
Schema Design
Webinar: Schema Design
Dev Jumpstart: Schema Design Best Practices
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Techorama - Evolvable Application Development with MongoDB
MongoDB Meetup
MongoDB Atlas Workshop - Singapore
MongoDB Schema Design Tips & Tricks
Webinar: Scaling MongoDB
An Evening with MongoDB Detroit 2013
MongoDB - A Document NoSQL Database
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Cloud computing and distributed systems.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Encapsulation theory and applications.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Cloud computing and distributed systems.
Mobile App Security Testing_ A Comprehensive Guide.pdf
Approach and Philosophy of On baking technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Per capita expenditure prediction using model stacking based on satellite ima...
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation_ Review paper, used for researhc scholars
Spectral efficient network and resource selection model in 5G networks
Unlocking AI with Model Context Protocol (MCP)
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
NewMind AI Weekly Chronicles - August'25 Week I
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The AUB Centre for AI in Media Proposal.docx
Big Data Technologies - Introduction.pptx
Empathic Computing: Creating Shared Understanding
Encapsulation theory and applications.pdf

Schema Design Best Practices with Buzz Moschetti

  • 5. 5 First thing that comes to mind…
  • 6. 6 But there are other types of schema
  • 8. 8 What is a Document? { name: ‘Dutch Constitution’, headline: ‘The Present State of Holand’, efforced_by: ‘King and Parliament’ date: ‘11 October 1848’, labels: [legal, society, rules], freedoms: [ { name: ‘Speach’, text: 'Any censorship is absolutely forbidden'}, { name: ‘Association’, text: 'This right can be limited by formal law,'}, } }
  • 10. 10 The focus is "What I want to Build" • We focus on how to use Data – Not on how to store it • Use flexibility of schema to adjust to new features and iterations deliver more features – Do not be restricted by the need to add functionality • Scale to accommodate your application data needs – Don't be afraid of being successful • Out of the Box Full features – Text Search – Geospatial, Rich queries – Map Reduce and Aggregation Framework
  • 13. 14 Discrete Documents { policyNum: 123, type: auto, customerId: abc, payment: 899, deductible: 500, make: Taurus, model: Ford, VIN: 123ABC456, } { policyNum: 456, type: life, customerId: efg, payment: 240, policyValue: 125000, start: jan, 1995 end: jan, 2015 } { policyNum: 789, type: home, customerId: hij, payment: 650, deductible: 1000, floodCoverage: No, street: “10 Maple Lane”, city: “Springfield”, state: “Maryland” }
  • 14. 15 Time Series { _id: "20130310/resource/home.htm", metadata: { date: ISODate("2013-03-10T00:00:00Z"), site: "main-site", page: "home.htm", … }, month : 3, total : 9120637, hourly: { 0 : 361012, 1 : 399034, …, 23 : 387010 }, hour-minute: { 0 : { 0 : 5678, 1 : 6745, 2 : 9212, … 59 : 6823 }, 1 : { 0 : 8765, 1 : 8976, 2 : 8345, … 59 : 9812 }, … 23 : { 0 : 7453, 1 : 7432, 2 : 7901, … 59 : 8764 }
  • 15. 16 Referencing vs Embedding { _id: 111, name: "Friso", beers: [ { name: "SuperBock", comment: "AWESOME" }, { name: "Bavaria", comment: "Boooohhhohoohoh"} ] } { _id: 21, user_id: 111, name: "SuperBock", comment: "AWESOME" } { _id: 22, user_id: 111, name: "Bavaria", comment: "Boooohhhohoohoh" } { _id: 111, name: "Friso" } Embedding Referencing
  • 16. 17 Referencing vs Embedding Referencing Embedding Data grows in different ways Want to retrieve all info in one go (avoid round trips to database) Is access by different access patterns and workflows Assure atomic operations Have a different lifecycle When data changes in the same rate and in the same pace
  • 18. 19 Unbounded Arrays/Documents db.profile.insert( doc0 ); {_id: 1, selfies: [x0001]} db.profile.insert( doc2 ); {_id: 2, selfies: [x0101]} db.profile.update({_id:1}, {$push:{selfies: x0202});
  • 19. 20 Unbounded Arrays/Documents db.profile.insert( doc0 ); {_id: 1, selfies: [x0001, x0202]} db.profile.insert( doc2 ); {_id: 2, selfies: [x0101]} db.profile.update({_id:1}, {$push:{selfies: x0202});
  • 20. 21 Unbounded Arrays/Documents db.profile.insert( doc0 ); {_id: 1, selfies: [x0001, x0202]} db.profile.insert( doc2 ); {_id: 2, selfies: [x0101]} db.profile.update({_id: i}, {$push:{selfies: xXXX}); for i in all_profiles: {_id: 3, selfies: [x0103…]} {_id: 4, selfies: [x0104…]}
  • 22. 23 Overloaded Documents { name: 'Norberto', role: 'Technical Evangelist', talks: [ { title: 'Document Database Schema Design', description:'This talk is a short introduction...', schedule: '12:10 - 12:25' }, { title: 'Scalable Cluster in 15 minutes!', description: 'This talk is a quick introduction...', schedule: '14:50 - 15:05'} ] twitter: 'nleite', email: 'norberto@mongodb.com', bio: 'Norberto Leite is Technical Evangelist...' address: 'Calle Artistas, Madrid', supporter: { clube: 'FC Porto', description: 'Best Club in the WORLD' } conferences: ['GOTO', 'MongoDB World' ...], git_activity: [{type: 'pr', hook:'3142ji3423j342'}], selfies: [0x13423423423423, 0x13423434324234] }
  • 23. 24 Overloaded Documents { name: 'Norberto', role: 'Technical Evangelist', talks: [ { title: 'Document Database Schema Design', description:'This talk is a short introduction...', schedule: '12:10 - 12:25' }, { title: 'Scalable Cluster in 15 minutes!', description: 'This talk is a quick introduction...', schedule: '14:50 - 15:05'} ] twitter: 'nleite', email: 'norberto@mongodb.com', bio: 'Norberto Leite is Technical Evangelist...' ... } 100% data usage
  • 24. 25 Overloaded Documents ... address: 'Calle Artistas, Madrid', supporter: { clube: 'FC Porto', description: 'Best Club in the WORLD' } conferences: ['GOTO', 'MongoDB World' ...], git_activity: [{type: 'pr', hook:'3142ji3423j342'}] selfies: [0x13423423423423, 0x13423434324234] ... } 0.1% data usage ?
  • 25. 26 Highly Nested Documents { name: 'Some Dude', arguments: [ { properties: [ { fields: [ topics: { a:1, ... } ] } ] } ] } } Please, don't go further than 5 levels!
  • 28. 29 Final Notes • Think on how you want your data to be used • Don't be afraid of making mistakes – It's normal (to normalize) and to make the first attempts with a relational mindset in place • Make use of the flexibility of schema do adjust or schema design • Talk to us if you need help!

Editor's Notes

  • #4: on this section we will be discussing the main features of Java8 Lambda functions New Date API
  • #6: If you are an experienced software developer you've seen this picture over and over again. This is the typical ERD that we produce so we can visualize our table spec.
  • #9: When we say documenter we are not referring to Pdf files Microsoft word documents A document in MongoDB is a data structure composed by hierarchical key value pairs that form a piece of information. These can be: Python Dictionaries PHP Array Ruby HashMaps JSON Objects Etc If one wants to, say define Holland's constitution we could use this json example
  • #11: In MongoDB we focus on the application purpose. We are not restricted by the need of defining ahead how our data is going to be persisted on disk. We change the line of though on what we want to build? How our app is going to be using information What kind of operations are our users demand from our system
  • #12: On a pure relational system the application is build from the database schema upwards While in MongoDB we focus on how the data is going to be used and then we persist. The application models our data
  • #15: There a different modeling patterns: We can have discrete document usage. Every single different object and element treated by the application will have it's discrete representation on the database Being a flexible schema we can adjust not only in terms of data variability but also in terms of change management Applications change faster and require new data When we grow to certain size we might have problems iterating due to migrations and data management processes that are trimmed down by using MongoDB
  • #16: In some other scenarios we need data to stored in a particular fashion to attend the type of data usage that application requires: In timeseries use cases using discrete documents is not very efficient because it requires a constant growth of the database for each single input and constant aggregations to elaborate the plotting and graphs associated with the type of visualization required For these use cases we tend to use a data format that reflects the application operations Per unit of time we keep track of the metrics that occur in the system We bucket that information in a pre-aggregated format so it's easier and faster to retrieve and update We use in-place updates to make all writes extremely fast
  • #17: There's also the constant thrive between embedding or referencing | denormalizing vs normalizing data structures In MongoDB the advantage is that to move from one particular data format to another is quite simple and requires very low impedance Make sure you understand the tradeoffs Faster access to data vs round trips to the database Single point of data updates with referencing vs cascade updates for all documents that contain duplicated data
  • #20: We start with a small set of data for each array. Then we decide push more info into those arrays, no problem we use the $push operator.
  • #21: Moving occasionally documents is not a problem. Moving all documents all the time has considerable perfromance implications since all documents will moving all the time
  • #22: Although MongoDB will try to efficiently occupy the $freelist space, if we keep recurring the iteration around moving documents this will cause all documents to be constantly moving all the time. Apart from slower operations this will also cause a large amount of fragmentation both on your indexes and in your collections structures.
  • #23: Let's take for example this page that describes the a conference speakers
  • #24: If we want a complete profile of the user in one single document we can achieve that easily: All it's information Name Profile Bio Description of the talk - etc
  • #25: But in reality what one actually uses are just a few fields of that profile. These ones would correspond, for that given page 100% of all data needs
  • #26: The rest will probably be accessed 0.1% of the time. Should we have all of it and access it all in one go? With all the required indexes? All the burden of fetching all those extra pages that are not going to be consumed in the same requests? Probably not.
  • #27: 5 nesting document levels is not that bad but I would not go further than that: Hard to maintain Gigantic queries Potential hitting limits on index entries and shard key sizes