SlideShare a Scribd company logo
Schema Design
Software Engineer, MongoDB
Craig Wilson
#MongoDBDays
@craiggwilson
All application development is
Schema Design
Success comes from a
Proper Data Structure
Terminology
RDBMS MongoDB
Database ➜ Database
Table ➜ Collection
Row ➜ Document
Index ➜ Index
Join ➜ Embedding & Linking
Working with Documents
{
_id: “123”,
title: "MongoDB: The Definitive Guide",
authors: [
{ _id: "kchodorow", name: "Kristina Chodorow“ },
{ _id: "mdirold", name: “Mike Dirolf“ }
],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher: {
name: "O’Reilly Media",
founded: "1980",
location: "CA"
}
}
What is a Document?
Traditional Schema Design
Focus on Data Storage
Document Schema Design
Focus on Data Usage
Traditional Schema Design
What answers do I have?
Document Schema Design
What questions do I
have?
Schema Design By Example
Library Management Application
• Patrons/Users
• Books
• Authors
• Publishers
Question:
What is a Patron’s
Address?
> patron = db.patrons.find({ _id : “joe” })
{
_id: "joe“,
name: "Joe Bookreader”
}
> address = db.addresses.find({ _id : “joe” })
{
_id: "joe“,
street: "123 Fake St. ",
city: "Faketon",
state: "MA",
zip: 12345
}
A Patron and their Address
> patron = db.patrons.find({ _id : “joe” })
{
_id: "joe",
name: "Joe Bookreader",
address: {
street: "123 Fake St. ",
city: "Faketon",
state: "MA",
zip: 12345
}
}
A Patron and their Address
One-to-One Relationships
• “Belongs to” relationships are often embedded.
• Holistic representation of entities with their
embedded attributes and relationships.
• Optimized for read performance
Question:
What are a Patron’s
Addresses?
> patron = db.patrons.find({ _id : “bob” })
{
_id: “bob",
name: “Bob Knowitall",
addresses: [
{street: "1 Vernon St.", city: "Newton", …},
{street: "52 Main St.", city: "Boston", …},
]
}
A Patron and their Addresses
> patron = db.patrons.find({ _id : “bob” })
{
_id: “bob",
name: “Bob Knowitall",
addresses: [
{street: "1 Vernon St.", city: "Newton", …},
{street: "52 Main St.", city: "Boston", …},
]
}
> patron = db.patrons.find({ _id : “joe” })
{
_id: "joe",
name: "Joe Bookreader",
address: { street: "123 Fake St. ", city: "Faketon", …}
}
A Patron and their Addresses
Migration Possibilities
• Migrate all documents when the schema changes.
• Migrate On-Demand
– As we pull up a patron’s document, we make the change.
– Any patrons that never come into the library never get
updated.
• Leave it alone
– As long as the application knows about both types…
Question:
Who is the publisher of
this book?
Book
MongoDB: The Definitive Guide,
By Kristina Chodorow and Mike Dirolf
Published: 9/24/2010
Pages: 216
Language: English
Publisher: O’Reilly Media, CA
> book = db.books.find({ _id : “123” })
{
_id: “123”,
title: "MongoDB: The Definitive Guide",
authors: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher: {
name: "O’Reilly Media",
founded: "1980",
location: "CA"
}
}
Book with embedded Publisher
Book with embedded Publisher
• Optimized for read performance of Books
• Other queries become difficult
Question:
Who are all the
publishers in the
system?
> publishers = db.publishers.find()
{
_id: “oreilly”,
name: "O’Reilly Media",
founded: "1980",
location: "CA"
}
{
_id: “penguin”,
name: “Penguin”,
founded: “1983”,
location: “CA”
}
All Publishers
> book = db.books.find({ _id: “123” })
{
_id: “123”,
publisher_id: “oreilly”,
title: "MongoDB: The Definitive Guide",
authors: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}
> db.publishers.find({ _id : book.publisher_id })
{
_id: “oreilly”,
name: "O’Reilly Media",
founded: "1980",
location: "CA"
}
Book with linked Publisher
Question:
What are all the books a
publisher has
published?
> publisher = db.publishers.find({ _id : “oreilly” })
{
_id: “oreilly”,
name: "O’Reilly Media",
founded: "1980",
location: "CA“,
books: [“123”,…]
}
> books = db.books.find({ _id: { $in : publisher.books } })
Publisher with linked Books
Question:
Who are the authors of a
given book?
> book = db.books.find({ _id : “123” })
{
_id: “123”,
title: "MongoDB: The Definitive Guide",
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English“,
authors: [“kchodorow”, “mdirolf”]
}
> authors = db.authors.find({ _id : { $in : book.authors } })
{ _id: "kchodorow", name: "Kristina Chodorow”, hometown: … }
{ _id: “mdirolf", name: “Mike Dirolf“, hometown: … }
Books with linked Authors
> book = db.books.find({ _id : “123” })
{
_id: “123”,
title: "MongoDB: The Definitive Guide",
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English“,
authors: [
{ id: "kchodorow", name: "Kristina Chodorow” },
{ id: "mdirolf", name: "Mike Dirolf” }
]
}
Books with linked Authors
Question:
What are all the books
an author has written?
> authors = db.authors.find({ _id : “kchodorow” })
{
_id: "kchodorow",
name: "Kristina Chodorow",
hometown: "Cincinnati",
books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ]
}
Authors with linked Books
> authors = db.authors.find({ _id : “kchodorow” })
{
_id: "kchodorow",
name: "Kristina Chodorow",
hometown: "Cincinnati",
books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ]
}
> book = db.books.find({ _id : “123” })
{
_id: “123”,
title: "MongoDB: The Definitive Guide",
authors: [
{ id: "kchodorow", name: "Kristina Chodorow” },
{ id: "mdirolf", name: "Mike Dirolf” }
]
}
Links on both Authors and Books
Linking vs. Embedding
• Embedding
– Great for read performance
– Writes can be slow
– Data integrity needs to be managed
• Linking
– Flexible
– Data integrity is built-in
– Work is done during reads
Question:
What are all the books
about databases?
> book = db.books.find({ _id : “123” })
{
_id: “123”,
title: "MongoDB: The Definitive Guide",
category: “MongoDB”
}
> categories = db.categories.find({ _id: “MongoDB” })
{
_id: “MongoDB”,
parent: “Databases”
}
Categories as Documents
> book = db.books.find({ _id : “123” })
{
_id: “123”,
title: "MongoDB: The Definitive Guide",
categories: [“MongoDB”, “Databases”, “Programming”]
}
> db.books.find({ categories: “Databases” })
Categories as an Array
> book = db.books.find({ _id : “123” })
{
_id: “123”,
title: "MongoDB: The Definitive Guide",
category: “Programming/Databases/MongoDB”
}
> db.books.find({ category: ^Programming/Databases/* })
Categories as a Path
Conclusion
• Schema design is different in MongoDB
• Basic data design principals stay the same
• Focus on how an application accesses/manipulates
data
• Evolve the schema to meet requirements as they
change
Schema Design
Software Engineer, 10gen
Craig Wilson
#MongoDBDays
@craiggwilson

More Related Content

PPTX
Jumpstart: Schema Design
PDF
Schema Design
PDF
Schema Design
PDF
Schema Design
PPTX
Webinar: Schema Design
PDF
Schema design
PPTX
Schema Design
PPTX
Schema design mongo_boston
Jumpstart: Schema Design
Schema Design
Schema Design
Schema Design
Webinar: Schema Design
Schema design
Schema Design
Schema design mongo_boston

What's hot (18)

PPTX
Schema Design
PDF
Schema & Design
PDF
MongoDB Schema Design
PDF
Schema Design
PPTX
Schema Design
PDF
Schema Design
PPT
MongoDB Schema Design
KEY
Schema Design by Example ~ MongoSF 2012
PPTX
MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consu...
PDF
Mongo DB schema design patterns
PPTX
MongoDB Schema Design: Four Real-World Examples
PPTX
Building Your First App with MongoDB
PPTX
Back to Basics 1: Thinking in documents
PPTX
Webinar: Schema Design
PPTX
Dev Jumpstart: Schema Design Best Practices
PDF
MongoDB Schema Design
PPTX
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
PDF
Agile Schema Design: An introduction to MongoDB
Schema Design
Schema & Design
MongoDB Schema Design
Schema Design
Schema Design
Schema Design
MongoDB Schema Design
Schema Design by Example ~ MongoSF 2012
MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consu...
Mongo DB schema design patterns
MongoDB Schema Design: Four Real-World Examples
Building Your First App with MongoDB
Back to Basics 1: Thinking in documents
Webinar: Schema Design
Dev Jumpstart: Schema Design Best Practices
MongoDB Schema Design
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Agile Schema Design: An introduction to MongoDB
Ad

Viewers also liked (11)

PPTX
Onomi - MongoDB Introduction
PPTX
Introduction to Couchbase: Onomi
PPTX
What's new in SQL Server 2016
PPTX
SQL Server 2014 New Features
PPTX
Advanced Document Modeling Techniques from a High-Scale Commerce Platform
KEY
Mongo at Sailthru (MongoNYC 2011)
PDF
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
PPTX
Introduction to couchbase
PPTX
Indexing Strategies to Help You Scale
PDF
Schema Design
PPTX
PEST ANALYSIS OF IT SECTOR IN INDIA
Onomi - MongoDB Introduction
Introduction to Couchbase: Onomi
What's new in SQL Server 2016
SQL Server 2014 New Features
Advanced Document Modeling Techniques from a High-Scale Commerce Platform
Mongo at Sailthru (MongoNYC 2011)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
Introduction to couchbase
Indexing Strategies to Help You Scale
Schema Design
PEST ANALYSIS OF IT SECTOR IN INDIA
Ad

Similar to Schema Design (19)

PDF
Schema Design
PDF
Schema Design
PPTX
Webinar: Back to Basics: Thinking in Documents
PPTX
Schema Design
PDF
Schema Design in MongoDB - TriMug Meetup North Carolina
PDF
Building Your First App: An Introduction to MongoDB
PDF
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
PDF
Building your first app with mongo db
KEY
Modeling Data in MongoDB
PPT
Building Your First App with MongoDB
PPTX
lecture_34e.pptx
PDF
buildyourfirstmongodbappberlin2013thomas-130313104259-phpapp02.pdf
PPTX
Schema Design
PPTX
Modeling JSON data for NoSQL document databases
PDF
MongoDB and Schema Design
PPT
No SQL and MongoDB - Hyderabad Scalability Meetup
PPTX
MongoDB
DOCX
MongoDB DOC v1.5
Schema Design
Schema Design
Webinar: Back to Basics: Thinking in Documents
Schema Design
Schema Design in MongoDB - TriMug Meetup North Carolina
Building Your First App: An Introduction to MongoDB
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Building your first app with mongo db
Modeling Data in MongoDB
Building Your First App with MongoDB
lecture_34e.pptx
buildyourfirstmongodbappberlin2013thomas-130313104259-phpapp02.pdf
Schema Design
Modeling JSON data for NoSQL document databases
MongoDB and Schema Design
No SQL and MongoDB - Hyderabad Scalability Meetup
MongoDB
MongoDB DOC v1.5

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Hybrid model detection and classification of lung cancer
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
A Presentation on Artificial Intelligence
PDF
Mushroom cultivation and it's methods.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
August Patch Tuesday
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Unlocking AI with Model Context Protocol (MCP)
SOPHOS-XG Firewall Administrator PPT.pptx
NewMind AI Weekly Chronicles - August'25-Week II
Hindi spoken digit analysis for native and non-native speakers
DP Operators-handbook-extract for the Mautical Institute
Encapsulation_ Review paper, used for researhc scholars
Hybrid model detection and classification of lung cancer
A comparative study of natural language inference in Swahili using monolingua...
A Presentation on Artificial Intelligence
Mushroom cultivation and it's methods.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
1 - Historical Antecedents, Social Consideration.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
August Patch Tuesday
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf

Schema Design

  • 1. Schema Design Software Engineer, MongoDB Craig Wilson #MongoDBDays @craiggwilson
  • 2. All application development is Schema Design
  • 3. Success comes from a Proper Data Structure
  • 4. Terminology RDBMS MongoDB Database ➜ Database Table ➜ Collection Row ➜ Document Index ➜ Index Join ➜ Embedding & Linking
  • 6. { _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ { _id: "kchodorow", name: "Kristina Chodorow“ }, { _id: "mdirold", name: “Mike Dirolf“ } ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" } } What is a Document?
  • 9. Traditional Schema Design What answers do I have?
  • 10. Document Schema Design What questions do I have?
  • 11. Schema Design By Example
  • 12. Library Management Application • Patrons/Users • Books • Authors • Publishers
  • 13. Question: What is a Patron’s Address?
  • 14. > patron = db.patrons.find({ _id : “joe” }) { _id: "joe“, name: "Joe Bookreader” } > address = db.addresses.find({ _id : “joe” }) { _id: "joe“, street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345 } A Patron and their Address
  • 15. > patron = db.patrons.find({ _id : “joe” }) { _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345 } } A Patron and their Address
  • 16. One-to-One Relationships • “Belongs to” relationships are often embedded. • Holistic representation of entities with their embedded attributes and relationships. • Optimized for read performance
  • 17. Question: What are a Patron’s Addresses?
  • 18. > patron = db.patrons.find({ _id : “bob” }) { _id: “bob", name: “Bob Knowitall", addresses: [ {street: "1 Vernon St.", city: "Newton", …}, {street: "52 Main St.", city: "Boston", …}, ] } A Patron and their Addresses
  • 19. > patron = db.patrons.find({ _id : “bob” }) { _id: “bob", name: “Bob Knowitall", addresses: [ {street: "1 Vernon St.", city: "Newton", …}, {street: "52 Main St.", city: "Boston", …}, ] } > patron = db.patrons.find({ _id : “joe” }) { _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", …} } A Patron and their Addresses
  • 20. Migration Possibilities • Migrate all documents when the schema changes. • Migrate On-Demand – As we pull up a patron’s document, we make the change. – Any patrons that never come into the library never get updated. • Leave it alone – As long as the application knows about both types…
  • 21. Question: Who is the publisher of this book?
  • 22. Book MongoDB: The Definitive Guide, By Kristina Chodorow and Mike Dirolf Published: 9/24/2010 Pages: 216 Language: English Publisher: O’Reilly Media, CA
  • 23. > book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" } } Book with embedded Publisher
  • 24. Book with embedded Publisher • Optimized for read performance of Books • Other queries become difficult
  • 25. Question: Who are all the publishers in the system?
  • 26. > publishers = db.publishers.find() { _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA" } { _id: “penguin”, name: “Penguin”, founded: “1983”, location: “CA” } All Publishers
  • 27. > book = db.books.find({ _id: “123” }) { _id: “123”, publisher_id: “oreilly”, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English" } > db.publishers.find({ _id : book.publisher_id }) { _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA" } Book with linked Publisher
  • 28. Question: What are all the books a publisher has published?
  • 29. > publisher = db.publishers.find({ _id : “oreilly” }) { _id: “oreilly”, name: "O’Reilly Media", founded: "1980", location: "CA“, books: [“123”,…] } > books = db.books.find({ _id: { $in : publisher.books } }) Publisher with linked Books
  • 30. Question: Who are the authors of a given book?
  • 31. > book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English“, authors: [“kchodorow”, “mdirolf”] } > authors = db.authors.find({ _id : { $in : book.authors } }) { _id: "kchodorow", name: "Kristina Chodorow”, hometown: … } { _id: “mdirolf", name: “Mike Dirolf“, hometown: … } Books with linked Authors
  • 32. > book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English“, authors: [ { id: "kchodorow", name: "Kristina Chodorow” }, { id: "mdirolf", name: "Mike Dirolf” } ] } Books with linked Authors
  • 33. Question: What are all the books an author has written?
  • 34. > authors = db.authors.find({ _id : “kchodorow” }) { _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ] } Authors with linked Books
  • 35. > authors = db.authors.find({ _id : “kchodorow” }) { _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {id: “123”, title : "MongoDB: The Definitive Guide“ } ] } > book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", authors: [ { id: "kchodorow", name: "Kristina Chodorow” }, { id: "mdirolf", name: "Mike Dirolf” } ] } Links on both Authors and Books
  • 36. Linking vs. Embedding • Embedding – Great for read performance – Writes can be slow – Data integrity needs to be managed • Linking – Flexible – Data integrity is built-in – Work is done during reads
  • 37. Question: What are all the books about databases?
  • 38. > book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", category: “MongoDB” } > categories = db.categories.find({ _id: “MongoDB” }) { _id: “MongoDB”, parent: “Databases” } Categories as Documents
  • 39. > book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", categories: [“MongoDB”, “Databases”, “Programming”] } > db.books.find({ categories: “Databases” }) Categories as an Array
  • 40. > book = db.books.find({ _id : “123” }) { _id: “123”, title: "MongoDB: The Definitive Guide", category: “Programming/Databases/MongoDB” } > db.books.find({ category: ^Programming/Databases/* }) Categories as a Path
  • 41. Conclusion • Schema design is different in MongoDB • Basic data design principals stay the same • Focus on how an application accesses/manipulates data • Evolve the schema to meet requirements as they change
  • 42. Schema Design Software Engineer, 10gen Craig Wilson #MongoDBDays @craiggwilson

Editor's Notes

  • #3: Schema Design is very important; its impact on your application is pervasive. We call the “dynamic” nature of a schema in MongoDB an “Application Defined Schema”.
  • #4: Wrong data structure will hurt you. Proper data structure can make all the pieces fall into place.
  • #7: A document is JSON. A value can be an integer, string, document, array, array of documents, etc…
  • #8: Focus on the way we store our data, neglecting the way we use it.
  • #9: Focus on how we use our data, neglecting (sort-of) how we store it.
  • #10: Has all the answers, but none can be given in an optimal way. Has zero knowledge of your application’s known queries, use cases, or client-side data structures.
  • #11: Has all the answers, but also knows what questions are going to be asked. Takes advantage of known queries, use cases, and client-side data structures.
  • #14: Imagine a patron walks up to the counter and presents his/her library card to check out some books. The first thing a librarian might want to do is confirm the patron’s address so as to have a place to send the library police when the book isn’t returned in a timely manner.
  • #15: This is entirely doable, and might be advantageous in a number of other use cases. But since we want to lookup the patron and their address at the same time, this is inefficient as it requires 2 queries.
  • #16: Embedded directly into the patron document. Only 1 query is necessary. Holistic view of a patron.
  • #17: Read performance is optimized because we only need a single query and a single disk/memory hit. Write performance change is negligible.
  • #18: Business Requirements Change! A librarian want’s all the places his/her book might be hiding out, and having more addresses for a patron is more places to look.
  • #19: Now, just store addresses as an array. Embedded directly into the patron document. Only 1 query is necessary. Holistic view of a patron.
  • #20: Schema isn’t rigid, but dynamic. An application defines the schema, and having two ways to represent addresses is entirely possible.
  • #24: Duplicate publisher in every book that the publisher has published. Data duplication is OK because the publisher is immutable.
  • #25: Best way to figure out how something is going to perform is to measure.
  • #28: Still have the previous question, who is the publisher of this book? Takes 2 queries. Same problems that exist in traditional systems. Foreign keys, while keeping data integrity, tend to erase history.
  • #30: Unbounded arrays are BAD!
  • #33: Take advantage of data that’s immutable. Duplicate data is OK.
  • #39: Recursive search to find all books about databases.
  • #40: When a category hierarchy gets changed, all documents will need to be re-categorized. If one category name exists in multiple hierarchies, then further refinement would need to happen. Uses a multi-key index.
  • #41: When a category hierarchy gets changed, all documents will need to be re-categorized. Uses an index because of the anchored regular expression.