Jumpstart: Schema Design

Schema Design
Marc Schwering
Solutions Architect, MongoDB
marc@mongodb.com
@m4rcsch

All application
deployment is Schema
Design

Success comes from a
Proper Data Structure

RDBMS MongoDB
Database Database
Table Collection
Row Document
Index Index
Join Embedding & Linking
Terminology

{
_id: “123”,
title: "MongoDB: The Definitive Guide",
authors: [
{ _id: "kchodorow", name: "Kristina Chodorow“ },
{ _id: "mdirold", name: “Mike Dirolf“ }
],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher: {
name: "O’Reilly Media",
founded: "1980",
location: "CA"
}
}
What is a Document?

7
Traditional Schema Design
Focus on Data Storage

8
Document Schema Design
Focus on Data Usage

9
Traditional Schema Design
What answers do I have?

10
Document Schema Design
What questions do I have?

12
Library Management Application
• Patrons/Users
• Books
• Authors
• Publishers

13
Question:
What is a Patron’s Address?

A Patron and their Address
> patron = db.patrons.find({ _id : “joe” })
_id: "joe“,
name: "Joe Bookreader”
}
> address = db.addresses.find({ _id : “joe” })
{
_id: "joe“,
street: "123 Fake St. ",
city: "Faketon",
state: "MA",
zip: 12345
}

A Patron and their Address
{
_id: "joe",
name: "Joe Bookreader",
address: {
street: "123 Fake St. ",
city: "Faketon",
state: "MA",
zip: 12345
}
}

16
One-to-One Relationships
• “Belongs to” relationships are often embedded.
• Holistic representation of entities with their embedded
attributes and relationships.
• Optimized for read performance

17
Question:
What are a Patron’s Addresses?

A Patron and their Addresses
> patron = db.patrons.find({ _id : “bob” })
{
_id: “bob",
name: “Bob Knowitall",
addresses: [
{street: "1 Vernon St.", city: "Newton", …},
{street: "52 Main St.", city: "Boston", …},
]
}

A Patron and their Addresses
> patron = db.patrons.find({ _id : “bob” })
{
_id: “bob",
name: “Bob Knowitall",
addresses: [
{street: "1 Vernon St.", city: "Newton", …},
{street: "52 Main St.", city: "Boston", …},
]
}
{
_id: "joe",
name: "Joe Bookreader",
address: { street: "123 Fake St. ", city: "Faketon", …}
}

20
Migration Possibilities
• Migrate all documents when the schema changes.
• Migrate On-Demand
– As we pull up a patron’s document, we make the change.
– Any patrons that never come into the library never get updated.
• Leave it alone
– As long as the application knows about both types…

21
Question:
Who is the publisher of this
book?

22
Book
• MongoDB: The Definitive Guide,
• By Kristina Chodorow and Mike Dirolf
• Published: 9/24/2010
• Pages: 216
• Language: English
• Publisher: O’Reilly Media, CA

Book with embedded Publisher
> book = db.books.find({ _id : “123” })
{
_id: “123”,
authors: [ "Kristina Chodorow", "Mike Dirolf" ],
pages: 216,
language: "English",
publisher: {
founded: "1980",
location: "CA"
}
}

24
Book with embedded Publisher
• Optimized for read performance of Books
• Other queries become difficult

25
Question:
Who are all the publishers in the
system?

All Publishers
> publishers = db.publishers.find()
{
_id: “oreilly”,
founded: "1980",
location: "CA"
}
{
_id: “penguin”,
name: “Penguin”,
founded: “1983”,
location: “CA”
}

Book with linked Publisher
> book = db.books.find({ _id: “123” })
{
_id: “123”,
publisher_id: “oreilly”,
authors: [ "Kristina Chodorow", "Mike Dirolf" ],
pages: 216,
language: "English"
}
> db.publishers.find({ _id : book.publisher_id })
{
_id: “oreilly”,
founded: "1980",
location: "CA"
}

28
Question:
What are all the books a
publisher has published?

Publisher with linked Books
> publisher = db.publishers.find({ _id : “oreilly” })
{
_id: “oreilly”,
founded: "1980",
location: "CA“,
books: [“123”,…]
}
> books = db.books.find({ _id: { $in : publisher.books } })

30
Question:
Who are the authors of a given
book?

Books with linked Authors
{
_id: “123”,
pages: 216,
language: "English“,
authors: [“kchodorow”, “mdirolf”]
}
> authors = db.authors.find({ _id : { $in : book.authors }
})
{ _id: "kchodorow", name: "Kristina Chodorow”, hometown: …
}
{ _id: “mdirolf", name: “Mike Dirolf“, hometown: … }

Books with linked Authors
{
_id: “123”,
pages: 216,
language: "English“,
authors: [
{ id: "kchodorow", name: "Kristina Chodorow” },
{ id: "mdirolf", name: "Mike Dirolf” }
]
}

33
Question:
What are all the books an author
has written?

> authors = db.authors.find({ _id : “kchodorow” })
{
_id: "kchodorow",
name: "Kristina Chodorow",
hometown: "Cincinnati",
books: [ {id: “123”, title : "MongoDB: The Definitive
Guide“ } ]
}
Authors with linked Books

Links on both Authors and Books
{
_id: "kchodorow",
hometown: "Cincinnati",
books: [ {id: “123”, title : "MongoDB: The Definitive
Guide“ } ]
}
{
_id: “123”,
authors: [
{ id: "kchodorow", name: "Kristina Chodorow” },
{ id: "mdirolf", name: "Mike Dirolf” }
]
}

36
Linking vs. Embedding
• Embedding
– Great for read performance
– Writes can be slow
– Data integrity needs to be managed
• Linking
– Flexible
– Data integrity is built-in
– Work is done during reads

37
Question:
What are all the books about
databases?

Categories as Documents
{
_id: "kchodorow",
homea> book = db.books.find({ _id : “123” })
{
_id: “123”,
category: “MongoDB”
}
> categories = db.categories.find({ _id: “MongoDB” })
{
_id: “MongoDB”,
parent: “Databases”
}
town: "Cincinnati",
books: [ {id: “123”, title : "MongoDB: The Definitive Guide“
} ]
}

Categories as an Array
{
_id: “123”,
categories: [“MongoDB”, “Databases”, “Programming”]
}
> db.books.find({ categories: “Databases” })

Categories as a Path
{
_id: “123”,
category: “Programming/Databases/MongoDB”
}
> db.books.find({ category: ^Programming/Databases/* })

41
Conclusion
• Schema design is different in MongoDB
• Basic data design principals stay the same
• Focus on how an application accesses/manipulates data
• Evolve the schema to meet requirements as they change

Jumpstart: Schema Design

More Related Content

What's hot (14)

Similar to Jumpstart: Schema Design (20)

More from MongoDB (20)

Recently uploaded (20)

Jumpstart: Schema Design

Editor's Notes