Sharding in 20 minutes 
Why;Who;When;Where; 
David Murphy , Mongo Master 
Lead DBA, ObjectRocket 
@dmurphy_data @objectrocket
Background 
• 16 yrs in databases, development, & system 
engineering 
• Lead DBA @ ObjectRocket 
• Mongo Master with a focus on sharding, chunks, and 
scaling mongo beyond normal means.
What does a sharded cluster look like?
Why;Who;When;Where; 
i) Why do we shard? 
ii) Who should I shard? 
iii) When do we shard? 
iv) Where is my shard key
Why do we shard? 
• Scaling out write locks
Why do we shard? 
• Scaling out write locks 
• Small dataset to search per node
Why do we shard? 
• Scaling out write locks 
• Small dataset to search per node 
• Getting more connections to the data
Why do we shard? 
• Scaling out write locks 
• Small dataset to search per node 
• Getting more connections to the data 
• More smaller node vs Scaling up to expensive 
nodes
Who should I shard? 
• Biggest Collections by Size
Who should I shard? 
• Biggest Collections by Size 
• Busiest Collection by changes
Who should I shard? 
• Biggest Collections by Size 
• Busiest Collection by changes 
• Groupings of data (example): 
• State/Country 
• UserID 
• Company 
• Category
When do I shard? 
ALWAYS as early as possible!
When do I shard? 
ALWAYS as early as possible! 
Reasons: 
• Not all commands work
When do I shard? 
ALWAYS as early as possible! 
Reasons: 
• Not all commands work 
• Future Proof - No recoding
When do I shard? 
ALWAYS as early as possible! 
Reasons: 
• Not all commands work 
• Future Proof - No recoding 
• Adding index once your live can take time 
you don’t have!
Where (and what) is my shard key? 
You have to pick your own :/ 
But there are some quick hints…
Where (and what) is my shard key? 
Sharding Quick Hints: 
• Hashed Shard keys 
Great for even disk usage 
Uses Scatter-Gathers == More Conns 
Dates,Increasing IDs , and text are great here
Where (and what) is my shard key? 
Sharding Quick Hints: 
• Hashed Shard keys 
Great for even disk usage 
Uses Scatter-Gathers == More Conns 
Dates,Increasing IDs , and text are great here 
• Non-Hashed Keys 
Use profiler_level:2 & review ALL operations 
Things you wont change only 
No Dates 
No Increasing ID numbers 
No Text
Why mongo sharding/balancing 
Modulus Sharding with MySQL: 
Hard to rebalance online 
Requires application coding to support 
Ring Topologies like Cassandra - 
Cant change schema online 
Hard to rebalance online
Further Reading 
Presentations: 
Kenny Gorman Sharding - bit.ly/1oXYDfm 
David Murphy - Adv Sharding for Operations - bit.ly/1oXYDfm 
Other Sharding MongoDB Links - bit.ly/ZTtDI1 
Picking a shard key (manual) - http://bit.ly/1ozuzMH 
Choosing a shard key - http://slidesha.re/1nBnGtq
Contact 
@dmurphy_data 
@objectrocket 
david@objectrocket.com 
https://www.objectrocket.com 
WE ARE HIRING! (DBA,DEVOPS, and more) 
https://www.objectrocket.com/careers

More Related Content

PPT
Everything You Need to Know About Sharding
PPTX
I have a good shard key now what - Advanced Sharding
PPTX
Lightning Talk: MongoDB Sharding
PPTX
Sharding Methods for MongoDB
PPTX
Sharding Methods for MongoDB
PPTX
MongoDB for Time Series Data: Sharding
PPTX
MongoDB Auto-Sharding at Mongo Seattle
PPTX
Introduction to Sharding
Everything You Need to Know About Sharding
I have a good shard key now what - Advanced Sharding
Lightning Talk: MongoDB Sharding
Sharding Methods for MongoDB
Sharding Methods for MongoDB
MongoDB for Time Series Data: Sharding
MongoDB Auto-Sharding at Mongo Seattle
Introduction to Sharding

What's hot (20)

KEY
Mongodb sharding
PPTX
Mongo db multidc_webinar
PPTX
MongoDB Deployment Checklist
PDF
Mongodb - Scaling write performance
PPTX
Webinar: Choosing the Right Shard Key for High Performance and Scale
PPTX
MongoDB Capacity Planning
PDF
Sharding
PPTX
Apache Druid Design and Future prospect
PDF
Benchmarking Apache Druid
PPTX
Ops Jumpstart: MongoDB Administration 101
PDF
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
PPTX
Large partition in Cassandra
PPTX
Webinar: Scaling MongoDB
PPTX
Druid realtime indexing
PDF
Archmage, Pinterest’s Real-time Analytics Platform on Druid
PPTX
Hardware Provisioning for MongoDB
PPTX
Scaling with MongoDB
PDF
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
PPTX
Programmatic Bidding Data Streams & Druid
PDF
Re-Engineering PostgreSQL as a Time-Series Database
Mongodb sharding
Mongo db multidc_webinar
MongoDB Deployment Checklist
Mongodb - Scaling write performance
Webinar: Choosing the Right Shard Key for High Performance and Scale
MongoDB Capacity Planning
Sharding
Apache Druid Design and Future prospect
Benchmarking Apache Druid
Ops Jumpstart: MongoDB Administration 101
Redis Day TLV 2018 - 10 Reasons why Redis should be your Primary Database
Large partition in Cassandra
Webinar: Scaling MongoDB
Druid realtime indexing
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Hardware Provisioning for MongoDB
Scaling with MongoDB
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
Programmatic Bidding Data Streams & Druid
Re-Engineering PostgreSQL as a Time-Series Database
Ad

Similar to Lightning Talk: What You Need to Know Before You Shard in 20 Minutes (20)

PPTX
Sharding why,what,when, how
PDF
Introduction to Sharding
PDF
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
PDF
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
PDF
Scaling MongoDB with Horizontal and Vertical Sharding
PPTX
MongoDB Sharding
PDF
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
PPTX
Sharding
PPTX
Introduction to Sharding
PPTX
Webinar: Sharding
PPTX
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
PDF
MongoDB Sharding Fundamentals
PPT
MongoDB Sharding Webinar 2014
PPTX
Basic Sharding in MongoDB presented by Shaun Verch
PPTX
Hellenic MongoDB user group - Introduction to sharding
KEY
2011 mongo sf-scaling
PDF
Scaling MongoDB - Presentation at MTP
PPTX
Data sharding
PDF
Шардинг в MongoDB, Henrik Ingo (MongoDB)
PPT
2011 mongo FR - scaling with mongodb
Sharding why,what,when, how
Introduction to Sharding
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
Scaling MongoDB with Horizontal and Vertical Sharding
MongoDB Sharding
Evolution of MonogDB Sharding and Its Best Practices - Ranjith A - Mydbops Team
Sharding
Introduction to Sharding
Webinar: Sharding
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB Sharding Fundamentals
MongoDB Sharding Webinar 2014
Basic Sharding in MongoDB presented by Shaun Verch
Hellenic MongoDB user group - Introduction to sharding
2011 mongo sf-scaling
Scaling MongoDB - Presentation at MTP
Data sharding
Шардинг в MongoDB, Henrik Ingo (MongoDB)
2011 mongo FR - scaling with mongodb
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
The various Industrial Revolutions .pptx
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Architecture types and enterprise applications.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
DOCX
search engine optimization ppt fir known well about this
PPTX
Tartificialntelligence_presentation.pptx
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
Getting Started with Data Integration: FME Form 101
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Five Habits of High-Impact Board Members
PPT
What is a Computer? Input Devices /output devices
PDF
1 - Historical Antecedents, Social Consideration.pdf
Enhancing emotion recognition model for a student engagement use case through...
sustainability-14-14877-v2.pddhzftheheeeee
The various Industrial Revolutions .pptx
Final SEM Unit 1 for mit wpu at pune .pptx
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Architecture types and enterprise applications.pdf
WOOl fibre morphology and structure.pdf for textiles
Web Crawler for Trend Tracking Gen Z Insights.pptx
search engine optimization ppt fir known well about this
Tartificialntelligence_presentation.pptx
Taming the Chaos: How to Turn Unstructured Data into Decisions
A novel scalable deep ensemble learning framework for big data classification...
NewMind AI Weekly Chronicles – August ’25 Week III
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
A review of recent deep learning applications in wood surface defect identifi...
Getting Started with Data Integration: FME Form 101
Group 1 Presentation -Planning and Decision Making .pptx
Five Habits of High-Impact Board Members
What is a Computer? Input Devices /output devices
1 - Historical Antecedents, Social Consideration.pdf

Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

  • 1. Sharding in 20 minutes Why;Who;When;Where; David Murphy , Mongo Master Lead DBA, ObjectRocket @dmurphy_data @objectrocket
  • 2. Background • 16 yrs in databases, development, & system engineering • Lead DBA @ ObjectRocket • Mongo Master with a focus on sharding, chunks, and scaling mongo beyond normal means.
  • 3. What does a sharded cluster look like?
  • 4. Why;Who;When;Where; i) Why do we shard? ii) Who should I shard? iii) When do we shard? iv) Where is my shard key
  • 5. Why do we shard? • Scaling out write locks
  • 6. Why do we shard? • Scaling out write locks • Small dataset to search per node
  • 7. Why do we shard? • Scaling out write locks • Small dataset to search per node • Getting more connections to the data
  • 8. Why do we shard? • Scaling out write locks • Small dataset to search per node • Getting more connections to the data • More smaller node vs Scaling up to expensive nodes
  • 9. Who should I shard? • Biggest Collections by Size
  • 10. Who should I shard? • Biggest Collections by Size • Busiest Collection by changes
  • 11. Who should I shard? • Biggest Collections by Size • Busiest Collection by changes • Groupings of data (example): • State/Country • UserID • Company • Category
  • 12. When do I shard? ALWAYS as early as possible!
  • 13. When do I shard? ALWAYS as early as possible! Reasons: • Not all commands work
  • 14. When do I shard? ALWAYS as early as possible! Reasons: • Not all commands work • Future Proof - No recoding
  • 15. When do I shard? ALWAYS as early as possible! Reasons: • Not all commands work • Future Proof - No recoding • Adding index once your live can take time you don’t have!
  • 16. Where (and what) is my shard key? You have to pick your own :/ But there are some quick hints…
  • 17. Where (and what) is my shard key? Sharding Quick Hints: • Hashed Shard keys Great for even disk usage Uses Scatter-Gathers == More Conns Dates,Increasing IDs , and text are great here
  • 18. Where (and what) is my shard key? Sharding Quick Hints: • Hashed Shard keys Great for even disk usage Uses Scatter-Gathers == More Conns Dates,Increasing IDs , and text are great here • Non-Hashed Keys Use profiler_level:2 & review ALL operations Things you wont change only No Dates No Increasing ID numbers No Text
  • 19. Why mongo sharding/balancing Modulus Sharding with MySQL: Hard to rebalance online Requires application coding to support Ring Topologies like Cassandra - Cant change schema online Hard to rebalance online
  • 20. Further Reading Presentations: Kenny Gorman Sharding - bit.ly/1oXYDfm David Murphy - Adv Sharding for Operations - bit.ly/1oXYDfm Other Sharding MongoDB Links - bit.ly/ZTtDI1 Picking a shard key (manual) - http://bit.ly/1ozuzMH Choosing a shard key - http://slidesha.re/1nBnGtq
  • 21. Contact @dmurphy_data @objectrocket david@objectrocket.com https://www.objectrocket.com WE ARE HIRING! (DBA,DEVOPS, and more) https://www.objectrocket.com/careers