SlideShare a Scribd company logo
Eliot Horowitz @eliothorowitz MongoBerlin October 4, 2010 Sharding Internals
MongoDB Sharding Scale horizontally for data size, index size, write and consistent read scaling Distribute databases, collections or a objects in a collection Auto-balancing, migrations, management happen with no down time
Choose how you partition data Can convert from single master to sharded system with no downtime Same features as non-sharding single master Fully consistent
Range Based collection is broken into chunks by range chunks default to 200mb or 100,000 objects
User profiles Partition by user_id Secondary indexes on location, dates, etc... Reads/writes know which shard to hit
User Activity Stream Shard by user_id Loading a user’s stream hits a single shard Writes are distributed across all shards Can index on activity for deleting
Photos Can shard by photo_id for best read/write distribution Secondary index on tags, date
Logging date machine, date logger name Possible Shard Keys
Architecture client mongos ... mongos mongod mongod mongodddd mongod mongod mongod ... Shards mongod mongod mongod Config Servers
Config Servers 3 of them changes are made with 2 phase commit if any are down, meta data goes read only system is online as long as 1/3 is up
Shards Can be master, master/slave or replica sets Replica sets gives sharding + full auto-failover Regular mongod processes
mongos Sharding Router Acts just like a mongod to clients Can have 1 or as many as you want Can run on appserver so no extra network traffic
Writes Inserts : require shard key, routed Removes: routed and/or scattered Updates: routed or scattered
Queries By shard key: routed sorted by shard key: routed in order by non shard key: scatter gather sorted by non shard key: distributed merge sort
Operations split: breaking a chunk into 2 migrate: move a chunk from 1 shard to another balancing: moving chunks automatically to keep system in balance
Setting it Up Start servers add shards: db.runCommand( { addshard : "10.1.1.5" } ) turn on partitioning: db.runCommand( { enablesharding : "test" }  shard a collection: db.runCommand( { shardcollection : "test.data" , key : { num : 1 } } )
Download MongoDB http://www.mongodb.org and let us know what you think @eliothorowitz  @mongodb 10gen is hiring! http://www.10gen.com/jobs

More Related Content

PDF
PPTX
Mongo db on azure for developers
PDF
MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...
PPTX
Quand utiliser MongoDB … Et quand vous en passer…
PDF
MongoDB Basics Unileon
KEY
Mongodb sharding
KEY
Node.js et MongoDB: Mongoose
PDF
Big Data at Oracle - Strata 2015 San Jose
Mongo db on azure for developers
MongoDB Evenings Houston: What's the Scoop on MongoDB and Hadoop? by Jake Ang...
Quand utiliser MongoDB … Et quand vous en passer…
MongoDB Basics Unileon
Mongodb sharding
Node.js et MongoDB: Mongoose
Big Data at Oracle - Strata 2015 San Jose

Similar to 2010 mongo berlin-shardinginternals (1) (20)

PPT
2011 mongo FR - scaling with mongodb
PPTX
Zookeeper big sonata
PPTX
mongoDB for sysadmins
PPTX
Practical Replication June-2011
PPTX
Setting Up Sumo Logic - Apr 2017
PPTX
MongoDB: An Introduction - July 2011
PPTX
MongoDB Auto-Sharding at Mongo Seattle
PDF
Event Processing and Integration with IAS Data Processors
PPTX
MongoDB: An Introduction - june-2011
PPTX
Using Sumo Logic - Apr 2018
DOCX
MongoDB Replication and Sharding
PPT
Advanced driver debugging (13005399) copy
PDF
Level 3 Certification: Setting up Sumo Logic - Oct 2018
PPT
2010 mongo berlin-scaling
PPT
60141457-Oracle-Golden-Gate-Presentation.ppt
PPTX
Software architecture for data applications
KEY
Sharding with MongoDB (Eliot Horowitz)
PPTX
PPTX
Welcome Webinar Slides
PPTX
Sumo Logic QuickStart Webinar Oct 2016
2011 mongo FR - scaling with mongodb
Zookeeper big sonata
mongoDB for sysadmins
Practical Replication June-2011
Setting Up Sumo Logic - Apr 2017
MongoDB: An Introduction - July 2011
MongoDB Auto-Sharding at Mongo Seattle
Event Processing and Integration with IAS Data Processors
MongoDB: An Introduction - june-2011
Using Sumo Logic - Apr 2018
MongoDB Replication and Sharding
Advanced driver debugging (13005399) copy
Level 3 Certification: Setting up Sumo Logic - Oct 2018
2010 mongo berlin-scaling
60141457-Oracle-Golden-Gate-Presentation.ppt
Software architecture for data applications
Sharding with MongoDB (Eliot Horowitz)
Welcome Webinar Slides
Sumo Logic QuickStart Webinar Oct 2016
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
Ad

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPT
Teaching material agriculture food technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Empathic Computing: Creating Shared Understanding
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Cloud computing and distributed systems.
PDF
Spectral efficient network and resource selection model in 5G networks
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
cuic standard and advanced reporting.pdf
PDF
Approach and Philosophy of On baking technology
Unlocking AI with Model Context Protocol (MCP)
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Teaching material agriculture food technology
Reach Out and Touch Someone: Haptics and Empathic Computing
Building Integrated photovoltaic BIPV_UPV.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Empathic Computing: Creating Shared Understanding
Review of recent advances in non-invasive hemoglobin estimation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Dropbox Q2 2025 Financial Results & Investor Presentation
Digital-Transformation-Roadmap-for-Companies.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Big Data Technologies - Introduction.pptx
Cloud computing and distributed systems.
Spectral efficient network and resource selection model in 5G networks
The AUB Centre for AI in Media Proposal.docx
cuic standard and advanced reporting.pdf
Approach and Philosophy of On baking technology

2010 mongo berlin-shardinginternals (1)

  • 1. Eliot Horowitz @eliothorowitz MongoBerlin October 4, 2010 Sharding Internals
  • 2. MongoDB Sharding Scale horizontally for data size, index size, write and consistent read scaling Distribute databases, collections or a objects in a collection Auto-balancing, migrations, management happen with no down time
  • 3. Choose how you partition data Can convert from single master to sharded system with no downtime Same features as non-sharding single master Fully consistent
  • 4. Range Based collection is broken into chunks by range chunks default to 200mb or 100,000 objects
  • 5. User profiles Partition by user_id Secondary indexes on location, dates, etc... Reads/writes know which shard to hit
  • 6. User Activity Stream Shard by user_id Loading a user’s stream hits a single shard Writes are distributed across all shards Can index on activity for deleting
  • 7. Photos Can shard by photo_id for best read/write distribution Secondary index on tags, date
  • 8. Logging date machine, date logger name Possible Shard Keys
  • 9. Architecture client mongos ... mongos mongod mongod mongodddd mongod mongod mongod ... Shards mongod mongod mongod Config Servers
  • 10. Config Servers 3 of them changes are made with 2 phase commit if any are down, meta data goes read only system is online as long as 1/3 is up
  • 11. Shards Can be master, master/slave or replica sets Replica sets gives sharding + full auto-failover Regular mongod processes
  • 12. mongos Sharding Router Acts just like a mongod to clients Can have 1 or as many as you want Can run on appserver so no extra network traffic
  • 13. Writes Inserts : require shard key, routed Removes: routed and/or scattered Updates: routed or scattered
  • 14. Queries By shard key: routed sorted by shard key: routed in order by non shard key: scatter gather sorted by non shard key: distributed merge sort
  • 15. Operations split: breaking a chunk into 2 migrate: move a chunk from 1 shard to another balancing: moving chunks automatically to keep system in balance
  • 16. Setting it Up Start servers add shards: db.runCommand( { addshard : "10.1.1.5" } ) turn on partitioning: db.runCommand( { enablesharding : "test" } shard a collection: db.runCommand( { shardcollection : "test.data" , key : { num : 1 } } )
  • 17. Download MongoDB http://www.mongodb.org and let us know what you think @eliothorowitz @mongodb 10gen is hiring! http://www.10gen.com/jobs

Editor's Notes

  • #3: for inconsistent read scaling
  • #7: don’t shard by date