SlideShare a Scribd company logo
Tuesday, December 4, 12
Hi! My name is Charity Majors, and I am a systems engineer at Parse.

Parse is a platform for mobile developers.

You can use our apis to build apps for iOS, Android, and Windows phones. We take care of all of the provisioning and scaling for backend services, so you can focus on building your app
and user experience.
Replica sets

                     • Always use replica sets
                     • Distribute across Availability Zones
                     • Avoid situations where you have even # voters
                     • More voters are better than fewer



Tuesday, December 4, 12
First, the basics.

* Always run with replica sets. Never run with a single node, unless you really hate your data. And always distribute your replica set members across
as many different regions as possible. If you have three nodes, use three regions. Do not put two nodes in one region and one node in a second
region. Remember, you need at least two nodes to form a quorum in case of network split. And an even number of nodes can leave you stuck in a
situation where they can’t elect a master. If you need to run with an even number of nodes temporarily, either assign more votes to some nodes or add
an arbiter. But always, always think about how to protect yourself from situations where you can’t elect a master. Go for more votes rather than fewer,
because it’s easier to subtract if you have too many than to add if you have too few.

** Remember, if you get in to a situation where you have only one node, you have a situation where you have no way to add another node to the replica
set. There was one time very early on when we were still figuring mongo out, and we had to recover from an outage by bringing up a node from
snapshot with the same hostname so it would be recognized as a member of the same replica set. Bottom line, you just really don’t want to be in this
situation. Spread your eggs around in lots of baskets.
Snapshots
                          • Snapshot often
                          • Lock Mongo
                          • Set snapshot node to priority = 0
                          • Always warm up a snapshot before promoting
                          • Warm up both indexes and data



Tuesday, December 4, 12
Snapshots

* Snapshot regularly. We snapshot every 30 minutes. EBS snapshot actually does a differential backup, so subsequent snapshots will be faster the
more frequently you do them.

* Make sure you use a snapshot script that locks mongo. It’s not enough to just use ec2-create-snapshot on the RAID volumes, you also need to lock
mongo beforehand and unlock it after. We use a script called ec2-consistent-snapshot, though I think we may have modified it to add mongo support.

* Always set your snapshot node to config priority = 0. This will prevent it from ever getting elected master. You really, really do not want your
snapshotting host to ever become master, or your site will go down. We also like to set our primary priority to 3, and all non-snapshot secondaries to 2,
because priority 1 isn’t always visible from rs.conf(). That’s just a preference of ours.

* Never, ever switch primary over to a newly restored snapshot. Something a lot of people don’t seem to realize is that EBS blocks are actually lazy-
loaded off S3. You need to warm your fresh secondaries up. I mean, you think loading data into RAM from disk is bad, try loading into RAM from S3.
There’s just a *tiny* bit of latency there.

Warming up

Lots of people seem to do this in different ways, and it kind of depends on how much data you have. If you have less data than you have RAM, you can
just use dd or vmtouch to load entire databases into memory. If you have more data than RAM, it’s a little bit trickier.

The way we do it is, first we run a script on the primary. It gets the current ops every quarter of a second or so for an hour, then sorts by most frequently
accessed collections. Then we take that list of collections and feed it into a warmup script on the secondary, which reads all the collections and indexes
into memory. The script is parallelized, but it still takes several hours to complete. You can also read collections into memory by doing a full table scan,
or a natural sort.

God, what I wouldn’t give for block-level replication like Amazon’s RDS.
Chef everything


                    • Role attributes for backup volumes, cluster
                          names

                    • Nodes are disposable
                    • Delete volumes and aws attributes, run chef-
                          client to reprovision




Tuesday, December 4, 12
Chef

Moving along … chef! Everything we have is fully chef’d. It only takes us like 5 minutes to bring up a new node from snapshot. We use the opscode
MongoDB and AWS cookbooks, with some local modifications so they can handle PIOPS and the ebs_optimized dedicated NICs. We haven’t open
sourced these changes, but we probably can, if there’s any demand for them. It looks like this:

$ knife ec2 server create -r "role[mongo-replset1-iops]" -f m2.4xlarge -G db -x ubuntu --node-name db36 -I ami-xxxxxxxx -Z us-east-1d -E production

There are some neat things in the mongo cookbook. You can create a role attribute to define the cluster name, so it automatically comes up and joins
the cluster. The backup volumes for a cluster are also just attributes for the role. So it’s easy to create a mongo backups role that automatically backs
up whatever volumes are pointed to by that attribute.


We use the m2.4xlarge flavor, which has like 68 gigs of memory. We have about a terabyte of data per replica set, so 68 gigs is just barely enough for
the working set to fit into memory.

We used to use four EBS volumes RAID 10’d, but we don’t even bother with RAID 10 anymore, we just stripe PIOPS volumes. It’s faster for us to
reprovision a replica set member than repairing the RAID array. If an EBS volume dies, or the secondary falls too far behind, or whatever, we just delete
the volumes, remove the AWS attributes for the node in the chef node description, and re-run chef-client. It reprovisions new volumes for us from the
latest snapshot in a matter of minutes. For most problems, it’s faster for us to destroy and rebuild than attempt any sort of repair.
Before PIOPS:




                     After PIOPS:




Tuesday, December 4, 12
P-IOPS

And … we use PIOPS. We switched to Provisioned IOPS literally as soon as it was available. As you can see from this graph, it made a *huge*
difference for us.

These are end-to-end latency graphs in Cloudwatch, from the point a request enters the ELB til the response goes back out. Note the different Y-axis!
order of magnitude difference. The top Y-axis goes up to 2.5, the bottom one goes up to 0.6.

EBS is awful. It’s bursty, and flaky, and just generally everything you DON’T want in your database hardware. As you can see here in the top graph,
using 4 EBS volumes raid 10'd, we had ebs spikes all the time. Any time one of the four ebs volumes had any sort of availability event, our end to end
latency took a hit. With PIOPS, our average latency dropped in half and went almost completely flat around 100 milliseconds.


So yes. Use PIOPS. Until recently you could only provision 1k iops per volume, but you can now provision volumes with up to 2000 iops per volume.
And they guarantee a variability of less than .1%, which is exactly what you want in your database hardware.
Filesystem & misc


                    • Use ext4
                    • Raise file descriptor limits (cat /proc/<pid>/
                          limits to verify)

                    • Sharding.                     Eventually you must shard.




Tuesday, December 4, 12
Misc

Some small, miscellaneous details:

* Remember to raise your file descriptor limits. And test that they are actually getting applied. The best way to do this is find the pid of your mongodb
process, and type “cat /proc/<pid>/limits. We had a hard time getting sysvinit scripts to properly apply the increased limits, so we converted to use
upstart and have had no issues. I don’t know if ubuntu no longer supports sysvinit very well, or what.

* We use ext4. Supposedly either ext4 or xfs will work, but I have been scarred by xfs file corruption way too many times to ever consider that. They
say it’s fixed, but I have like xfs PTSD or something.

* Sharding -- at some point you have to shard your data. The mongo built-in sharding didn’t work for us for a variety of reasons I won’t go into here.
We’re doing sharding at the app layer, the goal is to
Parse runs on MongoDB

                          • DDoS protection and query profiling
                          • Billing and logging analytics
                          • User data




Tuesday, December 4, 12
In summary, we are very excited about MongoDB. We love the fact that it fails over seamlessly between Availability Zones during an AZ event. And we
value the fact that its flexibility allows us to build our expertise and tribal knowledge around one primary database product, instead of a dozen different
ones.

In fact, we actually use MongoDB in at least three or four distinct ways. We use it for a high-writes DDoS and query analyzer cluster, where we process
a few hundred thousand writes per minute and expire the data every 10 minutes. We use it for our logging and analytics cluster, where we analyze all
our logs from S3 and generate billing data. And we use it to store all the app data for all of our users and their mobile apps.

Something like Parse wouldn’t even be possible without a nosql product as flexible and reliable as Mongo is. We’ve built our business around it, and
we’re very excited about its future.

Also, we’re hiring. See me if you’re interested. :)

Thank you! Any questions?
Tuesday, December 4, 12

More Related Content

KEY
MongoDB vs Mysql. A devops point of view
PPTX
When to Use MongoDB...and When You Should Not...
PPTX
Migrating from MySQL to MongoDB at Wordnik
PDF
NoSQL benchmarking
PPTX
Compare DynamoDB vs. MongoDB
PPTX
Hardware Provisioning for MongoDB
PPTX
MongoDB Capacity Planning
PPTX
Powering Microservices with Docker, Kubernetes, Kafka, and MongoDB
MongoDB vs Mysql. A devops point of view
When to Use MongoDB...and When You Should Not...
Migrating from MySQL to MongoDB at Wordnik
NoSQL benchmarking
Compare DynamoDB vs. MongoDB
Hardware Provisioning for MongoDB
MongoDB Capacity Planning
Powering Microservices with Docker, Kubernetes, Kafka, and MongoDB

What's hot (19)

PPT
January 2011 HUG: Kafka Presentation
PDF
MongoDB Capacity Planning
PPTX
Practical Design Patterns for Building Applications Resilient to Infrastructu...
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 2
PPTX
Capacity Planning
PPTX
Scaling MongoDB to a Million Collections
PPTX
Keeping the Lights On with MongoDB
PDF
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
PPTX
Webinar: When to Use MongoDB
PPTX
Managing a MongoDB Deployment
PPTX
Securing Your MongoDB Deployment
PPTX
Hardware Provisioning
PPTX
What's new in MongoDB 2.6
PDF
An Elastic Metadata Store for eBay’s Media Platform
PDF
How We Fixed Our MongoDB Problems
PPTX
Cassandra vs. MongoDB
PPTX
Conceptos Avanzados 1: Motores de Almacenamiento
PPTX
Capacity Planning For Your Growing MongoDB Cluster
PDF
MongoDB Administration 101
January 2011 HUG: Kafka Presentation
MongoDB Capacity Planning
Practical Design Patterns for Building Applications Resilient to Infrastructu...
Cassandra @ Sony: The good, the bad, and the ugly part 2
Capacity Planning
Scaling MongoDB to a Million Collections
Keeping the Lights On with MongoDB
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Webinar: When to Use MongoDB
Managing a MongoDB Deployment
Securing Your MongoDB Deployment
Hardware Provisioning
What's new in MongoDB 2.6
An Elastic Metadata Store for eBay’s Media Platform
How We Fixed Our MongoDB Problems
Cassandra vs. MongoDB
Conceptos Avanzados 1: Motores de Almacenamiento
Capacity Planning For Your Growing MongoDB Cluster
MongoDB Administration 101
Ad

Viewers also liked (20)

PDF
VirtualSense presentation at FBK
PPTX
Challenges in opening up qualitative research data
PDF
Review: Leadership Frameworks
PDF
Leinster college dublin - brochure web
PPT
Av capabilities presentation
PPT
Tecnologìas de la Información y la Comunicación
PDF
Heyat terzi report (Mart 2016)
PPTX
NOSQL Session GlueCon May 2010
PPTX
Mgidigitalglobalization
PDF
Amadeus big data
PDF
BPM & Enterprise Middleware - Datasheet
PPT
Migrating to git
PPTX
Anti-social Databases
PDF
Strongly Typed Languages and Flexible Schemas
PPT
Part 1
PDF
Special project
PDF
R Statistics With MongoDB
PPTX
Ov big data
PDF
Microsoft xamarin-experience
PDF
Introduction Pentaho 5.0
VirtualSense presentation at FBK
Challenges in opening up qualitative research data
Review: Leadership Frameworks
Leinster college dublin - brochure web
Av capabilities presentation
Tecnologìas de la Información y la Comunicación
Heyat terzi report (Mart 2016)
NOSQL Session GlueCon May 2010
Mgidigitalglobalization
Amadeus big data
BPM & Enterprise Middleware - Datasheet
Migrating to git
Anti-social Databases
Strongly Typed Languages and Flexible Schemas
Part 1
Special project
R Statistics With MongoDB
Ov big data
Microsoft xamarin-experience
Introduction Pentaho 5.0
Ad

Similar to MongoDB and AWS Best Practices (20)

PDF
Growing MongoDB on AWS
PDF
Demo 0.9.4
PDF
Cloud forensics putting the bits back together
PDF
The Smug Mug Tale
PDF
2019 PHP Serbia - Boosting your performance with Blackfire
PDF
Fantastic Design Patterns and Where to use them No Notes.pdf
PDF
RDS for MySQL, No BS Operations and Patterns
PDF
Pgbr 2013 postgres on aws
PPTX
Why Wordnik went non-relational
PDF
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
PPT
High Availabiltity & Replica Sets with mongoDB
PDF
AWS Developer Fundamentals
PDF
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
PDF
How Percolate uses CFEngine to Manage AWS Stateless Infrastructure
PDF
Midwest php 2013 deploying php on paas- why & how
PPTX
Graylog Engineering - Design Your Architecture
PDF
Top 10 Perl Performance Tips
ODP
Hosting Drupal on Amazon EC2
PPS
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
PDF
All About Storeconfigs
Growing MongoDB on AWS
Demo 0.9.4
Cloud forensics putting the bits back together
The Smug Mug Tale
2019 PHP Serbia - Boosting your performance with Blackfire
Fantastic Design Patterns and Where to use them No Notes.pdf
RDS for MySQL, No BS Operations and Patterns
Pgbr 2013 postgres on aws
Why Wordnik went non-relational
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
High Availabiltity & Replica Sets with mongoDB
AWS Developer Fundamentals
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
How Percolate uses CFEngine to Manage AWS Stateless Infrastructure
Midwest php 2013 deploying php on paas- why & how
Graylog Engineering - Design Your Architecture
Top 10 Perl Performance Tips
Hosting Drupal on Amazon EC2
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
All About Storeconfigs

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPT
Teaching material agriculture food technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Modernizing your data center with Dell and AMD
“AI and Expert System Decision Support & Business Intelligence Systems”
The Rise and Fall of 3GPP – Time for a Sabbatical?
Unlocking AI with Model Context Protocol (MCP)
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Review of recent advances in non-invasive hemoglobin estimation
Teaching material agriculture food technology
MYSQL Presentation for SQL database connectivity
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Understanding_Digital_Forensics_Presentation.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
20250228 LYD VKU AI Blended-Learning.pptx
Network Security Unit 5.pdf for BCA BBA.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Per capita expenditure prediction using model stacking based on satellite ima...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Modernizing your data center with Dell and AMD

MongoDB and AWS Best Practices

  • 1. Tuesday, December 4, 12 Hi! My name is Charity Majors, and I am a systems engineer at Parse. Parse is a platform for mobile developers. You can use our apis to build apps for iOS, Android, and Windows phones. We take care of all of the provisioning and scaling for backend services, so you can focus on building your app and user experience.
  • 2. Replica sets • Always use replica sets • Distribute across Availability Zones • Avoid situations where you have even # voters • More voters are better than fewer Tuesday, December 4, 12 First, the basics. * Always run with replica sets. Never run with a single node, unless you really hate your data. And always distribute your replica set members across as many different regions as possible. If you have three nodes, use three regions. Do not put two nodes in one region and one node in a second region. Remember, you need at least two nodes to form a quorum in case of network split. And an even number of nodes can leave you stuck in a situation where they can’t elect a master. If you need to run with an even number of nodes temporarily, either assign more votes to some nodes or add an arbiter. But always, always think about how to protect yourself from situations where you can’t elect a master. Go for more votes rather than fewer, because it’s easier to subtract if you have too many than to add if you have too few. ** Remember, if you get in to a situation where you have only one node, you have a situation where you have no way to add another node to the replica set. There was one time very early on when we were still figuring mongo out, and we had to recover from an outage by bringing up a node from snapshot with the same hostname so it would be recognized as a member of the same replica set. Bottom line, you just really don’t want to be in this situation. Spread your eggs around in lots of baskets.
  • 3. Snapshots • Snapshot often • Lock Mongo • Set snapshot node to priority = 0 • Always warm up a snapshot before promoting • Warm up both indexes and data Tuesday, December 4, 12 Snapshots * Snapshot regularly. We snapshot every 30 minutes. EBS snapshot actually does a differential backup, so subsequent snapshots will be faster the more frequently you do them. * Make sure you use a snapshot script that locks mongo. It’s not enough to just use ec2-create-snapshot on the RAID volumes, you also need to lock mongo beforehand and unlock it after. We use a script called ec2-consistent-snapshot, though I think we may have modified it to add mongo support. * Always set your snapshot node to config priority = 0. This will prevent it from ever getting elected master. You really, really do not want your snapshotting host to ever become master, or your site will go down. We also like to set our primary priority to 3, and all non-snapshot secondaries to 2, because priority 1 isn’t always visible from rs.conf(). That’s just a preference of ours. * Never, ever switch primary over to a newly restored snapshot. Something a lot of people don’t seem to realize is that EBS blocks are actually lazy- loaded off S3. You need to warm your fresh secondaries up. I mean, you think loading data into RAM from disk is bad, try loading into RAM from S3. There’s just a *tiny* bit of latency there. Warming up Lots of people seem to do this in different ways, and it kind of depends on how much data you have. If you have less data than you have RAM, you can just use dd or vmtouch to load entire databases into memory. If you have more data than RAM, it’s a little bit trickier. The way we do it is, first we run a script on the primary. It gets the current ops every quarter of a second or so for an hour, then sorts by most frequently accessed collections. Then we take that list of collections and feed it into a warmup script on the secondary, which reads all the collections and indexes into memory. The script is parallelized, but it still takes several hours to complete. You can also read collections into memory by doing a full table scan, or a natural sort. God, what I wouldn’t give for block-level replication like Amazon’s RDS.
  • 4. Chef everything • Role attributes for backup volumes, cluster names • Nodes are disposable • Delete volumes and aws attributes, run chef- client to reprovision Tuesday, December 4, 12 Chef Moving along … chef! Everything we have is fully chef’d. It only takes us like 5 minutes to bring up a new node from snapshot. We use the opscode MongoDB and AWS cookbooks, with some local modifications so they can handle PIOPS and the ebs_optimized dedicated NICs. We haven’t open sourced these changes, but we probably can, if there’s any demand for them. It looks like this: $ knife ec2 server create -r "role[mongo-replset1-iops]" -f m2.4xlarge -G db -x ubuntu --node-name db36 -I ami-xxxxxxxx -Z us-east-1d -E production There are some neat things in the mongo cookbook. You can create a role attribute to define the cluster name, so it automatically comes up and joins the cluster. The backup volumes for a cluster are also just attributes for the role. So it’s easy to create a mongo backups role that automatically backs up whatever volumes are pointed to by that attribute. We use the m2.4xlarge flavor, which has like 68 gigs of memory. We have about a terabyte of data per replica set, so 68 gigs is just barely enough for the working set to fit into memory. We used to use four EBS volumes RAID 10’d, but we don’t even bother with RAID 10 anymore, we just stripe PIOPS volumes. It’s faster for us to reprovision a replica set member than repairing the RAID array. If an EBS volume dies, or the secondary falls too far behind, or whatever, we just delete the volumes, remove the AWS attributes for the node in the chef node description, and re-run chef-client. It reprovisions new volumes for us from the latest snapshot in a matter of minutes. For most problems, it’s faster for us to destroy and rebuild than attempt any sort of repair.
  • 5. Before PIOPS: After PIOPS: Tuesday, December 4, 12 P-IOPS And … we use PIOPS. We switched to Provisioned IOPS literally as soon as it was available. As you can see from this graph, it made a *huge* difference for us. These are end-to-end latency graphs in Cloudwatch, from the point a request enters the ELB til the response goes back out. Note the different Y-axis! order of magnitude difference. The top Y-axis goes up to 2.5, the bottom one goes up to 0.6. EBS is awful. It’s bursty, and flaky, and just generally everything you DON’T want in your database hardware. As you can see here in the top graph, using 4 EBS volumes raid 10'd, we had ebs spikes all the time. Any time one of the four ebs volumes had any sort of availability event, our end to end latency took a hit. With PIOPS, our average latency dropped in half and went almost completely flat around 100 milliseconds. So yes. Use PIOPS. Until recently you could only provision 1k iops per volume, but you can now provision volumes with up to 2000 iops per volume. And they guarantee a variability of less than .1%, which is exactly what you want in your database hardware.
  • 6. Filesystem & misc • Use ext4 • Raise file descriptor limits (cat /proc/<pid>/ limits to verify) • Sharding. Eventually you must shard. Tuesday, December 4, 12 Misc Some small, miscellaneous details: * Remember to raise your file descriptor limits. And test that they are actually getting applied. The best way to do this is find the pid of your mongodb process, and type “cat /proc/<pid>/limits. We had a hard time getting sysvinit scripts to properly apply the increased limits, so we converted to use upstart and have had no issues. I don’t know if ubuntu no longer supports sysvinit very well, or what. * We use ext4. Supposedly either ext4 or xfs will work, but I have been scarred by xfs file corruption way too many times to ever consider that. They say it’s fixed, but I have like xfs PTSD or something. * Sharding -- at some point you have to shard your data. The mongo built-in sharding didn’t work for us for a variety of reasons I won’t go into here. We’re doing sharding at the app layer, the goal is to
  • 7. Parse runs on MongoDB • DDoS protection and query profiling • Billing and logging analytics • User data Tuesday, December 4, 12 In summary, we are very excited about MongoDB. We love the fact that it fails over seamlessly between Availability Zones during an AZ event. And we value the fact that its flexibility allows us to build our expertise and tribal knowledge around one primary database product, instead of a dozen different ones. In fact, we actually use MongoDB in at least three or four distinct ways. We use it for a high-writes DDoS and query analyzer cluster, where we process a few hundred thousand writes per minute and expire the data every 10 minutes. We use it for our logging and analytics cluster, where we analyze all our logs from S3 and generate billing data. And we use it to store all the app data for all of our users and their mobile apps. Something like Parse wouldn’t even be possible without a nosql product as flexible and reliable as Mongo is. We’ve built our business around it, and we’re very excited about its future. Also, we’re hiring. See me if you’re interested. :) Thank you! Any questions?