SlideShare a Scribd company logo
#MongoDBTokyo




Deployment
Preparedness
Alvin Richards
Technical Director, 10gen
Plan A because there is no Plan
B
             http://bit.ly/QlJULZ
Part One

Before you deploy…
Prototype

           Ops
        Playbook
                                Test

         Capacity
         Planning
                      Monitor




Reinventing the wheel
Essentials
• Disable NUMA
• Pick appropriate file-system (xfs, ext4)
• Pick 64-bit O/S
   – Recent Linux kernel, Win2k8R2

• More RAM
   – Spend on RAM not Cores

• Faster Disks
   – SSDs vs. SAN
   – Separate Journal and Data Files
Key things to consider
• Profiling
   – Baseline/Blue print: Understand what should happen
   – Ensure good Index usage

• Monitoring
   – SNMP, munin, zabix, cacti, nagios
   – MongoDB Monitoring Service (MMS)

• Sizing
   – Understand Capability (RAM, IOPs)
   – Understand Use Cases + Schema
What is your SLA?
• High Availability?
   – 24x7x365 operation?
   – Limited maintenance window?

• Data Protection?
   – Failure of a Single Node?
   – Failure of a Data Center?

• Disaster Recovery?
   – Manual or automatic failover?
   – Data Center, Region, Continent?
Build & Test your Playbook
• Backups
• Restores (backups are not enough)
• Upgrades
• Replica Set Operations
• Sharding Operations
Part Two

Under the cover…
How to see metrics
• mongostat
• MongoDB plug ins for
   – munin, zabix, cacti, ganglia

•Hosted Services
   – MMS - 10gen
   – Server Density, Cloudkick

• Profiling
Operation Counters
Metrics in detail: opcounters
• Counts:
 Insert, Update, Delete, Query, Commands
• Operation counters are mostly straightforward:
 more is better


• Some operations in a replica set primary are
 accounted differently in a secondary
• getlastError(), system.status etc are also
 counted
Resident Memory counter
Metrics in detail: resident
memory
• Key metric: to a very high degree, the
 performance of a mongod is a measure of how
 much data fits in RAM.


• If this quantity is stably lower than available
 physical memory, the mongod is likely
 performing well.
• Correlated metrics: page faults, B-Tree misses
Page Faults counter
Collection 1       Virtual                   Disk
                   Address
                   Space 1

                                  Physical
                                  RAM



  Index 1




                         100 ns
               =
                       10,000,000 ns
               =
Metrics in detail: page faults
• This measures reads or writes to pages of data
 file that aren't resident in memory
• If this is persistently non-zero, your data doesn't
 fit in memory.


• Correlated metrics: resident memory, B-Tree
 misses, iostats
Working Set
> db.blogs.stats()
{                                       Size of data
    "ns" : "test.blogs",
    "count" : 1338330,
    "size" : 46915928,                  Average
    "avgObjSize" : 35.05557523181876,   document size
    "storageSize" : 86092032,
    "numExtents" : 12,                  Size on disk (and
    "nindexes" : 2,                     in memory!)
    "lastExtentSize" : 20872960,
    "paddingFactor" : 1,
    "flags" : 0,
    "totalIndexSize" : 99860480,        Size of all
    "indexSizes" : {                    indexes
        "_id_" : 55877632,
        "name_1" : 43982848             Size of each
    },                                  index
    "ok" : 1
}
Lock % counter
Metrics in detail: lock
percentage and queues
• By itself, lock % can be misleading: a high lock
 percentage just means that writing is happening.


• But when lock % is high and queued readers or
 writers is non-zero, then the mongod probably at
 its write capacity.


• Correlated metrics: iostats
Log file
Mon Dec 3 15:05:37 [conn81]
getmore scaleout.nodes query: { ts: { $lte: new Date(1354547123142) } }
cursorid:8607875337747748011
ntoreturn:0
keyUpdates:0
numYields: 216
locks(micros) r:615830
nreturned:27055
reslen:4194349
551ms
explain, hint
// explain() shows the plan used by the operation
> db.c.find(<query>).explain()


// hint() forces a query to use a specific index
// x_1 is the name of the index from db.c.getIndexes()
> db.c.find( {x:1} ).hint("x_1")
B-Tree Counter
Metrics in detail: B-Tree
• Indicates b-tree accesses including page fault
 service during an index lookup
• If misses are persistently non-zero, your indexes
 don't fit in RAM. (You might need to change or
 drop indexes, or shard your data.)


• Correlated metrics: resident memory, page
 faults, iostats
B-Trees' strengths
• B-Tree indexes are designed for range queries
 over a single dimension


• Think of a compound index on { A, B } as being
 an index on the concatenation of the A and B
 values in documents


• MongoDB can use its indexes for sorting as well
B-Trees' weaknesses
• Ranges queries on the first field of a compound
 index are suboptimal
• Range queries over multiple dimensions are
 suboptimal
• In both these cases, a suboptimal index might
 be better than nothing, but best is to try to see if
 you can't change the problem
Indexing dark corners
• Some functionality can't currently always use
 indexes:
   – $where JavaScript clauses
   – $mod, $not, $ne
   – regex

• Negation maybe transformed into a range query
   – Index can be used

• Complicated regular expressions scan a whole
 index
Other tricks
Warming the Cache
> db.c.find( {unused_key: 1} ).explain()
> db.c.find( {unused_key: 1} )
   .hint( {random_index:1} )
   .explain()


# cat /data/db/* > /dev/null


// New in 2.2
> db.runCommand( { touch: "blogs",
           data: true, index: true } )
Journal on another disk
•The journal's write load is very different than the
data files
   – journal = append-only
   – data files = randomly accessed



•Putting the journal on a separate disk or RAID
(e.g., with a symlink) will minimize any seek-time
related journaling overhead
--directoryperdb
• Allows storage tiering
   – Different access patterns
   – Different Disk Types / Speeds


• use --directoryperdb
• add symlink into database directory
Dynamically change log level
// Change logging level to get more info


> db.adminCommand({ setParameter: 1, logLevel: 1 })


> db.adminCommand({ setParameter: 1, logLevel: 0 })
Because you now have a
Plan B
           http://bit.ly/QlJULZ

More Related Content

ODP
MongoDB: Advance concepts - Replication and Sharding
PPTX
Datastax / Cassandra Modeling Strategies
PDF
Re-Engineering PostgreSQL as a Time-Series Database
PDF
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
PPTX
Performance Tuning and Optimization
PPTX
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
PPTX
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
PDF
Hadoop-2.6.0 Slides
MongoDB: Advance concepts - Replication and Sharding
Datastax / Cassandra Modeling Strategies
Re-Engineering PostgreSQL as a Time-Series Database
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
Performance Tuning and Optimization
Андрей Козлов (Altoros): Оптимизация производительности Cassandra
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Hadoop-2.6.0 Slides

What's hot (20)

PPTX
Bucket your partitions wisely - Cassandra summit 2016
PDF
MongoDB Performance Tuning
PPTX
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
PPTX
Google File System
PDF
Spark & Cassandra - DevFest Córdoba
PDF
Memcached
PPTX
Cassandra in Operation
PDF
Intro to py spark (and cassandra)
PPTX
MongoDB for Time Series Data Part 3: Sharding
PPTX
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
PDF
Intro to cassandra
PDF
Real time stream processing presentation at General Assemb.ly
PPTX
GFS & HDFS Introduction
PPTX
Tales from production with postgreSQL at scale
PDF
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
PPTX
MongoDB Roadmap
PDF
Design of Experiments on Federator Polystore Architecture
PDF
PostgreSQL High_Performance_Cheatsheet
PDF
Prácticas recomendadas en materia de arquitectura y errores que debes evitar
PDF
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
Bucket your partitions wisely - Cassandra summit 2016
MongoDB Performance Tuning
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Google File System
Spark & Cassandra - DevFest Córdoba
Memcached
Cassandra in Operation
Intro to py spark (and cassandra)
MongoDB for Time Series Data Part 3: Sharding
Lambda Architecture with Cassandra (Vaibhav Puranik, GumGum) | C* Summit 2016
Intro to cassandra
Real time stream processing presentation at General Assemb.ly
GFS & HDFS Introduction
Tales from production with postgreSQL at scale
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
MongoDB Roadmap
Design of Experiments on Federator Polystore Architecture
PostgreSQL High_Performance_Cheatsheet
Prácticas recomendadas en materia de arquitectura y errores que debes evitar
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
Ad

Viewers also liked (20)

PPT
MELA: Monitoring and Analyzing Elasticity of Cloud Services -- CloudCom 2013
PPTX
On Analyzing Elasticity Relationships of Cloud Services
PDF
B6 improve operational_efficiency_through_process_and_document_collaboration
PPTX
Top 10 it infrastructure manager interview questions and answers
PPT
SharePoint Jumpstart #1 Creating a SharePoint Strategy
PDF
How to Plan, Manage and Control SharePoint Projects
PPTX
Guiding a Successful SharePoint Implementation
PPTX
Top 10 it infrastructure interview questions and answers
PDF
World-Class Web Metrics by Dan Olsen
PDF
How to Manage Projects in SharePoint Using Out of the Box Features
PPT
Building a Project Management Information System with SharePoint
PPT
Project Management System
PPTX
How to implement SharePoint in your organization
PPTX
Top 15 toughest interview questions with answers
PPT
Project Planning Basics - Everything you need to start managing a project
PPTX
Top 15 tips to prepare every job interviews
PDF
Strategy Grand Tour
PDF
We Are Social's Guide To Building A Connected Strategy
PPT
Strategic Planning For Managers
PDF
Utilizing SharePoint for Project Management
MELA: Monitoring and Analyzing Elasticity of Cloud Services -- CloudCom 2013
On Analyzing Elasticity Relationships of Cloud Services
B6 improve operational_efficiency_through_process_and_document_collaboration
Top 10 it infrastructure manager interview questions and answers
SharePoint Jumpstart #1 Creating a SharePoint Strategy
How to Plan, Manage and Control SharePoint Projects
Guiding a Successful SharePoint Implementation
Top 10 it infrastructure interview questions and answers
World-Class Web Metrics by Dan Olsen
How to Manage Projects in SharePoint Using Out of the Box Features
Building a Project Management Information System with SharePoint
Project Management System
How to implement SharePoint in your organization
Top 15 toughest interview questions with answers
Project Planning Basics - Everything you need to start managing a project
Top 15 tips to prepare every job interviews
Strategy Grand Tour
We Are Social's Guide To Building A Connected Strategy
Strategic Planning For Managers
Utilizing SharePoint for Project Management
Ad

Similar to Deployment Preparedness (20)

PPTX
M6d cassandrapresentation
PDF
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
PPTX
Managing Security At 1M Events a Second using Elasticsearch
PPTX
Percona FT / TokuDB
PPTX
Ops Jumpstart: MongoDB Administration 101
PPTX
Cassandra training
PDF
Avoiding big data antipatterns
PPTX
Storage talk
PDF
Black friday logs - Scaling Elasticsearch
KEY
Deployment Strategies
PDF
Fast and Scalable Python
PPT
MongoDB Sharding Webinar 2014
PPT
MongoDB Roadmap
PPTX
MongoDB for Time Series Data: Sharding
PDF
MongoDB Basics Unileon
KEY
Deployment Strategy
PDF
Cassandra at Pollfish
PDF
Cassandra at Pollfish
PDF
mongodb tutorial
PDF
Mongodb in-anger-boston-rb-2011
M6d cassandrapresentation
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing Security At 1M Events a Second using Elasticsearch
Percona FT / TokuDB
Ops Jumpstart: MongoDB Administration 101
Cassandra training
Avoiding big data antipatterns
Storage talk
Black friday logs - Scaling Elasticsearch
Deployment Strategies
Fast and Scalable Python
MongoDB Sharding Webinar 2014
MongoDB Roadmap
MongoDB for Time Series Data: Sharding
MongoDB Basics Unileon
Deployment Strategy
Cassandra at Pollfish
Cassandra at Pollfish
mongodb tutorial
Mongodb in-anger-boston-rb-2011

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Deployment Preparedness

  • 2. Plan A because there is no Plan B http://bit.ly/QlJULZ
  • 4. Prototype Ops Playbook Test Capacity Planning Monitor Reinventing the wheel
  • 5. Essentials • Disable NUMA • Pick appropriate file-system (xfs, ext4) • Pick 64-bit O/S – Recent Linux kernel, Win2k8R2 • More RAM – Spend on RAM not Cores • Faster Disks – SSDs vs. SAN – Separate Journal and Data Files
  • 6. Key things to consider • Profiling – Baseline/Blue print: Understand what should happen – Ensure good Index usage • Monitoring – SNMP, munin, zabix, cacti, nagios – MongoDB Monitoring Service (MMS) • Sizing – Understand Capability (RAM, IOPs) – Understand Use Cases + Schema
  • 7. What is your SLA? • High Availability? – 24x7x365 operation? – Limited maintenance window? • Data Protection? – Failure of a Single Node? – Failure of a Data Center? • Disaster Recovery? – Manual or automatic failover? – Data Center, Region, Continent?
  • 8. Build & Test your Playbook • Backups • Restores (backups are not enough) • Upgrades • Replica Set Operations • Sharding Operations
  • 10. How to see metrics • mongostat • MongoDB plug ins for – munin, zabix, cacti, ganglia •Hosted Services – MMS - 10gen – Server Density, Cloudkick • Profiling
  • 12. Metrics in detail: opcounters • Counts: Insert, Update, Delete, Query, Commands • Operation counters are mostly straightforward: more is better • Some operations in a replica set primary are accounted differently in a secondary • getlastError(), system.status etc are also counted
  • 14. Metrics in detail: resident memory • Key metric: to a very high degree, the performance of a mongod is a measure of how much data fits in RAM. • If this quantity is stably lower than available physical memory, the mongod is likely performing well. • Correlated metrics: page faults, B-Tree misses
  • 16. Collection 1 Virtual Disk Address Space 1 Physical RAM Index 1 100 ns = 10,000,000 ns =
  • 17. Metrics in detail: page faults • This measures reads or writes to pages of data file that aren't resident in memory • If this is persistently non-zero, your data doesn't fit in memory. • Correlated metrics: resident memory, B-Tree misses, iostats
  • 18. Working Set > db.blogs.stats() { Size of data "ns" : "test.blogs", "count" : 1338330, "size" : 46915928, Average "avgObjSize" : 35.05557523181876, document size "storageSize" : 86092032, "numExtents" : 12, Size on disk (and "nindexes" : 2, in memory!) "lastExtentSize" : 20872960, "paddingFactor" : 1, "flags" : 0, "totalIndexSize" : 99860480, Size of all "indexSizes" : { indexes "_id_" : 55877632, "name_1" : 43982848 Size of each }, index "ok" : 1 }
  • 20. Metrics in detail: lock percentage and queues • By itself, lock % can be misleading: a high lock percentage just means that writing is happening. • But when lock % is high and queued readers or writers is non-zero, then the mongod probably at its write capacity. • Correlated metrics: iostats
  • 21. Log file Mon Dec 3 15:05:37 [conn81] getmore scaleout.nodes query: { ts: { $lte: new Date(1354547123142) } } cursorid:8607875337747748011 ntoreturn:0 keyUpdates:0 numYields: 216 locks(micros) r:615830 nreturned:27055 reslen:4194349 551ms
  • 22. explain, hint // explain() shows the plan used by the operation > db.c.find(<query>).explain() // hint() forces a query to use a specific index // x_1 is the name of the index from db.c.getIndexes() > db.c.find( {x:1} ).hint("x_1")
  • 24. Metrics in detail: B-Tree • Indicates b-tree accesses including page fault service during an index lookup • If misses are persistently non-zero, your indexes don't fit in RAM. (You might need to change or drop indexes, or shard your data.) • Correlated metrics: resident memory, page faults, iostats
  • 25. B-Trees' strengths • B-Tree indexes are designed for range queries over a single dimension • Think of a compound index on { A, B } as being an index on the concatenation of the A and B values in documents • MongoDB can use its indexes for sorting as well
  • 26. B-Trees' weaknesses • Ranges queries on the first field of a compound index are suboptimal • Range queries over multiple dimensions are suboptimal • In both these cases, a suboptimal index might be better than nothing, but best is to try to see if you can't change the problem
  • 27. Indexing dark corners • Some functionality can't currently always use indexes: – $where JavaScript clauses – $mod, $not, $ne – regex • Negation maybe transformed into a range query – Index can be used • Complicated regular expressions scan a whole index
  • 29. Warming the Cache > db.c.find( {unused_key: 1} ).explain() > db.c.find( {unused_key: 1} ) .hint( {random_index:1} ) .explain() # cat /data/db/* > /dev/null // New in 2.2 > db.runCommand( { touch: "blogs", data: true, index: true } )
  • 30. Journal on another disk •The journal's write load is very different than the data files – journal = append-only – data files = randomly accessed •Putting the journal on a separate disk or RAID (e.g., with a symlink) will minimize any seek-time related journaling overhead
  • 31. --directoryperdb • Allows storage tiering – Different access patterns – Different Disk Types / Speeds • use --directoryperdb • add symlink into database directory
  • 32. Dynamically change log level // Change logging level to get more info > db.adminCommand({ setParameter: 1, logLevel: 1 }) > db.adminCommand({ setParameter: 1, logLevel: 0 })
  • 33. Because you now have a Plan B http://bit.ly/QlJULZ