SlideShare a Scribd company logo
Performance Tuning and
Optimization
Jake Angerman
Sr. Solutions Architect, MongoDB
Agenda
• Definition of terms
• When to do it
• Measurement tools
• Effecting Change
• Examples
These slides and a recording of the presentation will
be available within a day or two.
Performance Tuning vs Optimizing
• Optimizing – Modifying a system to work more efficiently or use
fewer resources
• Performance Tuning – Modifying a system to handle increased load
Performance Tuning vs Optimizing
• Optimizing – Modifying a system to work more efficiently or use
fewer resources
• Performance Tuning – Modifying a system to handle increased load
Developmen
t
QA Production
Performance Tuning vs Optimizing
• Optimizing – Modifying a system to work more efficiently or use
fewer resources
• Performance Tuning – Modifying a system to handle increased load
Developmen
t
QA Production
Performance Tuning vs Optimizing
• Optimizing – Modifying a system to work more efficiently or use
fewer resources
• Performance Tuning – Modifying a system to handle increased load
Developmen
t
QA Production
Premature Optimization
• "There is no doubt that the grail of efficiency leads to
abuse. Programmers waste enormous amounts of time
thinking about, or worrying about, the speed of
noncritical parts of their programs, and these attempts at
efficiency actually have a strong negative impact when
debugging and maintenance are considered. We should
forget about small efficiencies, say about 97% of the
time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that
critical 3%."
- Donald Knuth, 1974
Premature Optimization
• "There is no doubt that the grail of efficiency leads to
abuse. Programmers waste enormous amounts of time
thinking about, or worrying about, the speed of
noncritical parts of their programs, and these attempts at
efficiency actually have a strong negative impact when
debugging and maintenance are considered. We should
forget about small efficiencies, say about 97% of the
time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that
critical 3%."
- Donald Knuth, 1974
Premature Optimization
• "There is no doubt that the grail of efficiency leads to
abuse. Programmers waste enormous amounts of time
thinking about, or worrying about, the speed of
noncritical parts of their programs, and these attempts at
efficiency actually have a strong negative impact when
debugging and maintenance are considered. We should
forget about small efficiencies, say about 97% of the
time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that
critical 3%."
- Donald Knuth, 1974
Measurement Tools
Log files, Profiler, Query Optimizer
mongod
log file
profiler (collection)
query engine
Explain plan – Query Planner
Jakes-MacBook-Pro(mongod-3.0.1)[PRIMARY] test> db.example.find({a:1}).explain() // using the old <3.0 syntax
{
"ok": 1,
"queryPlanner": {
"indexFilterSet": false,
"namespace": "test.example",
"parsedQuery": {
"a": {
"$eq": 1
}
},
"plannerVersion": 1,
"rejectedPlans": [ ],
"winningPlan": {
"direction": "forward",
"filter": {
"a": {
"$eq": 1
}
},
"stage": "COLLSCAN"
}
},
"serverInfo": {
"gitVersion": "534b5a3f9d10f00cd27737fbcd951032248b5952",
"host": "Jakes-MacBook-Pro.local",
"port": 27017,
"version": "3.0.1"
}
}
Explain plan – Adding an Index
Jakes-MacBook-Pro(mongod-3.0.1)[PRIMARY] test> db.example.ensureIndex({a:1})
Jakes-MacBook-Pro(mongod-3.0.1)[PRIMARY] test> db.example.find({a:1}).explain() // using the old <3.0 syntax
{
"ok": 1,
"queryPlanner": {
"indexFilterSet": false,
"namespace": "test.example",
"parsedQuery": {
"a": {
"$eq": 1
}
},
"plannerVersion": 1,
"rejectedPlans": [ ],
"winningPlan": {
"inputStage": {
"direction": "forward",
"indexBounds": {
"a": [
"[1.0, 1.0]"
]
},
"indexName": "a_1",
"isMultiKey": false,
"keyPattern": {
"a": 1
},
"stage": "IXSCAN"
},
"stage": "FETCH"
}
}
[…]
New Explain Syntax in MongoDB 3.0
• count, distinct, group, et al. now have an explain() method
> db.example.find({a:1}).count().explain() // <3.0
E QUERY TypeError: Object 3 has no method
'explain'
at (shell):1:32
> db.example.explain().find({a:1}).count() // 3.0
• Explain a remove operation without actually removing anything
> db.example.explain().remove({a:1}) // doesn't
remove anything
Explain Levels in MongoDB 3.0
• queryPlanner (default level): runs the query planner and chooses
the winning plan without actually executing the query
– Use case: "Which plan will MongoDB choose to run my query?"
• executionStats – runs the query optimizer, then runs the winning
plan to completion
– Use case: "How is my query performing?"
• allPlansExecution – same as executionStats, but returns all the
query plans, not just the winning plan.
– Use case: "I want as much information as possible to diagnose a
slow query."
Explain plan – Query Planner
Jakes-MacBook-Pro(mongod-3.0.1)[PRIMARY] test> db.example.explain().find({a:1}) // new 3.0 syntax, default level
{
"ok": 1,
"queryPlanner": {
"indexFilterSet": false,
"namespace": "test.example",
"parsedQuery": {
"a": {
"$eq": 1
}
},
"plannerVersion": 1,
"rejectedPlans": [ ],
"winningPlan": {
"inputStage": {
"direction": "forward",
"indexBounds": {
"a": [
"[1.0, 1.0]"
]
},
"indexName": "a_1",
"isMultiKey": false,
"keyPattern": {
"a": 1
},
"stage": "IXSCAN"
},
"stage": "FETCH"
}
}
[…]
queryPlanner (default level): runs
the query planner and chooses the
winning plan without actually
executing the query
Explain plan – Query Optimizer
> db.example.explain("executionStats").find({a:1}) // new 3.0 syntax
{
"executionStats": {
"executionStages": {
"advanced": 3,
"alreadyHasObj": 0,
"docsExamined": 3,
"executionTimeMillisEstimate": 0,
"inputStage": {
"advanced": 3,
"direction": "forward",
"dupsDropped": 0,
"dupsTested": 0,
"executionTimeMillisEstimate": 0,
"indexBounds": {
"a": [
"[1.0, 1.0]"
]
},
"indexName": "a_1",
"invalidates": 0,
"isEOF": 1,
"isMultiKey": false,
"keyPattern": {
"a": 1
},
"keysExamined": 3,
"matchTested": 0,
"nReturned": 3,
"needFetch": 0,
"needTime": 0,
"restoreState": 0,
"saveState": 0,
"seenInvalidated": 0,
"stage": "IXSCAN",
"works": 3
},
"invalidates": 0,
"isEOF": 1,
"nReturned": 3,
"needFetch": 0,
"needTime": 0,
"restoreState": 0,
"saveState": 0,
"stage": "FETCH",
"works": 4
},
"executionSuccess": true,
"executionTimeMillis": 0,
"nReturned": 3,
"totalDocsExamined": 3,
"totalKeysExamined": 3
},
"ok": 1,
"queryPlanner": {
[…]
}
}
executionStats – runs the query optimizer,
then runs the winning plan to completion
Profiler
• 1MB capped collection named system.profile per database, per replica set
• One document per operation
• Examples:
> db.setProfilingLevel(1) // log all operations greater than 100ms
> db.setProfilingLevel(1, 20) // log all operations greater than 20ms
> db.setProfilingLevel(2) // log all operations regardless of duration
> db.setProfilingLevel(0) // turn off profiling
> db.getProfilingStatus() // display current profiling level
{
"slowms": 100,
"was": 2
}
• In a sharded cluster, you will need to connect to each shard's primary
mongod, not mongos
mongod Log Files
Sun Jun 29 06:35:37.646 [conn2]
query test.docs query: {
parent.company: "22794",
parent.employeeId: "83881" }
ntoreturn:1 ntoskip:0
nscanned:806381 keyUpdates:0
numYields: 5 locks(micros)
r:2145254 nreturned:0 reslen:20
1156ms
date and time thread
operation
namespace
n…
counters
lock
times
duration
number
of yields
Parsing Log Files
mtools
• http://github.com/rueckstiess/mtools
• log file analysis for poorly performing queries
– Show me queries that took more than 1000 ms from 6
am to 6 pm:
$ mlogfilter mongodb.log --from 06:00 --to
18:00 --slow 1000 > mongodb-filtered.log
mtools graphs
% mplotqueries --type histogram --group namespace --bucketSize 3600
Command Line tools
• iostat
• dstat
• mongostat
• mongotop
• mongoperf
MMS
• Memory usage
• Opcounters
• Lock percentage
• Queues
• Background flush average
• Replication oplog window and lag
Effecting Change
Process
1. Measure current performance
2. Find the bottleneck (the hard part)
3. Remove the bottleneck
4. Measure again
5. Repeat as needed
What can you change?
• Schema design
• Access patterns
• Indexes
• Instance
• Hardware
Schema Design
• MongoDB schemas are built oppositely than relational schemas!
• Relational Schema:
– normalize data
– write complex queries to join the data
– let the query planner figure out how to make queries efficient
• MongoDB Schema:
– denormalize the data
– create a (potentially complex) schema with prior knowledge of
your actual (not just predicted) query patterns
– write simple queries
Example: Schema Design
Product catalog schema for retailer selling in 20 countries
{
_id: 375,
en_US: { name: …, description: …, <etc…> },
en_GB: { name: …, description: …, <etc…> },
fr_FR: { name: …, description: …, <etc…> },
fr_CA: { name: …, description: …, <etc…> },
de_DE: …,
de_CH: …,
<… and so on for other locales …>
}
Example: Schema Design
• What's good about this schema?
–Each document contains all the data
about the product across all possible
locales.
–It is the most efficient way to retrieve all
translations of a product in a single
query (English, French, German, etc).
Example: Schema Design
But that's not how the data was accessed
> db.catalog.find( { _id: 375 }, { en_US: true } );
> db.catalog.find( { _id: 375 }, { fr_FR: true } );
> db.catalog.find( { _id: 375 }, { de_DE: true } );
… and so forth for other locales
The data model did not fit the access pattern.
Example: Schema Design
Why is this inefficient?
Data in RED are being
used. Data in BLUE take
up memory but are not in
demand.
{
_id: 375,
en_US: { name: …, description: …, <etc…> },
en_GB: { name: …, description: …, <etc…> },
fr_FR: { name: …, description: …, <etc…> },
fr_CA: { name: …, description: …, <etc…> },
de_DE: …,
de_CH: …,
<… and so on for other locales …>
}
{
_id: 42,
en_US: { name: …, description: …, <etc…> },
en_GB: { name: …, description: …, <etc…> },
fr_FR: { name: …, description: …, <etc…> },
fr_CA: { name: …, description: …, <etc…> },
de_DE: …,
de_CH: …,
<… and so on for other locales …>
}
Example: Schema Design
• Consequences of this schema
– Each document contained 20x more data than
the common use case requires
– Disk IO was too high for the relatively modest
query load on the dataset
– MongoDB lets you request a subset of a
document's contents via projection…
– … but the entire document must be loaded
into RAM to service the request
Example: Schema Design
• Consequences of the schema redesign
– Queries induced minimal memory overhead
– 20x as many distinct products fit in RAM at
once
– Disk IO utilization reduced
– Application latency reduced
{
_id: "375-en_GB",
name: …,
description: …,
<… the rest of the document …>
}
Example: Access Patterns
• Application allowed searches for users by first and/or last name
Example: Access Patterns
• Application allowed searches for users by first and/or last name
Tue Jul 1 13:08:29.858 [conn581923] query db.users query: {
$query: {$and: [ { $and: [ { firstName: /((?i)QbobE)/ }, {
lastName: /((?i)QjonesE)/ } ] } ] }, $orderby: { lastName:
1 } } ntoreturn:25 ntoskip:0 nscanned:2626282 scanAndOrder:1
keyUpdates:0 numYields: 299 locks(micros) r:30536738
nreturned:14 reslen:8646 15504ms
Example: Access Patterns
• Application was searching for unindexed, case-insensitive, unanchored regular
expressions
• MongoDB is better at indexed, case-sensitive, left-anchored regular expressions
{
_id: 1,
firstName: "Bob",
lastName: "Jones"
}
{
_id: 1,
firstName: "Bob",
lastName: "Jones",
fn: "bob",
ln: "jones"
}
> db.users.ensureIndex({ln:1, fn:1})
> db.users.ensureIndex({fn:1, ln:1})
> db.users.find({fn:/^bob/}).sort
({ln:1})
Example: Indexing
• Slow Queries in the logs:
Sun Jun 29 06:35:37.646 [conn2] query test.docs query: {
parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1
ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros)
r:2145254 nreturned:0 reslen:20 1156ms
• But there's an index???!!!!
db.system.indexes.find().toArray()
[{
"v" : 1,
"key" : {
"company" : 1,
"employeeId" : 1
},
"ns" : "test.docs",
"name" : "company_1_employeeId_1"
}]
Example: Indexing
• Answer: there needs to be an index on the subdocument's fields
Sun Jun 29 06:35:37.646 [conn2] query test.docs query: {
parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1
ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros)
r:2145254 nreturned:0 reslen:20 1156ms
db.system.indexes.find().toArray()
[{
"v" : 1,
"key" : {
"parent.company" : 1,
"parent.employeeId" : 1
},
"ns" : "test.docs",
"name" :"parent.company_1_parent.employeeId_1"
}]
Indexing Suggestions
• Create indexes that support your queries!
• Create highly selective indexes
• Don't create unnecessary indexes
• Eliminate duplicate indexes with a compound index, if possible
> db.collection.ensureIndex({A:1, B:1, C:1})
– allows queries using leftmost prefix
• Order compound index fields thusly: equality, sort, then range
– see http://emptysqua.re/blog/optimizing-mongodb-compound-indexes/
• Create indexes that support covered queries
• Prevent collection scans in pre-production environments
$ mongod --notablescan
> db.getSiblingDB("admin").runCommand( { setParameter: 1,
notablescan: 1 } )
Example: Hardware
Do's and Don’ts
• Do:
– Read production notes in MongoDB documentation
– Eliminate suspects in the right order (schema,
indexes, operations, instance, hardware)
– Know what is considered "normal" behavior by
monitoring
• Don't:
– confuse symptoms with root causes
– shard a poorly performing system
25% off discount code: JakeAngerman
Webinar: Performance Tuning + Optimization

More Related Content

PPTX
How to Achieve Scale with MongoDB
PPTX
Performance Tuning and Optimization
PPT
Migrating to MongoDB: Best Practices
PPTX
Scaling with MongoDB
PPTX
MongoDB Auto-Sharding at Mongo Seattle
PPTX
Webinar: Scaling MongoDB
PPTX
Scaling MongoDB
PPTX
MongoDB at Scale
How to Achieve Scale with MongoDB
Performance Tuning and Optimization
Migrating to MongoDB: Best Practices
Scaling with MongoDB
MongoDB Auto-Sharding at Mongo Seattle
Webinar: Scaling MongoDB
Scaling MongoDB
MongoDB at Scale

What's hot (20)

PPTX
MongoDB Best Practices for Developers
PPTX
Sharding Methods for MongoDB
PPTX
High Performance Applications with MongoDB
PPTX
Webinar: When to Use MongoDB
PPTX
Agility and Scalability with MongoDB
PDF
Webinar: Schema Patterns and Your Storage Engine
PPT
Everything You Need to Know About Sharding
KEY
MongoDB Best Practices in AWS
PPTX
Back to Basics 2017: Introduction to Sharding
PPTX
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
PPTX
Webinar: Avoiding Sub-optimal Performance in your Retail Application
PPTX
Back to Basics Webinar 6: Production Deployment
PDF
Mongodb - Scaling write performance
PDF
Migrating from RDBMS to MongoDB
PPTX
Webinar: Choosing the Right Shard Key for High Performance and Scale
PPTX
Document Validation in MongoDB 3.2
PPTX
MongoDB Aggregation Performance
PPTX
MongoDB Deployment Checklist
PPT
5 Pitfalls to Avoid with MongoDB
PPTX
I have a good shard key now what - Advanced Sharding
MongoDB Best Practices for Developers
Sharding Methods for MongoDB
High Performance Applications with MongoDB
Webinar: When to Use MongoDB
Agility and Scalability with MongoDB
Webinar: Schema Patterns and Your Storage Engine
Everything You Need to Know About Sharding
MongoDB Best Practices in AWS
Back to Basics 2017: Introduction to Sharding
Breaking the Oracle Tie; High Performance OLTP and Analytics Using MongoDB
Webinar: Avoiding Sub-optimal Performance in your Retail Application
Back to Basics Webinar 6: Production Deployment
Mongodb - Scaling write performance
Migrating from RDBMS to MongoDB
Webinar: Choosing the Right Shard Key for High Performance and Scale
Document Validation in MongoDB 3.2
MongoDB Aggregation Performance
MongoDB Deployment Checklist
5 Pitfalls to Avoid with MongoDB
I have a good shard key now what - Advanced Sharding
Ad

Viewers also liked (20)

PDF
MongoDB Performance Tuning
PDF
Mongodb debugging-performance-problems
PPTX
Gestion des données d'entreprise à l'ère de MongoDB et du Data Lake
KEY
MongoDB vs Mysql. A devops point of view
PDF
MongoDB at the energy frontier
PPTX
Indexing In MongoDB
PPTX
SSecuring Your MongoDB Deployment
PPTX
Ops Jumpstart: MongoDB Administration 101
PPTX
Replication and Replica Sets
PDF
Mongo db security guide
PPSX
Lesson11 Create Query
PPSX
Lesson4 Protect and maintain databases
PPTX
التحدى 6 الإستعلام بطريقة المعالج
PDF
Trabalho fitos digitais
PPSX
Lesson8 Manage Records
PPTX
MarketLine Country Statistics Database
PPTX
Oracle hard and soft parsing
PPTX
Chapter 11new
PPTX
MongoDB 2.4 Security Features
PPSX
Lesson5 Print and export databases
MongoDB Performance Tuning
Mongodb debugging-performance-problems
Gestion des données d'entreprise à l'ère de MongoDB et du Data Lake
MongoDB vs Mysql. A devops point of view
MongoDB at the energy frontier
Indexing In MongoDB
SSecuring Your MongoDB Deployment
Ops Jumpstart: MongoDB Administration 101
Replication and Replica Sets
Mongo db security guide
Lesson11 Create Query
Lesson4 Protect and maintain databases
التحدى 6 الإستعلام بطريقة المعالج
Trabalho fitos digitais
Lesson8 Manage Records
MarketLine Country Statistics Database
Oracle hard and soft parsing
Chapter 11new
MongoDB 2.4 Security Features
Lesson5 Print and export databases
Ad

Similar to Webinar: Performance Tuning + Optimization (20)

PPTX
Hadoop cluster performance profiler
PDF
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
PPTX
splunkquickstartsplunkquickstartsplunkquickstart
PDF
Crating a Robust Performance Strategy
PDF
The Ring programming language version 1.5.3 book - Part 7 of 184
PPTX
Webinar: Best Practices for Upgrading to MongoDB 3.2
PDF
Shorten Device Boot Time for Automotive IVI and Navigation Systems
PPTX
Oracle Database Performance Tuning Basics
PDF
Winning performance challenges in oracle standard editions
PDF
The Ring programming language version 1.5.1 book - Part 6 of 180
PDF
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
PDF
제3회난공불락 오픈소스 인프라세미나 - MySQL Performance
PPTX
Presentación Oracle Database Migración consideraciones 10g/11g/12c
PDF
22-4_PerformanceTuningUsingtheAdvisorFramework.pdf
PDF
What’s eating python performance
PDF
How to use Impala query plan and profile to fix performance issues
PPTX
Welcome Webinar Slides
PDF
Seven deadly sins of ElasticSearch Benchmarking
PPTX
Ruby3x3: How are we going to measure 3x
PDF
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Hadoop cluster performance profiler
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
splunkquickstartsplunkquickstartsplunkquickstart
Crating a Robust Performance Strategy
The Ring programming language version 1.5.3 book - Part 7 of 184
Webinar: Best Practices for Upgrading to MongoDB 3.2
Shorten Device Boot Time for Automotive IVI and Navigation Systems
Oracle Database Performance Tuning Basics
Winning performance challenges in oracle standard editions
The Ring programming language version 1.5.1 book - Part 6 of 180
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
제3회난공불락 오픈소스 인프라세미나 - MySQL Performance
Presentación Oracle Database Migración consideraciones 10g/11g/12c
22-4_PerformanceTuningUsingtheAdvisorFramework.pdf
What’s eating python performance
How to use Impala query plan and profile to fix performance issues
Welcome Webinar Slides
Seven deadly sins of ElasticSearch Benchmarking
Ruby3x3: How are we going to measure 3x
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
A Presentation on Artificial Intelligence
PPT
Teaching material agriculture food technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
KodekX | Application Modernization Development
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Per capita expenditure prediction using model stacking based on satellite ima...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
A Presentation on Artificial Intelligence
Teaching material agriculture food technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KodekX | Application Modernization Development
Review of recent advances in non-invasive hemoglobin estimation
Chapter 3 Spatial Domain Image Processing.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication

Webinar: Performance Tuning + Optimization

  • 1. Performance Tuning and Optimization Jake Angerman Sr. Solutions Architect, MongoDB
  • 2. Agenda • Definition of terms • When to do it • Measurement tools • Effecting Change • Examples These slides and a recording of the presentation will be available within a day or two.
  • 3. Performance Tuning vs Optimizing • Optimizing – Modifying a system to work more efficiently or use fewer resources • Performance Tuning – Modifying a system to handle increased load
  • 4. Performance Tuning vs Optimizing • Optimizing – Modifying a system to work more efficiently or use fewer resources • Performance Tuning – Modifying a system to handle increased load Developmen t QA Production
  • 5. Performance Tuning vs Optimizing • Optimizing – Modifying a system to work more efficiently or use fewer resources • Performance Tuning – Modifying a system to handle increased load Developmen t QA Production
  • 6. Performance Tuning vs Optimizing • Optimizing – Modifying a system to work more efficiently or use fewer resources • Performance Tuning – Modifying a system to handle increased load Developmen t QA Production
  • 7. Premature Optimization • "There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." - Donald Knuth, 1974
  • 8. Premature Optimization • "There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." - Donald Knuth, 1974
  • 9. Premature Optimization • "There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." - Donald Knuth, 1974
  • 11. Log files, Profiler, Query Optimizer mongod log file profiler (collection) query engine
  • 12. Explain plan – Query Planner Jakes-MacBook-Pro(mongod-3.0.1)[PRIMARY] test> db.example.find({a:1}).explain() // using the old <3.0 syntax { "ok": 1, "queryPlanner": { "indexFilterSet": false, "namespace": "test.example", "parsedQuery": { "a": { "$eq": 1 } }, "plannerVersion": 1, "rejectedPlans": [ ], "winningPlan": { "direction": "forward", "filter": { "a": { "$eq": 1 } }, "stage": "COLLSCAN" } }, "serverInfo": { "gitVersion": "534b5a3f9d10f00cd27737fbcd951032248b5952", "host": "Jakes-MacBook-Pro.local", "port": 27017, "version": "3.0.1" } }
  • 13. Explain plan – Adding an Index Jakes-MacBook-Pro(mongod-3.0.1)[PRIMARY] test> db.example.ensureIndex({a:1}) Jakes-MacBook-Pro(mongod-3.0.1)[PRIMARY] test> db.example.find({a:1}).explain() // using the old <3.0 syntax { "ok": 1, "queryPlanner": { "indexFilterSet": false, "namespace": "test.example", "parsedQuery": { "a": { "$eq": 1 } }, "plannerVersion": 1, "rejectedPlans": [ ], "winningPlan": { "inputStage": { "direction": "forward", "indexBounds": { "a": [ "[1.0, 1.0]" ] }, "indexName": "a_1", "isMultiKey": false, "keyPattern": { "a": 1 }, "stage": "IXSCAN" }, "stage": "FETCH" } } […]
  • 14. New Explain Syntax in MongoDB 3.0 • count, distinct, group, et al. now have an explain() method > db.example.find({a:1}).count().explain() // <3.0 E QUERY TypeError: Object 3 has no method 'explain' at (shell):1:32 > db.example.explain().find({a:1}).count() // 3.0 • Explain a remove operation without actually removing anything > db.example.explain().remove({a:1}) // doesn't remove anything
  • 15. Explain Levels in MongoDB 3.0 • queryPlanner (default level): runs the query planner and chooses the winning plan without actually executing the query – Use case: "Which plan will MongoDB choose to run my query?" • executionStats – runs the query optimizer, then runs the winning plan to completion – Use case: "How is my query performing?" • allPlansExecution – same as executionStats, but returns all the query plans, not just the winning plan. – Use case: "I want as much information as possible to diagnose a slow query."
  • 16. Explain plan – Query Planner Jakes-MacBook-Pro(mongod-3.0.1)[PRIMARY] test> db.example.explain().find({a:1}) // new 3.0 syntax, default level { "ok": 1, "queryPlanner": { "indexFilterSet": false, "namespace": "test.example", "parsedQuery": { "a": { "$eq": 1 } }, "plannerVersion": 1, "rejectedPlans": [ ], "winningPlan": { "inputStage": { "direction": "forward", "indexBounds": { "a": [ "[1.0, 1.0]" ] }, "indexName": "a_1", "isMultiKey": false, "keyPattern": { "a": 1 }, "stage": "IXSCAN" }, "stage": "FETCH" } } […] queryPlanner (default level): runs the query planner and chooses the winning plan without actually executing the query
  • 17. Explain plan – Query Optimizer > db.example.explain("executionStats").find({a:1}) // new 3.0 syntax { "executionStats": { "executionStages": { "advanced": 3, "alreadyHasObj": 0, "docsExamined": 3, "executionTimeMillisEstimate": 0, "inputStage": { "advanced": 3, "direction": "forward", "dupsDropped": 0, "dupsTested": 0, "executionTimeMillisEstimate": 0, "indexBounds": { "a": [ "[1.0, 1.0]" ] }, "indexName": "a_1", "invalidates": 0, "isEOF": 1, "isMultiKey": false, "keyPattern": { "a": 1 }, "keysExamined": 3, "matchTested": 0, "nReturned": 3, "needFetch": 0, "needTime": 0, "restoreState": 0, "saveState": 0, "seenInvalidated": 0, "stage": "IXSCAN", "works": 3 }, "invalidates": 0, "isEOF": 1, "nReturned": 3, "needFetch": 0, "needTime": 0, "restoreState": 0, "saveState": 0, "stage": "FETCH", "works": 4 }, "executionSuccess": true, "executionTimeMillis": 0, "nReturned": 3, "totalDocsExamined": 3, "totalKeysExamined": 3 }, "ok": 1, "queryPlanner": { […] } } executionStats – runs the query optimizer, then runs the winning plan to completion
  • 18. Profiler • 1MB capped collection named system.profile per database, per replica set • One document per operation • Examples: > db.setProfilingLevel(1) // log all operations greater than 100ms > db.setProfilingLevel(1, 20) // log all operations greater than 20ms > db.setProfilingLevel(2) // log all operations regardless of duration > db.setProfilingLevel(0) // turn off profiling > db.getProfilingStatus() // display current profiling level { "slowms": 100, "was": 2 } • In a sharded cluster, you will need to connect to each shard's primary mongod, not mongos
  • 19. mongod Log Files Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms date and time thread operation namespace n… counters lock times duration number of yields
  • 21. mtools • http://github.com/rueckstiess/mtools • log file analysis for poorly performing queries – Show me queries that took more than 1000 ms from 6 am to 6 pm: $ mlogfilter mongodb.log --from 06:00 --to 18:00 --slow 1000 > mongodb-filtered.log
  • 22. mtools graphs % mplotqueries --type histogram --group namespace --bucketSize 3600
  • 23. Command Line tools • iostat • dstat • mongostat • mongotop • mongoperf
  • 24. MMS • Memory usage • Opcounters • Lock percentage • Queues • Background flush average • Replication oplog window and lag
  • 26. Process 1. Measure current performance 2. Find the bottleneck (the hard part) 3. Remove the bottleneck 4. Measure again 5. Repeat as needed
  • 27. What can you change? • Schema design • Access patterns • Indexes • Instance • Hardware
  • 28. Schema Design • MongoDB schemas are built oppositely than relational schemas! • Relational Schema: – normalize data – write complex queries to join the data – let the query planner figure out how to make queries efficient • MongoDB Schema: – denormalize the data – create a (potentially complex) schema with prior knowledge of your actual (not just predicted) query patterns – write simple queries
  • 29. Example: Schema Design Product catalog schema for retailer selling in 20 countries { _id: 375, en_US: { name: …, description: …, <etc…> }, en_GB: { name: …, description: …, <etc…> }, fr_FR: { name: …, description: …, <etc…> }, fr_CA: { name: …, description: …, <etc…> }, de_DE: …, de_CH: …, <… and so on for other locales …> }
  • 30. Example: Schema Design • What's good about this schema? –Each document contains all the data about the product across all possible locales. –It is the most efficient way to retrieve all translations of a product in a single query (English, French, German, etc).
  • 31. Example: Schema Design But that's not how the data was accessed > db.catalog.find( { _id: 375 }, { en_US: true } ); > db.catalog.find( { _id: 375 }, { fr_FR: true } ); > db.catalog.find( { _id: 375 }, { de_DE: true } ); … and so forth for other locales The data model did not fit the access pattern.
  • 32. Example: Schema Design Why is this inefficient? Data in RED are being used. Data in BLUE take up memory but are not in demand. { _id: 375, en_US: { name: …, description: …, <etc…> }, en_GB: { name: …, description: …, <etc…> }, fr_FR: { name: …, description: …, <etc…> }, fr_CA: { name: …, description: …, <etc…> }, de_DE: …, de_CH: …, <… and so on for other locales …> } { _id: 42, en_US: { name: …, description: …, <etc…> }, en_GB: { name: …, description: …, <etc…> }, fr_FR: { name: …, description: …, <etc…> }, fr_CA: { name: …, description: …, <etc…> }, de_DE: …, de_CH: …, <… and so on for other locales …> }
  • 33. Example: Schema Design • Consequences of this schema – Each document contained 20x more data than the common use case requires – Disk IO was too high for the relatively modest query load on the dataset – MongoDB lets you request a subset of a document's contents via projection… – … but the entire document must be loaded into RAM to service the request
  • 34. Example: Schema Design • Consequences of the schema redesign – Queries induced minimal memory overhead – 20x as many distinct products fit in RAM at once – Disk IO utilization reduced – Application latency reduced { _id: "375-en_GB", name: …, description: …, <… the rest of the document …> }
  • 35. Example: Access Patterns • Application allowed searches for users by first and/or last name
  • 36. Example: Access Patterns • Application allowed searches for users by first and/or last name Tue Jul 1 13:08:29.858 [conn581923] query db.users query: { $query: {$and: [ { $and: [ { firstName: /((?i)QbobE)/ }, { lastName: /((?i)QjonesE)/ } ] } ] }, $orderby: { lastName: 1 } } ntoreturn:25 ntoskip:0 nscanned:2626282 scanAndOrder:1 keyUpdates:0 numYields: 299 locks(micros) r:30536738 nreturned:14 reslen:8646 15504ms
  • 37. Example: Access Patterns • Application was searching for unindexed, case-insensitive, unanchored regular expressions • MongoDB is better at indexed, case-sensitive, left-anchored regular expressions { _id: 1, firstName: "Bob", lastName: "Jones" } { _id: 1, firstName: "Bob", lastName: "Jones", fn: "bob", ln: "jones" } > db.users.ensureIndex({ln:1, fn:1}) > db.users.ensureIndex({fn:1, ln:1}) > db.users.find({fn:/^bob/}).sort ({ln:1})
  • 38. Example: Indexing • Slow Queries in the logs: Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms • But there's an index???!!!! db.system.indexes.find().toArray() [{ "v" : 1, "key" : { "company" : 1, "employeeId" : 1 }, "ns" : "test.docs", "name" : "company_1_employeeId_1" }]
  • 39. Example: Indexing • Answer: there needs to be an index on the subdocument's fields Sun Jun 29 06:35:37.646 [conn2] query test.docs query: { parent.company: "22794", parent.employeeId: "83881" } ntoreturn:1 ntoskip:0 nscanned:806381 keyUpdates:0 numYields: 5 locks(micros) r:2145254 nreturned:0 reslen:20 1156ms db.system.indexes.find().toArray() [{ "v" : 1, "key" : { "parent.company" : 1, "parent.employeeId" : 1 }, "ns" : "test.docs", "name" :"parent.company_1_parent.employeeId_1" }]
  • 40. Indexing Suggestions • Create indexes that support your queries! • Create highly selective indexes • Don't create unnecessary indexes • Eliminate duplicate indexes with a compound index, if possible > db.collection.ensureIndex({A:1, B:1, C:1}) – allows queries using leftmost prefix • Order compound index fields thusly: equality, sort, then range – see http://emptysqua.re/blog/optimizing-mongodb-compound-indexes/ • Create indexes that support covered queries • Prevent collection scans in pre-production environments $ mongod --notablescan > db.getSiblingDB("admin").runCommand( { setParameter: 1, notablescan: 1 } )
  • 42. Do's and Don’ts • Do: – Read production notes in MongoDB documentation – Eliminate suspects in the right order (schema, indexes, operations, instance, hardware) – Know what is considered "normal" behavior by monitoring • Don't: – confuse symptoms with root causes – shard a poorly performing system
  • 43. 25% off discount code: JakeAngerman