SlideShare a Scribd company logo
Ger Hartnett & Alan Spencer 
MongoDB Dublin
Overview 
• Fictional story of a startup using MongoDB & 
MEAN stack to build IoT application 
• We’ll take a devops perspective - show you what 
to watch out for a framework like MEAN 
• Tips you can use to help development team focus 
on the right things when close to production 
• Questions 
• How many from operations? 
• How many from development? 
2
5 Things we Learned 
Capacity planning/prototyping is a good idea but 
performance is sensitive to sample test data 
The MEAN stack rocks - fast to get started - profiler 
can help you understand what’s under the hood 
Realtime/incremental aggregation works well with IoT 
workloads - the “MMS approach” 
With NodeJS/Express number of app servers becomes 
bottleneck before MongoDB 
Performance tuning patterns apply - "bottleneck 
whack-a-mole" & “slam-dunk-optimization” 
3
Context: IoT & MEAN
Internet of Things 
Big Data => Humongous Data 
“The rise of device oriented development … 
new architectural and workflow challenges 
… distinctly different from … web and 
mobile development so far.” - Morten Bagai
Internet of Things 
• Bosch: “IoT brings 
root and branch 
changes to the 
world of business” 
• Richard Kreuter's 
Webinar May 2013 
• Earlier bootcamp 
looked at sharding 
IoT 
6 
Photo by jurvetson - Creative Commons Attribution License - http://www.flickr.com/photos/jurvetson/916142
MEAN stack 
7 
MongoDB - the database 
Express - web app framework/router 
Angular - browser HTML/JS MVC 
Node - javascript application server 
Photo by benmizen - Creative Commons ShareAlike License - http://www.flickr.com/photos/benmizen/9456440635
Learn more about MEAN 
Valeri Karpov - MongoDB Kernel Tools Team 
http://thecodebarbarian.wordpress.com/2013/07/22/ 
introduction-to-the-mean-stack-part-one-setting-up-your-tools/ 
MEAN.io 
http://mean.io 
8
About MongoDB Bootcamp 
We invest in technical new hires 
Everyone does “bootcamp” 
NYC for 2 weeks - product internals 
Then work on a longer project 3-4 weeks 
In our case: wanted to do a bit of everything, 
capacity planning, iterate user-stories, MongoDB 
a component 
9
The Application
Location based advertising - IoMT 
11 
! 
! 
Advertiser 
! 
Advertiser Advertiser 
! 
! 
! 
! 
Customer 
! 
• IoT example 3 from Richard’s Webinar
User Stories - for the application 
US1 - customer looks 
for advertisers near 
US2 - advertiser wants 
to see how many 
customers saw offer 
US3 - find hot spots 
where many customers 
but few advertisers 
12 
Photo by consumerist - Creative Commons Attribution License - http://www.flickr.com/photos/consumerist/2158190589
Document / Model / Controller 
Model (advertiser.js) Document 
Haystack examples sent us in 
wrong direction initially 
exports.all = function(req, res) {! 
! findQuery = { near: [ Number(req.query.lng), Number(req.query.lat) ],! 
! ! maxDistance: Number(req.query.dist) };! 
! Advertiser.geoSearch({kind:"pub"}, findQuery, ! 
! ! function (err, advertisers) {! 
// error handling! 
! !! res.jsonp(advertisers);! 
! ! });! 
} 
13 
{ 
name: ‘Long Hall’, 
pos: [-6.265535, 53.3418364], 
kind: “pub” 
} 
AdvertiserSchema = new Schema({! 
name: { type: String,! 
default: ‘’},! 
pos: [Number],! 
kind: { type: String,! 
default: ‘place’},! 
}); Controller (advertisers.js)
CRUD interface & Mongoose 
CRUD 
interface 
! 
Raised & 
fixed bug in 
Mongoose, 
pull request 
merged 
14
5 Things we Learned 
Capacity planning/prototyping is a good idea but 
performance is sensitive to sample test data 
The MEAN stack rocks - fast to get started - profiler 
can help you understand what’s under the hood 
Realtime/incremental aggregation works well with IoT 
workloads - the “MMS approach” 
With NodeJS/Express number of app servers becomes 
bottleneck before MongoDB 
Performance tuning patterns apply - "bottleneck 
whack-a-mole" & “slam-dunk-optimization” 
15
US1 Initial Measurements 
MongoDB shell scripts 
9 advertisers, small area, distance 10km 
MongoDB has 5 kinds of geo query 3 kinds of geo 
index 
geoSearch (haystack) looked much better than 
others (our 1st mistake) 
TIP: performance is sensitive to test data & query 
16
5 Things we Learned 
Capacity planning/prototyping is a good idea but 
performance is sensitive to sample test data 
The MEAN stack rocks - fast to get started - profiler 
can help you understand what’s under the hood 
Realtime/incremental aggregation works well with IoT 
workloads - the “MMS approach” 
With NodeJS/Express number of app servers becomes 
bottleneck before MongoDB 
Performance tuning patterns apply - "bottleneck 
whack-a-mole" & “slam-dunk-optimization” 
17
The good thing about frameworks is… 
! 
they do lot’s of things for developers 
! 
! 
! 
…and the bad thing about frameworks? 
! 
they do lot’s of things for developers
To find out what’s happening - debug 
We used Express passport-http to add Basic- 
Digest auth (client id lookup) 
It can be hard to figure out what a framework like 
express/mongoose really does 
Tip: mongoose.set('debug', true) - detailed logging 
Console 
Mongoose: clients.findOne({ _id: ObjectId(“…”) })! 
Mongoose: advertisers.geoHaystack({…[-6.267765, 53.34087]})! 
19
Find out what’s happening - profiler 
Tip: The MongoDB profiler shows operations 
really happening on DB, check with dev 
20 
db.system.profile.find 
{"op":"query", "ns":"tings.clients",...! 
{“op":"command", "command":{"geoSearch"...! 
{"op" :"update","ns":"tings.sessions"...! 
exports.all = function(req, res) {! 
. . .! 
! ! ! req.session = null;! 
! !! res.jsonp(advertisers);! 
} 
10% performance 
improvement 
Where did that 
come from? 
Fixing it is not obvious
Back to the application
US2 means we built on US1 
US1 - customer looks 
for advertisers near 
• Need to store 
customer location 
US2 - advertiser wants 
to see how many 
customers near 
22 
Being a startup we decided to 
take a naive pragmatic approach: 
• Store all samples 
• US2 aggregates on-demand 
Photo by consumerist - Creative Commons Attribution License - http://www.flickr.com/photos/consumerist/2158190589
5 Things we Learned 
Capacity planning/prototyping is a good idea but 
performance is sensitive to sample test data 
The MEAN stack rocks - fast to get started - profiler 
can help you understand what’s under the hood 
Realtime/incremental aggregation works well with IoT 
workloads - the “MMS approach” 
With NodeJS/Express number of app servers becomes 
bottleneck before MongoDB 
Performance tuning patterns apply - "bottleneck 
whack-a-mole" & “slam-dunk-optimization” 
23
US2 - Aggregation of Raw Samples 
1 hour of raw samples @ 2k RPS 
= 7.2M documents 
! 
Aggregation on 7.2M raw samples 
took 1 second on our instances 
Significant impact 
• Run every 2 seconds 
RPS dropped by factor of 4! 
(single instance) 
Samples 
Query 
Aggregate 
24 
Raw 
Insert 
Aggregate
US2 - Pre aggregation 
Samples 
Query 
Aggregate 
25 
Raw 
Insert 
Samples 
Pre 
Aggregate 
! 
Update 
Query 
Aggregate 
Aggregate Aggregate 
An MMS type approach 
Document for 
advertiser-customer-month 
! 
Using update multi-true 
(more on this later) 
! 
Query now only needs to 
aggregate unique 
customers
US1 measurements revisited 
MongoDB shell scripts 
More realistic data - old measurements repeated 
locations 
110k advertisers with clusters in DUB and NYC 
Performance best for near and nearSphere (2x 
better than Haystack) 
26
Where does the time go? 
27 
• Express/Mongoose/Node 
• Customer Lookup 
• Find ($near) 
• Save Sample DB 
• Save Sample File 
• Preagg=multiple docs (6) 
• Preagg=multi-update 1 doc
5 Things we Learned 
Capacity planning/prototyping is a good idea but 
performance is sensitive to sample test data 
The MEAN stack rocks - fast to get started - profiler 
can help you understand what’s under the hood 
Realtime/incremental aggregation works well with IoT 
workloads - the “MMS approach” 
With NodeJS/Express number of app servers becomes 
bottleneck before MongoDB 
Performance tuning patterns apply - "bottleneck 
whack-a-mole" & “slam-dunk-optimization” 
28
NodeJS MongoD 
MongoD 
Deployment 
Chrome:Postman 
29 
NodeJS 
HAproxy 
NodeLoad 
NodeJS 
NodeJS
Scaling 
30
5 Things we Learned 
Capacity planning/prototyping is a good idea but 
performance is sensitive to sample test data 
The MEAN stack rocks - fast to get started - profiler 
can help you understand what’s under the hood 
Realtime/incremental aggregation works well with IoT 
workloads - the “MMS approach” 
With NodeJS/Express number of app servers becomes 
bottleneck before MongoDB 
Performance tuning patterns apply - "bottleneck 
whack-a-mole" & “slam-dunk-optimization” 
31
1 - number of Node.JS 
2 - HAproxy 
3 - load gen threads/BW
Pattern: “slam dunk optimization" 
2 
* NodeJS MongoD 
MongoD 
Chrome:Postman 
33 
NodeJS 
HAproxy 
NodeLoad 
NodeJS 
NodeJS 
3 
1
Performance tips 
1. Increase number of Node.JS 
2. Increase perf of proxy/balancer instance 
34 
HAproxy more balanced than Amazon ELB 
3. Tweak Nodeload (generates/measures REST) 
Nodeload concurrency 3x Node servers 
Run Nodeload on same machine as HAproxy 
Development recommendation: Postman chrome 
ext - generates REST / Basic Auth
Back to the application
US3 Overview 
What are the top 10 hot sales areas? 
• What is an “area”…? 
Requirements 
• Little impact, easy to calculate 
• Approx. Regular size 
• Optimal approx. distance - “bounding areas” 
• Plays nice with sharding 
Internals of haystack, 2dsphere? Polygon? MGRS? 
36
US3 - Hot box - Sales, go sell! 
37
MGRS - Military Grid Reference 
System 
38 
• 4QFJ123678 precision level 100m 
Image by Mikael Rittri - Creative Commons ShareAlike License 
http://en.wikipedia.org/wiki/File:MGRSgridHawaiiSchemeAARealigned.png
MGRS - But at the poles… 
39 
Image by Mikael Rittri - Creative Commons ShareAlike License 
http://en.wikipedia.org/wiki/File:MGRSgridNorthPole.png
Introducing the ‘box’
The “box” - the poor-man’s MGRS 
x 
• Reinvented the sphere 
• Long/lat -> box number 
• Tailored to specific distance 
• Boxes are at least 1km 
• Search in current and 8 
neighbouring boxes 
! 
• Filter outside circle in JS 
• Performed relatively well 
• Can be used to shard 
41
Replication 
42
Impact of Replication 
43 
Secondary reads 
! 
Worked for this app 
! 
Beware - don’t try 
this at home!
Apply the production notes 
Change from default readahead 
Disable NUMA & THP 
ext4 or XFS 
noatime 
Load test workload on different configurations 
Instance Store / EBS (PIOPs) 
SSDs / spinning rust 
AWS instance types 
44
Recap
5 Things we Learned 
Capacity planning/prototyping is a good idea but 
performance is sensitive to sample test data 
The MEAN stack rocks - fast to get started but profiler 
can help you understand what’s under the hood 
Realtime/incremental aggregation works well with IoT 
workloads - the “MMS approach” 
Performance tuning patterns apply - "bottleneck 
whack-a-mole" & “slam-dunk-optimization” 
With NodeJS/Express number of app servers becomes 
bottleneck before MongoDB 
46
Next Steps
Next Steps 
Plan to publish as blog post series and github 
project 
! 
Check blog.mongodb.org 
! 
Continue to explore… 
48
Next Steps - continuation 
Hadoop/YARN for aggregations 
Use “box” to geo-shard 
Try 2.6 bulk updates 
Dynamic angular-google-maps with socket-io 
Implement in another framework (Go/Clojure) to 
load MongoDB with less hardware 
Find balance between batch and pre-aggregation 
49 
(see next slide)
Learn More & Thank You 
Introduction to MEAN - Valeri Karpov 
http://thecodebarbarian.wordpress.com/2013/07/22/introduction-to-the-mean-stack-part-one-setting-up-your- 
tools/ 
MEAN.io 
http://mean.io 
Richard Kreuter's webinar - M2M 
http://www.mongodb.com/presentations/webinar-realizing-promise-machine-machine-m2m-mongodb 
Building MongoDB Into Your Internet of Things 
http://blog.mongohq.com/building-mongodb-into-your-internet-of-things-a-tutorial/ 
Schema design for time series data (MMS) 
http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb 
50
MongoDB and the MEAN Stack

More Related Content

PDF
Introduction to the MEAN stack
PPTX
MEAN stack
PPTX
MongoDB Days Silicon Valley: Building Applications with the MEAN Stack
PDF
NodeSummit - MEAN Stack
PPTX
MEAN Stack
PDF
The MEAN Stack
PPTX
PPT
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js
Introduction to the MEAN stack
MEAN stack
MongoDB Days Silicon Valley: Building Applications with the MEAN Stack
NodeSummit - MEAN Stack
MEAN Stack
The MEAN Stack
The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js

What's hot (20)

PDF
Beginning MEAN Stack
PPTX
Building your first MEAN application
PDF
LAMP is so yesterday, MEAN is so tomorrow! :)
PPTX
Introduction to mean stack
PPTX
Angular js introduction
PDF
The MEAN stack - SoCalCodeCamp - june 29th 2014
PPT
Beyond the MEAN Stack: Thinking Small with Node.js for the Enterprise
PPTX
Building Modern Web Apps with MEAN Stack
PDF
MEAN Stack
PDF
Mean Stack - An Overview
PDF
MEAN Stack - Google Developers Live 10/03/2013
PDF
MEAN Stack WeNode Barcelona Workshop
PPTX
Introduction to MERN Stack
PPTX
MEAN Stack - Introduction & Advantages - Why should you switch to MEAN stack ...
PDF
Why NodeJS
PPTX
Evolution of java script libraries
PDF
Node js (runtime environment + js library) platform
PDF
Developing realtime apps with Drupal and NodeJS
PDF
Node js projects
Beginning MEAN Stack
Building your first MEAN application
LAMP is so yesterday, MEAN is so tomorrow! :)
Introduction to mean stack
Angular js introduction
The MEAN stack - SoCalCodeCamp - june 29th 2014
Beyond the MEAN Stack: Thinking Small with Node.js for the Enterprise
Building Modern Web Apps with MEAN Stack
MEAN Stack
Mean Stack - An Overview
MEAN Stack - Google Developers Live 10/03/2013
MEAN Stack WeNode Barcelona Workshop
Introduction to MERN Stack
MEAN Stack - Introduction & Advantages - Why should you switch to MEAN stack ...
Why NodeJS
Evolution of java script libraries
Node js (runtime environment + js library) platform
Developing realtime apps with Drupal and NodeJS
Node js projects
Ad

Viewers also liked (11)

PPT
Get MEAN! Node.js and the MEAN stack
PPTX
Building your First MEAN App
PDF
Font-End Hero
PDF
29 Essential AngularJS Interview Questions
PPTX
Дизайн REST API для высокопроизводительных систем / Александр Лебедев (Новые ...
PDF
Symfony2 and MongoDB - MidwestPHP 2013
PDF
Civil Engineering – Oldest Yet A Highly Sought After Career Choice in India
PDF
25 Cars Worth Waiting For 2016–2019
PPTX
ERP-System - 20 wichtige Fragen vor der Einführung
PDF
The Programmer
PDF
Paris ML meetup
Get MEAN! Node.js and the MEAN stack
Building your First MEAN App
Font-End Hero
29 Essential AngularJS Interview Questions
Дизайн REST API для высокопроизводительных систем / Александр Лебедев (Новые ...
Symfony2 and MongoDB - MidwestPHP 2013
Civil Engineering – Oldest Yet A Highly Sought After Career Choice in India
25 Cars Worth Waiting For 2016–2019
ERP-System - 20 wichtige Fragen vor der Einführung
The Programmer
Paris ML meetup
Ad

Similar to MongoDB and the MEAN Stack (20)

PDF
Managing Large Flask Applications On Google App Engine (GAE)
PDF
Accra MongoDB User Group
PPTX
how_can_businesses_address_storage_issues_using_mongodb.pptx
PDF
Mdb dn 2016_07_elastic_search
PDF
how_can_businesses_address_storage_issues_using_mongodb.pdf
PPTX
NoSQL and MongoDB Introdction
PDF
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
PPT
MongoDB Tick Data Presentation
PDF
SQL vs NoSQL, an experiment with MongoDB
PDF
MongoDB performance
PPTX
Augmenting Mongo DB with treasure data
PPTX
Augmenting Mongo DB with Treasure Data
PPTX
Ops Jumpstart: Admin 101
PDF
Pre-Aggregated Analytics And Social Feeds Using MongoDB
PDF
Silicon Valley Code Camp 2014 - Advanced MongoDB
PDF
MongoDB Tips and Tricks
PDF
Effectively Deploying MongoDB on AEM
PPTX
MongoDB Days Silicon Valley: Jumpstart: The Right and Wrong Use Cases for Mon...
PDF
Backend Development Bootcamp - Node [Online & Offline] In Bangla
PPT
Klmug presentation - Simple Analytics with MongoDB
Managing Large Flask Applications On Google App Engine (GAE)
Accra MongoDB User Group
how_can_businesses_address_storage_issues_using_mongodb.pptx
Mdb dn 2016_07_elastic_search
how_can_businesses_address_storage_issues_using_mongodb.pdf
NoSQL and MongoDB Introdction
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Tick Data Presentation
SQL vs NoSQL, an experiment with MongoDB
MongoDB performance
Augmenting Mongo DB with treasure data
Augmenting Mongo DB with Treasure Data
Ops Jumpstart: Admin 101
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Silicon Valley Code Camp 2014 - Advanced MongoDB
MongoDB Tips and Tricks
Effectively Deploying MongoDB on AEM
MongoDB Days Silicon Valley: Jumpstart: The Right and Wrong Use Cases for Mon...
Backend Development Bootcamp - Node [Online & Offline] In Bangla
Klmug presentation - Simple Analytics with MongoDB

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
KodekX | Application Modernization Development
PDF
cuic standard and advanced reporting.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation theory and applications.pdf
PDF
Modernizing your data center with Dell and AMD
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Empathic Computing: Creating Shared Understanding
PPTX
MYSQL Presentation for SQL database connectivity
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
20250228 LYD VKU AI Blended-Learning.pptx
KodekX | Application Modernization Development
cuic standard and advanced reporting.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
NewMind AI Monthly Chronicles - July 2025
Spectral efficient network and resource selection model in 5G networks
Encapsulation theory and applications.pdf
Modernizing your data center with Dell and AMD
The AUB Centre for AI in Media Proposal.docx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Empathic Computing: Creating Shared Understanding
MYSQL Presentation for SQL database connectivity
“AI and Expert System Decision Support & Business Intelligence Systems”
Bridging biosciences and deep learning for revolutionary discoveries: a compr...

MongoDB and the MEAN Stack

  • 1. Ger Hartnett & Alan Spencer MongoDB Dublin
  • 2. Overview • Fictional story of a startup using MongoDB & MEAN stack to build IoT application • We’ll take a devops perspective - show you what to watch out for a framework like MEAN • Tips you can use to help development team focus on the right things when close to production • Questions • How many from operations? • How many from development? 2
  • 3. 5 Things we Learned Capacity planning/prototyping is a good idea but performance is sensitive to sample test data The MEAN stack rocks - fast to get started - profiler can help you understand what’s under the hood Realtime/incremental aggregation works well with IoT workloads - the “MMS approach” With NodeJS/Express number of app servers becomes bottleneck before MongoDB Performance tuning patterns apply - "bottleneck whack-a-mole" & “slam-dunk-optimization” 3
  • 5. Internet of Things Big Data => Humongous Data “The rise of device oriented development … new architectural and workflow challenges … distinctly different from … web and mobile development so far.” - Morten Bagai
  • 6. Internet of Things • Bosch: “IoT brings root and branch changes to the world of business” • Richard Kreuter's Webinar May 2013 • Earlier bootcamp looked at sharding IoT 6 Photo by jurvetson - Creative Commons Attribution License - http://www.flickr.com/photos/jurvetson/916142
  • 7. MEAN stack 7 MongoDB - the database Express - web app framework/router Angular - browser HTML/JS MVC Node - javascript application server Photo by benmizen - Creative Commons ShareAlike License - http://www.flickr.com/photos/benmizen/9456440635
  • 8. Learn more about MEAN Valeri Karpov - MongoDB Kernel Tools Team http://thecodebarbarian.wordpress.com/2013/07/22/ introduction-to-the-mean-stack-part-one-setting-up-your-tools/ MEAN.io http://mean.io 8
  • 9. About MongoDB Bootcamp We invest in technical new hires Everyone does “bootcamp” NYC for 2 weeks - product internals Then work on a longer project 3-4 weeks In our case: wanted to do a bit of everything, capacity planning, iterate user-stories, MongoDB a component 9
  • 11. Location based advertising - IoMT 11 ! ! Advertiser ! Advertiser Advertiser ! ! ! ! Customer ! • IoT example 3 from Richard’s Webinar
  • 12. User Stories - for the application US1 - customer looks for advertisers near US2 - advertiser wants to see how many customers saw offer US3 - find hot spots where many customers but few advertisers 12 Photo by consumerist - Creative Commons Attribution License - http://www.flickr.com/photos/consumerist/2158190589
  • 13. Document / Model / Controller Model (advertiser.js) Document Haystack examples sent us in wrong direction initially exports.all = function(req, res) {! ! findQuery = { near: [ Number(req.query.lng), Number(req.query.lat) ],! ! ! maxDistance: Number(req.query.dist) };! ! Advertiser.geoSearch({kind:"pub"}, findQuery, ! ! ! function (err, advertisers) {! // error handling! ! !! res.jsonp(advertisers);! ! ! });! } 13 { name: ‘Long Hall’, pos: [-6.265535, 53.3418364], kind: “pub” } AdvertiserSchema = new Schema({! name: { type: String,! default: ‘’},! pos: [Number],! kind: { type: String,! default: ‘place’},! }); Controller (advertisers.js)
  • 14. CRUD interface & Mongoose CRUD interface ! Raised & fixed bug in Mongoose, pull request merged 14
  • 15. 5 Things we Learned Capacity planning/prototyping is a good idea but performance is sensitive to sample test data The MEAN stack rocks - fast to get started - profiler can help you understand what’s under the hood Realtime/incremental aggregation works well with IoT workloads - the “MMS approach” With NodeJS/Express number of app servers becomes bottleneck before MongoDB Performance tuning patterns apply - "bottleneck whack-a-mole" & “slam-dunk-optimization” 15
  • 16. US1 Initial Measurements MongoDB shell scripts 9 advertisers, small area, distance 10km MongoDB has 5 kinds of geo query 3 kinds of geo index geoSearch (haystack) looked much better than others (our 1st mistake) TIP: performance is sensitive to test data & query 16
  • 17. 5 Things we Learned Capacity planning/prototyping is a good idea but performance is sensitive to sample test data The MEAN stack rocks - fast to get started - profiler can help you understand what’s under the hood Realtime/incremental aggregation works well with IoT workloads - the “MMS approach” With NodeJS/Express number of app servers becomes bottleneck before MongoDB Performance tuning patterns apply - "bottleneck whack-a-mole" & “slam-dunk-optimization” 17
  • 18. The good thing about frameworks is… ! they do lot’s of things for developers ! ! ! …and the bad thing about frameworks? ! they do lot’s of things for developers
  • 19. To find out what’s happening - debug We used Express passport-http to add Basic- Digest auth (client id lookup) It can be hard to figure out what a framework like express/mongoose really does Tip: mongoose.set('debug', true) - detailed logging Console Mongoose: clients.findOne({ _id: ObjectId(“…”) })! Mongoose: advertisers.geoHaystack({…[-6.267765, 53.34087]})! 19
  • 20. Find out what’s happening - profiler Tip: The MongoDB profiler shows operations really happening on DB, check with dev 20 db.system.profile.find {"op":"query", "ns":"tings.clients",...! {“op":"command", "command":{"geoSearch"...! {"op" :"update","ns":"tings.sessions"...! exports.all = function(req, res) {! . . .! ! ! ! req.session = null;! ! !! res.jsonp(advertisers);! } 10% performance improvement Where did that come from? Fixing it is not obvious
  • 21. Back to the application
  • 22. US2 means we built on US1 US1 - customer looks for advertisers near • Need to store customer location US2 - advertiser wants to see how many customers near 22 Being a startup we decided to take a naive pragmatic approach: • Store all samples • US2 aggregates on-demand Photo by consumerist - Creative Commons Attribution License - http://www.flickr.com/photos/consumerist/2158190589
  • 23. 5 Things we Learned Capacity planning/prototyping is a good idea but performance is sensitive to sample test data The MEAN stack rocks - fast to get started - profiler can help you understand what’s under the hood Realtime/incremental aggregation works well with IoT workloads - the “MMS approach” With NodeJS/Express number of app servers becomes bottleneck before MongoDB Performance tuning patterns apply - "bottleneck whack-a-mole" & “slam-dunk-optimization” 23
  • 24. US2 - Aggregation of Raw Samples 1 hour of raw samples @ 2k RPS = 7.2M documents ! Aggregation on 7.2M raw samples took 1 second on our instances Significant impact • Run every 2 seconds RPS dropped by factor of 4! (single instance) Samples Query Aggregate 24 Raw Insert Aggregate
  • 25. US2 - Pre aggregation Samples Query Aggregate 25 Raw Insert Samples Pre Aggregate ! Update Query Aggregate Aggregate Aggregate An MMS type approach Document for advertiser-customer-month ! Using update multi-true (more on this later) ! Query now only needs to aggregate unique customers
  • 26. US1 measurements revisited MongoDB shell scripts More realistic data - old measurements repeated locations 110k advertisers with clusters in DUB and NYC Performance best for near and nearSphere (2x better than Haystack) 26
  • 27. Where does the time go? 27 • Express/Mongoose/Node • Customer Lookup • Find ($near) • Save Sample DB • Save Sample File • Preagg=multiple docs (6) • Preagg=multi-update 1 doc
  • 28. 5 Things we Learned Capacity planning/prototyping is a good idea but performance is sensitive to sample test data The MEAN stack rocks - fast to get started - profiler can help you understand what’s under the hood Realtime/incremental aggregation works well with IoT workloads - the “MMS approach” With NodeJS/Express number of app servers becomes bottleneck before MongoDB Performance tuning patterns apply - "bottleneck whack-a-mole" & “slam-dunk-optimization” 28
  • 29. NodeJS MongoD MongoD Deployment Chrome:Postman 29 NodeJS HAproxy NodeLoad NodeJS NodeJS
  • 31. 5 Things we Learned Capacity planning/prototyping is a good idea but performance is sensitive to sample test data The MEAN stack rocks - fast to get started - profiler can help you understand what’s under the hood Realtime/incremental aggregation works well with IoT workloads - the “MMS approach” With NodeJS/Express number of app servers becomes bottleneck before MongoDB Performance tuning patterns apply - "bottleneck whack-a-mole" & “slam-dunk-optimization” 31
  • 32. 1 - number of Node.JS 2 - HAproxy 3 - load gen threads/BW
  • 33. Pattern: “slam dunk optimization" 2 * NodeJS MongoD MongoD Chrome:Postman 33 NodeJS HAproxy NodeLoad NodeJS NodeJS 3 1
  • 34. Performance tips 1. Increase number of Node.JS 2. Increase perf of proxy/balancer instance 34 HAproxy more balanced than Amazon ELB 3. Tweak Nodeload (generates/measures REST) Nodeload concurrency 3x Node servers Run Nodeload on same machine as HAproxy Development recommendation: Postman chrome ext - generates REST / Basic Auth
  • 35. Back to the application
  • 36. US3 Overview What are the top 10 hot sales areas? • What is an “area”…? Requirements • Little impact, easy to calculate • Approx. Regular size • Optimal approx. distance - “bounding areas” • Plays nice with sharding Internals of haystack, 2dsphere? Polygon? MGRS? 36
  • 37. US3 - Hot box - Sales, go sell! 37
  • 38. MGRS - Military Grid Reference System 38 • 4QFJ123678 precision level 100m Image by Mikael Rittri - Creative Commons ShareAlike License http://en.wikipedia.org/wiki/File:MGRSgridHawaiiSchemeAARealigned.png
  • 39. MGRS - But at the poles… 39 Image by Mikael Rittri - Creative Commons ShareAlike License http://en.wikipedia.org/wiki/File:MGRSgridNorthPole.png
  • 41. The “box” - the poor-man’s MGRS x • Reinvented the sphere • Long/lat -> box number • Tailored to specific distance • Boxes are at least 1km • Search in current and 8 neighbouring boxes ! • Filter outside circle in JS • Performed relatively well • Can be used to shard 41
  • 43. Impact of Replication 43 Secondary reads ! Worked for this app ! Beware - don’t try this at home!
  • 44. Apply the production notes Change from default readahead Disable NUMA & THP ext4 or XFS noatime Load test workload on different configurations Instance Store / EBS (PIOPs) SSDs / spinning rust AWS instance types 44
  • 45. Recap
  • 46. 5 Things we Learned Capacity planning/prototyping is a good idea but performance is sensitive to sample test data The MEAN stack rocks - fast to get started but profiler can help you understand what’s under the hood Realtime/incremental aggregation works well with IoT workloads - the “MMS approach” Performance tuning patterns apply - "bottleneck whack-a-mole" & “slam-dunk-optimization” With NodeJS/Express number of app servers becomes bottleneck before MongoDB 46
  • 48. Next Steps Plan to publish as blog post series and github project ! Check blog.mongodb.org ! Continue to explore… 48
  • 49. Next Steps - continuation Hadoop/YARN for aggregations Use “box” to geo-shard Try 2.6 bulk updates Dynamic angular-google-maps with socket-io Implement in another framework (Go/Clojure) to load MongoDB with less hardware Find balance between batch and pre-aggregation 49 (see next slide)
  • 50. Learn More & Thank You Introduction to MEAN - Valeri Karpov http://thecodebarbarian.wordpress.com/2013/07/22/introduction-to-the-mean-stack-part-one-setting-up-your- tools/ MEAN.io http://mean.io Richard Kreuter's webinar - M2M http://www.mongodb.com/presentations/webinar-realizing-promise-machine-machine-m2m-mongodb Building MongoDB Into Your Internet of Things http://blog.mongohq.com/building-mongodb-into-your-internet-of-things-a-tutorial/ Schema design for time series data (MMS) http://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb 50