SlideShare a Scribd company logo
The Importance of
Indexes in MongoDB
How we increased the loading speed Profile Gliphs and Insights


James Toyer, Lead Software Engineer at Glipho
What is Glipho?

                   Social network for text
                    based content
                   Aims to better engage
                    writers and readers
                   Original content only
                   Not an aggregator
                   Automatically share to
                    Facebook, LinkedIn and
                    Twitter
Insights Page

                 Load up the gliphs a writer
                  has
                 Iterate through, and
                  sum, actions for each gliph
                 Load and sum actions for
                  the writers profile
                 This can be over 100 calls to
                  the database
                     We know it’s inefficient but
                      it does the job for now
Insights Document Structure




 Timestamp – when the action took place
 EntityId – identifier of the original entity
 ActionType – the type of the entity (probably should be entity
  type)
 Action – the actual action that took place
Helpful Error Page
…it’s all gone wrong!
Troubleshooting


 CPU spiking? NO
 Memory high? NO
 Disk IO high? NO
 Are there any actual regular hits happening? NO
 Do you know anything? NO


Crack out the code performance tools…
Pre-Index performance

•   3 passes on each filter page
•   Average time for each page to load = 3.9 seconds
•   “ListAll” method calls the database
•   “ListAll” is iterated over for each gliph in the database and the profile (in this case ~10 times)
•   Average time in “ListAll” 256ms
More Troubleshooting


 Is the code doing obviously stupid things? NO
 Has Linq screwed you over again? NO
 Do you trust the driver? PROBABLY
 Check the database
    ~ 400,000 documents (now ~690,000)
    No indexes
Know your query

 GetMongoQuery code   Output
Index analysis

    Without action field           With action field
 Query structure             Query structure




 Query time before index:    Query time before index:
     334ms                       409ms
 Index                       Index




 Query time after index:     Query time after index:
     >1ms                        >1ms
Post-Index performance

  Pre-index performance           Post-index performance
 Page load time:                Page load time:
    3.9s                            72ms
 “ListAll” method execution     “ListAll” method execution
  time:                           time:
    256ms                           >0.2ms



         Page load time deceased by 98%
         “ListAll” method execution time decreased
          by 99%
Gliph listings for Writers

                  Problems:
                     Slow loading
                     Sometimes erroring out
                  Reasons:
                     Indexes were no longer
                      accurate
                     Code had changed
                  Solution:
                     New indexes
                     Remove old indexes
What did I learn?



 Know exactly what queries are being run
 Don’t do a “best guess” on an index. Test them out
 Don’t “forget” to add indexes
 Ensure your indexes evolve as your queries do
Any Questions?

       james@glipho.com
       glipho.com/james
            @jamestoyer

More Related Content

PDF
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...
PPTX
The tale of 100 cve's
PDF
DevSecCon Tel Aviv 2018 - Value driven threat modeling by Avi Douglen
PPTX
DevSecCon Tel Aviv 2018 - Integrated Security Testing by Morgan Roman
PPT
Scaling Ruby for Enterprise Applications
PDF
DevSecCon London 2017: Hands-on secure software development from design to de...
PPTX
DevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly Davidoff
PPT
Web2.0 : an introduction
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...
The tale of 100 cve's
DevSecCon Tel Aviv 2018 - Value driven threat modeling by Avi Douglen
DevSecCon Tel Aviv 2018 - Integrated Security Testing by Morgan Roman
Scaling Ruby for Enterprise Applications
DevSecCon London 2017: Hands-on secure software development from design to de...
DevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly Davidoff
Web2.0 : an introduction

What's hot (10)

PPTX
SQL Server Tips & Tricks
PPT
Devops at Netflix (re:Invent)
PPTX
Exploiting NoSQL Like Never Before
PPTX
NoSQL - No Security? - The BSides Edition
PDF
Effective approaches to web application security
PPTX
Saving Time By Testing With Jest
PPTX
Power shell v3 session1
PPTX
Continuous integration sql in the city
PDF
GraphQL with Spring Boot
PPTX
InSpec Workshop DevSecCon 2017
SQL Server Tips & Tricks
Devops at Netflix (re:Invent)
Exploiting NoSQL Like Never Before
NoSQL - No Security? - The BSides Edition
Effective approaches to web application security
Saving Time By Testing With Jest
Power shell v3 session1
Continuous integration sql in the city
GraphQL with Spring Boot
InSpec Workshop DevSecCon 2017
Ad

Similar to The importance of indexes in mongo db (20)

PPT
Mongo Performance Optimization Using Indexing
PDF
Mongo db improve the performance of your application codemotion2016
PDF
The need for speed uk fest
PPTX
Introduction to RavenDB
PPTX
How to Achieve Scale with MongoDB
PPTX
From Zero to Performance Hero in Minutes - Agile Testing Days 2014 Potsdam
PDF
Indexing and Query Performance in MongoDB.pdf
PDF
Use Performance Insights To Enhance MongoDB Performance - (Manosh Malai - Myd...
PPTX
Just entity framework
PDF
Know thy cost (or where performance problems lurk)
PDF
Umbraco - DUUGFest 17 -The need for speed
PPTX
Examiness hints and tips from the trenches
PPTX
#MongoDB indexes
KEY
Micro-Blogging for The Enterprise (MongoDB)
PDF
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
PDF
Query Tuning for Database Pros & Developers
PPTX
10 Things I Like in SharePoint 2013 Search
PPTX
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
PDF
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
PPTX
MongoDB Aggregation Performance
Mongo Performance Optimization Using Indexing
Mongo db improve the performance of your application codemotion2016
The need for speed uk fest
Introduction to RavenDB
How to Achieve Scale with MongoDB
From Zero to Performance Hero in Minutes - Agile Testing Days 2014 Potsdam
Indexing and Query Performance in MongoDB.pdf
Use Performance Insights To Enhance MongoDB Performance - (Manosh Malai - Myd...
Just entity framework
Know thy cost (or where performance problems lurk)
Umbraco - DUUGFest 17 -The need for speed
Examiness hints and tips from the trenches
#MongoDB indexes
Micro-Blogging for The Enterprise (MongoDB)
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
Query Tuning for Database Pros & Developers
10 Things I Like in SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
MongoDB Aggregation Performance
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

The importance of indexes in mongo db

  • 1. The Importance of Indexes in MongoDB How we increased the loading speed Profile Gliphs and Insights James Toyer, Lead Software Engineer at Glipho
  • 2. What is Glipho?  Social network for text based content  Aims to better engage writers and readers  Original content only  Not an aggregator  Automatically share to Facebook, LinkedIn and Twitter
  • 3. Insights Page  Load up the gliphs a writer has  Iterate through, and sum, actions for each gliph  Load and sum actions for the writers profile  This can be over 100 calls to the database  We know it’s inefficient but it does the job for now
  • 4. Insights Document Structure  Timestamp – when the action took place  EntityId – identifier of the original entity  ActionType – the type of the entity (probably should be entity type)  Action – the actual action that took place
  • 6. Troubleshooting  CPU spiking? NO  Memory high? NO  Disk IO high? NO  Are there any actual regular hits happening? NO  Do you know anything? NO Crack out the code performance tools…
  • 7. Pre-Index performance • 3 passes on each filter page • Average time for each page to load = 3.9 seconds • “ListAll” method calls the database • “ListAll” is iterated over for each gliph in the database and the profile (in this case ~10 times) • Average time in “ListAll” 256ms
  • 8. More Troubleshooting  Is the code doing obviously stupid things? NO  Has Linq screwed you over again? NO  Do you trust the driver? PROBABLY  Check the database  ~ 400,000 documents (now ~690,000)  No indexes
  • 9. Know your query GetMongoQuery code Output
  • 10. Index analysis Without action field With action field  Query structure  Query structure  Query time before index:  Query time before index:  334ms  409ms  Index  Index  Query time after index:  Query time after index:  >1ms  >1ms
  • 11. Post-Index performance Pre-index performance Post-index performance  Page load time:  Page load time:  3.9s  72ms  “ListAll” method execution  “ListAll” method execution time: time:  256ms  >0.2ms  Page load time deceased by 98%  “ListAll” method execution time decreased by 99%
  • 12. Gliph listings for Writers  Problems:  Slow loading  Sometimes erroring out  Reasons:  Indexes were no longer accurate  Code had changed  Solution:  New indexes  Remove old indexes
  • 13. What did I learn?  Know exactly what queries are being run  Don’t do a “best guess” on an index. Test them out  Don’t “forget” to add indexes  Ensure your indexes evolve as your queries do
  • 14. Any Questions? james@glipho.com glipho.com/james @jamestoyer

Editor's Notes

  • #2: Who are you?What are you talking about?Mention how it got recognisedThis is a case study…kinda
  • #3: “Think of it like twitter for blogs”You can bring your existing content with you for no cost
  • #4: Writers are vain and lazyTime filtersUp to 100 gliphs
  • #5: Anonymous4 important fields for this
  • #6: Insights page appeared to be taking an age to load. Could be temporary blip. Something that is just being a bit slow. Then a bunch of timeout errors from the page effectively not completing the map-reduce job. Coincidentally the gliph listing page for a writer started loading really slowly
  • #7: Use New Relic
  • #8: Not original figures – ran yesterdayThis are averages over (3 x 3 = 9 passes)
  • #9: obviously is not a healthy combinationMy PC = Solid StateProduction on AWS, even with 8 drive in RAID 10 (as recommended by the MongoDB documentation)MASSIVE FAIL!!!!
  • #10: Can’t just add indexes…don’t know what queries are.We use Linq. Not as smart as you hope.Use “GetMongoQuery”
  • #11: These are through the shell
  • #13: Asproimised listings for writersThis was AJAX so less pronouncedUsed “GetMongoQuery” again
  • #14: I know good developers who guess.Forget reasons: - prototype to production - do them later - forget from restore