SlideShare a Scribd company logo
Bumps and Breezes
Our Journey From RDBMS To MongoDB
Presenters: Ajit Oke and Harsha Undapalli
Organization:
Background of Pre-MongoDB(RDBMS) Environment
Why MongoDB?
Evolution of MongoDB Environment
Database – Bumps and Breezes
MongoDB Application Layer – Bumps and Breezes
Benchmarking Results
Next Steps and Help Needed
Agenda
Our Journey From RDBMS to MongoDB
Pre-MongoDB Environment
 To ensure quality, every Intel product is electrically and
functionally tested thoroughly before it reaches
customer.
 Currently, this multi TB test data is managed by
RDBMS based Decision Support System (DSS).
 Intel CPU products, typically, have a Die attached to
Substrate using thousands of tiny balls. When DSS
team received a request to store ball level test data in
RDBMS, team started facing challenges in terms of
performance and storage.
 After literature review and proof of concept(POC)
analysis, team decided to use MongoDB for Ball
level test data. Substrate
Die
Our Journey From RDBMS to MongoDB
Why MongoDB?
MongoDB
Easy
Learning
Flexible
Schema
Driver
Support
Open Source
Scalability
Query
Performance
Faster performance with
multiple concurrent users
Horizontal scalability
using Sharding
Cost effective solution
with community support
Drivers available for C#,
Python etc.
Support dynamic addition
of fields in a collection
User friendly documentation and
training courses
Our Journey From RDBMS to MongoDB
Evolution of MongoDB Environment
In the last year and half, our MongoDB
environment is continuing to evolve with
the business needs.
This journey was not easy. We had a few
bumps and many breezes along the way.
Our Journey From RDBMS to MongoDB
Breezes - Database
Breezes
Small Install Footprint
 MongoDB binaries have a small footprint.
 Lean installation package of MongoDB
makes it easy to install on even virtual
servers, desktops or laptops.
 In comparison to our current RDBMS
platform, which takes several gigabytes of
storage space, MongoDB is very lightweight
in terms of storage needs.
Our Journey From RDBMS to MongoDB
Breezes
Ease of creating Mongo databases and database objects
 Compared to RDBMS, creating database and database objects is very easy in MongoDB.
 INDEX creation is very intuitive and similar to RDBMS.
 Config file in YAML format with sections.
 Similarity with RDBMS but doesn’t have the overhead of RDBMS.
Our Journey From RDBMS to MongoDB
Breezes
Data Compression
 Collection stats - db.getCollection(‘<collection-name>').stats()
 Example – We are almost getting 3 times compression through wiredTiger which will be huge
on large volume databases.
 WiredTiger data compression comes inbuilt in MongoDB.
Our Journey From RDBMS to MongoDB
Breezes
TTL index
 Eliminates the need of a separate data purger utility.
https://docs.mongodb.com/manual/core/index-ttl/
 Command to change the expiration value -
https://docs.mongodb.com/manual/reference/command/collMod/#dbcmd.collMod
Our Journey From RDBMS to MongoDB
Breezes
Sharding capability
 As database grows in size and usage, MongoDB Sharding
capability provides excellent mechanism of horizontal
scaling using commodity hardware.
 In RDBMS, vertical scalability is achieved by upgrading
server hardware and horizontal scalability is offered using
shared disk storage systems. Both these options are very
expensive compared to MongoDB sharding.
 Creating Sharding environment was formidable task initially,
but after taking a few MongoDB University online courses,
and learning from experiences from MongoDB user
community, we could create a foolproof process for shard
creation.
 Choosing hashed sharding key on high cardinality field
provided us excellent balancing while loading data as well
as storing data.
Scalability Technology Shift over the years
Our Journey From RDBMS to MongoDB
Bumps and Key Learnings - Database
Bumps and Key Learnings
Usage of some config settings
 bindIpAll parameter - changes in
MongoDB 3.6
 keyFile parameter for authentication
 cacheSize parameter for performance
Our Journey From RDBMS to MongoDB
Bumps and Key Learnings
Replica Set/Sharding with
authentication and key file creation
 Creating a key file, was a challenge at that time
because we could not find proper documentation.
Experiences from some of the other users were
referring to third party tools to create hexadecimal
key etc.
 At the end, we realized that we can easily create a
key file using text editor. Since our servers are well
secured inside a firewall and only authorized users
can access them, we are OK with key file having
readable text.
Our Journey From RDBMS to MongoDB
MongoDB Application Layer
Bumps and Breezes
Software Stack in our MongoDB Application
 Software components are developed in .NET C# platform
using MongoDB drivers –
 CSV Importer
 Custom File Loader
 Standard Library
 Standard library for reusable functionalities - Connect to
mongo client, Insert/Update/Delete/Find documents in
Mongo collection.
 Loaders are highly configurable in terms of connection
options, database, collection, user credentials and number
of parallel instances to run.
Standard
Library
Custom
File
Loader
CSV
Importer
Our Journey From RDBMS to MongoDB
Breezes – Application
Breezes
MongoDB C# driver capabilities - https://docs.mongodb.com/ecosystem/drivers/csharp/
 Easy interface to CRUD operations
 Builders – Filters, Sort, Limit, Skip
 Aggregation Framework
 BulkWrite, FindOneAndUpdate
 Mongo client connection options
Our Journey From RDBMS to MongoDB
Breezes
Flexible schema structure – dynamic document creation with flexible fields in same collection.
Our Journey From RDBMS to MongoDB
Breezes
Support for parallel loading
As MongoDB supports document level concurrency, we were able to simultaneously insert several 1000s of documents
into the same collection using our parallel running loaders with configurable number of processes.
85
18
10.5
0
10
20
30
40
50
60
70
80
90
1 4 8
LOADTIMEINSECONDS
NUMBER OF LOADER INSTANCES
Load Performance
12.3
52.4
70
0
10
20
30
40
50
60
70
80
1 4 8
LOADRATE(GB/HOUR)
NUMBER OF LOADER INSTANCES
Load Rate(GB/Hour)
Our Journey From RDBMS to MongoDB
Breezes
 MongoDB Documentation -
https://docs.mongodb.com/
 MongoDB University -
https://university.mongodb.com/courses
Our Journey From RDBMS to MongoDB
Bumps and Key Learnings – Application
Bumps and Key Learnings
Mongo Import options
 Capture success/error message from import command Screen -
 Define the fields and their datatypes in each document (Headerline vs No Headerline) -
 Connect to a replica set -
mongoimport --host svr1:27017 --db MDB_CSV --collection MC_CSV –username readWriteUser --password readwrite$ --authenticationDatabase MDB_CSV --
type CSV --columnsHaveTypes --headerline --file E:RawDataLoadCSVtest_0034_WW13.CSV --numInsertionWorkers 4 --ignoreBlanks --parseGrace stop
mongoimport --host RS_NAME/svr1:27017, svr2:27017, svr3:27017 --db MDB_CSV --collection MC_CSV –username readWriteUser --password readwrite$ --
authenticationDatabase MDB_CSV --type CSV --columnsHaveTypes --fieldFile D:MongoFieldFilesfields.txt --file
E:RawDataLoadCSVtest_0034_WW15.CSV --numInsertionWorkers 4 --ignoreBlanks --parseGrace stop
mongoimport --host localhost:27017 --db MDB_CSV --collection MC_CSV –username readWriteUser --password readwrite$ --authenticationDatabase
MDB_CSV --type CSV --columnsHaveTypes --fieldFile D:MongoFieldFilesfields.txt --file E:RawDataLoadCSVtest_0034_WW14.CSV --
numInsertionWorkers 4 --ignoreBlanks --parseGrace stop
mongoimport --host svr1:27017 --db MDB_CSV --collection MC_CSV –username readWriteUser --password readwrite$ --authenticationDatabase MDB_CSV --
type CSV --columnsHaveTypes --headerline --file E:RawDataLoadCSVtest_0034_WW13.CSV --numInsertionWorkers 4 --ignoreBlanks --parseGrace stop
Our Journey From RDBMS to MongoDB
Bumps and Key Learnings
 InsertMany  BulkWrite
InsertMany Vs BulkWrite
 No easy way to capture insertion count.
 Internally uses BulkWrite.
 Cannot handle multiple type of
(Insert/Update/Delete) operations at once.
Our Journey From RDBMS to MongoDB
Bumps and Key Learnings
 Replica set connection through C# driver -  Programmatically build the connection string -
mongodb://<username>:<password>@<host1:port1>,..,<hostN:portN>/<authDB>?<connectOptions>
Various Ways of Replica set Connection
Our Journey From RDBMS to MongoDB
Data Access Patterns
 Aggregation queries that roll up the ball data – $group, $project, $match
 3D Scatter plots of ball X and Y positions along with ball data values, to identify certain error patterns
 Clustering or commonality analysis of ball data
0
5
10
15
20
25
30
1 2 3 4 5 6 7 8 9 10
TOTALUNITS
CLUSTER ID
Cluster Distribution
Our Journey From RDBMS to MongoDB
Benchmarking Results
Performance benchmarks : Multi User Scalability
400
410
450
1100
1900
130
131
135
150
190
2 5 10 20 40
TIMEINSECONDS
NUMBER OF CONCURRENT USERS
PERFORMANCE ON RDBMS VS MONGODB
RDBMS Performance in Seconds MongoDB Performance in Seconds
MongoDB Handles up to 40 concurrent
users within acceptable query
performance time
Our Journey From RDBMS to MongoDB
Query Performance - Data vs Cache Size Correlation
0
20
40
60
80
100
120
140
160
180
Default cache size 100 GB cache size Default cache size 200 GB cache size Default cache size 300GB cache size
TIMETOQUERYINMINUTES
DB Query Performance vs Cache Size Correlation
Query For 10K Units
Query For 20K Units
Query For 100K Units
2TB DB Size 3TB DB Size1TB DB Size
Our Journey From RDBMS to MongoDB
Next Steps and Help Needed
Our Next Steps
MongoDB 4.0
 Compliance to ACID properties that includes transaction
across multiple documents. This will allow us to develop
applications and schema with advantages of RDBMS on
NO SQL environment.
 Aggregation pipeline enhancements.
Sharding
 Sharding will be explored further to combine multiple
replica set based MongoDB into a few giant MongoDB
shard environments.
Help Needed from MongoDB
 More inbuilt analytical functions(e.g. percentile, rank,
median) in aggregation pipeline will help in advanced
data analysis.
 Row level security helps with better access control.
 Improvement in aggregate query performance on
shard environment.
Our Journey From RDBMS to MongoDB

More Related Content

PDF
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
PDF
An Elastic Metadata Store for eBay’s Media Platform
PDF
Containerizing MongoDB with kubernetes
PPTX
EBS and RBS in SharePoint 2010
PPTX
Mongodb introduction and_internal(simple)
PDF
MongodB Internals
PDF
Mongo DB
DOCX
Mongo db report
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
An Elastic Metadata Store for eBay’s Media Platform
Containerizing MongoDB with kubernetes
EBS and RBS in SharePoint 2010
Mongodb introduction and_internal(simple)
MongodB Internals
Mongo DB
Mongo db report

What's hot (20)

PPTX
MongoDB presentation
PPTX
Introduction To MongoDB
PPTX
MongoDB Internals
PDF
Data Migration Between MongoDB and Oracle
PPTX
Transitioning from SQL to MongoDB
PDF
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
PDF
Performance analysis of MongoDB and HBase
ODP
Introduction to MongoDB
PDF
Webinar: Schema Patterns and Your Storage Engine
PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PPTX
MongoDB: An Introduction - june-2011
PDF
Responsive & Responsible: Implementing Responsive Design at Scale
PPTX
Mongodb basics and architecture
PDF
Mongodb
PPTX
Securing Your Enterprise Web Apps with MongoDB Enterprise
PPTX
Mongo db intro.pptx
PPTX
MongoDB at Scale
PDF
MongoDB on AWS
PPTX
Securing Your MongoDB Deployment
PDF
Mongo db dhruba
MongoDB presentation
Introduction To MongoDB
MongoDB Internals
Data Migration Between MongoDB and Oracle
Transitioning from SQL to MongoDB
MongoDB World 2019: Raiders of the Anti-patterns: A Journey Towards Fixing Sc...
Performance analysis of MongoDB and HBase
Introduction to MongoDB
Webinar: Schema Patterns and Your Storage Engine
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB: An Introduction - june-2011
Responsive & Responsible: Implementing Responsive Design at Scale
Mongodb basics and architecture
Mongodb
Securing Your Enterprise Web Apps with MongoDB Enterprise
Mongo db intro.pptx
MongoDB at Scale
MongoDB on AWS
Securing Your MongoDB Deployment
Mongo db dhruba
Ad

Similar to MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB (20)

PDF
Introduction to MongoDB and its best practices
PDF
MongoDB Developer's Notebook, March 2016 -- MongoDB Connector for Business In...
PPTX
Elevate MongoDB with ODBC/JDBC
DOCX
MongoDB DOC v1.5
PPTX
Techorama - Evolvable Application Development with MongoDB
PPTX
MongoDB.local Sydney: An Introduction to Document Databases with MongoDB
PDF
Mongodb By Vipin
PPTX
MongoDB 3.4 webinar
PDF
MongoDB
PPTX
Jumpstart: Your Introduction To MongoDB
PPTX
Node.CQ - Creating Real-time Data Mashups with Node.JS and Adobe CQ
PDF
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
PPTX
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
PPT
Tikal Fuse Day Access Layer Implementation (C#) Based On Mongo Db
PDF
TCO Comparison MongoDB & Oracle
PDF
Best Practices for Building Open Source Data Layers
PDF
NoSQL on microsoft azure april 2014
PPTX
Why NBC Universal Migrated to MongoDB Atlas
PDF
MongoDB Tips and Tricks
PPTX
MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
Introduction to MongoDB and its best practices
MongoDB Developer's Notebook, March 2016 -- MongoDB Connector for Business In...
Elevate MongoDB with ODBC/JDBC
MongoDB DOC v1.5
Techorama - Evolvable Application Development with MongoDB
MongoDB.local Sydney: An Introduction to Document Databases with MongoDB
Mongodb By Vipin
MongoDB 3.4 webinar
MongoDB
Jumpstart: Your Introduction To MongoDB
Node.CQ - Creating Real-time Data Mashups with Node.JS and Adobe CQ
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...
Tikal Fuse Day Access Layer Implementation (C#) Based On Mongo Db
TCO Comparison MongoDB & Oracle
Best Practices for Building Open Source Data Layers
NoSQL on microsoft azure april 2014
Why NBC Universal Migrated to MongoDB Atlas
MongoDB Tips and Tricks
MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
PDF
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Approach and Philosophy of On baking technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Empathic Computing: Creating Shared Understanding
PDF
KodekX | Application Modernization Development
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Electronic commerce courselecture one. Pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Big Data Technologies - Introduction.pptx
Network Security Unit 5.pdf for BCA BBA.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Approach and Philosophy of On baking technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...
Empathic Computing: Creating Shared Understanding
KodekX | Application Modernization Development
MYSQL Presentation for SQL database connectivity
Dropbox Q2 2025 Financial Results & Investor Presentation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Electronic commerce courselecture one. Pdf
sap open course for s4hana steps from ECC to s4
Spectral efficient network and resource selection model in 5G networks
Understanding_Digital_Forensics_Presentation.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Advanced methodologies resolving dimensionality complications for autism neur...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Big Data Technologies - Introduction.pptx

MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB

  • 1. Bumps and Breezes Our Journey From RDBMS To MongoDB Presenters: Ajit Oke and Harsha Undapalli Organization:
  • 2. Background of Pre-MongoDB(RDBMS) Environment Why MongoDB? Evolution of MongoDB Environment Database – Bumps and Breezes MongoDB Application Layer – Bumps and Breezes Benchmarking Results Next Steps and Help Needed Agenda Our Journey From RDBMS to MongoDB
  • 3. Pre-MongoDB Environment  To ensure quality, every Intel product is electrically and functionally tested thoroughly before it reaches customer.  Currently, this multi TB test data is managed by RDBMS based Decision Support System (DSS).  Intel CPU products, typically, have a Die attached to Substrate using thousands of tiny balls. When DSS team received a request to store ball level test data in RDBMS, team started facing challenges in terms of performance and storage.  After literature review and proof of concept(POC) analysis, team decided to use MongoDB for Ball level test data. Substrate Die Our Journey From RDBMS to MongoDB
  • 4. Why MongoDB? MongoDB Easy Learning Flexible Schema Driver Support Open Source Scalability Query Performance Faster performance with multiple concurrent users Horizontal scalability using Sharding Cost effective solution with community support Drivers available for C#, Python etc. Support dynamic addition of fields in a collection User friendly documentation and training courses Our Journey From RDBMS to MongoDB
  • 5. Evolution of MongoDB Environment In the last year and half, our MongoDB environment is continuing to evolve with the business needs. This journey was not easy. We had a few bumps and many breezes along the way. Our Journey From RDBMS to MongoDB
  • 7. Breezes Small Install Footprint  MongoDB binaries have a small footprint.  Lean installation package of MongoDB makes it easy to install on even virtual servers, desktops or laptops.  In comparison to our current RDBMS platform, which takes several gigabytes of storage space, MongoDB is very lightweight in terms of storage needs. Our Journey From RDBMS to MongoDB
  • 8. Breezes Ease of creating Mongo databases and database objects  Compared to RDBMS, creating database and database objects is very easy in MongoDB.  INDEX creation is very intuitive and similar to RDBMS.  Config file in YAML format with sections.  Similarity with RDBMS but doesn’t have the overhead of RDBMS. Our Journey From RDBMS to MongoDB
  • 9. Breezes Data Compression  Collection stats - db.getCollection(‘<collection-name>').stats()  Example – We are almost getting 3 times compression through wiredTiger which will be huge on large volume databases.  WiredTiger data compression comes inbuilt in MongoDB. Our Journey From RDBMS to MongoDB
  • 10. Breezes TTL index  Eliminates the need of a separate data purger utility. https://docs.mongodb.com/manual/core/index-ttl/  Command to change the expiration value - https://docs.mongodb.com/manual/reference/command/collMod/#dbcmd.collMod Our Journey From RDBMS to MongoDB
  • 11. Breezes Sharding capability  As database grows in size and usage, MongoDB Sharding capability provides excellent mechanism of horizontal scaling using commodity hardware.  In RDBMS, vertical scalability is achieved by upgrading server hardware and horizontal scalability is offered using shared disk storage systems. Both these options are very expensive compared to MongoDB sharding.  Creating Sharding environment was formidable task initially, but after taking a few MongoDB University online courses, and learning from experiences from MongoDB user community, we could create a foolproof process for shard creation.  Choosing hashed sharding key on high cardinality field provided us excellent balancing while loading data as well as storing data. Scalability Technology Shift over the years Our Journey From RDBMS to MongoDB
  • 12. Bumps and Key Learnings - Database
  • 13. Bumps and Key Learnings Usage of some config settings  bindIpAll parameter - changes in MongoDB 3.6  keyFile parameter for authentication  cacheSize parameter for performance Our Journey From RDBMS to MongoDB
  • 14. Bumps and Key Learnings Replica Set/Sharding with authentication and key file creation  Creating a key file, was a challenge at that time because we could not find proper documentation. Experiences from some of the other users were referring to third party tools to create hexadecimal key etc.  At the end, we realized that we can easily create a key file using text editor. Since our servers are well secured inside a firewall and only authorized users can access them, we are OK with key file having readable text. Our Journey From RDBMS to MongoDB
  • 16. Software Stack in our MongoDB Application  Software components are developed in .NET C# platform using MongoDB drivers –  CSV Importer  Custom File Loader  Standard Library  Standard library for reusable functionalities - Connect to mongo client, Insert/Update/Delete/Find documents in Mongo collection.  Loaders are highly configurable in terms of connection options, database, collection, user credentials and number of parallel instances to run. Standard Library Custom File Loader CSV Importer Our Journey From RDBMS to MongoDB
  • 18. Breezes MongoDB C# driver capabilities - https://docs.mongodb.com/ecosystem/drivers/csharp/  Easy interface to CRUD operations  Builders – Filters, Sort, Limit, Skip  Aggregation Framework  BulkWrite, FindOneAndUpdate  Mongo client connection options Our Journey From RDBMS to MongoDB
  • 19. Breezes Flexible schema structure – dynamic document creation with flexible fields in same collection. Our Journey From RDBMS to MongoDB
  • 20. Breezes Support for parallel loading As MongoDB supports document level concurrency, we were able to simultaneously insert several 1000s of documents into the same collection using our parallel running loaders with configurable number of processes. 85 18 10.5 0 10 20 30 40 50 60 70 80 90 1 4 8 LOADTIMEINSECONDS NUMBER OF LOADER INSTANCES Load Performance 12.3 52.4 70 0 10 20 30 40 50 60 70 80 1 4 8 LOADRATE(GB/HOUR) NUMBER OF LOADER INSTANCES Load Rate(GB/Hour) Our Journey From RDBMS to MongoDB
  • 21. Breezes  MongoDB Documentation - https://docs.mongodb.com/  MongoDB University - https://university.mongodb.com/courses Our Journey From RDBMS to MongoDB
  • 22. Bumps and Key Learnings – Application
  • 23. Bumps and Key Learnings Mongo Import options  Capture success/error message from import command Screen -  Define the fields and their datatypes in each document (Headerline vs No Headerline) -  Connect to a replica set - mongoimport --host svr1:27017 --db MDB_CSV --collection MC_CSV –username readWriteUser --password readwrite$ --authenticationDatabase MDB_CSV -- type CSV --columnsHaveTypes --headerline --file E:RawDataLoadCSVtest_0034_WW13.CSV --numInsertionWorkers 4 --ignoreBlanks --parseGrace stop mongoimport --host RS_NAME/svr1:27017, svr2:27017, svr3:27017 --db MDB_CSV --collection MC_CSV –username readWriteUser --password readwrite$ -- authenticationDatabase MDB_CSV --type CSV --columnsHaveTypes --fieldFile D:MongoFieldFilesfields.txt --file E:RawDataLoadCSVtest_0034_WW15.CSV --numInsertionWorkers 4 --ignoreBlanks --parseGrace stop mongoimport --host localhost:27017 --db MDB_CSV --collection MC_CSV –username readWriteUser --password readwrite$ --authenticationDatabase MDB_CSV --type CSV --columnsHaveTypes --fieldFile D:MongoFieldFilesfields.txt --file E:RawDataLoadCSVtest_0034_WW14.CSV -- numInsertionWorkers 4 --ignoreBlanks --parseGrace stop mongoimport --host svr1:27017 --db MDB_CSV --collection MC_CSV –username readWriteUser --password readwrite$ --authenticationDatabase MDB_CSV -- type CSV --columnsHaveTypes --headerline --file E:RawDataLoadCSVtest_0034_WW13.CSV --numInsertionWorkers 4 --ignoreBlanks --parseGrace stop Our Journey From RDBMS to MongoDB
  • 24. Bumps and Key Learnings  InsertMany  BulkWrite InsertMany Vs BulkWrite  No easy way to capture insertion count.  Internally uses BulkWrite.  Cannot handle multiple type of (Insert/Update/Delete) operations at once. Our Journey From RDBMS to MongoDB
  • 25. Bumps and Key Learnings  Replica set connection through C# driver -  Programmatically build the connection string - mongodb://<username>:<password>@<host1:port1>,..,<hostN:portN>/<authDB>?<connectOptions> Various Ways of Replica set Connection Our Journey From RDBMS to MongoDB
  • 26. Data Access Patterns  Aggregation queries that roll up the ball data – $group, $project, $match  3D Scatter plots of ball X and Y positions along with ball data values, to identify certain error patterns  Clustering or commonality analysis of ball data 0 5 10 15 20 25 30 1 2 3 4 5 6 7 8 9 10 TOTALUNITS CLUSTER ID Cluster Distribution Our Journey From RDBMS to MongoDB
  • 28. Performance benchmarks : Multi User Scalability 400 410 450 1100 1900 130 131 135 150 190 2 5 10 20 40 TIMEINSECONDS NUMBER OF CONCURRENT USERS PERFORMANCE ON RDBMS VS MONGODB RDBMS Performance in Seconds MongoDB Performance in Seconds MongoDB Handles up to 40 concurrent users within acceptable query performance time Our Journey From RDBMS to MongoDB
  • 29. Query Performance - Data vs Cache Size Correlation 0 20 40 60 80 100 120 140 160 180 Default cache size 100 GB cache size Default cache size 200 GB cache size Default cache size 300GB cache size TIMETOQUERYINMINUTES DB Query Performance vs Cache Size Correlation Query For 10K Units Query For 20K Units Query For 100K Units 2TB DB Size 3TB DB Size1TB DB Size Our Journey From RDBMS to MongoDB
  • 30. Next Steps and Help Needed Our Next Steps MongoDB 4.0  Compliance to ACID properties that includes transaction across multiple documents. This will allow us to develop applications and schema with advantages of RDBMS on NO SQL environment.  Aggregation pipeline enhancements. Sharding  Sharding will be explored further to combine multiple replica set based MongoDB into a few giant MongoDB shard environments. Help Needed from MongoDB  More inbuilt analytical functions(e.g. percentile, rank, median) in aggregation pipeline will help in advanced data analysis.  Row level security helps with better access control.  Improvement in aggregate query performance on shard environment. Our Journey From RDBMS to MongoDB