SlideShare a Scribd company logo
Use Cases and Roadmap

                                   Norberto Leite

                           Senior Solutions Architect, EMEA
                                norberto@10gen.com
                                        @nleite




Thursday, 25 October 12
Agenda

        •Use Cases
        •Roadmap
        •Future




Thursday, 25 October 12
Use Cases




Thursday, 25 October 12
Big Data = MongoDB = Solved

        Content Management    Operational Intelligence   E-Commerce




     User Data Management    High Volume Data Feeds        Mobile




Thursday, 25 October 12
Location Based Service

        •Problem:
                •Location based social networking service needs to scale to
                high number of users and check-ins

        •Solution:
                •Used MongoDB deployed on EC2
                •8 clusters, 40 machines, 15k QPS, 2.3 billion records
                •Auto-sharding and geo-spatial indexing are key

        •Results:
                •To date have scaled to 9m users, 3m check-ins per day,
                750m total check-ins, 20m places, 400k merchants



Thursday, 25 October 12
•Problem:
                •Business needed modern data store for rapid
                development and scale

        •Solution:
                •Used PHP and MongoDB

        •Results:
                •RealTime estatistics
                •All data, images etc store together
                •No need for complex migrations
                •Enable very rapid development and growth


Thursday, 25 October 12
•Problem:
                •Deal with massive data volume across all customers

        •Solution:
                •Use MongoDB to replace Google Analytics / Omniture

        •Results:
                •Less than one week to build prototype and POC
                •Rapid deployment of new features




Thursday, 25 October 12
•Problem:
                •Lots of friction with RDMS for archiving storage
                •Needed to more scalable archive storage database

        •Solution:
                •Keep MySQL for active data ( 100 Million )
                •MongoDB for archive ( 2 Billion )

        •Results:
                •No more alter tables statements taking over 2 months
                •Sharding fixed vertical scale problem
                •Very happily looking for other ways to use MongoDB


Thursday, 25 October 12
How Telefónica uses MongoDB
                                                                  Apps
        M2M Event Acquisition


                          Event notification


                                                            Event Notifier              Portal

                                                                                         API



                                               Event                          Mng
                            Core                                            Storage      Mng      Mng
                                              Storage
                                                                                       Platform




                                                  Event Gateway
                                                                                       BOSS
                          Event acquisition


                                                                   Operator Network


                              MNO1
                                                                  MNO2
                                                                                      MNOn




Thursday, 25 October 12
And many others ...




Thursday, 25 October 12
Roadmap




Thursday, 25 October 12
The Evolution of MongoDB

           1.8                  2.0               2.2                2.4
         March ‘11            Sept ‘11           Aug ‘12          winter ‘12

      Journaling          Index enhancements    Aggregation
      Sharding and        to improve size and   Framework
      Replica set         performance           Multi-Data Center
      enhancements        Authentication with   Deployments
      Spherical geo       sharded clusters      Improved
      search              Replica Set           Performance and
                          Enhancements          Concurrency
                          Concurrency
                          improvements

Thursday, 25 October 12
2.2 Release August 2012

        • Concurrency: yielding + db level locking
        • New aggregation framework
        • TTL Collections
        • Improved free list implementation
        • Tag aware sharding
        • Read Preferences

        • http://docs.mongodb.org/manual/release-notes/2.2/




Thursday, 25 October 12
Yielding + DB Locking

        • improved yielding on page fault
        • breaking down the global level lock
           • Lock per Database in 2.2
           • Lock per Collection post 2.2




Thursday, 25 October 12
Aggregation Framework
        • pipeline model (a bit like unix pipes)
        • like a "group by"
              – Operators
                 – $project, $group, $match, $limit, $skip, $unwind, $sort
              – Expressions
                 – Logical Expressions: $and, $not, $or, $cmp ...
                 – Math Expressions: $add, $divide, $mod ...
                 – String Expressions: $strcasecmp, $substr, $toLower ...
                 – Date/Time Expressions: $dayOfMonth, $hour...
                 – Multi-Expressions: $ifNull, $cond

        • Use Cases: Real-time / inline analytics



Thursday, 25 October 12
Example - For each "tag", list
        the authors
        {
            title : "my tech blog" ,
            author : "bob" ,
            tags : [ "fun" , "good" , "tech" ] ,
        }

        {
            title : "cool tech" ,
            author : "jim" ,
            tags : [ "awesome" , "tech" ] ,
        }




Thursday, 25 October 12
Aggregate Command

        db.blogs.aggregate(
           { $project : { author : 1, tags : 1 } },
           { $unwind : "$tags" },
           { $group : {
              _id : { tags : "$tags" },
              authors : { $addToSet : "$author" }
           } }
        );




Thursday, 25 October 12
Time To Live (TTL)
        Collections
        • auto expire data out of a collection
        • must be on a date datatype
        • single value is evaluated
        • Use Cases: data retention, cache expiration
        db.events.ensureIndex(
          { "timestamp": 1 },
          { expireAfterSeconds: 3600 } )




Thursday, 25 October 12
Tag aware sharding

        • Distribute data based on a Tag
        • Use Cases: Locality for Data by Data Center
        sh.addShardTag("shard0000", "dc-emea")

        sh.addTagRange("mydb.users",
                       { country: "uk"}, { country: "ul"},
                       "dc-emea"
        );

        sh.addTagRange("mydb.users",
                       { country: "by"},{ country: "bz"},
                       "dc-emea"
        );



Thursday, 25 October 12
Read Preferences

        • Mode
           • PRIMARY, PRIMARY_PREFERRED
           • SECONDARY, SECONDARY_PREFERRED
           • NEAREST
        • Tag Sets
           • Uses Replica Set tags
           • Passed Tag is used to find matching members




Thursday, 25 October 12
2.4 Roadmap

        Must
        • Kerberos integration
        • LDAP/AD integration
        Nice To Have
        • Hash Shard Key
        • Background Index Build on Secondaries
        • V8 for Map/Reduce (replaces Spider Monkey)
        • Geo: intersecting polygons, Geo shard key
        • Agg: $out, more functions, speed improvements



Thursday, 25 October 12
And beyond

        • Full Text Search
        • Collection / Extent level locking
        • Field level security
        • Audit




Thursday, 25 October 12
Thursday, 25 October 12

More Related Content

PDF
A Morning with MongoDB Barcelona: Use Cases and Roadmap
PDF
Welcome to icehouse
PDF
Dsdt meetup 2017 11-21
PDF
Cloud Databases in Research and Practice
PPTX
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
PDF
Cassandra's Odyssey @ Netflix
PDF
Google Cloud Dataflow
PPTX
Video Analysis in Hadoop
A Morning with MongoDB Barcelona: Use Cases and Roadmap
Welcome to icehouse
Dsdt meetup 2017 11-21
Cloud Databases in Research and Practice
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
Cassandra's Odyssey @ Netflix
Google Cloud Dataflow
Video Analysis in Hadoop

What's hot (20)

PPTX
Curriculum Associates Strata NYC 2017
PDF
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
PPTX
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
PPTX
MongoDB World 2016: Keynote
PDF
Web Performance – die effektivsten Techniken aus der Praxis
PPTX
DataStax C*ollege Credit: What and Why NoSQL?
PDF
Cloud Connect 2012, Big Data @ Netflix
PPTX
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
PPTX
Deploy data analysis pipeline with mesos and docker
PDF
Better, Faster, Cheaper Infrastructure: Apache CloudStack and Riak CS
PPTX
Google Cloud and Data Pipeline Patterns
PPTX
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
PDF
AWS Athena vs. Google BigQuery for interactive SQL Queries
PDF
Google Cloud Platform Kubernetes Workshop IYTE
PPTX
Managing 100s of PetaBytes of data in Cloud
PPTX
Managing Cloud Security Design and Implementation in a Ransomware World
PPTX
GCP for AWS Professionals
PPT
CloudStack and BigData
PPTX
Graph Databases at Netflix
PPTX
2011 boston open stack meetup 11 29_r1jmm
Curriculum Associates Strata NYC 2017
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
MongoDB World 2016: Keynote
Web Performance – die effektivsten Techniken aus der Praxis
DataStax C*ollege Credit: What and Why NoSQL?
Cloud Connect 2012, Big Data @ Netflix
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
Deploy data analysis pipeline with mesos and docker
Better, Faster, Cheaper Infrastructure: Apache CloudStack and Riak CS
Google Cloud and Data Pipeline Patterns
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
AWS Athena vs. Google BigQuery for interactive SQL Queries
Google Cloud Platform Kubernetes Workshop IYTE
Managing 100s of PetaBytes of data in Cloud
Managing Cloud Security Design and Implementation in a Ransomware World
GCP for AWS Professionals
CloudStack and BigData
Graph Databases at Netflix
2011 boston open stack meetup 11 29_r1jmm
Ad

Similar to MongoDB Use Cases and Roadmap (20)

PPT
MongoDB Roadmap
PPT
MongoDB Tick Data Presentation
PPTX
MongoDB Roadmap
PPTX
NoSQL for the SQL Server Pro
PDF
Webinar: The Future of SQL
PDF
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
PPTX
Introducing MongoDB into your Organization
PDF
Accra MongoDB User Group
PDF
Using Spring with NoSQL databases (SpringOne China 2012)
PPTX
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
PPTX
Enterprise Trends for MongoDB as a Service
PDF
Intro to NoSQL and MongoDB
PPTX
Nosql Now 2012: MongoDB Use Cases
POTX
EDB Postgres in DBaaS & Container Platforms
PDF
2012 mongo db_bangalore_roadmap_new
PDF
Cloud Big Data Architectures
PDF
Google Cloud Dataflow Two Worlds Become a Much Better One
PDF
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
PPTX
Neo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
PPTX
Cqrs and event sourcing in azure
MongoDB Roadmap
MongoDB Tick Data Presentation
MongoDB Roadmap
NoSQL for the SQL Server Pro
Webinar: The Future of SQL
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
Introducing MongoDB into your Organization
Accra MongoDB User Group
Using Spring with NoSQL databases (SpringOne China 2012)
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Enterprise Trends for MongoDB as a Service
Intro to NoSQL and MongoDB
Nosql Now 2012: MongoDB Use Cases
EDB Postgres in DBaaS & Container Platforms
2012 mongo db_bangalore_roadmap_new
Cloud Big Data Architectures
Google Cloud Dataflow Two Worlds Become a Much Better One
Maximizing Data Lake ROI with Data Virtualization: A Technical Demonstration
Neo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
Cqrs and event sourcing in azure
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectral efficient network and resource selection model in 5G networks
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The AUB Centre for AI in Media Proposal.docx
Digital-Transformation-Roadmap-for-Companies.pptx
Big Data Technologies - Introduction.pptx
Chapter 3 Spatial Domain Image Processing.pdf
MYSQL Presentation for SQL database connectivity
Encapsulation_ Review paper, used for researhc scholars
Network Security Unit 5.pdf for BCA BBA.
Unlocking AI with Model Context Protocol (MCP)
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Machine learning based COVID-19 study performance prediction
20250228 LYD VKU AI Blended-Learning.pptx
cuic standard and advanced reporting.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
sap open course for s4hana steps from ECC to s4
Cloud computing and distributed systems.

MongoDB Use Cases and Roadmap

  • 1. Use Cases and Roadmap Norberto Leite Senior Solutions Architect, EMEA norberto@10gen.com @nleite Thursday, 25 October 12
  • 2. Agenda •Use Cases •Roadmap •Future Thursday, 25 October 12
  • 4. Big Data = MongoDB = Solved Content Management Operational Intelligence E-Commerce User Data Management High Volume Data Feeds Mobile Thursday, 25 October 12
  • 5. Location Based Service •Problem: •Location based social networking service needs to scale to high number of users and check-ins •Solution: •Used MongoDB deployed on EC2 •8 clusters, 40 machines, 15k QPS, 2.3 billion records •Auto-sharding and geo-spatial indexing are key •Results: •To date have scaled to 9m users, 3m check-ins per day, 750m total check-ins, 20m places, 400k merchants Thursday, 25 October 12
  • 6. •Problem: •Business needed modern data store for rapid development and scale •Solution: •Used PHP and MongoDB •Results: •RealTime estatistics •All data, images etc store together •No need for complex migrations •Enable very rapid development and growth Thursday, 25 October 12
  • 7. •Problem: •Deal with massive data volume across all customers •Solution: •Use MongoDB to replace Google Analytics / Omniture •Results: •Less than one week to build prototype and POC •Rapid deployment of new features Thursday, 25 October 12
  • 8. •Problem: •Lots of friction with RDMS for archiving storage •Needed to more scalable archive storage database •Solution: •Keep MySQL for active data ( 100 Million ) •MongoDB for archive ( 2 Billion ) •Results: •No more alter tables statements taking over 2 months •Sharding fixed vertical scale problem •Very happily looking for other ways to use MongoDB Thursday, 25 October 12
  • 9. How Telefónica uses MongoDB Apps M2M Event Acquisition Event notification Event Notifier Portal API Event Mng Core Storage Mng Mng Storage Platform Event Gateway BOSS Event acquisition Operator Network MNO1 MNO2 MNOn Thursday, 25 October 12
  • 10. And many others ... Thursday, 25 October 12
  • 12. The Evolution of MongoDB 1.8 2.0 2.2 2.4 March ‘11 Sept ‘11 Aug ‘12 winter ‘12 Journaling Index enhancements Aggregation Sharding and to improve size and Framework Replica set performance Multi-Data Center enhancements Authentication with Deployments Spherical geo sharded clusters Improved search Replica Set Performance and Enhancements Concurrency Concurrency improvements Thursday, 25 October 12
  • 13. 2.2 Release August 2012 • Concurrency: yielding + db level locking • New aggregation framework • TTL Collections • Improved free list implementation • Tag aware sharding • Read Preferences • http://docs.mongodb.org/manual/release-notes/2.2/ Thursday, 25 October 12
  • 14. Yielding + DB Locking • improved yielding on page fault • breaking down the global level lock • Lock per Database in 2.2 • Lock per Collection post 2.2 Thursday, 25 October 12
  • 15. Aggregation Framework • pipeline model (a bit like unix pipes) • like a "group by" – Operators – $project, $group, $match, $limit, $skip, $unwind, $sort – Expressions – Logical Expressions: $and, $not, $or, $cmp ... – Math Expressions: $add, $divide, $mod ... – String Expressions: $strcasecmp, $substr, $toLower ... – Date/Time Expressions: $dayOfMonth, $hour... – Multi-Expressions: $ifNull, $cond • Use Cases: Real-time / inline analytics Thursday, 25 October 12
  • 16. Example - For each "tag", list the authors { title : "my tech blog" , author : "bob" , tags : [ "fun" , "good" , "tech" ] , } { title : "cool tech" , author : "jim" , tags : [ "awesome" , "tech" ] , } Thursday, 25 October 12
  • 17. Aggregate Command db.blogs.aggregate( { $project : { author : 1, tags : 1 } }, { $unwind : "$tags" }, { $group : { _id : { tags : "$tags" }, authors : { $addToSet : "$author" } } } ); Thursday, 25 October 12
  • 18. Time To Live (TTL) Collections • auto expire data out of a collection • must be on a date datatype • single value is evaluated • Use Cases: data retention, cache expiration db.events.ensureIndex( { "timestamp": 1 }, { expireAfterSeconds: 3600 } ) Thursday, 25 October 12
  • 19. Tag aware sharding • Distribute data based on a Tag • Use Cases: Locality for Data by Data Center sh.addShardTag("shard0000", "dc-emea") sh.addTagRange("mydb.users", { country: "uk"}, { country: "ul"}, "dc-emea" ); sh.addTagRange("mydb.users", { country: "by"},{ country: "bz"}, "dc-emea" ); Thursday, 25 October 12
  • 20. Read Preferences • Mode • PRIMARY, PRIMARY_PREFERRED • SECONDARY, SECONDARY_PREFERRED • NEAREST • Tag Sets • Uses Replica Set tags • Passed Tag is used to find matching members Thursday, 25 October 12
  • 21. 2.4 Roadmap Must • Kerberos integration • LDAP/AD integration Nice To Have • Hash Shard Key • Background Index Build on Secondaries • V8 for Map/Reduce (replaces Spider Monkey) • Geo: intersecting polygons, Geo shard key • Agg: $out, more functions, speed improvements Thursday, 25 October 12
  • 22. And beyond • Full Text Search • Collection / Extent level locking • Field level security • Audit Thursday, 25 October 12