SlideShare a Scribd company logo
MongoDB as A
Message Queue
         Luke Gotszling

          Aol / About.me

Silicon Valley MongoDB User Group
           Big Data Week
           Palo Alto, CA
           April 25, 2012

                                    1
Prior AMQP Usage

• 3-node RabbitMQ cluster on v1.8, opted to
  forego disk persistence for better
  performance
• Hard to diagnose cause of failure at scale




                                               2
At About.me


• All asynchronous and periodic tasks
• Short lived messages
 • No journalling
• Sharded cluster on v2.0.4 (shard key =
  queue name)



                                           3
Benefits

• Async operations
• Per message (document) atomicity
• Batch processes
• Periodic processes
• Durability / ability to shard
• Operational familiarity


                                     4
AMQP?
                       Direct               Topic               Fanout
                                                    ?



 AMQP                   Push                  Yes                  Yes




 Mongo                                    Regular
                         Poll                                   Sort of*
 Queue                                   expression

* Options include passing a message along with an incrementing key or
multiple declarations. Added to Kombu in v2.1 -- reduces performance for
non-fanout operations due to additional queries
                                                                           5
To cap or not to cap
• Capped collections[1]
   • Better performance but limited to single node[2]
   • FIFO
• Uncapped collections -- rest of this presentation
   • Can shard, lower performance per-node
   • FIFO-ish[3], custom ordering available
[1] http://blog.boxedice.com/2011/04/13/queueing-mongodb-using-mongodb/

   http://blog.boxedice.com/2011/09/28/replacing-rabbitmq-with-mongodb/

[2] SERVER-211, SERVER-2654

[3] Only down to 1 second granularity
                                                                          6
Code (mongo)
• Create:
    db.messages.insert( { queue:"email",
                          payload:serialized_data} )


• Consume:
    db.messages.findAndModify( { query:{"queue":"email"},
                                 sort:{"_id":+1},
                                 remove:true} )




• Index:
     db.messages.ensureIndex({ queue:1 })
     db.messages.ensureIndex({ queue:1, _id:1})



                                                            7
Code (Python)
• Create:
    self.client.insert({"payload": serialize(message),
                        "queue": queue})


• Consume:
     self.client.database.command("findandmodify", "messages",
                           query={"queue": queue},
                           sort={"_id": pymongo.ASCENDING},
                           remove=True)



• Index:
     col.ensure_index([("queue", 1)])
     col.ensure_index([("queue", 1),("_id", 1)])

  http://packages.python.org/kombu/

                                                                 8
Celery Task Creation
              Benchmarks (Single-Node)
                         RabbitMQ v2.7.1                              MongoDB (2.0.4) --nojournal
                         MongoDB (2.0.4) --journal

              5600


              4200
Created / s




              2800


              1400


                 0
                     1                     2                      3                      4             5

                                                    Concurrency (processes)
                            celery 2.4.5 / kombu 2.0 / pymongo 2.1 / amqplib 1.0.2 / eventlet 0.9.16

                                                                                                           9
Celery Task Consumption
               Benchmarks (Single-Node)
                          RabbitMQ v2.7.1                          MongoDB (2.0.4) --nojournal
                          MongoDB (2.0.4) --journal

               2000


               1500
Consumed / s




               1000


                500


                  0
                      1            5              9              13             17              21     25

                                                      Concurrency (eventlet)
                            celery 2.4.5 / kombu 2.0 / pymongo 2.1 / amqplib 1.0.2 / eventlet 0.9.16

                                                                                                            10
Pros                       Cons
• Familiar technology    • Not AMQP

• Sharding               • Need to poll

• Durability             • Performance depends
                           on polling frequency
• Lower operational        and concurrency
  overhead
                         • Message consumption
• Advanced querying        is a locking operation
  (map/reduce etc...)
                         • Fewer libraries
                           available[1]
                         [1] Python has kombu, < v2.1 no fanout
                        support but better async task performance
                                                                    11
Don’t Forget To Shard
  Your Collections!




                        12
Questions?

 luke@about.me
 about.me/luke
   @lmgtwit



                 13

More Related Content

PDF
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PDF
Flink powered stream processing platform at Pinterest
PDF
Understanding Apache Kafka® Latency at Scale
PDF
Changelog Stream Processing with Apache Flink
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PDF
GraalVM の概要と、Native Image 化によるSpring Boot 爆速化の夢
PDF
Apache Kafka - Martin Podval
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink powered stream processing platform at Pinterest
Understanding Apache Kafka® Latency at Scale
Changelog Stream Processing with Apache Flink
Using the New Apache Flink Kubernetes Operator in a Production Deployment
GraalVM の概要と、Native Image 化によるSpring Boot 爆速化の夢
Apache Kafka - Martin Podval

What's hot (20)

PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PDF
게임사를 위한 Amazon GameLift 세션 - 이정훈, AWS 솔루션즈 아키텍트
PDF
Kafka internals
PDF
Prometheus at Preferred Networks
PPTX
Apache Flink in the Cloud-Native Era
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PDF
Streaming 101 Revisited: A Fresh Hot Take With Tyler Akidau and Dan Sotolongo...
PDF
MagicOnion~C#でゲームサーバを開発しよう~
PPTX
Spring と TDD
PDF
우리가 몰랐던 크롬 개발자 도구
PDF
ネットワーク運用自動化の実際〜現場で使われているツールを調査してみた〜
PDF
高負荷に耐えうるWeb application serverの作り方
PDF
Etsy Activity Feeds Architecture
PDF
Uber: Kafka Consumer Proxy
PDF
Tuning TCP and NGINX on EC2
PDF
Apache NiFi の紹介 #streamctjp
PDF
Apache Kafka – (Pattern and) Anti-Pattern
PDF
How Twitter Works (Arsen Kostenko Technology Stream)
PDF
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
게임사를 위한 Amazon GameLift 세션 - 이정훈, AWS 솔루션즈 아키텍트
Kafka internals
Prometheus at Preferred Networks
Apache Flink in the Cloud-Native Era
Where is my bottleneck? Performance troubleshooting in Flink
Streaming 101 Revisited: A Fresh Hot Take With Tyler Akidau and Dan Sotolongo...
MagicOnion~C#でゲームサーバを開発しよう~
Spring と TDD
우리가 몰랐던 크롬 개발자 도구
ネットワーク運用自動化の実際〜現場で使われているツールを調査してみた〜
高負荷に耐えうるWeb application serverの作り方
Etsy Activity Feeds Architecture
Uber: Kafka Consumer Proxy
Tuning TCP and NGINX on EC2
Apache NiFi の紹介 #streamctjp
Apache Kafka – (Pattern and) Anti-Pattern
How Twitter Works (Arsen Kostenko Technology Stream)
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
Ad

Viewers also liked (20)

PDF
Building a High-Performance Distributed Task Queue on MongoDB
KEY
MongoDB, E-commerce and Transactions
PDF
Inbound Marketing - Marketo
PPT
1346 A Single Chip Microcomputer
PPTX
PPT
Search Engine Optimization in Web Technology
PPSX
Integrated Lifecycle Marketing Workshop: Putting the Marketing Democracy to W...
PPTX
Mobile Marketing 101
PPTX
Social Aspect in human life
PDF
Managing new product development
PPTX
What is an Sms Hub
PDF
Top10 Salesforce.com Admin Tools
PPTX
Marketing & business plan
PDF
100 Sales Tips for 2014 Salesforce ebook
PDF
Performance Monitoring and Testing in the Salesforce Cloud
PPT
End of Cold War - Poland's Solidarity, Gorbachev, Fall of USSR
PPT
6 maxillary osteotomies
PPT
The Radiology of Malrotation
PPTX
The Marketing Models
PDF
Digital Self Service Trends & Innovations
Building a High-Performance Distributed Task Queue on MongoDB
MongoDB, E-commerce and Transactions
Inbound Marketing - Marketo
1346 A Single Chip Microcomputer
Search Engine Optimization in Web Technology
Integrated Lifecycle Marketing Workshop: Putting the Marketing Democracy to W...
Mobile Marketing 101
Social Aspect in human life
Managing new product development
What is an Sms Hub
Top10 Salesforce.com Admin Tools
Marketing & business plan
100 Sales Tips for 2014 Salesforce ebook
Performance Monitoring and Testing in the Salesforce Cloud
End of Cold War - Poland's Solidarity, Gorbachev, Fall of USSR
6 maxillary osteotomies
The Radiology of Malrotation
The Marketing Models
Digital Self Service Trends & Innovations
Ad

Similar to MongoDB as Message Queue (20)

PDF
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
PDF
My sql 56_roadmap_april2012_zht2
PDF
Improvements in RabbitMQ
PDF
Kotlin @ Coupang Backed - JetBrains Day seoul 2018
PDF
NoSQL with MySQL
PDF
MongoDB Tokyo - Monitoring and Queueing
PPTX
Architectures with Windows Azure
KEY
Operating MongoDB in the Cloud
PDF
TorqueBox at DC:JBUG - November 2011
PDF
TS 4839 - Enterprise Integration Patterns in Practice
PPTX
Acsug scalable windows azure patterns
PPTX
Blue host using openstack in a traditional hosting environment
PPTX
Blue host openstacksummit_2013
PDF
PostgreSQL: meet your queue
PDF
Rails in the Cloud - Experiences from running on EC2
PDF
Rails in the Cloud
PPTX
MEW22 22nd Machine Evaluation Workshop Microsoft
PDF
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
KEY
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
My sql 56_roadmap_april2012_zht2
Improvements in RabbitMQ
Kotlin @ Coupang Backed - JetBrains Day seoul 2018
NoSQL with MySQL
MongoDB Tokyo - Monitoring and Queueing
Architectures with Windows Azure
Operating MongoDB in the Cloud
TorqueBox at DC:JBUG - November 2011
TS 4839 - Enterprise Integration Patterns in Practice
Acsug scalable windows azure patterns
Blue host using openstack in a traditional hosting environment
Blue host openstacksummit_2013
PostgreSQL: meet your queue
Rails in the Cloud - Experiences from running on EC2
Rails in the Cloud
MEW22 22nd Machine Evaluation Workshop Microsoft
[AWS Dev Day] 실습워크샵 | Amazon EKS 핸즈온 워크샵
The sFlow Standard: Scalable, Unified Monitoring of Networks, Systems and App...

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPT
Teaching material agriculture food technology
PDF
KodekX | Application Modernization Development
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
sap open course for s4hana steps from ECC to s4
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Chapter 3 Spatial Domain Image Processing.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
cuic standard and advanced reporting.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Weekly Chronicles - August'25 Week I
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
Empathic Computing: Creating Shared Understanding
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Programs and apps: productivity, graphics, security and other tools
Teaching material agriculture food technology
KodekX | Application Modernization Development
“AI and Expert System Decision Support & Business Intelligence Systems”
Reach Out and Touch Someone: Haptics and Empathic Computing
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

MongoDB as Message Queue

  • 1. MongoDB as A Message Queue Luke Gotszling Aol / About.me Silicon Valley MongoDB User Group Big Data Week Palo Alto, CA April 25, 2012 1
  • 2. Prior AMQP Usage • 3-node RabbitMQ cluster on v1.8, opted to forego disk persistence for better performance • Hard to diagnose cause of failure at scale 2
  • 3. At About.me • All asynchronous and periodic tasks • Short lived messages • No journalling • Sharded cluster on v2.0.4 (shard key = queue name) 3
  • 4. Benefits • Async operations • Per message (document) atomicity • Batch processes • Periodic processes • Durability / ability to shard • Operational familiarity 4
  • 5. AMQP? Direct Topic Fanout ? AMQP Push Yes Yes Mongo Regular Poll Sort of* Queue expression * Options include passing a message along with an incrementing key or multiple declarations. Added to Kombu in v2.1 -- reduces performance for non-fanout operations due to additional queries 5
  • 6. To cap or not to cap • Capped collections[1] • Better performance but limited to single node[2] • FIFO • Uncapped collections -- rest of this presentation • Can shard, lower performance per-node • FIFO-ish[3], custom ordering available [1] http://blog.boxedice.com/2011/04/13/queueing-mongodb-using-mongodb/ http://blog.boxedice.com/2011/09/28/replacing-rabbitmq-with-mongodb/ [2] SERVER-211, SERVER-2654 [3] Only down to 1 second granularity 6
  • 7. Code (mongo) • Create: db.messages.insert( { queue:"email", payload:serialized_data} ) • Consume: db.messages.findAndModify( { query:{"queue":"email"}, sort:{"_id":+1}, remove:true} ) • Index: db.messages.ensureIndex({ queue:1 }) db.messages.ensureIndex({ queue:1, _id:1}) 7
  • 8. Code (Python) • Create: self.client.insert({"payload": serialize(message), "queue": queue}) • Consume: self.client.database.command("findandmodify", "messages", query={"queue": queue}, sort={"_id": pymongo.ASCENDING}, remove=True) • Index: col.ensure_index([("queue", 1)]) col.ensure_index([("queue", 1),("_id", 1)]) http://packages.python.org/kombu/ 8
  • 9. Celery Task Creation Benchmarks (Single-Node) RabbitMQ v2.7.1 MongoDB (2.0.4) --nojournal MongoDB (2.0.4) --journal 5600 4200 Created / s 2800 1400 0 1 2 3 4 5 Concurrency (processes) celery 2.4.5 / kombu 2.0 / pymongo 2.1 / amqplib 1.0.2 / eventlet 0.9.16 9
  • 10. Celery Task Consumption Benchmarks (Single-Node) RabbitMQ v2.7.1 MongoDB (2.0.4) --nojournal MongoDB (2.0.4) --journal 2000 1500 Consumed / s 1000 500 0 1 5 9 13 17 21 25 Concurrency (eventlet) celery 2.4.5 / kombu 2.0 / pymongo 2.1 / amqplib 1.0.2 / eventlet 0.9.16 10
  • 11. Pros Cons • Familiar technology • Not AMQP • Sharding • Need to poll • Durability • Performance depends on polling frequency • Lower operational and concurrency overhead • Message consumption • Advanced querying is a locking operation (map/reduce etc...) • Fewer libraries available[1] [1] Python has kombu, < v2.1 no fanout support but better async task performance 11
  • 12. Don’t Forget To Shard Your Collections! 12