SlideShare a Scribd company logo
SCALING RAILS
The Journey to 200M
Notifications
󰞦 Software Engineer @ CloudWalk
gustavoaraujo.dev
garaujodev
garaujodev
Gustavo Araujo
󰞦 Software Engineer @ CloudWalk
🎯 Focused on Ruby on Rails and Elixir
gustavoaraujo.dev
garaujodev
garaujodev
Gustavo Araujo
󰞦 Software Engineer @ CloudWalk
🎯 Focused on Ruby on Rails and Elixir
💜 Passionate about code quality, observability, and performance
gustavoaraujo.dev
garaujodev
garaujodev
Gustavo Araujo
󰞦 Software Engineer @ CloudWalk
🎯 Focused on Ruby on Rails and Elixir
💜 Passionate about code quality, observability, and performance
󰜼 Sharing knowledge through talks and technical content
gustavoaraujo.dev
garaujodev
garaujodev
Gustavo Araujo
Application Challenges
➡ High-Volume Notifications (1B last year)
Application Challenges
➡ High-Volume Notifications (1B last year)
➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push)
Application Challenges
➡ High-Volume Notifications (1B last year)
➡ Several Database Writes
➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push)
Application Challenges
➡ High-Volume Notifications (1B last year)
➡ Several Database Writes
➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push)
➡ Unpredictable workloads
The challenges we faced,
and the key lessons learned
CONCEPT OF SCALE
CONCEPT OF SCALE
HOW WE STARTED?
CHOOSE YOUR
WEBSERVER
⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations
WebRick
Single-threaded +
single-process
Very low Development only
Not production-ready;
removed in Rails 6+
Ruby Webservers
⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations
WebRick
Single-threaded +
single-process
Very low Development only
Not production-ready;
removed in Rails 6+
Passenger
Multi-threaded +
multi-process (hybrid)
Medium to High
Easy-to-deploy
production apps
Advanced features require
paid license
Ruby Webservers
⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations
WebRick
Single-threaded +
single-process
Very low Development only
Not production-ready;
removed in Rails 6+
Passenger
Multi-threaded +
multi-process (hybrid)
Medium to High
Easy-to-deploy
production apps
Advanced features require
paid license
Unicorn
Multi-process
(no threads)
High for CPU-bound
apps
Apps requiring
process-level isolation
No concurrency per
worker; not ideal for
I/O-bound apps
Ruby Webservers
⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations
WebRick
Single-threaded +
single-process
Very low Development only
Not production-ready;
removed in Rails 6+
Passenger
Multi-threaded +
multi-process (hybrid)
Medium to High
Easy-to-deploy
production apps
Advanced features require
paid license
Unicorn
Multi-process
(no threads)
High for CPU-bound
apps
Apps requiring
process-level isolation
No concurrency per
worker; not ideal for
I/O-bound apps
Puma
Multi-threaded +
multi-process (clustered)
Highly configurable —
scales with threads
and workers
Modern Rails apps
needing concurrent I/O
handling
Requires tuning
(threads/workers + DB pool
alignment)
Ruby Webservers
Puma
✅ ❌
Clustered Mode Best for Multi-core CPUs, CPU-bound applications
Higher memory usage since each worker
is a separate process
Puma
✅ ❌
Clustered Mode Best for Multi-core CPUs, CPU-bound applications
Higher memory usage since each worker
is a separate process
Single Mode Best for I/O-bound apps, lower memory usage
Limitation: Single process, may not fully
utilize multi-core CPUs
How Many Workers?
Scenario Recommendation
🖥 App runs on VM/bare metal Use 1 worker per available CPU core
How Many Workers?
Scenario Recommendation
🖥 App runs on VM/bare metal Use 1 worker per available CPU core
📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2
How Many Workers?
Scenario Recommendation
🖥 App runs on VM/bare metal Use 1 worker per available CPU core
📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2
🧠 App has high memory usage Use fewer workers, increase threads
How Many Workers?
Scenario Recommendation
🖥 App runs on VM/bare metal Use 1 worker per available CPU core
📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2
🧠 App has high memory usage Use fewer workers, increase threads
🧪 Unclear limits or mixed load Start with 2–4 workers and benchmark
How Many Threads?
Factor Guideline
🔄 I/O-bound app Use more threads (e.g. 16–32) to handle concurrency
How Many Threads?
Factor Guideline
🔄 I/O-bound app Use more threads (e.g. 16–32) to handle concurrency
🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention
How Many Threads?
Factor Guideline
🔄 I/O-bound app Use more threads (e.g. 16–32) to handle concurrency
🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention
🧵 Low traffic env (dev/staging) Use something like threads 1, 4
DEPLOYMENT STRATEGIES
Platform as a Service (PaaS)
Example: Heroku, fly.io
✅ Easy to get started and Built-in autoscaling
Platform as a Service (PaaS)
Example: Heroku, fly.io
✅ Easy to get started and Built-in autoscaling
❌ Cost to scale and limited control over infrastructure for advanced tuning
Infrastructure as a Service (IaaS)
Example: AWS, GCP, Azure
✅ Full control (Instances, Auto Scaling, Load Balancers)
Infrastructure as a Service (IaaS)
Example: AWS, GCP, Azure
✅ Full control (Instances, Auto Scaling, Load Balancers)
❌ High availability requires manual setup, fixed resource provisioning
Container
Orchestration
Example: Kubernetes (can run on AWS, GCP, Azure, or On-Premise)
✅ Portability and fine-grained control over resources(CPU, memory, limits per pod)
Container
Orchestration
Example: Kubernetes (can run on AWS, GCP, Azure, or On-Premise)
✅ Portability and fine-grained control over resources(CPU, memory, limits per pod)
❌ Steep learning curve — complex concepts (pods, services, volumes)
Kubernetes Scaling Strategies
↔ HPA (Horizontal Pod Autoscaler) → reacts to CPU/memory usage
Kubernetes Scaling Strategies
↔ HPA (Horizontal Pod Autoscaler) → reacts to CPU/memory usage
↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods
Kubernetes Scaling Strategies
↔ HPA (Horizontal Pod Autoscaler) → reacts to CPU/memory usage
↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods
🧱 Cluster Autoscaler → adds/removes nodes to accommodate workload
Horizontal Pod Autoscaling (HPA)
Kubernetes Event-driven Autoscaling
(KEDA)
WHAT WE LEARNED
Common Pitfalls
🧩 Issue 🛠 Cause ⚠ Impact
DB Pool Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors
Common Pitfalls
🧩 Issue 🛠 Cause ⚠ Impact
DB Pool Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors
Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and
degraded app performance
Common Pitfalls
🧩 Issue 🛠 Cause ⚠ Impact
DB Pool Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors
Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and
degraded app performance
Resource Overcommit Too many workers × threads for available CPU/memory Application instability due to memory exhaustion
(OOM) or CPU throttling
Database Insights
➡ Read Replicas
Database Insights
➡ Read Replicas
➡ Add and maintain proper indexes
Database Insights
➡ Read Replicas
➡ Add and maintain proper indexes
➡ Optimize slow queries
Database Insights
➡ Read Replicas
➡ Add and maintain proper indexes
➡ Optimize slow queries
➡ Use partitioning
Database Insights
➡ Read Replicas
➡ Add and maintain proper indexes
➡ Optimize slow queries
➡ Use partitioning
➡ Consider cache strategies
Pro Tips
Use background jobs
Offload heavy work to async queues
Fail fast, retry smart
Use retries with backoff (Sidekiq, Shoryuken, etc.) to avoid overload loops
Pro Tips
Use observability tools
Datadog, Sentry, NewRelic, AppSignal, Skylight…
Pro Tips
Thank you!
gustavoaraujo.dev
garaujodev
garaujodev
We are hiring!
cloudwalk.io/jobs
References
https://www.speedshop.co/2015/07/29/scaling-ruby-apps-to-1000-rpm.html
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
https://keda.sh/docs/2.16/concepts/
https://kubernetes.io/docs/concepts/workloads/autoscaling/

More Related Content

PPT
GlobalsDB: Its significance for Node.js Developers
PPTX
Identifying Workloads to Move to the Cloud
PPTX
Building Scalable Websites for the Cloud
PDF
Programming Language Selection
PPTX
How to Build Scalable Websites in the Cloud
ODP
DiUS Computing Lca Rails Final
PDF
Become a Performance Diagnostics Hero
PDF
Rails Conf Europe 2007 Notes
GlobalsDB: Its significance for Node.js Developers
Identifying Workloads to Move to the Cloud
Building Scalable Websites for the Cloud
Programming Language Selection
How to Build Scalable Websites in the Cloud
DiUS Computing Lca Rails Final
Become a Performance Diagnostics Hero
Rails Conf Europe 2007 Notes

Similar to Scalling Rails: The Journey to 200M Notifications (20)

PDF
Quilt - Distributed Load Simulation from AWS
PDF
Phoenix for Rubyists
PDF
Rapidly Building and Deploying Scalable Web Architectures
PDF
Cloudy in Indonesia: Java and Cloud
PDF
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
PPT
AWS (Hadoop) Meetup 30.04.09
PDF
Castles in the Cloud: Developing with Google App Engine
PPTX
HDInsight for Architects
PPTX
analytic engine - a common big data computation service on the aws
PDF
Amazon Aurora (Debanjan Saha) - AWS DB Day
PDF
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PPT
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
PPT
Hadoop and Voldemort @ LinkedIn
PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
ODP
Front Range PHP NoSQL Databases
PDF
Server Monitoring (Scaling while bootstrapped)
PDF
Abusing the Cloud for Fun and Profit
PPTX
Windows Azure Platform + PHP - Jonathan Wong
Quilt - Distributed Load Simulation from AWS
Phoenix for Rubyists
Rapidly Building and Deploying Scalable Web Architectures
Cloudy in Indonesia: Java and Cloud
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
AWS (Hadoop) Meetup 30.04.09
Castles in the Cloud: Developing with Google App Engine
HDInsight for Architects
analytic engine - a common big data computation service on the aws
Amazon Aurora (Debanjan Saha) - AWS DB Day
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Hadoop and Voldemort @ LinkedIn
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
Front Range PHP NoSQL Databases
Server Monitoring (Scaling while bootstrapped)
Abusing the Cloud for Fun and Profit
Windows Azure Platform + PHP - Jonathan Wong
Ad

Recently uploaded (20)

PPTX
additive manufacturing of ss316l using mig welding
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
UNIT 4 Total Quality Management .pptx
PPT
Project quality management in manufacturing
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
Digital Logic Computer Design lecture notes
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Welding lecture in detail for understanding
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
composite construction of structures.pdf
PPTX
Geodesy 1.pptx...............................................
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Sustainable Sites - Green Building Construction
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
additive manufacturing of ss316l using mig welding
CYBER-CRIMES AND SECURITY A guide to understanding
UNIT 4 Total Quality Management .pptx
Project quality management in manufacturing
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Digital Logic Computer Design lecture notes
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Welding lecture in detail for understanding
Arduino robotics embedded978-1-4302-3184-4.pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
composite construction of structures.pdf
Geodesy 1.pptx...............................................
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Sustainable Sites - Green Building Construction
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Lecture Notes Electrical Wiring System Components
Operating System & Kernel Study Guide-1 - converted.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Lesson 3_Tessellation.pptx finite Mathematics
Ad

Scalling Rails: The Journey to 200M Notifications

  • 1. SCALING RAILS The Journey to 200M Notifications
  • 2. 󰞦 Software Engineer @ CloudWalk gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo
  • 3. 󰞦 Software Engineer @ CloudWalk 🎯 Focused on Ruby on Rails and Elixir gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo
  • 4. 󰞦 Software Engineer @ CloudWalk 🎯 Focused on Ruby on Rails and Elixir 💜 Passionate about code quality, observability, and performance gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo
  • 5. 󰞦 Software Engineer @ CloudWalk 🎯 Focused on Ruby on Rails and Elixir 💜 Passionate about code quality, observability, and performance 󰜼 Sharing knowledge through talks and technical content gustavoaraujo.dev garaujodev garaujodev Gustavo Araujo
  • 6. Application Challenges ➡ High-Volume Notifications (1B last year)
  • 7. Application Challenges ➡ High-Volume Notifications (1B last year) ➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push)
  • 8. Application Challenges ➡ High-Volume Notifications (1B last year) ➡ Several Database Writes ➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push)
  • 9. Application Challenges ➡ High-Volume Notifications (1B last year) ➡ Several Database Writes ➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push) ➡ Unpredictable workloads
  • 10. The challenges we faced, and the key lessons learned
  • 15. ⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Ruby Webservers
  • 16. ⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Ruby Webservers
  • 17. ⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Unicorn Multi-process (no threads) High for CPU-bound apps Apps requiring process-level isolation No concurrency per worker; not ideal for I/O-bound apps Ruby Webservers
  • 18. ⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations WebRick Single-threaded + single-process Very low Development only Not production-ready; removed in Rails 6+ Passenger Multi-threaded + multi-process (hybrid) Medium to High Easy-to-deploy production apps Advanced features require paid license Unicorn Multi-process (no threads) High for CPU-bound apps Apps requiring process-level isolation No concurrency per worker; not ideal for I/O-bound apps Puma Multi-threaded + multi-process (clustered) Highly configurable — scales with threads and workers Modern Rails apps needing concurrent I/O handling Requires tuning (threads/workers + DB pool alignment) Ruby Webservers
  • 19. Puma ✅ ❌ Clustered Mode Best for Multi-core CPUs, CPU-bound applications Higher memory usage since each worker is a separate process
  • 20. Puma ✅ ❌ Clustered Mode Best for Multi-core CPUs, CPU-bound applications Higher memory usage since each worker is a separate process Single Mode Best for I/O-bound apps, lower memory usage Limitation: Single process, may not fully utilize multi-core CPUs
  • 21. How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare metal Use 1 worker per available CPU core
  • 22. How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2
  • 23. How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2 🧠 App has high memory usage Use fewer workers, increase threads
  • 24. How Many Workers? Scenario Recommendation 🖥 App runs on VM/bare metal Use 1 worker per available CPU core 📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2 🧠 App has high memory usage Use fewer workers, increase threads 🧪 Unclear limits or mixed load Start with 2–4 workers and benchmark
  • 25. How Many Threads? Factor Guideline 🔄 I/O-bound app Use more threads (e.g. 16–32) to handle concurrency
  • 26. How Many Threads? Factor Guideline 🔄 I/O-bound app Use more threads (e.g. 16–32) to handle concurrency 🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention
  • 27. How Many Threads? Factor Guideline 🔄 I/O-bound app Use more threads (e.g. 16–32) to handle concurrency 🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention 🧵 Low traffic env (dev/staging) Use something like threads 1, 4
  • 29. Platform as a Service (PaaS) Example: Heroku, fly.io ✅ Easy to get started and Built-in autoscaling
  • 30. Platform as a Service (PaaS) Example: Heroku, fly.io ✅ Easy to get started and Built-in autoscaling ❌ Cost to scale and limited control over infrastructure for advanced tuning
  • 31. Infrastructure as a Service (IaaS) Example: AWS, GCP, Azure ✅ Full control (Instances, Auto Scaling, Load Balancers)
  • 32. Infrastructure as a Service (IaaS) Example: AWS, GCP, Azure ✅ Full control (Instances, Auto Scaling, Load Balancers) ❌ High availability requires manual setup, fixed resource provisioning
  • 33. Container Orchestration Example: Kubernetes (can run on AWS, GCP, Azure, or On-Premise) ✅ Portability and fine-grained control over resources(CPU, memory, limits per pod)
  • 34. Container Orchestration Example: Kubernetes (can run on AWS, GCP, Azure, or On-Premise) ✅ Portability and fine-grained control over resources(CPU, memory, limits per pod) ❌ Steep learning curve — complex concepts (pods, services, volumes)
  • 35. Kubernetes Scaling Strategies ↔ HPA (Horizontal Pod Autoscaler) → reacts to CPU/memory usage
  • 36. Kubernetes Scaling Strategies ↔ HPA (Horizontal Pod Autoscaler) → reacts to CPU/memory usage ↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods
  • 37. Kubernetes Scaling Strategies ↔ HPA (Horizontal Pod Autoscaler) → reacts to CPU/memory usage ↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods 🧱 Cluster Autoscaler → adds/removes nodes to accommodate workload
  • 41. Common Pitfalls 🧩 Issue 🛠 Cause ⚠ Impact DB Pool Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors
  • 42. Common Pitfalls 🧩 Issue 🛠 Cause ⚠ Impact DB Pool Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and degraded app performance
  • 43. Common Pitfalls 🧩 Issue 🛠 Cause ⚠ Impact DB Pool Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and degraded app performance Resource Overcommit Too many workers × threads for available CPU/memory Application instability due to memory exhaustion (OOM) or CPU throttling
  • 45. Database Insights ➡ Read Replicas ➡ Add and maintain proper indexes
  • 46. Database Insights ➡ Read Replicas ➡ Add and maintain proper indexes ➡ Optimize slow queries
  • 47. Database Insights ➡ Read Replicas ➡ Add and maintain proper indexes ➡ Optimize slow queries ➡ Use partitioning
  • 48. Database Insights ➡ Read Replicas ➡ Add and maintain proper indexes ➡ Optimize slow queries ➡ Use partitioning ➡ Consider cache strategies
  • 49. Pro Tips Use background jobs Offload heavy work to async queues
  • 50. Fail fast, retry smart Use retries with backoff (Sidekiq, Shoryuken, etc.) to avoid overload loops Pro Tips
  • 51. Use observability tools Datadog, Sentry, NewRelic, AppSignal, Skylight… Pro Tips