Scalling Rails: The Journey to 200M Notifications

SCALING RAILS
The Journey to 200M
Notiﬁcations

󰞦 Software Engineer @ CloudWalk
gustavoaraujo.dev
garaujodev
garaujodev
Gustavo Araujo

🎯 Focused on Ruby on Rails and Elixir
gustavoaraujo.dev
garaujodev
garaujodev
Gustavo Araujo

💜 Passionate about code quality, observability, and performance
gustavoaraujo.dev
garaujodev
garaujodev
Gustavo Araujo

💜 Passionate about code quality, observability, and performance
󰜼 Sharing knowledge through talks and technical content
gustavoaraujo.dev
garaujodev
garaujodev
Gustavo Araujo

Application Challenges
➡ High-Volume Notiﬁcations (1B last year)

➡ Multi-channel (e.g., WhatsApp, SMS, Email, Push)

➡ Several Database Writes

➡ Several Database Writes
➡ Unpredictable workloads

The challenges we faced,
and the key lessons learned

⚙Concurrency Model ⚡ Performance ✅ Best Use Case ❌ Limitations
WebRick
Single-threaded +
single-process
Very low Development only
Not production-ready;
removed in Rails 6+
Ruby Webservers

WebRick
Single-threaded +
single-process
removed in Rails 6+
Passenger
Multi-threaded +
multi-process (hybrid)
Medium to High
Easy-to-deploy
production apps
Advanced features require
paid license
Ruby Webservers

WebRick
Single-threaded +
single-process
removed in Rails 6+
Passenger
Multi-threaded +
Medium to High
Easy-to-deploy
production apps
paid license
Unicorn
Multi-process
(no threads)
High for CPU-bound
apps
Apps requiring
process-level isolation
No concurrency per
worker; not ideal for
I/O-bound apps
Ruby Webservers

WebRick
Single-threaded +
single-process
removed in Rails 6+
Passenger
Multi-threaded +
Medium to High
Easy-to-deploy
production apps
paid license
Unicorn
Multi-process
(no threads)
High for CPU-bound
apps
Apps requiring
process-level isolation
No concurrency per
worker; not ideal for
I/O-bound apps
Puma
Multi-threaded +
multi-process (clustered)
Highly conﬁgurable —
scales with threads
and workers
Modern Rails apps
needing concurrent I/O
handling
Requires tuning
(threads/workers + DB pool
alignment)
Ruby Webservers

Puma
✅ ❌
Clustered Mode Best for Multi-core CPUs, CPU-bound applications
Higher memory usage since each worker
is a separate process

Puma
✅ ❌
Clustered Mode Best for Multi-core CPUs, CPU-bound applications
Higher memory usage since each worker
is a separate process
Single Mode Best for I/O-bound apps, lower memory usage
Limitation: Single process, may not fully
utilize multi-core CPUs

How Many Workers?
Scenario Recommendation
🖥 App runs on VM/bare metal Use 1 worker per available CPU core

How Many Workers?
📦 App runs in Docker/Kubernetes Respect container CPU limit (cpu_limit = 2) → set workers = 2

How Many Workers?
🧠 App has high memory usage Use fewer workers, increase threads

How Many Workers?
🧠 App has high memory usage Use fewer workers, increase threads
🧪 Unclear limits or mixed load Start with 2–4 workers and benchmark

How Many Threads?
Factor Guideline
🔄 I/O-bound app Use more threads (e.g. 16–32) to handle concurrency

How Many Threads?
Factor Guideline
🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention

How Many Threads?
Factor Guideline
🧮 CPU-bound app Fewer threads (e.g. 4–8) to avoid contention
🧵 Low traffic env (dev/staging) Use something like threads 1, 4

Platform as a Service (PaaS)
Example: Heroku, ﬂy.io
✅ Easy to get started and Built-in autoscaling

Platform as a Service (PaaS)
Example: Heroku, ﬂy.io
✅ Easy to get started and Built-in autoscaling
❌ Cost to scale and limited control over infrastructure for advanced tuning

Infrastructure as a Service (IaaS)
Example: AWS, GCP, Azure
✅ Full control (Instances, Auto Scaling, Load Balancers)

Infrastructure as a Service (IaaS)
Example: AWS, GCP, Azure
✅ Full control (Instances, Auto Scaling, Load Balancers)
❌ High availability requires manual setup, ﬁxed resource provisioning

Container
Orchestration
Example: Kubernetes (can run on AWS, GCP, Azure, or On-Premise)
✅ Portability and ﬁne-grained control over resources(CPU, memory, limits per pod)

Container
Orchestration
Example: Kubernetes (can run on AWS, GCP, Azure, or On-Premise)
✅ Portability and ﬁne-grained control over resources(CPU, memory, limits per pod)
❌ Steep learning curve — complex concepts (pods, services, volumes)

Kubernetes Scaling Strategies
↔ HPA (Horizontal Pod Autoscaler) → reacts to CPU/memory usage

↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods

↕ VPA (Vertical Pod Autoscaler) → adjusts resource requests for individual pods
🧱 Cluster Autoscaler → adds/removes nodes to accommodate workload

Horizontal Pod Autoscaling (HPA)

Kubernetes Event-driven Autoscaling
(KEDA)

Common Pitfalls
🧩 Issue 🛠 Cause ⚠ Impact
DB Pool Mismatch THREADS > pool in database.yml Connection timeouts and ActiveRecord errors

Common Pitfalls
Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and
degraded app performance

Common Pitfalls
Excessive Thread Count Threads set too high without real concurrency need Increased memory usage, thread contention, and
degraded app performance
Resource Overcommit Too many workers × threads for available CPU/memory Application instability due to memory exhaustion
(OOM) or CPU throttling

Database Insights
➡ Read Replicas

Database Insights
➡ Read Replicas
➡ Add and maintain proper indexes

Database Insights
➡ Read Replicas
➡ Optimize slow queries

Database Insights
➡ Read Replicas
➡ Use partitioning

Database Insights
➡ Read Replicas
➡ Use partitioning
➡ Consider cache strategies

Pro Tips
Use background jobs
Offload heavy work to async queues

Fail fast, retry smart
Use retries with backoff (Sidekiq, Shoryuken, etc.) to avoid overload loops
Pro Tips

Use observability tools
Datadog, Sentry, NewRelic, AppSignal, Skylight…
Pro Tips

Thank you!
gustavoaraujo.dev
garaujodev
garaujodev
We are hiring!
cloudwalk.io/jobs

References
https://www.speedshop.co/2015/07/29/scaling-ruby-apps-to-1000-rpm.html
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
https://keda.sh/docs/2.16/concepts/
https://kubernetes.io/docs/concepts/workloads/autoscaling/

Scalling Rails: The Journey to 200M Notifications

More Related Content

Similar to Scalling Rails: The Journey to 200M Notifications (20)

Recently uploaded (20)

Scalling Rails: The Journey to 200M Notifications