SlideShare a Scribd company logo
How to scale (with ruby on rails) George Palmer [email_address] 3dogsbark.com
Overview One server Two servers Scaling the database Scaling the web server User clusters Final architecture Caching Cached architecture Links Questions
How you start out Shared Hosting One web server and DB on same machine Application designed for one machine Volume of traffic will depend on host DB Web Server Shared Hosting
Two servers Possibly still shared hosting Web server and DB on different machine Minimal changes to code Volume of traffic will depend on whether made it to dedicated machines DB Web Server
Scaling the database (1) DB setup more suited to read intensive applications (MySQL replication) Should be on dedicated hosts Minimal changes to code Master DB Web Server Slave Slave Slave
Scaling the database (2) DB setup more suited to equal read/write applications (MySQL cluster) Should be on dedicated hosts Minimal changes to code Master DB Web Server Master DB MySQL Cluster
Scaling the web server Web Server comprises of “Worker threads” that process work as it comes in DB Farm Worker thread Worker thread Worker thread Worker thread Web Server
Load balancing App Server depends: Rails (Mongrel, FastCGI) PHP J2EE Some changes to code will be required DB Farm App Server App Server App Server Load balancer
The story so far… App servers continue to scale but the database side is somewhat limited… App Server App Server App Server Load balancer Master DB Slave Slave Slave
User Clusters For each user registered on the service add a entry to a master database detailing where their user data is stored UserID DB Cluster Basic authorisation details such as username, password, any NLS settings
User Clusters (2) App Server Master DB User  Cluster 1 User Cluster 2 User clusters are themselves one of the two database setups outlined earlier SELECT * FROM users WHERE  username=‘Bob’ AND … user_id=91732db_cluster=2
User Clusters (3) ID management becomes an issue Best to use master DB id as user_id in user cluster If let cluster allocate then make sure use offset and increment (not auto_increment) Other DBs such as session must reference a user by id and DB cluster Serious code changes may be required Will want to have ability to move use users between clusters
The final architecture As number of app servers grow it’s a good idea to add a database connection manager (eg SQLRelay) Extract out session, search, translation databases onto own machines Use MySQL cluster (or equivalent) for any critical database In replication setup can make a slave a backup master Add a NFS/SAN for static files
The final architecture (2) Load balancer Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave Master Slave Slave Slave User Cluster 2 User Cluster 1 NFS/SAN
Issues Load balancer and database connection manager are single point of failure Easy solved 2PC needed for some operations.  For example a user wants to be removed from search database 2PC not supported in rails Rails doesn’t support database switching for a given model Can do explicitly on each request but expensive due to connection establishment overhead Can get round if using connection manager but a proper solution is required (I may write a gem to do this)
Making the most of your assets In a lot of web applications a huge % of the hits are read only.  Hence the need for caching: Squid A reverse-proxy (or webserver accelerator) Memcached Distributed memory caching solution
Squid Lookup of pages is in memory, storing of files is on disk Can act also act as a load balancer Pages can be expired by sending DELETE request to proxy Squid App Server 1 App Server 2 NFS/SAN In cache Not in cache …
Memcached Location of data is irrespective of physical machine A really nice simple API SET GET DELETE In rails only a fews LOC will make a model cached Also useful for tracking cross machine information – eg dodge user behaviour App Server DB Farm Memcached Physical  Machine App Server Memcached Physical  Machine (Not in memcached)
Cached Architecture Introduce Squid Acts as load balancer (note there are higher performing load balancers) Introduce memcached Can go on every machine that has spare memory Best suited to application servers which have high CPU usage but low memory requirements
Cached architecture Squid Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave Master Slave Slave Slave User Cluster 2 User Cluster 1 NFS/SAN M C M C M C MC=memcached
Cached architecture Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached So only 15% of hits actually get to the DB!! Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration But don’t get carried away - at some point the time you spend exceeds the money saved
Cached architecture – 1 machine Squid Master DB App Server 1 App Server 2 App Server 5 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave User Cluster 1 NFS/SAN Memcached Physical Machine
How far can it go? For a truly global application, with millions of users - In order of ease: Have a cache on each continent Make user clusters based on user location Distribute the clusters physically around the world Introduce app servers on each continent If you must replicate your site globally then use transaction replication software, eg GoldenGate
Useful Links http://www.squid-cache.org/ http://www.danga.com/memcached/ http://sqlrelay.sourceforge.net/ http://railsexpress.de/blog/
Questions?

More Related Content

PPTX
Cassandra & puppet, scaling data at $15 per month
PDF
High Concurrency Architecture and Laravel Performance Tuning
PDF
Real-time Data Pipeline: Kafka Streams / Kafka Connect versus Spark Streaming
PDF
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
PPTX
HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions
PDF
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
PDF
Operational Tips for Deploying Spark by Miklos Christine
PDF
MySQL Live Migration - Common Scenarios
Cassandra & puppet, scaling data at $15 per month
High Concurrency Architecture and Laravel Performance Tuning
Real-time Data Pipeline: Kafka Streams / Kafka Connect versus Spark Streaming
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Operational Tips for Deploying Spark by Miklos Christine
MySQL Live Migration - Common Scenarios

What's hot (19)

PDF
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
PPT
How To Scale v2
PDF
Apache Drill (ver. 0.1, check ver. 0.2)
PDF
What is apache Kafka?
PDF
PostgreSQL Replication High Availability Methods
PDF
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
ODP
Sun Web Server Brief
PDF
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
PPT
Semantic Search Engines
PDF
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
PPTX
Repository performance tuning
PDF
London Apache Kafka Meetup (Jan 2017)
PPTX
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
PDF
MySQL shell and It's utilities - Praveen GR (Mydbops Team)
PDF
hbaseconasia2017: Large scale data near-line loading method and architecture
PDF
Highlights Of Sqoop2
PDF
Presto on Apache Spark: A Tale of Two Computation Engines
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PPTX
Kafka Connect
Kafka meetup JP #3 - Engineering Apache Kafka at LINE
How To Scale v2
Apache Drill (ver. 0.1, check ver. 0.2)
What is apache Kafka?
PostgreSQL Replication High Availability Methods
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Sun Web Server Brief
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Semantic Search Engines
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
Repository performance tuning
London Apache Kafka Meetup (Jan 2017)
One Tool to Rule Them All- Seamless SQL on MongoDB, MySQL and Redis with Apac...
MySQL shell and It's utilities - Praveen GR (Mydbops Team)
hbaseconasia2017: Large scale data near-line loading method and architecture
Highlights Of Sqoop2
Presto on Apache Spark: A Tale of Two Computation Engines
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
Kafka Connect
Ad

Viewers also liked (8)

PPT
Методики «Inversion of Control» и «Dependency Injection». Применение в Spring.
PDF
Принципы Solid на практике
PPTX
Принципы объектно-ориентированного дизайна
PPT
7 Stages of Scaling Web Applications
PPT
Diagnosing Technical Issues With Search Engine Optimization
PPTX
Web scale IT - Nutanix
PPTX
Introduction to Web Architecture
PDF
Architecture of a Modern Web App
Методики «Inversion of Control» и «Dependency Injection». Применение в Spring.
Принципы Solid на практике
Принципы объектно-ориентированного дизайна
7 Stages of Scaling Web Applications
Diagnosing Technical Issues With Search Engine Optimization
Web scale IT - Nutanix
Introduction to Web Architecture
Architecture of a Modern Web App
Ad

Similar to How to scale your web app (20)

PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
PPS
Scalable Web Arch
PPS
Scalable Web Architectures - Common Patterns & Approaches
PPS
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
PPTX
Handling Data in Mega Scale Systems
PDF
Advanced Deployment
PDF
Joyent circa 2006 (Scale with Rails)
PPTX
Clustrix Database Percona Ruby on Rails benchmark
PDF
Kickin' Ass with Cache-Fu (with notes)
 
ODP
MNPHP Scalable Architecture 101 - Feb 3 2011
PDF
Scalable, good, cheap
PDF
How to build a state-of-the-art rails cluster
PDF
6 tips for improving ruby performance
PDF
Scalability Considerations
PDF
What every developer should know about database scalability, PyCon 2010
PDF
System design handwritten notes guidance
PDF
Memcached Code Camp 2009
PDF
System Design.pdf
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
Scalable Web Arch
Scalable Web Architectures - Common Patterns & Approaches
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Handling Data in Mega Scale Systems
Advanced Deployment
Joyent circa 2006 (Scale with Rails)
Clustrix Database Percona Ruby on Rails benchmark
Kickin' Ass with Cache-Fu (with notes)
 
MNPHP Scalable Architecture 101 - Feb 3 2011
Scalable, good, cheap
How to build a state-of-the-art rails cluster
6 tips for improving ruby performance
Scalability Considerations
What every developer should know about database scalability, PyCon 2010
System design handwritten notes guidance
Memcached Code Camp 2009
System Design.pdf

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Electronic commerce courselecture one. Pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPT
Teaching material agriculture food technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
NewMind AI Monthly Chronicles - July 2025
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Cloud computing and distributed systems.
PDF
cuic standard and advanced reporting.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation theory and applications.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Network Security Unit 5.pdf for BCA BBA.
Electronic commerce courselecture one. Pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Teaching material agriculture food technology
MYSQL Presentation for SQL database connectivity
NewMind AI Monthly Chronicles - July 2025
The AUB Centre for AI in Media Proposal.docx
Cloud computing and distributed systems.
cuic standard and advanced reporting.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Dropbox Q2 2025 Financial Results & Investor Presentation
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Weekly Chronicles - August'25 Week I
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx

How to scale your web app

  • 1. How to scale (with ruby on rails) George Palmer [email_address] 3dogsbark.com
  • 2. Overview One server Two servers Scaling the database Scaling the web server User clusters Final architecture Caching Cached architecture Links Questions
  • 3. How you start out Shared Hosting One web server and DB on same machine Application designed for one machine Volume of traffic will depend on host DB Web Server Shared Hosting
  • 4. Two servers Possibly still shared hosting Web server and DB on different machine Minimal changes to code Volume of traffic will depend on whether made it to dedicated machines DB Web Server
  • 5. Scaling the database (1) DB setup more suited to read intensive applications (MySQL replication) Should be on dedicated hosts Minimal changes to code Master DB Web Server Slave Slave Slave
  • 6. Scaling the database (2) DB setup more suited to equal read/write applications (MySQL cluster) Should be on dedicated hosts Minimal changes to code Master DB Web Server Master DB MySQL Cluster
  • 7. Scaling the web server Web Server comprises of “Worker threads” that process work as it comes in DB Farm Worker thread Worker thread Worker thread Worker thread Web Server
  • 8. Load balancing App Server depends: Rails (Mongrel, FastCGI) PHP J2EE Some changes to code will be required DB Farm App Server App Server App Server Load balancer
  • 9. The story so far… App servers continue to scale but the database side is somewhat limited… App Server App Server App Server Load balancer Master DB Slave Slave Slave
  • 10. User Clusters For each user registered on the service add a entry to a master database detailing where their user data is stored UserID DB Cluster Basic authorisation details such as username, password, any NLS settings
  • 11. User Clusters (2) App Server Master DB User Cluster 1 User Cluster 2 User clusters are themselves one of the two database setups outlined earlier SELECT * FROM users WHERE username=‘Bob’ AND … user_id=91732db_cluster=2
  • 12. User Clusters (3) ID management becomes an issue Best to use master DB id as user_id in user cluster If let cluster allocate then make sure use offset and increment (not auto_increment) Other DBs such as session must reference a user by id and DB cluster Serious code changes may be required Will want to have ability to move use users between clusters
  • 13. The final architecture As number of app servers grow it’s a good idea to add a database connection manager (eg SQLRelay) Extract out session, search, translation databases onto own machines Use MySQL cluster (or equivalent) for any critical database In replication setup can make a slave a backup master Add a NFS/SAN for static files
  • 14. The final architecture (2) Load balancer Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave Master Slave Slave Slave User Cluster 2 User Cluster 1 NFS/SAN
  • 15. Issues Load balancer and database connection manager are single point of failure Easy solved 2PC needed for some operations. For example a user wants to be removed from search database 2PC not supported in rails Rails doesn’t support database switching for a given model Can do explicitly on each request but expensive due to connection establishment overhead Can get round if using connection manager but a proper solution is required (I may write a gem to do this)
  • 16. Making the most of your assets In a lot of web applications a huge % of the hits are read only. Hence the need for caching: Squid A reverse-proxy (or webserver accelerator) Memcached Distributed memory caching solution
  • 17. Squid Lookup of pages is in memory, storing of files is on disk Can act also act as a load balancer Pages can be expired by sending DELETE request to proxy Squid App Server 1 App Server 2 NFS/SAN In cache Not in cache …
  • 18. Memcached Location of data is irrespective of physical machine A really nice simple API SET GET DELETE In rails only a fews LOC will make a model cached Also useful for tracking cross machine information – eg dodge user behaviour App Server DB Farm Memcached Physical Machine App Server Memcached Physical Machine (Not in memcached)
  • 19. Cached Architecture Introduce Squid Acts as load balancer (note there are higher performing load balancers) Introduce memcached Can go on every machine that has spare memory Best suited to application servers which have high CPU usage but low memory requirements
  • 20. Cached architecture Squid Master DB App Server 1 App Server 2 App Server 50 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave Master Slave Slave Slave User Cluster 2 User Cluster 1 NFS/SAN M C M C M C MC=memcached
  • 21. Cached architecture Wikipedia quote a cache hit rate of 78% for squid and 7% for memcached So only 15% of hits actually get to the DB!! Performance is a whole new ball game but we recently gained 15-20% by optimising our rails configuration But don’t get carried away - at some point the time you spend exceeds the money saved
  • 22. Cached architecture – 1 machine Squid Master DB App Server 1 App Server 2 App Server 5 … DB Connection Manager Master DB Session DB Search DB NLS DB Master Slave Slave Slave User Cluster 1 NFS/SAN Memcached Physical Machine
  • 23. How far can it go? For a truly global application, with millions of users - In order of ease: Have a cache on each continent Make user clusters based on user location Distribute the clusters physically around the world Introduce app servers on each continent If you must replicate your site globally then use transaction replication software, eg GoldenGate
  • 24. Useful Links http://www.squid-cache.org/ http://www.danga.com/memcached/ http://sqlrelay.sourceforge.net/ http://railsexpress.de/blog/

Editor's Notes

  • #2: First barcamp Rails but principles applied elsewhere blog