SlideShare a Scribd company logo
Adrian Hornsby, Technical Evangelist @ AWS
Journey Towards Scaling Your
Application to 10 Million Users
@adhorn
• Technical Evangelist, Developer Advocate,
… Software Engineer
• Own bed in Finland
• Previously:
• Solutions Architect @AWS
• Lead Cloud Architect @Dreambroker
• Director of Engineering, Software Engineer, DevOps, Manager, ... @Hdm
• Researcher @Nokia Research Center
• and a bunch of other stuff.
• Climber, like Ginger shots.
Journey Towards Scaling Your Application to Million Users
Let’s start from…
The “Must” from Day 1
• High quality code
• Version controlled
• CI/CD pipeline
• Infrastructure as code
• Security at every layer
• Cost conscious
• Test & Monitor everything
• DR procedure
Operational Excellence
The “Must” from Day 1
• High quality code
• Version controlled
• CI/CD pipeline
• Infrastructure as code
• Security at every layer
• Cost conscious
• Test & Monitor everything
• DR procedure
Operational Excellence
The “Must” from Day 1
• High quality code
• Version controlled
• CI/CD pipeline
• Infrastructure as code
• Security at every layer
• Cost conscious
• Test & Monitor everything
• DR procedure
Operational Excellence
The “Must” from Day 1
• High quality code
• Version controlled
• CI/CD pipeline
• Infrastructure as code
• Security at every layer
• Cost conscious
• Test & Monitor everything
• DR procedure
Operational Excellence
Journey Towards Scaling Your Application to Million Users
Journey Towards Scaling Your Application to Million Users
What are we building?
AWS Global Infrastructure
16
Regions
44 Availability Zones
Users > 1
K.I.S.S
Amazon Simple Storage Service (S3)
Amazon S3
Amazon S3
App 0.0v1
http://example.adhorn.me.s3-website-eu-west-1.amazonaws.com
Simple Static Website
Amazon Route53
• Traffic Flow
• Latency Based Routing
• Geo DNS
• Private DNS for Amazon VPC
• DNS Failover
• Health Checks and Monitoring
• Domain Registration
• CloudFront Zone Apex Support
• S3 Zone Apex Support
• Weighted Round Robin
Highly available and scalable DNS web service.
Amazon S3
App 0.0v2
http://poliko.adhorn.me.s3-website-eu-west-1.amazonaws.com
Simple Static Website
Amazon
Route53 http://example.adhorn.me
• Cache content at the edge for
faster delivery
• Lower load on origin
• Dynamic and static content
• Streaming video
• Custom SSL certificates
• Low TTLs
Amazon CloudFront (CDN)
Amazon S3
Amazon
CloudFront
Amazon
Route53
App 0.0v3
Simple Static Website
http://example.adhorn.me
Custom backend
App 0.1v1
Amazon
EC2
instance
Elastic IP
User
Amazon
Route 53
EC2 backend
example.adhorn.me
54.223.92.16
App 0.1v2
Docker
Container
Elastic IP
User
Amazon
Route 53
Containerized backend
54.223.92.16
example.adhorn.me
Managed
API Gateway
cache
Amazon
CloudWatch
API Gateway
Endpoints on
Amazon EC2
Any other publicly
accessible endpoint
AWS Lambda
functions
Amazon
CloudFront
API Gateway
User Amazon
Route 53
App 0.1v3
Serverless backend
Databases
Self-managed Fully managed
Amazon EC2 Amazon
DynamoDB
Amazon RDS
Database options
NoSQL vs SQL
Why start with SQL?
• Easy to change your data access needs
• Established and well-worn technology.
• Lots of existing code, communities, books, and tools.
• You aren’t going to break SQL DBs in your first 10 million
users. No, really, you won’t.*
• Clear patterns to scalability.
*Unless you are doing something SUPER peculiar with the data or you have MASSIVE amounts of it.
…but even then SQL will have a place in your stack.
Why you might need NoSQL?
• Super low-latency applications
• Metadata-driven datasets
• Highly non-relational data
• Need schema-less data constructs*
• Rapid ingest of (unstructured) data (thousands of
records/sec)
• Massive amounts of data (again, in the TB+ range)
*Need != “It’s easier to do dev without schemas”
Application
Elastic IP
Database
User
Amazon
Route 53
App 0.2
Separate the data layer
Separation of content type
Application
Elastic IP
Database
User
Amazon
Route 53
App 0.3
Separate static assets from dynamic content
Amazon S3
Amazon
CloudFront
*.js
*.jpeg
*.mp4
Users > 1000
Availability and Redundancy
Elastic Load Balancer
• Highly available
• 1 - 65535
• Health checks
• Session stickiness
• Monitoring / Logging
• Content-based routing
• Container-based apps
• WebSockets
• HTTP/2
Web
Instance
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
Web
Instance
RDS DB Instance
Standby (Multi-AZ)
Load
balancer
App 0.4
Available & redundant application
User
Amazon
Route 53
Amazon
CloudFront
Amazon S3
Caching layer
Amazon Elasticache
• Redis and Memcached Compatible
• Fully Managed
• Easily Scalable
• Transient session data
• Shared state
• High-frequency counters
• Queues
• Leaderboards
• Pub/Sub
• Lists, sets, …
In-memory data store and cache
Amazon
ElastiCache
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
RDS DB Instance
Standby (Multi-AZ)
ELB
User
Amazon
Route 53
Amazon
CloudFront
Amazon S3
App 0.5
Stateless application
Web
Instance
Web
Instance
Auto Scaling
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Weekly traffic pattern
Auto Scaling
• Maintain your Amazon EC2 instance availability
• Automatically Scale Up and Down your EC2 Fleet
• Scale based on CPU, Memory or Custom metrics
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
RDS DB Instance
Standby (Multi-AZ)
ELB
App 0.6
Auto scaling groups
User
Amazon
Route 53
Amazon
CloudFront
Amazon S3
Web
Instances
Web
Instances ElastiCache
Auto-Scaling group
Users > 100,000
Journey Towards Scaling Your Application to Million Users
Databases (part 1)
Read / Write Sharding
RDS DB Instance
Read Replica
App
Instance
App
Instance
App
Instance
RDS DB Instance
Master (Multi-AZ)
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
Database Federation
Users
DB
Products
DB
App
Instance
App
Instance
App
Instance
Database Sharding
User ShardID
002345 A
002346 B
002347 C
002348 B
002349 A
CBA
App
Instance
App
Instance
App
Instance
Users > 1,000,000
Asynchronous patterns
Message passing
A
Queue
B
A
Queue
BListener
Pub-Sub
SNS, SQS, Redis, RabbitMQ
Async. Architecture (part 1)
Web
Instances
Worker
Instance
Worker
Instance
Queue
API
Instance
API
Instance
API
Instance
API: {DO foo}
PUT JOB: {JobID: 0001, Task: DO foo}
API: {JobID: 0001}
GET JOB: {JobID: 0001, Task: DO foo}
ElastiCache
Result:
{
JobID: 0001,
Result: bar
}
Async. Architecture (part 2)
Worker
Instance
Worker
Instance
Queue
API
Instance
API
Instance
API
Instance
ElastiCache
Amazon SNS
Push Notification
User
RDS DB Instance
Active (Multi-AZ)
Availability Zone
Elastic Load
Balancer
Web
Instance
Web
Instance
Amazon
Route 53User
Amazon S3
Amazon
Cloudfront
ElastiCache
Worker
Instance
Worker
Instance
App 0.7
Decoupling
Queue Amazon SNS
Event-driven patterns
Event driven
A B CEvent on B by A triggers C
How Lambda works
S3 event
notifications
DynamoDB
Streams
Kinesis
events
Cognito
events
SNS
events
Custom
events
CloudTrail
events
LambdaDynamoDB
Kinesis S3
Any custom
Invoked in response to events
- Changes in data
- Changes in state
Redshift
SNS
Access any service,
including your own
Such as…
Lambda functions
CloudWatch
events
Event-driven using Lambda
AWS Lambda:
Resize Images
Users upload photos
S3:
Source Bucket
S3:
Destination Bucket
Triggered on
PUTs
Micro-Services
Amazon
Route 53User
Amazon
Cloudfront
API Edge Service
Product Listing
Service
Recommendation
Service
Any
Service
Auth.
Service
Databases (part 2)
Specialized Database
NoSQL Graph DB
Database specialization example: Redis
In-memory data structure store, used as a database, cache and message
broker.
Specialized in data structures such as
• string
• hashes
• lists
• sets
• sorted sets with range queries
• bitmaps
• hyperloglogs
• geospatial indexes with radius queries
Users = 10,000,000
Happy Scaling!

More Related Content

PPTX
Deep Dive on Amazon S3
PPTX
Being Well Architected in the Cloud (Updated)
PPTX
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
PPTX
2016 Utah Cloud Summit: Big Data Architectural Patterns and Best Practices on...
PPTX
Journey Towards Scaling Your Application to Million Users
PPTX
Scaling on AWS to the First 10 Million Users
PPTX
Escalando para sus primeros 10 millones de usuarios
PPTX
Escalando para sus primeros 10 millones de usuarios
Deep Dive on Amazon S3
Being Well Architected in the Cloud (Updated)
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
2016 Utah Cloud Summit: Big Data Architectural Patterns and Best Practices on...
Journey Towards Scaling Your Application to Million Users
Scaling on AWS to the First 10 Million Users
Escalando para sus primeros 10 millones de usuarios
Escalando para sus primeros 10 millones de usuarios

Similar to Journey Towards Scaling Your Application to Million Users (20)

PDF
Scaling on AWS for the First 10 Million Users at Websummit Dublin
PDF
What is Amazon Web Services & How to Start to deploy your apps ?
PDF
002 AWSSlides.pdf
PDF
Scale, baby, scale! (June 2016)
PPTX
Architecting for AWS Cloud - let's do it right!
PPTX
Migrating enterprise workloads to AWS
PDF
Scale, baby, scale!
PPTX
Introduction to amazon web services for developers
PDF
"How to optimize the architecture of your platform" by Julien Simon
PDF
Scale, baby, scale
PPTX
PPTX
AWS basics
PDF
Scaling web application in the Cloud
PPTX
Amazon web services
PPTX
Aws 101 garage+
PPTX
Journey Towards Scaling Your API to 10 Million Users
PPTX
PDF
Escalando hasta sus primeros 10 millones de usuarios
PDF
Introduction to AWS
PDF
AMAZON CLOUD Course Content
Scaling on AWS for the First 10 Million Users at Websummit Dublin
What is Amazon Web Services & How to Start to deploy your apps ?
002 AWSSlides.pdf
Scale, baby, scale! (June 2016)
Architecting for AWS Cloud - let's do it right!
Migrating enterprise workloads to AWS
Scale, baby, scale!
Introduction to amazon web services for developers
"How to optimize the architecture of your platform" by Julien Simon
Scale, baby, scale
AWS basics
Scaling web application in the Cloud
Amazon web services
Aws 101 garage+
Journey Towards Scaling Your API to 10 Million Users
Escalando hasta sus primeros 10 millones de usuarios
Introduction to AWS
AMAZON CLOUD Course Content
Ad

More from Adrian Hornsby (20)

PPTX
How can your business benefit from going serverless?
PDF
Can Automotive be as agile as Unicorns?
PDF
Moving Forward with AI - as presented at the Prosessipäivät 2018
PPTX
Chaos Engineering: Why Breaking Things Should Be Practised.
PPTX
Chaos Engineering: Why Breaking Things Should Be Practised.
PPTX
Model Serving for Deep Learning
PDF
AI in Finance: Moving forward!
PPTX
Building a Multi-Region, Active-Active Serverless Backends.
PDF
Moving Forward with AI
PPTX
AI: State of the Union
PPTX
Serverless Architectural Patterns
PPTX
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
PPTX
re:Invent re:Cap - Big Data & IoT at Any Scale
PPTX
Innovations and the Cloud
PPTX
Serverless in Action on AWS
PDF
Innovations and The Cloud
PPTX
Devoxx: Building AI-powered applications on AWS
PDF
10 Lessons from 10 Years of AWS
PDF
Developing Sophisticated Serverless Applications with AI
PPTX
AWS Startup Day Bangalore: Being Well-Architected in the Cloud
How can your business benefit from going serverless?
Can Automotive be as agile as Unicorns?
Moving Forward with AI - as presented at the Prosessipäivät 2018
Chaos Engineering: Why Breaking Things Should Be Practised.
Chaos Engineering: Why Breaking Things Should Be Practised.
Model Serving for Deep Learning
AI in Finance: Moving forward!
Building a Multi-Region, Active-Active Serverless Backends.
Moving Forward with AI
AI: State of the Union
Serverless Architectural Patterns
re:Invent re:Cap - An overview of Artificial Intelligence and Machine Learnin...
re:Invent re:Cap - Big Data & IoT at Any Scale
Innovations and the Cloud
Serverless in Action on AWS
Innovations and The Cloud
Devoxx: Building AI-powered applications on AWS
10 Lessons from 10 Years of AWS
Developing Sophisticated Serverless Applications with AI
AWS Startup Day Bangalore: Being Well-Architected in the Cloud
Ad

Recently uploaded (20)

PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
Cloud computing and distributed systems.
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
Teaching material agriculture food technology
PDF
Empathic Computing: Creating Shared Understanding
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
KodekX | Application Modernization Development
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Electronic commerce courselecture one. Pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
A Presentation on Artificial Intelligence
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
NewMind AI Monthly Chronicles - July 2025
Cloud computing and distributed systems.
Unlocking AI with Model Context Protocol (MCP)
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Teaching material agriculture food technology
Empathic Computing: Creating Shared Understanding
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Network Security Unit 5.pdf for BCA BBA.
KodekX | Application Modernization Development
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The AUB Centre for AI in Media Proposal.docx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Electronic commerce courselecture one. Pdf
Review of recent advances in non-invasive hemoglobin estimation
The Rise and Fall of 3GPP – Time for a Sabbatical?
Spectral efficient network and resource selection model in 5G networks
A Presentation on Artificial Intelligence
Encapsulation_ Review paper, used for researhc scholars
Dropbox Q2 2025 Financial Results & Investor Presentation

Journey Towards Scaling Your Application to Million Users

Editor's Notes

  • #6: Invest time to save time
  • #7: Invest time to save time
  • #8: Invest time to save time
  • #9: Invest time to save time
  • #12: We need some basics to lay the foundations we’ll need to build our knowledge of AWS on top of.
  • #13: Latency regulation
  • #16: Simplicity.  Durability.  Scalability.  Security.  Broad integration with other AWS services Cloud Data Migration options. Tiered Storage Classes
  • #20: Cloudfront allows you to cache static content at the CF edge for faster delivery from a local pop to the end user; in other words, your static content gets cached locally to a user and then delivered locally reducing download times for the website overall there are over 60 CF cache pops around the world as we mentioned earlier. CloudFront helps lower load on your origin infrastructure You can front end static content as discussed and dynamic content as well For dynamic content, CF proxies and accelerates your connection back to your dynamic origin and you would set a 0 ttl on your dynamic content so CloudFront always goes back to origin to fetch this content.
  • #23: This here is the most basic set up you would need to serve up a web application. Any user would first hit Route53 for DNS resolution. Behind the DNS service is an EC2 instance running our webapp and database on a single server, We will need to attach an Elastic IP so Route53 can direct traffic to our webstack at that IP Address with an A record. To scale this infrastructure, the only real option we have is to get a bigger EC2 instance…
  • #24: Same with Docker ( application container ..)
  • #27: At AWS there are a lot of different options to running databases. One is to just install pretty much any database you can think of on an EC2 instance, and manage all of it yourself. If you are really comfortable doing DBA like activities, like backups, patching, security, tuning, this could be an option for you. Also, if you need something highly specialized or customized and need to manage the hardware to achieve this, again this might be for you. If not, then we have a few options that we think are a better idea: First is Amazon RDS, or Relational Database Service. With RDS you get a managed database instance of either MySQL, Oracle, Postgres or SQL Server, with features such as automated daily backups, simple scaling, patch management, snapshots and restores, High availability, and read replicas - depending on the engine you go with. We also have Aurora in Preview today. Amazon Aurora is a MySQL-compatible relational database that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. Aurora provides up to five times better performance than MySQL at a price point one tenth that of a commercial relational databases while delivering similar performance and availability. Next up we have DynamoDB, a NoSQL database, built on top of SSDs. DynamoDB is based on the Dynamo whitepaper published by Amazon.com back in 2003. This whitepaper was considered the grandfather of most modern NoSQL databases like Cassandra. DynamoDB is kind of like a cousin of the original paper or an evolution of that whitepaper. One of the key concepts to DynamoDB is what we call “Zero Administration”. With DynamoDB the only knobs to tweak are the reads and writes per second you want the DB to be able to perform at. You set it, and it will give you that capacity with query responses averaging in single digit millisecond. We’ve had customers with loads such as half a million reads and writes per second without DynamoDB even blinking.
  • #29: So Why start with SQL databases? Generally speaking SQL based databases are established and well worn technology. There’s a good chance SQL is older than most people in this room. It has however continued to power most of the largest web applications we deal with on a daily basis. There are a lot of existing code, books, tools, communities, and people who know and understand SQL. Some of these newer nosql databases might have a handful of companies using them at scale. People are key here in addition to all of the other points as you may need to hire people to manage your database. You also aren’t going to break SQL databases in your first 10 million users. And yes there is an asterisk here, and we’ll get to that in a second. Lastly, there are a lot of clear patterns for scalability that we’ll discuss a bit through out this talk. So as for my point here at the bottom, I again strongly recommend SQL based technology, unless your application is doing something SUPER weird with the data, or you’ll have MASSIVE amounts of it, even then, SQL will be in your stack.
  • #30: So why else might you need NoSQL? There are definitely usecases where it makes sense to go NoSQL right off the bat. Some examples: Super low latency applications. Metadata driven datasets High-unrelational data Kind of going along with the previous is where you really need schema-less data constructs. And lets highlight the word NEED here. This isn’t just developers saying its easy to make apps without schemas. That’s just laziness Massive amounts of data, again from the previous slide, in the several TB range. Rapid ingest of data. Where you need to ingest potentially thousands of records per second into a single dataset
  • #31: So for this scenario today and based upon our discussion, we’re going to go with RDS and MYSQL as our database engine.
  • #33: Whats the biggest problem with this one?
  • #36: provide additional visibility into the health of the target instances and containers perform and report on health checks on a per-port basis
  • #37: Next up we need to address the lack of failover and redundancy in our infrastructure. We’re going to do this by adding in another webapp instance, and enabling the Multi-AZ feature of RDS, which will give us a standby instance in a different AZ from the Primary. We’re also going to replace our EIP with an Elastic Load Balancer to share the load between our two web instances Now we have an app that is a bit more scalable and has some fault tolerance built in as well.
  • #39: We could use Elasticache as a place to store common database query information for content that doesn’t change often, like information on our user, or what is in their cart. We should try and do this as often as possible; so what is Elasticache? Elasticache is hosted Memcache or Redis It does speak the same API as the traditional open source products so think of this as Memcache or Redis as a Service where we manage the clusters for you You can scale from one to many nodes This provides very fast single digit ms latencies as well Managed Simplifies and offloads the management, monitoring, and operation of in-memory cache environments. Compatible Most client libraries will work with the respective engines they were built for - no additional changes or tweaking required. Monitored Detailed monitoring statistics for the engine nodes at no extra cost via Amazon CloudWatch No persistence or replication with Memcache With Redis, you can put a replica in a different AZ with persistence
  • #40: We can also move things like session information to Elasticache. We can also use Elasticache to store some of our common database query results which will prevent us from hitting the database too much. This should take load off of our DB tier. Removing session state from our web / app tier is also very key as it allows us to scale up and down without losing session information when this horizontal scaling happens. This is called making our tier “stateless”
  • #42: At the stage, you typically start to have a significant amount of traffic and you can already detect patterns (usage patterns) This is amazon.com usage patterns.. East and weast coast ..
  • #46: CAP theorem
  • #48: Write and updates Counters!!!! Not on the DB – redis!!
  • #49: Database Federation is where we break up the database by function. In our example, we have broken out the Forums DB from the User DB from the Products DB Of course, cross functional queries are harder to do and you may need to do your joins at the application layer for these types of queries This will reduce our database footprint for a while and the great thing is, this does prevent you from having to shard until much further down the line. This isn’t going to help for single large tables; for this we will need to shard.
  • #50: Sharding is where we break up that single large database into multiple DBs. We might need to do this because of database or table size or potentially for high write IOPs as well. Here is an example of us breaking up a database with a large table into 3 databases. Above we show where each userID is located, but the easiest way to describe how this would work would be to use the example of all users with A-H go into one DB, and I – M go in another, and N – Z go into the third DB. Typically this is done by key space and your application has to be aware of where to read from, update and write to for a particular record. ORM support can help here. This does create operation complexity so if you can federate first, do that. This can be done with SQL or NoSQL, and DynamoDB does this for you under the covers on the backend as your data size increases and the reads / writes per second scale.
  • #64: When we start getting into the 5M user plus range, we may start seeing database contention issues on writes to the Master We are going to drill into a couple of techniques to solve these types of issues, and those include Federation and Sharding