Building A Scalable Web Application
When I left the cool confines of the college world for my first foray into the workforce, people were just starting to come to terms with the web. In those days, having a web page, let alone a web application, was not a necessity and the technology back then was in its infancy. There was no cloud, all companies had their own data centers or on-premise server closets and Netscape was the browser of choice.
Fast forward to today and having a business that doesn't have a web, mobile or some kind of digital presence is practically unheard of. And if you have a digital presence, especially a web app, you have to take scalability into account.
Why does a web application need to be scalable?
Today almost everyone is on the web and any application worth its weight is either exclusively on the web or has a significant presence there. This means that if you are developing an application for the web (or already have one), you want your application to be able to handle traffic without taking a performance hit, or worse yet, crashing. With that in mind, you have to design your application with scalability in mind, because you have to assure that the app is adaptable regardless of load, traffic, outages or disasters.
The Pros of Constructing a Scalable Web App
If you want to handle increased user traffic without fear, minimize maintenance and maximize up time, your app needs to scale. If you want to remain competitive, provide the best user experience and give your business room to grow, you have to scale. If you want to remain secure, be adaptable and responsive in the face of disaster, you have to scale. In fact, I can't think of any good reason not to scale! If you are still on the fence, here are some points about scaling:
So What Is Scalable Web Architecture?
Quite simply, when you are building an application that is expected to handle innumerable requests, you want to avoid single points of failure that slow or bring down the entire application. You want your application to be able to dynamically adapt to increased traffic, security compromises or disasters with little or no effect to the end-user.
In the simplest of terms, a web application is comprised of the following:
When you are building a scalable web application, the key word is adaptability. In order for a web application to be adaptable, it must be segmented to distribute the load. In the above diagram, early web applications were broken apart into three tiers, the presentation layer or UI, the application layer where the business logic lived and the data layer where the databases resided.
Presentation Tier
This is where the application is presented via a browser to your client, either on a laptop, tablet or mobile device. As the User-Interface (UI) for your application, front-end validations should occur here (HTML, CSS, XML & JavaScript) and here is normally your first line of defense (SSL) via a login screen.
Application Tier
The web server (Apache, NginX or IIS) that bridges the gap between the presentation layer and the application server (Python, .NET, React, J2EE, etc) is typically where the business logic lives and it is a good spot to do backend validations prior to tossing data to the Data Tier.
Data Tier
The heart of any application lies with the data. In this layer you have your database servers, NO SQL databases, data lakes, object containers, etc. This is where your application sends or receives the information it needs to properly function, and as such, this is the make or break layer, as it affects the entire application if it fails.
Scaling
If your application is simple, with average traffic, say a couple of thousand requests a day, you can manage with a 3-Tier design, separating out presentation logic from business logic and database logic and letting the Web Server manage the traffic for you. If you hit any limitations, they will probably be in relation to memory, CPU, network or diskspace and increasing those is called Vertical Scaling.
Vertical Scaling
With vertical scaling you are essentially throwing more hardware at increased traffic in terms of increasing memory, increasing diskspace, CPU and networking. This will function in the short-term, but there is a certainly an upper limit to how much you can grow a server (or VM instance). For smaller applications, this type of scaling may make sense.
Horizontal Scaling
Horizontal scaling functions by scaling out rather than up. So, if a web server instance is having some issues in managing load, instead of increasing the memory, a new replicated instance is created of the web server and network traffic is evenly distributed between the two instances. Theoretically, there is no limitation to the load a horizontally scaled application can handle.
Database Scaling
With relational databases, there is a upper limit to how much memory and diskspace you can throw at an instance before performance suffers again, so many database administrators will implement database replication or sharding as a way of scaling.
How Do I Build a Scalable App?
Well, there are some simple steps.
What are some of the best scaling methods?
There are really two things to keep in mind when you are building a highly scalable application. The first is that you should prioritize the horizontal approach. There is a ceiling to adding more memory, CPU and diskspace to one server in comparison to having N amount of servers that can be dynamically added in response to traffic if you host your solution in the cloud.
The second approach is use microservice architecture. All cloud providers support microservice architecture and having multiple independent modules makes everything easier to build, deploy and scale. In addition, many serverless architectures that support microservices have scalability automatically built in. Here are some other scaling thoughts.
Independent Nodes
Separate out your applications features and nodes into multiple modules, allowing you to manage each individually and easily expand the applications functionality while avoiding contradictions between nodes and functions.
Caches
Implement a global cache for each requesting node, which will allow your application to find information without overloading the database. Also, make use of a Content Delivery Network, which caches data at edge locations, reducing latency and load times for your application and increasing its overall performance
Load Balancers
Load balancers allow you to put horizontal scaling to work. They can invoke new nodes, serverless functions or new instances in response to increasing load and scale everything back down when load decreases.
Queues
This avoids the problem of a customer not being able to complete a request until your web application responds. With Queues, the user can do something else and come back to the application to see if the application has completed processing the request.
Indexing
This is important for database management. You can think of it as the database keeping a piece of certain data information, when a request comes in, indexing directs it to the right location without having to search the entire database. Used properly, it can greatly increase response times, overuse however can have the opposite effect.
Modern Web Application Design
Modern web applications follow the same basic 3-Tier approach of presentation layer, application layer and data layer. The only difference is that each layer is further partitioned by microservices, containers and serverless architecture and other services which all have scalability built in.
Since I am most familiar with AWS, let's take a look at a modern web application design on the AWS platform.
Presentation Tier Breakdown
Regardless of the development frameworks (React, React Native, Django, Ruby on Rails, etc.) you utilize to develop your application, you want your communication to your user interface to be flexible in terms of not only handling increased traffic, but being able to disperse it across the various tiers of your application while keeping the response times small.
For that first leg of the user journey, we make use of AWS Route 53, which is a highly scalable Domain Name Service that will route traffic to our Content Delivery Network (CDN), AWS CloudFront. Coupled together, these highly scalable services can disperse traffic automatically to healthy instances (bypassing busy or unhealthy ones) and cache data at edge locations ensuring quicker response times.
Capping off the presentation layer is our use of Amazon S3, a highly durable storage service that automatically disperses data across availability zones within a region. We can use this service to effectively store any kind of media that the application utilizes with the knowledge that those objects will be replicated and maintained even if an availability zone goes completely down, thereby ensuring that our application will remain functional in the face of an outage.
Application (Logic) Tier Breakdown
In the Application layer, we disperse our web server instances into two separate availability zones and make use of auto scaling and load balancing to detect when traffic hits certain key limitations. When those limitations are hit (CPU usage, bandwidth, latency, etc.) the load balancers can trigger auto scalers to invoke new web server instances to handle the load when the traffic is high and to scale down those instances to default levels when the traffic lessens. This not only ensures that increased traffic is handled, but also that we are not incurring charges on running instances that aren't doing anything.
To further add to the flexibility of the application tier, we break apart these web server instances into separate virtual private cloud (VPC) subnets, each with their own internal IP CIDR range (10.0.0.0/24, 10.0.2.0/24, etc.), ensuring that none of the IP ranges overlap so that new web servers can be brought online without any IP conflicts. These VPCs are public and operate within a larger VPC that encompasses both availability zones, including the data tier.
Data Tier Breakdown
To limit exposure and increase security, we create two private VPCs within each availability zone that contain the applications database sitting on Amazon RDS (Relational Database Service). With Amazon RDS, we setup database replication, so that there is one exact replica of the primary database ready to step in and take over application duties should the primary database go down for any reason.
These databases are not exposed to the internet and make use of security groups to assure that connections to the database only come from specific IP ranges and specific ports known only to the application.
All of this together assures that not only does the application scale to increased traffic, but that it remains resilient in the face of disaster, network outages and security breaches, while keeping the cost footprint small.
What now?
The tools and methods that I discussed is just scratching the surface of possibilities. Amazon by itself provides over 200 services, many scalable, to handle whatever workload you can dream up. Other providers, like Azure or Google, have similar offerings with similar architectures, but they all have a few things in common, namely:
With those three points in mind, you are well on your way to designing out and implementing a highly scalable and reliable application. Of course, if you need some more guidance, here at Secure Wave Consulting, we are more than happy to help.
Technology and People
11moGreat overview with relevance for a variety of technical abilities to understand how to scale in the Cloud.