AWS Auto Scaling Groups (ASG) with Terraform: A Deep Dive (Part 1)
Introduction
AWS Auto Scaling Groups (ASGs) are a foundational component for building scalable and highly available cloud infrastructure. An ASG helps you ensure that you have the correct number of Amazon EC2 instances available to handle the load for your application. You create collections of EC2 instances, called Auto Scaling groups, and specify the minimum, maximum, and desired capacity for each group.
When you use Auto Scaling groups, you can:
Automatically maintain a specified number of instances even if an instance becomes unhealthy
Dynamically scale your EC2 capacity out or in automatically based on demand
Distribute instances across multiple Availability Zones for high availability
Save costs by removing excess capacity during periods of low demand
Terraform, an Infrastructure as Code (IaC) tool, is a preferred way to set up ASGs over manual configuration in the AWS console because it brings consistency, version control, and automation to your deployments. Why manually configure scaling in the wee hours of the morning when you can define it in code once and reuse it across environments? (There's a DevOps joke that friends don't let friends do ClickOps in production 😄 Terraform helps us avoid that.)
In this guide, we'll explore how to effectively implement Auto Scaling groups using Terraform, covering everything from basic setup to advanced configurations and best practices.
Core Concepts
What is an Auto Scaling Group (ASG)? An Auto Scaling Group in AWS is a collection of EC2 instances that are treated as a logical grouping for the purposes of automatic scaling and management. You specify the minimum, maximum, and desired capacity for your group. Amazon EC2 Auto Scaling ensures your group never goes below the minimum or above the maximum size, while maintaining the desired capacity you set. This automation allows your application to scale out (add instances) when demand increases and scale in (remove instances) when demand decreases, automatically maintaining application availability while controlling costs. ASGs also distribute instances across multiple Availability Zones for high availability, protecting your applications from failures in a single location.
Key Components of ASGs:
Launch Templates (recommended) or Launch Configurations: A launch template defines the configuration for EC2 instances in your ASG. It includes the Amazon Machine Image (AMI) ID, instance type, key pair, security groups, block device mapping, and other settings needed to launch instances. Launch templates provide more functionality than launch configurations and support versioning.
Instance Types and Purchase Options: You can run specific EC2 instance types (t3.micro, m5.large, etc.) or use a flexible instance type selection with the mixed instances policy. ASGs support both On-Demand and Spot Instances, allowing you to optimize for cost savings while maintaining performance. You can specify the percentage of group capacity to launch as On-Demand vs. Spot Instances.
Elastic Load Balancing Integration: While optional, ASGs commonly work with Elastic Load Balancing to distribute traffic to healthy instances. When you attach a load balancer to your ASG, new instances are automatically registered with the load balancer, and unhealthy instances are deregistered before termination. This ensures seamless traffic handling during scaling events.
Scaling Policies: These define when and how your ASG adjusts capacity. Options include: Target Tracking Scaling – Maintains a specific metric (like average CPU utilization) at your target value. Step Scaling – Defines specific adjustments based on alarm thresholds. Simple Scaling – Makes capacity adjustments based on a single scaling adjustment. Scheduled Scaling – Adjusts capacity based on predictable load changes. Predictive Scaling – Uses machine learning to forecast capacity needs
Health Checks: ASGs perform health checks to determine if instances are healthy. By default, they use EC2 status checks, but you can also enable Elastic Load Balancing health checks to ensure your instances are not only running but also serving traffic properly.
Lifecycle Hooks: These allow you to perform custom actions when instances launch or terminate, such as installing software or extracting logs before termination.
EC2 Auto Scaling vs. Auto Scaling Groups
The terminology can be confusing. Amazon EC2 Auto Scaling is the service that provides scalability for EC2 instances, while an Auto Scaling Group is a specific resource within that service. In simpler terms, EC2 Auto Scaling is the overall service (which includes features for creating groups, defining policies, performing health checks, etc.), and an Auto Scaling group is the actual collection of EC2 instances that you create and manage.
Amazon EC2 Auto Scaling involves two primary components for instance configuration:
Launch Templates (or legacy Launch Configurations): These define the blueprint for your instances—specifying the AMI, instance type, key pair, security groups, and block device mapping for your EC2 instances. Launch templates are more versatile than launch configurations, supporting versioning and offering more features.
Auto Scaling Group: This defines the scaling parameters—specifying the VPC and subnets where instances are launched, the minimum, maximum, and desired capacity, and the scaling policies that determine when to scale the group out or in.
Amazon offers application auto scaling for various AWS services beyond EC2, including Amazon ECS, DynamoDB, and Aurora. However, Amazon EC2 Auto Scaling specifically manages the automatic scaling of EC2 instances through Auto Scaling groups.
It's worth noting that as of January 1, 2023, new instance types are no longer supported in launch configurations, and AWS recommends migrating to launch templates for all new and existing Auto Scaling groups to ensure access to the latest features and instance types.
https://docs.aws.amazon.com/autoscaling/ec2/userguide/launch-configurations.html
Setting Up ASG with Terraform
Infrastructure as Code (IaC) has become essential for managing cloud resources, and Terraform is one of the most popular tools for this purpose. Terraform allows you to define AWS resources like Auto Scaling Groups in declarative configuration files, enabling consistent, repeatable deployments across environments while maintaining version control.
Why Terraform for AWS Auto Scaling Groups?
Terraform provides several advantages for managing ASGs:
Consistency: Define your infrastructure once and deploy it consistently across development, staging, and production
Version control: Track changes to your infrastructure over time using Git or other VCS
Automation: Integrate with CI/CD pipelines for automated deployment
State management: Terraform tracks the current state of your infrastructure, making updates and changes predictable
Provider ecosystem: Easily integrate with other AWS services and third-party providers
For larger organizations or complex infrastructure, Terragrunt serves as an excellent thin wrapper around Terraform that provides additional benefits:
DRY configurations: Reduce repetition in your Terraform code with remote state configurations
Multi-account management: Simplify working across multiple AWS accounts and regions
Dependencies: Manage dependencies between Terraform modules more efficiently
Parallel execution: Run Terraform commands in parallel across multiple modules
We'll cover Terragrunt in-depth in a part 2, but for now, let's focus on implementing AWS Auto Scaling Groups with vanilla Terraform.
Getting Started with Terraform for ASGs
In the following sections, we'll walk through different Terraform implementations for Auto Scaling Groups, starting with basic setups and progressing to more complex configurations. Each example will include:
Basic ASG Setup (Launch Template + ASG)
ASG with Load Balancer integration
Mixed instances policy for cost optimization
Advanced scaling policies and health checks
Let's start with a basic ASG configuration (NOTE: Not intended for copy paste follow along and its high level example code snippets)...
1. Basic ASG Setup (Launch Template + ASG)
In this simple example, we create a launch template for our EC2 instances and then an Auto Scaling Group that uses that template. We specify the desired capacity and size bounds, along with the subnets for the instances.
This configuration creates a simple Auto Scaling group that maintains between 1 and 3 instances, distributed across multiple Availability Zones for high availability. The launch template defines the instance type, AMI, and other configurations for the EC2 instances.
The ASG will automatically:
Launch new instances if the count falls below the minimum size
Terminate instances if the count exceeds the maximum size
Maintain the desired capacity (1 instance in this example)
Replace unhealthy instances that fail EC2 health checks
Distribute instances across the specified subnets
Next, we'll explore how to integrate this ASG with a load balancer to distribute traffic among the instances.
2. ASG with Load Balancer Integration
In most production scenarios, you'll want to distribute traffic across your Auto Scaling group instances using a load balancer. This setup improves both availability and performance by routing requests to healthy instances and allowing for seamless scaling.
This configuration adds several important components:
Application Load Balancer (ALB): Acts as the entry point for all traffic to your application, distributing requests across healthy instances.
Target Group: Defines where the ALB should route traffic (to which instances and which port).
Listener: Configures the ALB to listen on port 80 and forward traffic to the target group.
ASG Integration: Changed health check type from EC2 to ELB: The ASG now considers an instance unhealthy if it fails load balancer health checks, not just EC2 status checksAdded target_group_arns: The ASG automatically registers new instances with this target groupThe aws_autoscaling_attachment resource explicitly connects the ASG to the target group (though this is redundant with the target_group_arns approach)
With this setup, as your ASG scales in or out:
New instances are automatically registered with the load balancer
Unhealthy instances are deregistered before termination
Traffic is distributed across all healthy instances
This provides a resilient, scalable architecture where your application can handle varying loads while maintaining high availability.
3. Mixed Instances Policy for Cost Optimization
One of the most powerful features of Auto Scaling groups is the ability to use multiple instance types and purchase options within a single ASG. This approach can significantly reduce costs while maintaining performance and availability.
This configuration implements a cost-optimized Auto Scaling group by:
Multiple Instance Types: The ASG can launch any of four different instance types (c5.large, c5a.large, c4.large, or t3.medium), giving it flexibility to choose the most cost-effective option available.
Mixed Purchase Options: Sets an on-demand base capacity of 1 instance (ensuring at least one On-Demand instance)Uses a 30% On-Demand / 70% Spot split for any capacity beyond the base
Spot Allocation Strategy: Selects Spot Instances from the pools with the lowest price while also considering which pools have the highest capacity availability
Termination Policies: : Removes instances running on outdated launch templates first: Saves cost by terminating instances close to billing hour
Benefits of This Approach
Cost Savings: Spot Instances can be 70-90% cheaper than On-Demand, dramatically reducing your infrastructure costs
Reliability: The On-Demand base capacity ensures critical functionality always runs on stable instances
Resilience: Using multiple instance types and the price-capacity-optimized allocation strategy significantly reduces the risk of Spot Instance interruptions
Performance Consistency: By selecting instance types with similar performance characteristics, your application maintains consistent performance regardless of which specific instance type is provisioned
This approach is particularly effective for stateless applications, batch processing workloads, and containerized applications where instances can be easily replaced.
4. Advanced Scaling Policies and Health Checks
Setting up proper scaling policies and health checks is crucial for an Auto Scaling group to efficiently adapt to changing workloads while maintaining application reliability. In this section, we'll implement advanced scaling configurations that respond to various metrics and ensure instance health.
Key Features of This Setup
Multi-Metric Scaling Policies: Target Tracking for CPU - Automatically adjusts capacity to maintain 50% average CPU utilization. Target Tracking for Request Count - Scales based on ALB request count per target (1000 requests). Step Scaling for Scale-In - More conservative scale-in based on sustained low CPU (30% for 5 minutes)
Instance Warmup: : Gives new instances 3 minutes to initialize before considering them for metricsPrevents premature scaling based on metrics from instances still initializingUsed instead of cooldown periods for more responsive scaling
Health Checks: : Uses load balancer health checks to determine instance health : Gives instances 5 minutes to start up before checking health
Scheduled Scaling: Weekly maintenance window during low-traffic period (2 AM on Sundays) Useful for predictable workload patterns or maintenance activities
CloudWatch Alarm: Triggers scale-in when CPU utilization is below 30% for 5 consecutive minutesProvides a more conservative approach to scaling in than target tracking policies
Monitoring Dashboard: CloudWatch dashboard to visualize key metricsMakes it easier to monitor and troubleshoot the ASG's behavior
Advanced Usage Tips
Predictive Scaling: For workloads with predictable patterns, consider adding predictive scaling policies:
For Critical Applications: Consider implementing instance protection to prevent specific instances from being terminated during scale-in events:
By implementing these advanced policies and health checks, your Auto Scaling group will be able to efficiently adapt to changing workloads while maintaining application performance and availability.
Conclusion: Embracing Scalability with AWS Auto Scaling Groups and Terraform
In this guide, we've explored the powerful combination of AWS Auto Scaling Groups and Terraform to build infrastructure that adapts to your application's needs while maintaining high availability and cost efficiency. Let's recap what we've covered:
Key Takeaways
ASG Fundamentals: Auto Scaling Groups provide automatic scaling of EC2 instances based on demand, ensuring you maintain the right number of instances to handle your application load without overspending.
Infrastructure as Code: Using Terraform to define ASGs brings consistency, version control, and automation to your deployments—saying goodbye to manual configurations in the AWS console.
Flexible Architecture: We've seen how to implement a range of ASG configurations: 1. Basic ASGs with launch templates, 2. Integration with load balancers for distributed traffic, 3. Mixed instance policies for cost optimization (combining On-Demand and Spot instances), 4. Advanced scaling policies responding to multiple metrics
Best Practices: We covered essential practices like: 1. Using multiple Availability Zones for high availability, 2. Implementing appropriate health checks and grace periods, 3. Setting up effective scaling policies based on application needs, 4. Configuring predictive scaling for workloads with predictable patterns
The Value Proposition
By implementing Auto Scaling Groups with Terraform as demonstrated in this guide, you gain:
Reliability: Your applications become more resilient to failures and traffic spikes
Cost Efficiency: You pay only for the resources you need, when you need them
Operational Excellence: Your infrastructure responds automatically to changing conditions
Developer Productivity: Your team spends less time managing infrastructure and more time building features
What's Next?
As you become comfortable with these patterns, consider exploring:
Enhanced Monitoring: Set up detailed CloudWatch dashboards and alarms for deeper insights
Advanced Deployment Strategies: Implement blue/green deployments using ASG features
Terraform Modules: Create reusable modules for your organization's common ASG patterns
CI/CD Integration: Automate your infrastructure deployments alongside your application code
Remember that Auto Scaling is not just about handling traffic spikes—it's about building self-healing, efficient infrastructure that adapts to your business needs. Whether you're scaling to handle millions of users or simply ensuring your application never goes down at 3 AM, AWS Auto Scaling Groups paired with Terraform provide the foundation you need.
As the cloud continues to evolve, the principles we've covered here will remain relevant: define your infrastructure as code, automate everywhere possible, and let the cloud handle the heavy lifting of scaling and availability.
Happy scaling! 🚀
References and Further Reading
This section provides links to official AWS documentation and additional resources to help you deepen your understanding of Auto Scaling groups and best practices for implementation.
Official AWS Documentation
Amazon EC2 Auto Scaling User Guide - Comprehensive documentation covering all aspects of EC2 Auto Scaling
AWS Auto Scaling Features - Overview of key features and capabilities
Launch Templates vs. Launch Configurations - Understanding the differences and benefits of launch templates
Mixed Instances Policy - Documentation on implementing mixed instance types and purchase options
Scaling Policies - Detailed guide on different scaling policy types
Terraform Documentation
Terraform AWS Provider - Auto Scaling Group - Official Terraform documentation for the resource
Terraform AWS Provider - Launch Template - Documentation for the resource
Terraform AWS Provider - Auto Scaling Policy - Documentation for scaling policies in Terraform
Terragrunt Resources
Terragrunt Documentation - Official Terragrunt documentation
DRY Terraform with Terragrunt - Guide to implementing DRY principles with Terragrunt
Terragrunt Example Configurations - Example repository for Terragrunt configurations
Additional Resources
AWS Well-Architected Framework - Cost Optimization Pillar - Best practices for cost optimization (relevant for Spot Instance usage)
AWS Auto Scaling Workshop - Hands-on lab for learning Auto Scaling
EC2 Spot Instances Best Practices - Optimizing Spot Instance usage in your Auto Scaling groups
Scaling Based on Amazon SQS - Advanced pattern for scaling based on queue depth
AWS Blog Posts
Introducing Attribute-Based Instance Type Selection for EC2 Auto Scaling and EC2 Fleet - Overview of attribute-based instance type selection
Capacity-Optimized Spot Instance Allocation Strategy - Details on choosing the optimal Spot allocation strategy
Amazon EC2 Auto Scaling Warm Pools - Guide to using warm pools for faster scaling
Feel free to explore these resources to enhance your understanding and implementation of AWS Auto Scaling groups with Terraform.