© 2023, Amazon Web Services, Inc. or its affiliates.
© 2023, Amazon Web Services, Inc. or its affiliates.
AWS Cost Optimization Workshop
Techniques: EC2 Spot Instances
Speaker Name
Speaker job title / team / company
Date
© 2023, Amazon Web Services, Inc. or its affiliates.
Agenda
• Benefits of Spot
• Rules of Spot
• Handling Interruptions
• Workload Examples
© 2023, Amazon Web Services, Inc. or its affiliates.
On-Demand
Pay for compute capacity
by the second with no
long-term commitments
Spiky workloads,
to define needs
Amazon EC2 purchase options
Spot Instances
Spare Amazon EC2 capacity
at savings of up to 90%
off On-Demand prices
Fault-tolerant, flexible,
stateless workloads
Savings Plans
and Reserved
Make a 1 or 3 year commitment
and receive a significant
discount off On-Demand prices
Committed and
steady-state usage
© 2023, Amazon Web Services, Inc. or its affiliates.
Why Spot Instances?
Low, predictable prices
Up to 90% discount over
On-Demand prices
Faster results
Increase throughput up to
10x while staying in budget
Easy to use
Launch through AWS services
(e.g. ECS, Batch, EMR)
or integrated third-parties
Spot Instances are reclaimed with a 2-minute warning and
only when On-Demand needs capacity back—no bidding!
© 2023, Amazon Web Services, Inc. or its affiliates.
Spot Instances help you save money and time!
Up to 90%
saved monthly
Improved
test result response time
with Spot Instances
30 minutes
without Spot Instances
Two days
Decreased monthly
computing cost while
increasing power
75% compute
cost reduction
More
compute power
with Spot Instances
© 2023, Amazon Web Services, Inc. or its affiliates.
Spot placement score
The Spot placement score feature can recommend an AWS Region or Availability Zone based on your Spot capacity
requirements. Spot capacity fluctuates, and you can't be sure that you'll always get the capacity that you need. A Spot
placement score indicates how likely it is that a Spot request will succeed in a Region or Availability Zone.
Benefits
• To relocate and scale Spot compute capacity in a different Region, as needed, in response
to increased capacity needs or decreased available capacity in the current Region.
• To identify the most optimal Availability Zone in which to run single-Availability Zone
workloads.
• To simulate future Spot capacity needs so that you can pick an optimal Region for the
expansion of your Spot-based workloads.
• To find an optimal combination of instance types to fulfill your Spot capacity needs.
You can use the Spot placement score feature for the following:
© 2023, Amazon Web Services, Inc. or its affiliates.
What happens when AWS reclaims an instance?
Minimal interruptions
Less than 5% of Spot instances were
interrupted in the last 3 months
Alerts Automation
Handling Options
 Terminate
 Stop/Start
 Hibernate
Map to Strategy
EC2 Spot rebalance
recommendation
An EC2 Spot Instance rebalance
recommendation is a signal from that
notifies you when a Spot Instance is at
elevated risk of interruption. The signal
can arrive sooner than the
two-minute Spot Instance interruption n
otice
, giving you the opportunity to
proactively manage the Spot Instance.
© 2023, Amazon Web Services, Inc. or its affiliates.
Be time flexible to account for interruptions and/or location flexible to maximize
application uptime
Flexibility is key to successful adoption
Instanc
e
flexible
• More than one instance type can get the job done
• Instance weighting gives you more flexibility on instance
types
• Multiple instance types are key to resilient clusters
Time
flexible
Region
flexible
OR
© 2023, Amazon Web Services, Inc. or its affiliates.
Spot pricing
Smooth, infrequent changes
no spikes, more predictable
Up to 90% off
Interruptions
Happen when EC2 needs
capacity
Spot infrastructure
Is same as On-Demand
and RIs
Diversify
Choose different instance types,
size and AZ in a single fleet
The simple rules of Spot
© 2023, Amazon Web Services, Inc. or its affiliates.
To optimize Amazon EC2, combine purchase options with
EC2 Auto Scaling
Use Savings Plan or RIs for
known, steady-state workloads
On-demand, for new or
stateful spiky workloads
Scale using Spot for
fault-tolerant, flexible,
stateless workloads
© 2023, Amazon Web Services, Inc. or its affiliates.
Save up to 90% using EC2 Auto Scaling and EC2 Fleet
• Capacity optimized
Prioritize deploying Spot Instances into greater Spot pool capacity in
order to lower the chance of interruptions
• Lowest cost
Specify what percentage of your ASG capacity should be fulfilled by On-
Demand and Spot Instances and the ASG will prioritize launching Spot
Instances based on price
• Price capacity optimized – 15 nov. 2022
Makes Spot Instance allocation decisions based on both the price and the
capacity availability of Spot Instances
Reduce cost, optimize performance, and eliminate operational overhead
Spot Instances
On-Demand Instances
Reserved Instances
Amazon EC2
Auto Scaling
© 2023, Amazon Web Services, Inc. or its affiliates.
Amazon EC2 Spot integrations
AWS
Batch
Amazon
EMR
Amazon Elastic
Container Service
AWS
CloudFormation
Amazon
SageMaker
AWS
Elastic Beanstalk
EC2 Auto
Scaling
AWS
Fargate
AWS
Gamelift
Amazon Elastic
Kubernetes Service
© 2023, Amazon Web Services, Inc. or its affiliates.
Spot Instances are perfect for fault-tolerant
Lean on Spot for these
workloads!
Big data
HPC
CI/CD
Web services
Containers
Machine Learning
Batch
© 2023, Amazon Web Services, Inc. or its affiliates.
Containers + Spot = match made in heaven
 Containers are often stateless, fault-tolerant, and a great fit for Spot Instances
 Deploy containerized workloads and easily manage clusters at any scale at a
fraction of the cost with Spot Instances
 Spot instances can be used with ECS or Kubernetes to run any containerized
workload
Skyscanner is a travel fare aggregator
website and travel metasearch engine
based in Edinburgh, Scotland
“We are currently tracking
74% saving over all regions.”
—Paul Gillespie,
Principal Architect/Tribe Lead
© 2023, Amazon Web Services, Inc. or its affiliates.
Workload example: Big data
 Spot Instances provide acceleration, scale, and cost savings to run
hyper-scale workloads for data analysis
 Scale to large numbers of parallel nodes via Fleet
 Use Spot Instances with Amazon EMR, Hadoop, or Spark to process
massive amounts of data
Amazon EMR
“A job that took weeks in our data center, due to limited resources, took
hours on Spot thanks to the great parallelism, in a very cost-efficient
price.”
- Shay Asoolin, Sr. Director Development infrastructure, Mobileye
© 2023, Amazon Web Services, Inc. or its affiliates.
Workload example: CI/CD
 Configure Jenkins with the EC2 spot plug-in to automatically
scale a fleet of spot instances based on the number CI/CD jobs
 Increase cost savings by leveraging older generation instances
for CI, as these processes do not require a lot of power for testing
“By using AWS Spot instances, we've been able to save up to 75 percent a
month simply by changing four lines of code. It makes perfect sense for
saving money when you're running continuous integration workloads or
pipeline processing.”
—Matthew Leventi, Lead Engineer, Lyft
© 2023, Amazon Web Services, Inc. or its affiliates.
Workload example: Web services
 Scale, throughput and deep cost savings for large-scale web operations
 Launch and manage a collection of diversified Spot Instances across
pools via EC2 Fleet and ASG
 NEW! Include Spot with RIs and On-Demand in a single ASG
Quantcast Scales Ad Services Saves 60% Using Amazon EC2 Spot Instances
“As we roll out more infrastructure to AWS, Amazon EC2 Spot Instances are
helping us control costs and scale our systems to meet demand.“
—Leah Blank, Senior Systems Engineer, Quantcast
Amazon EC2 Auto Scaling Amazon EC2 Fleet
© 2023, Amazon Web Services, Inc. or its affiliates.
Workload example: HPC
 Accelerate HPC workloads such as genomic sequencing, CFD and
algorithmic trading by running massively parallel jobs
 Run multiple projects simultaneously; launch & de-commission
1000’s of nodes
 Spot Auto Scaling groups; F1(FPGA), eg1 (Elastic GPUs), Cluster
GPU instances to accelerate processing
Illumina saves nearly $400,000 monthly, Speeds Genomics Analysis using
Spot Instances
“We are able to offer our customers a lower cost, high-performance genomic-analysis
platform, which can help them speed their time to answers.“
—Andy Nelson, Associate Director, Informatics & Cloud Operations, Illumina
Amazon
EC2 Fleet
AWS
CloudFormation
AWS
Batch
© 2023, Amazon Web Services, Inc. or its affiliates.
Consumer apps
B2B enterprise tech
Research
Sports, media, &
entertainment
Financial services
AdTech & MarTech
Customers across different industries and verticals use
Spot
© 2023, Amazon Web Services, Inc. or its affiliates.
resources
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-best-practices.html
https://aws.amazon.com/blogs/compute/diving-deep-into-ec2-spot-instance-cost-and-operational-practic
es/
https://aws.amazon.com/blogs/compute/efficiently-scaling-kops-clusters-with-amazon-ec2-spot-instances
/
https://aws.amazon.com/ec2/spot/pricing/
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-fleet-allocation-strategy.html
https://aws.amazon.com/blogs/compute/introducing-price-capacity-optimized-allocation-strategy-for-ec2
-spot-instances/
© 2023, Amazon Web Services, Inc. or its affiliates.
© 2023, Amazon Web Services, Inc. or its affiliates.
Q&A
© 2023, Amazon Web Services, Inc. or its affiliates.
© 2023, Amazon Web Services, Inc. or its affiliates.
Thank you!

More Related Content

PDF
Amazon EC2 Spot Instances Workshop
PDF
Amazon EC2 Spot Instances
PPTX
AWS Atlanta Meetup -AWS Spot Blocks and Spot Fleet
PDF
An introduction to Spot Instances and AWS Fleet - Webinar
PDF
Cut AWS Costs: Using Spot Instances for More Than Batch
PPTX
AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...
PDF
Reduce Your Cloud Spending With AWS Spot Instances
PDF
Advanced cost management strategies in AWS
Amazon EC2 Spot Instances Workshop
Amazon EC2 Spot Instances
AWS Atlanta Meetup -AWS Spot Blocks and Spot Fleet
An introduction to Spot Instances and AWS Fleet - Webinar
Cut AWS Costs: Using Spot Instances for More Than Batch
AWS EMEA Online Summit - Blending Spot and On-Demand instances to optimizing ...
Reduce Your Cloud Spending With AWS Spot Instances
Advanced cost management strategies in AWS

Similar to Cost Optimization - aws and cloud compute (20)

PDF
AWS Cost Optimizations Risks
PPTX
AWS SSA Webinar - Cost optimisation on AWS
PDF
Embracing the volatility of AWS Spot Fleet
PPTX
AWS Spot Pricing with Terraform [ENG 2023]
PDF
Six ways to reduce your AWS bill
PDF
AWS Cost Optimization - JLM
PPTX
Aws ec2 - hibernate and spot instance
PPTX
AWS Meetup - Exploring ways to buy EC2 capacity
PPTX
Cost optimization - Don't overspend on AWS
PDF
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
PPTX
AWS Cost Optimization
PPTX
5 Ways to Control your AWS Spending (or, How to Make Your CFO Happy)
PDF
AWS Spot infographic
PPTX
Running Spot instance for production load
PDF
AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...
PPTX
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
PDF
Microservices on AWS Spot instances
PPTX
Cloud Resilience and Container Workload Automation
PPTX
old_obsolete.pptx
PPTX
Webcast: AWS Sticker Shock? How can containers and automation help?
AWS Cost Optimizations Risks
AWS SSA Webinar - Cost optimisation on AWS
Embracing the volatility of AWS Spot Fleet
AWS Spot Pricing with Terraform [ENG 2023]
Six ways to reduce your AWS bill
AWS Cost Optimization - JLM
Aws ec2 - hibernate and spot instance
AWS Meetup - Exploring ways to buy EC2 capacity
Cost optimization - Don't overspend on AWS
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
AWS Cost Optimization
5 Ways to Control your AWS Spending (or, How to Make Your CFO Happy)
AWS Spot infographic
Running Spot instance for production load
AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...
EC2 Pricing Model (deck 0307 of the InfiniteSkills AWS course at http://bit.l...
Microservices on AWS Spot instances
Cloud Resilience and Container Workload Automation
old_obsolete.pptx
Webcast: AWS Sticker Shock? How can containers and automation help?
Ad

Recently uploaded (20)

PPTX
Configure Apache Mutual Authentication
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
STKI Israel Market Study 2025 version august
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PDF
Five Habits of High-Impact Board Members
PDF
CloudStack 4.21: First Look Webinar slides
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
Benefits of Physical activity for teenagers.pptx
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
DOCX
search engine optimization ppt fir known well about this
Configure Apache Mutual Authentication
NewMind AI Weekly Chronicles – August ’25 Week III
Getting started with AI Agents and Multi-Agent Systems
Enhancing emotion recognition model for a student engagement use case through...
The influence of sentiment analysis in enhancing early warning system model f...
sbt 2.0: go big (Scala Days 2025 edition)
STKI Israel Market Study 2025 version august
Zenith AI: Advanced Artificial Intelligence
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
Five Habits of High-Impact Board Members
CloudStack 4.21: First Look Webinar slides
2018-HIPAA-Renewal-Training for executives
Developing a website for English-speaking practice to English as a foreign la...
Convolutional neural network based encoder-decoder for efficient real-time ob...
Final SEM Unit 1 for mit wpu at pune .pptx
A comparative study of natural language inference in Swahili using monolingua...
Benefits of Physical activity for teenagers.pptx
Custom Battery Pack Design Considerations for Performance and Safety
Credit Without Borders: AI and Financial Inclusion in Bangladesh
search engine optimization ppt fir known well about this
Ad

Cost Optimization - aws and cloud compute

  • 1. © 2023, Amazon Web Services, Inc. or its affiliates. © 2023, Amazon Web Services, Inc. or its affiliates. AWS Cost Optimization Workshop Techniques: EC2 Spot Instances Speaker Name Speaker job title / team / company Date
  • 2. © 2023, Amazon Web Services, Inc. or its affiliates. Agenda • Benefits of Spot • Rules of Spot • Handling Interruptions • Workload Examples
  • 3. © 2023, Amazon Web Services, Inc. or its affiliates. On-Demand Pay for compute capacity by the second with no long-term commitments Spiky workloads, to define needs Amazon EC2 purchase options Spot Instances Spare Amazon EC2 capacity at savings of up to 90% off On-Demand prices Fault-tolerant, flexible, stateless workloads Savings Plans and Reserved Make a 1 or 3 year commitment and receive a significant discount off On-Demand prices Committed and steady-state usage
  • 4. © 2023, Amazon Web Services, Inc. or its affiliates. Why Spot Instances? Low, predictable prices Up to 90% discount over On-Demand prices Faster results Increase throughput up to 10x while staying in budget Easy to use Launch through AWS services (e.g. ECS, Batch, EMR) or integrated third-parties Spot Instances are reclaimed with a 2-minute warning and only when On-Demand needs capacity back—no bidding!
  • 5. © 2023, Amazon Web Services, Inc. or its affiliates. Spot Instances help you save money and time! Up to 90% saved monthly Improved test result response time with Spot Instances 30 minutes without Spot Instances Two days Decreased monthly computing cost while increasing power 75% compute cost reduction More compute power with Spot Instances
  • 6. © 2023, Amazon Web Services, Inc. or its affiliates. Spot placement score The Spot placement score feature can recommend an AWS Region or Availability Zone based on your Spot capacity requirements. Spot capacity fluctuates, and you can't be sure that you'll always get the capacity that you need. A Spot placement score indicates how likely it is that a Spot request will succeed in a Region or Availability Zone. Benefits • To relocate and scale Spot compute capacity in a different Region, as needed, in response to increased capacity needs or decreased available capacity in the current Region. • To identify the most optimal Availability Zone in which to run single-Availability Zone workloads. • To simulate future Spot capacity needs so that you can pick an optimal Region for the expansion of your Spot-based workloads. • To find an optimal combination of instance types to fulfill your Spot capacity needs. You can use the Spot placement score feature for the following:
  • 7. © 2023, Amazon Web Services, Inc. or its affiliates. What happens when AWS reclaims an instance? Minimal interruptions Less than 5% of Spot instances were interrupted in the last 3 months Alerts Automation Handling Options  Terminate  Stop/Start  Hibernate Map to Strategy EC2 Spot rebalance recommendation An EC2 Spot Instance rebalance recommendation is a signal from that notifies you when a Spot Instance is at elevated risk of interruption. The signal can arrive sooner than the two-minute Spot Instance interruption n otice , giving you the opportunity to proactively manage the Spot Instance.
  • 8. © 2023, Amazon Web Services, Inc. or its affiliates. Be time flexible to account for interruptions and/or location flexible to maximize application uptime Flexibility is key to successful adoption Instanc e flexible • More than one instance type can get the job done • Instance weighting gives you more flexibility on instance types • Multiple instance types are key to resilient clusters Time flexible Region flexible OR
  • 9. © 2023, Amazon Web Services, Inc. or its affiliates. Spot pricing Smooth, infrequent changes no spikes, more predictable Up to 90% off Interruptions Happen when EC2 needs capacity Spot infrastructure Is same as On-Demand and RIs Diversify Choose different instance types, size and AZ in a single fleet The simple rules of Spot
  • 10. © 2023, Amazon Web Services, Inc. or its affiliates. To optimize Amazon EC2, combine purchase options with EC2 Auto Scaling Use Savings Plan or RIs for known, steady-state workloads On-demand, for new or stateful spiky workloads Scale using Spot for fault-tolerant, flexible, stateless workloads
  • 11. © 2023, Amazon Web Services, Inc. or its affiliates. Save up to 90% using EC2 Auto Scaling and EC2 Fleet • Capacity optimized Prioritize deploying Spot Instances into greater Spot pool capacity in order to lower the chance of interruptions • Lowest cost Specify what percentage of your ASG capacity should be fulfilled by On- Demand and Spot Instances and the ASG will prioritize launching Spot Instances based on price • Price capacity optimized – 15 nov. 2022 Makes Spot Instance allocation decisions based on both the price and the capacity availability of Spot Instances Reduce cost, optimize performance, and eliminate operational overhead Spot Instances On-Demand Instances Reserved Instances Amazon EC2 Auto Scaling
  • 12. © 2023, Amazon Web Services, Inc. or its affiliates. Amazon EC2 Spot integrations AWS Batch Amazon EMR Amazon Elastic Container Service AWS CloudFormation Amazon SageMaker AWS Elastic Beanstalk EC2 Auto Scaling AWS Fargate AWS Gamelift Amazon Elastic Kubernetes Service
  • 13. © 2023, Amazon Web Services, Inc. or its affiliates. Spot Instances are perfect for fault-tolerant Lean on Spot for these workloads! Big data HPC CI/CD Web services Containers Machine Learning Batch
  • 14. © 2023, Amazon Web Services, Inc. or its affiliates. Containers + Spot = match made in heaven  Containers are often stateless, fault-tolerant, and a great fit for Spot Instances  Deploy containerized workloads and easily manage clusters at any scale at a fraction of the cost with Spot Instances  Spot instances can be used with ECS or Kubernetes to run any containerized workload Skyscanner is a travel fare aggregator website and travel metasearch engine based in Edinburgh, Scotland “We are currently tracking 74% saving over all regions.” —Paul Gillespie, Principal Architect/Tribe Lead
  • 15. © 2023, Amazon Web Services, Inc. or its affiliates. Workload example: Big data  Spot Instances provide acceleration, scale, and cost savings to run hyper-scale workloads for data analysis  Scale to large numbers of parallel nodes via Fleet  Use Spot Instances with Amazon EMR, Hadoop, or Spark to process massive amounts of data Amazon EMR “A job that took weeks in our data center, due to limited resources, took hours on Spot thanks to the great parallelism, in a very cost-efficient price.” - Shay Asoolin, Sr. Director Development infrastructure, Mobileye
  • 16. © 2023, Amazon Web Services, Inc. or its affiliates. Workload example: CI/CD  Configure Jenkins with the EC2 spot plug-in to automatically scale a fleet of spot instances based on the number CI/CD jobs  Increase cost savings by leveraging older generation instances for CI, as these processes do not require a lot of power for testing “By using AWS Spot instances, we've been able to save up to 75 percent a month simply by changing four lines of code. It makes perfect sense for saving money when you're running continuous integration workloads or pipeline processing.” —Matthew Leventi, Lead Engineer, Lyft
  • 17. © 2023, Amazon Web Services, Inc. or its affiliates. Workload example: Web services  Scale, throughput and deep cost savings for large-scale web operations  Launch and manage a collection of diversified Spot Instances across pools via EC2 Fleet and ASG  NEW! Include Spot with RIs and On-Demand in a single ASG Quantcast Scales Ad Services Saves 60% Using Amazon EC2 Spot Instances “As we roll out more infrastructure to AWS, Amazon EC2 Spot Instances are helping us control costs and scale our systems to meet demand.“ —Leah Blank, Senior Systems Engineer, Quantcast Amazon EC2 Auto Scaling Amazon EC2 Fleet
  • 18. © 2023, Amazon Web Services, Inc. or its affiliates. Workload example: HPC  Accelerate HPC workloads such as genomic sequencing, CFD and algorithmic trading by running massively parallel jobs  Run multiple projects simultaneously; launch & de-commission 1000’s of nodes  Spot Auto Scaling groups; F1(FPGA), eg1 (Elastic GPUs), Cluster GPU instances to accelerate processing Illumina saves nearly $400,000 monthly, Speeds Genomics Analysis using Spot Instances “We are able to offer our customers a lower cost, high-performance genomic-analysis platform, which can help them speed their time to answers.“ —Andy Nelson, Associate Director, Informatics & Cloud Operations, Illumina Amazon EC2 Fleet AWS CloudFormation AWS Batch
  • 19. © 2023, Amazon Web Services, Inc. or its affiliates. Consumer apps B2B enterprise tech Research Sports, media, & entertainment Financial services AdTech & MarTech Customers across different industries and verticals use Spot
  • 20. © 2023, Amazon Web Services, Inc. or its affiliates. resources https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-best-practices.html https://aws.amazon.com/blogs/compute/diving-deep-into-ec2-spot-instance-cost-and-operational-practic es/ https://aws.amazon.com/blogs/compute/efficiently-scaling-kops-clusters-with-amazon-ec2-spot-instances / https://aws.amazon.com/ec2/spot/pricing/ https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-fleet-allocation-strategy.html https://aws.amazon.com/blogs/compute/introducing-price-capacity-optimized-allocation-strategy-for-ec2 -spot-instances/
  • 21. © 2023, Amazon Web Services, Inc. or its affiliates. © 2023, Amazon Web Services, Inc. or its affiliates. Q&A
  • 22. © 2023, Amazon Web Services, Inc. or its affiliates. © 2023, Amazon Web Services, Inc. or its affiliates. Thank you!

Editor's Notes

  • #2: Many customers are aware of how to be reactive when it comes to cutting down waste, however in order to sustain a cost optimized environment its important to consider proactive architectures This is where services such a service catalog come into play The goal of this presentation is to equip you with a basis of understanding around the implementation of service catalog Target audience – members of a CCOE
  • #3: There are 3 different ways to purchase compute On-Demand: Pay-as-you-go, no commitments, best for fluctuating workloads Reserved Instance: Long term commitments that offer big savings over On-Demand prices. Best for always on workloads Savings Plan: Just like Reserved Instances, but monetary commitment based and compute can be used across Fargate and EC2 Spot Instances: Same as pay-as-you-go pricing as On-Demand, but at up to 90% off. EC2 can reclaim with a 2 minute warning. Best for stateless or fault tolerant workloads All four purchasing options use the same underlying EC2 instances and AWS infrastructure across 22 Regions [Poll] How many of you use Spot Instances?
  • #5: SAY: Here are a few success stories where customers were able to optimize costs by using spot: Lyft – primarily uses Spot to reduce their computer costs. uses Spot to support their CI/CD pipeline and save up to 90% monthly Yelp – uses Spot for another key benefit – scale. With cheaper instances, they are able to throw more compute power at their testing requirements, reducing response time from two days to 30 minutes. Illumina takes the best of both worlds. All 3 of these customers are public references for us and their stories can be found on our website.
  • #6: Explain How Spot placement score works References https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-placement-score.html
  • #7: First let me highlight that more than 95% of the time a spot instance is terminated by the client or the job completes Meaning our client has requested a spot instance, we fulfilled it, it has completed its work and is no longer needed so the client terminates it Remember each spot pool has it’s own unique capacity and characteristics, but across our portfolio 95% of the time clients terminate the spot instance We also offer a tool called Spot Instance Advisor that provides a 30 day rolling average of interruption rates by instance per region which is helpful when configuring your requests. That said, Spot by nature is an interruptible service, so regardless of the frequency, you need to be able to handle interruptions if/when they occur. Logistically, the first thing that will happen is AWS will notify you we need to reclaim the instance This is done thru either a cloud watch alert or by pulling the metadata of the local instance When that warning is received, you’ll have 2 minutes to gracefully terminate the instance and migrate that work to elsewhere in your fleet We have different strategies that you can employ to handle interruptions and our team is here to help you make the right choices, but they essentially come down to 3 options” Option one is the most common and that is terminate the instance. In this case you want to check point and bring the instance up elsewhere in the fleet. WE can provide sample scripts to implement some LAMBDA script to accomplish this Another option is Stop/Start – this is often used for time-insensitive workloads and we can persist the EBS volume and then re-attach once the instance become available again. The third option is Hibernate. Again implemented much less often, but it is available. In this case we flush in-state memory to disk and then it functions much like opening and closing the lid of your laptop HIGHLIGHT: So depending on your workload and the requirements, we offer a variety of ways to elegantly manage an interruption should it occur. MENTION: It is not always possible for Amazon EC2 to send the rebalance recommendation signal before the two-minute Spot Instance interruption notice. Therefore, the rebalance recommendation signal can arrive along with the two-minute interruption notice. Rebalance recommendations are made available as a CloudWatch event and as an item in the instance metadata on the Spot Instance. Events are emitted on a best effort basis. MENTION: You should use at least 10 different instance types to minimize frequent rebalancing notifications. References https://aws.amazon.com/about-aws/whats-new/2020/11/introducing-ec2-instance-rebalance-recommendation-for-ec2-spot-instances/ https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/rebalance-recommendations.html
  • #8: Being instance flexible is the most important element to a success ASG With instance weights you can specific the weight that and instance type would contribute to your application’s performance. For example, if you are flexible across different instance sizes within the C5 family, you can assign a weight of one to c5large, a weight of 2 to c5xlarge, or a weight of 4 to c5xxlarge, etc. For example, if your desired capacity is 100, it can either launch 100 c5larges or 50 c5xlarges or a combination of such Instance flexibility: (Time sensitive workloads) Mix instance types with similar capabilityes: num of vCPUs / Memory Time flexible: (time insensitive workloads) Workloads that require specific instance types, but can be flexible on completion times (e.g. batch Jobs with no SLA, ML training Jobs…) Region flexible: large size / very instance specific kind of workloads e.g. real time rendering with a specific g3 instance, can benefit of increased
  • #10: 1/ As your cloud environments and usage scale, optimizing your costs and correctly scaling become essential. 2/ You can use Savings Plan with On-Demand and Spot Instances to mix our pricing models to best match your utilization and optimize your performance and costs. 3/ EC2 Fleet and Auto-Scaling allow you to scale across all these purchase models - Spot, On-Demand, RIs and Savings Plans.
  • #11: SAY: Now that we’ve covered the key workloads where we see the widest adoption of Spot, let’s talk about how to launch Spot instances. Spot instances can be launched directly in the EC2 console under spot instances, but AWS provides tools called Spot Fleet and EC2 Fleet to help manage your spot workloads without additional management code. With Spot fleet you can automatically provision and scale instances across a variety of instance families to meet your needs. In addition, on-demand instances can be mixed in the instance types if you want a number of instances that are not running on spot and therefore will not have a probability of being interrupted. NEW FEATURE 15 Nov. 2022: Price capacity optimized https://aws.amazon.com/blogs/compute/introducing-price-capacity-optimized-allocation-strategy-for-ec2-spot-instances/
  • #12: Substituted AWS Thinkbox Deadline with AWS Gamelift . Also mention: AWS Thinkbox Deadline References: https://docs.aws.amazon.com/gamelift/latest/developerguide/spot-tasks.html https://aws.amazon.com/about-aws/whats-new/2018/04/more-control-of-idle-spot-fleet-instances-with-thinkbox-deadline/
  • #13: The most common use cases where we see successful Spot implementations are: Anything containerized – Big data frameworks like Apache Spark or Hadoop Batch processing Stateless web services Machine learning – Pytorch, Tensorflow or jobs that require heavy training Continuous Integration and Continuous Deployment (CI/CD) with Jenkins High performance computing (HPC) – genomics sequencing Anything fault-tolerant or stateless that can be instance flexible
  • #14: SAY: Because many workloads running on containers already are stateless and fault-tolerant by design, they are great candidates for spot. ECS supports native integration with spot which can be used to spin up additional cluster nodes as needed and migrate running containers to a newly created ECS instance when an interruption notification is delivered Skyscanner is a one customer that was able to save 74% on their workloads by moving their ECS clusters to spot.
  • #15: SAY: With discounts of up to 90% off, you can scale your compute environment by up to 10x while staying within budget. Integrate directly with EMR or Elastic MapReduce and can support any Hadoop or Spark clusters to process large amounts of data. Mobileeye was able to take advantage of this by parallelizing their tasks were able to do a job that took weeks in their datacenter, in hours running on spot at a much more cost efficient price.
  • #16: SAY: CI/ CD is a growing use case with customers Spot provides inexpensive compute environments for developers to build, test, and deploy code changes There is also a Jenkins plug-in that spins up spot instances during automated build and testing Lyft is an interesting example as they were able to save 75% a month by just changing four lines of code that launched instances in their salt module.
  • #17: SAY: Web Services is another common use case where we see a lot of usage of Spot instances as customers look for inexpensive solutions to maintain high performance during peak traffic periods. A very common architecture here is for customers to run a baseline of RIs for their typical traffic patterns and scale into Spot instances when traffic spikes. And as we’ll talk about shortly, EC2 Fleet and autoscaling groups can now support RIs, On-Demand and Spot in a single call or autoscaling group making this much easier to manage.
  • #18: SAY: HPC or high performance computing workloads, like big data, can benefit from the scale that Spot instances can provide for workloads such as genomic sequencing, computational fluid dynamics and risk analytics. With Spot you can run multiple projects simultaneously and reduce the time to answers. A few of our key customers in this space include Illumina, who we discussed earlier, along with NASA, Autodesk and AdRoll. Integrated with AWS Batch, AWS CloudFormation and other AWS services
  • #19: SAY: Spot is used by many leading technology companies in the world across a variety of industry verticals.
  • #20: SAY: Spot is used by many leading technology companies in the world across a variety of industry verticals.