SlideShare a Scribd company logo
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Adam Rozumek, Systems Engineer
RIOT GAMES
Standardizing Application
Deployments Using Amazon ECS
RIOT GAMES
Standardizing Application
Deployments Using Amazon ECS
SYSTEMS ENGINEER
ADAM ROZUMEK
INTRODUCTIONS
WHO AM I?
AMAZON ECS & VIDEO GAMES
WHAT TO EXPECT
ECS CONSOLIDATION WINS & LESSONS
Adapting existing deployments & infrastructure
INFRASTRUCTURE ENCAPSULATION
Object Oriented Development Operations
MULTI-GAME MODULAR SERVICE DESIGN
A different kind of scaling
1
2
3
AMAZON ECS & VIDEO GAMES
WHAT TO EXPECT
ECS CONSOLIDATION WINS & LESSONS
Adapting existing deployments & infrastructure
INFRASTRUCTURE ENCAPSULATION
Object Oriented Development Operations
MULTI-GAME MODULAR SERVICE DESIGN
A different kind of scaling
1
2
3
AMAZON ECS & VIDEO GAMES
WHAT TO EXPECT
ECS CONSOLIDATION WINS & LESSONS
Adapting existing deployments & infrastructure
INFRASTRUCTURE ENCAPSULATION
Object Oriented Development Operations
MULTI-GAME MODULAR SERVICE DESIGN
A different kind of scaling
1
2
3
AMAZON ECS & VIDEO GAMES
WHAT TO EXPECT
ECS CONSOLIDATION WINS & LESSONS
Adapting existing deployments & infrastructure
INFRASTRUCTURE ENCAPSULATION
Object Oriented Development Operations
MULTI-GAME MODULAR SERVICE DESIGN
A different kind of scaling
1
2
3
Riot Games 글로벌 게임 운영을 위한 Docker 및 Amazon ECS 활용사례 - AWS Summit Seoul 2017
Riot Games 글로벌 게임 운영을 위한 Docker 및 Amazon ECS 활용사례 - AWS Summit Seoul 2017
7.5MILLION
PEAK CONCURRENT
PLAYERS
100MILLION
MONTHLY ACTIVE
PLAYERS
MORE THAN
27MILLION
DAILY ACTIVE
PLAYERS
MORE THAN MORE THAN
2016
LEAGUE OF LEGENDS STATS
DATA PRODUCTS & SERVICES
OUR MISSION
Empower teams at Riot to make timely, data-informed products by maintaining a
scalable and reliable data platform
AWS re:Invent 2015 | (GAM303) Riot Games: Migrating Mountains of Big Data to AWS
Sean Maloney
PROBLEM
PROBLEM
SCALING TOTAL OWNERSHIP ON AWS
Total ownership
We want to empower developers to:
• Provision their own infrastructure
• Execute their own deployments
• Monitor their own metrics
PROBLEM
SCALING TOTAL OWNERSHIP ON AWS
Resource attribution
• Who owns these EBS volumes?
• What applications depend on these security groups?
• Can these AMIs be deleted?
PROBLEM
SCALING TOTAL OWNERSHIP ON AWS
Security
PROBLEM
SCALING TOTAL OWNERSHIP ON AWS
Security
• Auditing is important, but it’s reactive
• Operational time sink
Security Monkey AWS Trusted Advisor
PROBLEM
SCALING TOTAL OWNERSHIP ON AWS
Amazon EC2
Container Service
CONTAINERS
CONTAINERS
STANDARDIZED APPLICATION UNITS… ON AWS
• Dockerfiles capture application dependencies
CONTAINERS
STANDARDIZED APPLICATION UNITS… ON AWS
• Dockerfiles capture application dependencies
• Common use cases have great community support
CONTAINERS
STANDARDIZED APPLICATION UNITS… ON AWS
• Dockerfiles capture application dependencies
• Common use cases have great community support
• Profit from our own engineering community
CONTAINERS
STANDARDIZED APPLICATION UNITS… ON AWS
CONTAINERS
STANDARDIZED APPLICATION UNITS… ON AWS
Embrace the abstraction
• Plan for failure at all levels
• Avoid manual intervention whenever possible
• It’s ephemeral all the way down
CONTAINERS
STANDARDIZED APPLICATION UNITS… ON AWS
Scheduling is hard
We need to:
• Quickly and fairly run tasks
• Prevent resource conflicts
• Provide reasonable fault tolerance
CONTAINERS
STANDARDIZED APPLICATION UNITS… ON AWS
Provisioning AWS hardware
• Enable total ownership of the AWS resources backing our containers
• Avoid the security, resource attribution, and convention degradation pitfalls
AMAZON ECS
AMAZON EC2 CONTAINER SERVICE
STANDARDIZED APPLICATION UNITS… ON ECS!
• ECS AMI provides all necessary software
• Designed with AWS integration in mind
• Free!
AMAZON EC2 CONTAINER SERVICE
KEY FEATURE TIMELINE
Nov 2014
ECS ANNOUNCED
Re:Invent 2014
AMAZON EC2 CONTAINER SERVICE
KEY FEATURE TIMELINE
Nov 2014 April 2015 Dec 2015 August 2016
ECS ANNOUNCED
Re:Invent 2014
ECS GENERALLY
AVAILABLE
ECR & NEW REGIONS
ECS becomes available in the final
missing region we need (Frankfurt)
for our globally deployed
applications
AMAZON EC2 CONTAINER SERVICE
KEY FEATURE TIMELINE
Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016
ECS ANNOUNCED
Re:Invent 2014
ECS GENERALLY
AVAILABLE
ECR & NEW REGIONS
ECS becomes available in the final
missing region we need (Frankfurt)
for our globally deployed
applications
SERVICE
SCALING
Automatic task count
scaling based on
CloudWatch metrics
introduced
AMAZON EC2 CONTAINER SERVICE
KEY FEATURE TIMELINE
Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016
ECS ANNOUNCED
Re:Invent 2014
ECS GENERALLY
AVAILABLE
ECR & NEW REGIONS
ECS becomes available in the final
missing region we need (Frankfurt)
for our globally deployed
applications
SERVICE
SCALING
Automatic task count
scaling based on
CloudWatch metrics
introduced
TASK SPECIFIC
IAM ROLES
Unlocked a lot of cluster
sharing potential
AMAZON EC2 CONTAINER SERVICE
KEY FEATURE TIMELINE
Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016
ECS ANNOUNCED
Re:Invent 2014
ECS GENERALLY
AVAILABLE
ECR & NEW REGIONS
ECS becomes available in the final
missing region we need (Frankfurt)
for our globally deployed
applications
SERVICE
SCALING
Automatic task count
scaling based on
CloudWatch metrics
introduced
TASK SPECIFIC
IAM ROLES
Unlocked a lot of cluster
sharing potential
APPLICATION
LOAD BALANCERS
Several key improvements
for ECS
BEYOND ECS
INFRASTRUCTURE AS CODE
• At scale, orchestrating infrastructure in a consistent, reproducible way is key
BEYOND ECS
INFRASTRUCTURE AS CODE
• At scale, orchestrating infrastructure in a consistent, reproducible way is key
Total ownership
PROVISIONING
VPC NAT
Gateway
Route 53
Hosted Zone
Route
Tables
VPN
Gateway
VPC Internet
Gateway
Application Subnets Tools
Instances
Instances
Instances
Availability
Zone C
Availability
Zone B
Availability
Zone A
PROVISIONING
INFRASTRUCTURE AS OBJECT-ORIENTED CODE
CASE STUDIES
ECS CLUSTER
TERRAFORM BUILDING BLOCKS
VPC NAT
Gateway
Route 53
Hosted Zone
Route
Tables
VPN
Gateway
VPC Internet
Gateway
Application Subnets Tools
Instances
Instances
Instances
Availability
Zone C
Availability
Zone B
Availability
Zone A
ECS Cluster
Autoscaling Group
Security
Group
Security
Group
Security
Group
Instance Instance Instance Instance Instance Instance
Launch Configuration User Data
IAM Role CloudWatch
Alarms
ECS CLUSTER
TERRAFORM BUILDING BLOCKS
Module 1
Module 2
Module 3
ECS CLUSTER
TERRAFORM BUILDING BLOCKS
MICROSERVICES
WITHOUT SERVICE ENDPOINTS
ECS Cluster
Autoscaling Group
Security
Group
Security
Group
Security
Group
Instance Instance Instance Instance Instance Instance
Launch Configuration User Data
IAM Role CloudWatch
Alarm
ECS Service Task Definition
CloudWatch Alarms IAM Role
MICROSERVICES
WITH SERVICE ENDPOINTS
ECS Cluster
ECS Service Task Definition
CloudWatch Alarms IAM Role
Application Load Balancer
Monitoring
CloudWatch Alarms SNS Topics
Security
Group
Security
Group
Listeners
Target Groups
Route 53
PERSISTENT DATA
LOSE ECS HOSTS WITHOUT LOSING DATA
ECS Cluster
ECS Service
Application Load Balancer
Monitoring
CloudWatch Alarms SNS Topics
Security
Group
Security
Group
Listeners
Target Groups
Route 53
Attachment Group
EBS EBS EBS
Elastic Network Interface
Elastic IP
LESSONS LEARNED
LESSONS
WHAT WORKED FOR US
• Break apart your stacks
ECS Cluster
LESSONS
WHAT WORKED FOR US
• Break apart your stacks, but don’t overdo it
ECS Service
ALB
SNS Topics
Route 53
IAM Role
Task Definition
CloudWatch Alarms
LESSONS
WHAT WORKED FOR US
• Be liberal with your cluster provisioning
• Don’t risk resource contention in production
• With good orchestration, additional clusters != additional operational overhead
LESSONS
WHAT WORKED FOR US
• Tag everything all of the time
• Keep your tags organized in your Terraform templates
• Have top level variables that get applied to every resource
• Create a tag for every dimension that is useful to your business
LESSONS
WHAT WORKED FOR US
• Centralize your logs
or
Amazon
CloudWatch
Logs
LESSONS
WHAT WORKED FOR US
• Stay up to date with release blogs and application updates
• ECS updates on the AWS blog
LESSONS
WHAT WORKED FOR US
• Capture your AMI generation process
LESSONS
WHAT WORKED FOR US
• Profile your memory requirements, monitor for scheduling issues
RIOT ENGINEERING BLOG
THESE PROBLEMS AND MORE
https://engineering.riotgames.com/
Riot engineering
How we use data
http://na.leagueoflegends.com/en/tag/insights
함께 해주셔서 감사합니다!

More Related Content

PDF
AWS Lambda 활용의 모든 것! - AWS Summit Seoul 2017
PDF
AWS re:Invent 2017 Recap
PDF
AWS Innovate: Smart Deployment on AWS - Andy Kim
PPTX
AWS re:Invent 2021 Recap by APN Ambassador
PPTX
AWS 101 - Amazon Web Services
PDF
Security @ (Cloud) Scale Deep Dive
PDF
AWS Summit Seoul 2015 - 일본 AWS 게임 고객사례 - Gungho, Grani, Nintendo를 중심으로
PDF
중국에서의 AWS 활용 현황 및 유저그룹 활동 - AWS Summit Seoul 2017
AWS Lambda 활용의 모든 것! - AWS Summit Seoul 2017
AWS re:Invent 2017 Recap
AWS Innovate: Smart Deployment on AWS - Andy Kim
AWS re:Invent 2021 Recap by APN Ambassador
AWS 101 - Amazon Web Services
Security @ (Cloud) Scale Deep Dive
AWS Summit Seoul 2015 - 일본 AWS 게임 고객사례 - Gungho, Grani, Nintendo를 중심으로
중국에서의 AWS 활용 현황 및 유저그룹 활동 - AWS Summit Seoul 2017

What's hot (11)

PDF
New Trends of Geospatial Services on AWS Cloud - Channy Yun :: ICGIS 2015 Seoul
PDF
라즈베리파이와 서버리스 환경을 통한 얼굴 인식 AI 서비스 구현 - AWS Summit Seoul 2017
PDF
일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017
PDF
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
PDF
선도 금융사들의 aws security 활용 방안 소개 :: Eugene Yu :: AWS Finance...
PDF
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
PDF
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
PDF
Aws summit devops 云端多环境自动化运维和部署
PDF
AWS Innovate 2016 : Opening Keynote - Glenn Gore
PDF
Bluesoft @ AWS re:Invent 2017 + AWS 101
PPTX
AWS & Cloud competition from Azure, openstack
New Trends of Geospatial Services on AWS Cloud - Channy Yun :: ICGIS 2015 Seoul
라즈베리파이와 서버리스 환경을 통한 얼굴 인식 AI 서비스 구현 - AWS Summit Seoul 2017
일본 시골 개발자의 AWS 활용기 - AWS Summit Seoul 2017
AWS Summit Seoul 2015 - AWS 최신 서비스 살펴보기 - Aurora, Lambda, EFS, Machine Learn...
선도 금융사들의 aws security 활용 방안 소개 :: Eugene Yu :: AWS Finance...
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
AWS Summit Seoul 2015 - AWS 클라우드를 활용한 빅데이터 및 실시간 스트리밍 분석
Aws summit devops 云端多环境自动化运维和部署
AWS Innovate 2016 : Opening Keynote - Glenn Gore
Bluesoft @ AWS re:Invent 2017 + AWS 101
AWS & Cloud competition from Azure, openstack
Ad

Similar to Riot Games 글로벌 게임 운영을 위한 Docker 및 Amazon ECS 활용사례 - AWS Summit Seoul 2017 (7)

PDF
[AWS Dev Day] 앱 현대화 | DevOps 개발자가 되기 위한 쿠버네티스 핵심 활용 예제 알아보기 - 정영준 AWS 솔루션즈 아키...
PDF
Amazon Web Services CLF-C02_Exam_Guide_Slides
PDF
ServerlessConf Tokyo キーノート
PPTX
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
PDF
What is Amazon Web Services & How to Start to deploy your apps ?
PDF
AWS Update from AWS User Group UK July Meetup
PDF
From Docker Straight to AWS
[AWS Dev Day] 앱 현대화 | DevOps 개발자가 되기 위한 쿠버네티스 핵심 활용 예제 알아보기 - 정영준 AWS 솔루션즈 아키...
Amazon Web Services CLF-C02_Exam_Guide_Slides
ServerlessConf Tokyo キーノート
Weaveworks at AWS re:Invent 2016: Operations Management with Amazon ECS
What is Amazon Web Services & How to Start to deploy your apps ?
AWS Update from AWS User Group UK July Meetup
From Docker Straight to AWS
Ad

More from Amazon Web Services Korea (20)

PDF
[D3T1S01] Gen AI를 위한 Amazon Aurora 활용 사례 방법
PDF
[D3T1S06] Neptune Analytics with Vector Similarity Search
PDF
[D3T1S03] Amazon DynamoDB design puzzlers
PDF
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
PDF
[D3T1S07] AWS S3 - 클라우드 환경에서 데이터베이스 보호하기
PDF
[D3T1S05] Aurora 혼합 구성 아키텍처를 사용하여 예상치 못한 트래픽 급증 대응하기
PDF
[D3T1S02] Aurora Limitless Database Introduction
PDF
[D3T2S01] Amazon Aurora MySQL 메이저 버전 업그레이드 및 Amazon B/G Deployments 실습
PDF
[D3T2S03] Data&AI Roadshow 2024 - Amazon DocumentDB 실습
PDF
AWS Modern Infra with Storage Roadshow 2023 - Day 2
PDF
AWS Modern Infra with Storage Roadshow 2023 - Day 1
PDF
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
PDF
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
PDF
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
PDF
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
PDF
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
PDF
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
PDF
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
PDF
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
PDF
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
[D3T1S01] Gen AI를 위한 Amazon Aurora 활용 사례 방법
[D3T1S06] Neptune Analytics with Vector Similarity Search
[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S04] Aurora PostgreSQL performance monitoring and troubleshooting by use...
[D3T1S07] AWS S3 - 클라우드 환경에서 데이터베이스 보호하기
[D3T1S05] Aurora 혼합 구성 아키텍처를 사용하여 예상치 못한 트래픽 급증 대응하기
[D3T1S02] Aurora Limitless Database Introduction
[D3T2S01] Amazon Aurora MySQL 메이저 버전 업그레이드 및 Amazon B/G Deployments 실습
[D3T2S03] Data&AI Roadshow 2024 - Amazon DocumentDB 실습
AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 1
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...

Recently uploaded (20)

PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Approach and Philosophy of On baking technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
KodekX | Application Modernization Development
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Modernizing your data center with Dell and AMD
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Machine learning based COVID-19 study performance prediction
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Understanding_Digital_Forensics_Presentation.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Approach and Philosophy of On baking technology
MYSQL Presentation for SQL database connectivity
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Chapter 3 Spatial Domain Image Processing.pdf
cuic standard and advanced reporting.pdf
NewMind AI Monthly Chronicles - July 2025
Digital-Transformation-Roadmap-for-Companies.pptx
NewMind AI Weekly Chronicles - August'25 Week I
KodekX | Application Modernization Development
20250228 LYD VKU AI Blended-Learning.pptx
Modernizing your data center with Dell and AMD
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Per capita expenditure prediction using model stacking based on satellite ima...
Machine learning based COVID-19 study performance prediction

Riot Games 글로벌 게임 운영을 위한 Docker 및 Amazon ECS 활용사례 - AWS Summit Seoul 2017

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Adam Rozumek, Systems Engineer RIOT GAMES Standardizing Application Deployments Using Amazon ECS
  • 4. AMAZON ECS & VIDEO GAMES WHAT TO EXPECT ECS CONSOLIDATION WINS & LESSONS Adapting existing deployments & infrastructure INFRASTRUCTURE ENCAPSULATION Object Oriented Development Operations MULTI-GAME MODULAR SERVICE DESIGN A different kind of scaling 1 2 3
  • 5. AMAZON ECS & VIDEO GAMES WHAT TO EXPECT ECS CONSOLIDATION WINS & LESSONS Adapting existing deployments & infrastructure INFRASTRUCTURE ENCAPSULATION Object Oriented Development Operations MULTI-GAME MODULAR SERVICE DESIGN A different kind of scaling 1 2 3
  • 6. AMAZON ECS & VIDEO GAMES WHAT TO EXPECT ECS CONSOLIDATION WINS & LESSONS Adapting existing deployments & infrastructure INFRASTRUCTURE ENCAPSULATION Object Oriented Development Operations MULTI-GAME MODULAR SERVICE DESIGN A different kind of scaling 1 2 3
  • 7. AMAZON ECS & VIDEO GAMES WHAT TO EXPECT ECS CONSOLIDATION WINS & LESSONS Adapting existing deployments & infrastructure INFRASTRUCTURE ENCAPSULATION Object Oriented Development Operations MULTI-GAME MODULAR SERVICE DESIGN A different kind of scaling 1 2 3
  • 10. 7.5MILLION PEAK CONCURRENT PLAYERS 100MILLION MONTHLY ACTIVE PLAYERS MORE THAN 27MILLION DAILY ACTIVE PLAYERS MORE THAN MORE THAN 2016 LEAGUE OF LEGENDS STATS
  • 11. DATA PRODUCTS & SERVICES OUR MISSION Empower teams at Riot to make timely, data-informed products by maintaining a scalable and reliable data platform AWS re:Invent 2015 | (GAM303) Riot Games: Migrating Mountains of Big Data to AWS Sean Maloney
  • 13. PROBLEM SCALING TOTAL OWNERSHIP ON AWS Total ownership We want to empower developers to: • Provision their own infrastructure • Execute their own deployments • Monitor their own metrics
  • 14. PROBLEM SCALING TOTAL OWNERSHIP ON AWS Resource attribution • Who owns these EBS volumes? • What applications depend on these security groups? • Can these AMIs be deleted?
  • 16. PROBLEM SCALING TOTAL OWNERSHIP ON AWS Security • Auditing is important, but it’s reactive • Operational time sink Security Monkey AWS Trusted Advisor
  • 17. PROBLEM SCALING TOTAL OWNERSHIP ON AWS Amazon EC2 Container Service
  • 19. CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS • Dockerfiles capture application dependencies
  • 20. CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS • Dockerfiles capture application dependencies • Common use cases have great community support
  • 21. CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS • Dockerfiles capture application dependencies • Common use cases have great community support • Profit from our own engineering community
  • 23. CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS Embrace the abstraction • Plan for failure at all levels • Avoid manual intervention whenever possible • It’s ephemeral all the way down
  • 24. CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS Scheduling is hard We need to: • Quickly and fairly run tasks • Prevent resource conflicts • Provide reasonable fault tolerance
  • 25. CONTAINERS STANDARDIZED APPLICATION UNITS… ON AWS Provisioning AWS hardware • Enable total ownership of the AWS resources backing our containers • Avoid the security, resource attribution, and convention degradation pitfalls
  • 27. AMAZON EC2 CONTAINER SERVICE STANDARDIZED APPLICATION UNITS… ON ECS! • ECS AMI provides all necessary software • Designed with AWS integration in mind • Free!
  • 28. AMAZON EC2 CONTAINER SERVICE KEY FEATURE TIMELINE Nov 2014 ECS ANNOUNCED Re:Invent 2014
  • 29. AMAZON EC2 CONTAINER SERVICE KEY FEATURE TIMELINE Nov 2014 April 2015 Dec 2015 August 2016 ECS ANNOUNCED Re:Invent 2014 ECS GENERALLY AVAILABLE ECR & NEW REGIONS ECS becomes available in the final missing region we need (Frankfurt) for our globally deployed applications
  • 30. AMAZON EC2 CONTAINER SERVICE KEY FEATURE TIMELINE Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016 ECS ANNOUNCED Re:Invent 2014 ECS GENERALLY AVAILABLE ECR & NEW REGIONS ECS becomes available in the final missing region we need (Frankfurt) for our globally deployed applications SERVICE SCALING Automatic task count scaling based on CloudWatch metrics introduced
  • 31. AMAZON EC2 CONTAINER SERVICE KEY FEATURE TIMELINE Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016 ECS ANNOUNCED Re:Invent 2014 ECS GENERALLY AVAILABLE ECR & NEW REGIONS ECS becomes available in the final missing region we need (Frankfurt) for our globally deployed applications SERVICE SCALING Automatic task count scaling based on CloudWatch metrics introduced TASK SPECIFIC IAM ROLES Unlocked a lot of cluster sharing potential
  • 32. AMAZON EC2 CONTAINER SERVICE KEY FEATURE TIMELINE Nov 2014 April 2015 Dec 2015 May 2016 July 2016 August 2016 ECS ANNOUNCED Re:Invent 2014 ECS GENERALLY AVAILABLE ECR & NEW REGIONS ECS becomes available in the final missing region we need (Frankfurt) for our globally deployed applications SERVICE SCALING Automatic task count scaling based on CloudWatch metrics introduced TASK SPECIFIC IAM ROLES Unlocked a lot of cluster sharing potential APPLICATION LOAD BALANCERS Several key improvements for ECS
  • 33. BEYOND ECS INFRASTRUCTURE AS CODE • At scale, orchestrating infrastructure in a consistent, reproducible way is key
  • 34. BEYOND ECS INFRASTRUCTURE AS CODE • At scale, orchestrating infrastructure in a consistent, reproducible way is key Total ownership
  • 36. VPC NAT Gateway Route 53 Hosted Zone Route Tables VPN Gateway VPC Internet Gateway Application Subnets Tools Instances Instances Instances Availability Zone C Availability Zone B Availability Zone A PROVISIONING INFRASTRUCTURE AS OBJECT-ORIENTED CODE
  • 38. ECS CLUSTER TERRAFORM BUILDING BLOCKS VPC NAT Gateway Route 53 Hosted Zone Route Tables VPN Gateway VPC Internet Gateway Application Subnets Tools Instances Instances Instances Availability Zone C Availability Zone B Availability Zone A ECS Cluster Autoscaling Group Security Group Security Group Security Group Instance Instance Instance Instance Instance Instance Launch Configuration User Data IAM Role CloudWatch Alarms
  • 39. ECS CLUSTER TERRAFORM BUILDING BLOCKS Module 1 Module 2 Module 3
  • 41. MICROSERVICES WITHOUT SERVICE ENDPOINTS ECS Cluster Autoscaling Group Security Group Security Group Security Group Instance Instance Instance Instance Instance Instance Launch Configuration User Data IAM Role CloudWatch Alarm ECS Service Task Definition CloudWatch Alarms IAM Role
  • 42. MICROSERVICES WITH SERVICE ENDPOINTS ECS Cluster ECS Service Task Definition CloudWatch Alarms IAM Role Application Load Balancer Monitoring CloudWatch Alarms SNS Topics Security Group Security Group Listeners Target Groups Route 53
  • 43. PERSISTENT DATA LOSE ECS HOSTS WITHOUT LOSING DATA ECS Cluster ECS Service Application Load Balancer Monitoring CloudWatch Alarms SNS Topics Security Group Security Group Listeners Target Groups Route 53 Attachment Group EBS EBS EBS Elastic Network Interface Elastic IP
  • 45. LESSONS WHAT WORKED FOR US • Break apart your stacks ECS Cluster
  • 46. LESSONS WHAT WORKED FOR US • Break apart your stacks, but don’t overdo it ECS Service ALB SNS Topics Route 53 IAM Role Task Definition CloudWatch Alarms
  • 47. LESSONS WHAT WORKED FOR US • Be liberal with your cluster provisioning • Don’t risk resource contention in production • With good orchestration, additional clusters != additional operational overhead
  • 48. LESSONS WHAT WORKED FOR US • Tag everything all of the time • Keep your tags organized in your Terraform templates • Have top level variables that get applied to every resource • Create a tag for every dimension that is useful to your business
  • 49. LESSONS WHAT WORKED FOR US • Centralize your logs or Amazon CloudWatch Logs
  • 50. LESSONS WHAT WORKED FOR US • Stay up to date with release blogs and application updates • ECS updates on the AWS blog
  • 51. LESSONS WHAT WORKED FOR US • Capture your AMI generation process
  • 52. LESSONS WHAT WORKED FOR US • Profile your memory requirements, monitor for scheduling issues
  • 53. RIOT ENGINEERING BLOG THESE PROBLEMS AND MORE https://engineering.riotgames.com/ Riot engineering How we use data http://na.leagueoflegends.com/en/tag/insights