SlideShare a Scribd company logo
Deploying Apache Kylin on AWS
And designing a task scheduler for it
Chase Zhang
Strikingly
Outline
Introduction
Strikingly
Analytics Service of Strikingly
Deploy Apache Kylin on AWS
Overview
Containerizing Kylin
Maintenance
Scheduler for Kylin System
Designing Goals
Basic Idea & Implementation
Tasks, Executors and Services
Concurrency and Fault Tolerance
Maintenance and Monitoring
Conclusion
Introduction
Strikingly
https://strikingly.com
https://sxl.com
Introduction
Strikingly
At Strikingly, we are devoted to provide convenient and one stop website building solution
to our customers.
Introduction
Analytics Service of Strikingly
The version 0 of our analytics service is Google Analytics
Strikingly
Google
Analytics
User Pages
User
Register / Get Track IDSet Track ID
Generate User's website
Collect Page Views Data
Serve User Query
Figure: Google Analytics
Introduction
Analytics Service of Strikingly
The version 1 of our analytics service is through Keen IO, a 3rd party service
Strikingly
User Pages
User
View Analytics
Generate User's website
Keen.IO
Serve User Query
Collect Page Views Data
Figure: Keen IO
Introduction
Analytics Service of Strikingly
The version 2 of our analytics services is combining Keen IO and Apache Kylin
Strikingly
User Pages
User
View Analytics
Generate User's website
Keen.IO
Collect Page Views Data
Apache
Kylin
Serve User Query
Figure: Keen IO + Apache Kylin
Deploy Apache Kylin on AWS
Overview
EMR ECS
7070
33345
33345Kylin
(query)
Kylin
(query)
Kylin
(query)
Kylin
(job)
Application
Load
Balancer
Hadoop Hive HBase
Hive HBase
S3
33347
Target
Group
80
Hadoop
Hadoop Hive
YARN
HBase
Query
Requests
Keen.IO
Figure: Deploy Apache Kylin on AWS
Deploy Apache Kylin on AWS
Containerizing Kylin
Hive HBase
HDFS
Apache Kylin
MapReduce / Spark
Figure: Apache Kylin is “Stateless”
Deploy Apache Kylin on AWS
Containerizing Kylin
Problem
We’d like to
▶ Deploy Kylin on multiple regions
▶ Customize behaviors with environment variables
▶ Build a single docker image and run everywhere
Deploy Apache Kylin on AWS
Containerizing Kylin
Hive HBase Hadoop YARN Kylin
Configuration Files Templates
Configuration Files
Substitute variables
Start running
Figure: Launching Kylin with customized script
Deploy Apache Kylin on AWS
Maintenance
Problem
Two problems while maintaining this system:
▶ Auto scale and dynamic ports
▶ Clean-up and back-up
Deploy Apache Kylin on AWS
Maintenance
ECS
33345
33345
xxxxx
Kylin
(query)
Kylin
(query)
Kylin
(query)
Application
Load
Balancer
33347
Target
Group
80
Query
Requests
Kylin
(query)7070Kylin
(job)
Figure: Auto Scale and Dynamic Listening Ports
Deploy Apache Kylin on AWS
Maintenance
./bin/metastore.sh backup
./bin/metastore.sh restore
./bin/metastore.sh clean
./bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob
Figure: Clean-up and back-up tools
Deploy Apache Kylin on AWS
Maintenance
Solution
A customized task scheduler.
Scheduler for Kylin System
Designing Goals
▶ Customizing task scheduling
▶ Making system robust and fault tolerant
▶ Solving both previously mentioned maintenance problems
Scheduler for Kylin System
Basic Idea & Overall Design
The Systemd (Anti-UNIX) philosophy
▶ Scheduler works as a central service
▶ Other components work as RPC services
Scheduler for Kylin System
Basic Idea & Overal Design
Scheduler
Target Group
Kylin
(query)
Kylin
(query)
Kylin
(query)
Kylin
(query)
Kylin
(job)
HBase Hive
DynamoDB S3
Keen.IO
Kylin
(query)
Scheduler for Kylin System
Basic Idea & Overall Design
Implementation details:
▶ Applying FP9
and Actor Model10
ideas
▶ Implemented with Scala11
and Akka12
▶ Interact with Hadoop components through Java libraries
9
https://en.wikipedia.org/wiki/Functional_programming
10
https://en.wikipedia.org/wiki/Actor_model
11
http://scala-lang.org/
12
https://akka.io/
Scheduler for Kylin System
Basic Idea & Overall Design
Control Actor
Consistent
Hashing Router
Task Actor
Executor
Scheduler
1 2
1
22
3
3
3
1
2
3
1
Control Message
Task Message
Service
Figure: Scheduler’s Actor System
Scheduler for Kylin System
Tasks, Executors and Services
▶ Task = immutable message
▶ Task has a type for executor
▶ Executor call services to work
▶ Task categories: planning tasks, working tasks, maintaining tasks
Scheduler for Kylin System
Tasks, Executors and Services
PlanDataRefresh
PlanCubeMaintenance
HiveTableRefresh
KylinCubeBuild
KylinCubeRefresh
KylinCubeMerge
Hive Service
Kylin Service
Hourly
Daily
Need import new data?
Need build a new segment?
Need refresh old segments?
Need fill holes between segments?
Need merge segments?
Need fill holes in hive table?
Hive table has been refreshed, refresh segment
Planning Tasks Working Tasks Services
Message Storage
Service
Figure: Planning Tasks and Working Tasks
Scheduler for Kylin System
Tasks, Executors and Services
KylinMetadataBackup
KylinMetadataCleanup
KylinMetadataRestore
KylinHBaseTableCleanup
HBase Service
Kylin Service
AWS Service
S3
Apache Kylin
KYLIN_XWFQ12
kylin_metadata
kylin-metadata-backups
Update Cache
Get Cube Info
Delete Table
Read MetadataDelete Row
Write ZIP File
Read ZIP File
Write Table
Get Cube Info
Figure: Maintaining Tasks
Scheduler for Kylin System
Concurrency and Fault Tolerance
Problem
We’d like to execute tasks in order
▶ Maintaining tasks run exclusively
▶ Tasks of the same cube run execlusively
Scheduler for Kylin System
Concurrency and Fault Tolerance
Solution
Two manners to solve this problem:
▶ ReadWriteLock
▶ ConsistentHashingRouter
Scheduler for Kylin System
Concurrency and Fault Tolerance
Problem
We’d like to be fault tolerant:
1. Recovering from failures
2. Filling missed segment gaps
3. Recording history
Scheduler for Kylin System
Concurrency and Fault Tolerance
Solution
We’re taking multiple manners to solve this problem:
1. Assigning each task with a Unique ID
2. Persisting task message with progress to DynamoDB
3. Implementing planning and working tasks carefully to be issue aware
Scheduler for Kylin System
Concurrency and Fault Tolerance
ControlActor TaskActor
Executor
Consistent
HashingRouterinit running finish
error
DynamoDB DynamoDB DynamoDB
TaskMessage
Acquire Lock Release Lock
Figure: Concurrency and Message Persistent
Scheduler for Kylin System
Maintenance and Monitoring
Problem
We still have two trival problems to solve:
▶ Manually performing actions
▶ Task monitoring and error notification
Scheduler for Kylin System
Maintenance and Monitoring
How to design the user interface of scheduler?
Scheduler for Kylin System
Maintenance and Monitoring
Introducing scheduler slack bot...
Scheduler for Kylin System
Maintenance and Monitoring
Event Bus
Control Actor
Consistent
Hashing Router Task Actor Executor Service
SlackBot Actor
User Command
Figure: Scheduler Slack Bot
Scheduler for Kylin System
Maintenance and Monitoring
Figure: List task status
Scheduler for Kylin System
Maintenance and Monitoring
Figure: List Kylin Job Progress
Conclusion
▶ With Apache Kylin, we’re providing a sub-second web analytics service
▶ With little effort, we managed to deploy Apache Kylin with docker container
▶ With the scheduler, we deployed the system on AWS without losses of features
▶ We’ve made the system concurrency safe and robust
Conclusion
Version 3?
But wait, we still have a problem, don’t we?
Conclusion
Version 3?
User
Keen.IOS3
North America
S3
Tokyo, Japan
S3
Beijing, China
User
5 minutes10 minutes20 minutes
Page Views
Figure: Data Transfer Delay of Keen IO
Conclusion
Version 3?
User
S3
Tokyo, Japan
Application
Load Balancer
S3
Beijing, China
User
5 minutes
Application
Load Balancer
5 minutes
Page Views Page Views
Figure: Collecting Data with ALB?
Thank you!
BTW, we’re still hiring Data Platform
Engineer:
1. Writing Scala
2. Working on AWS
3. Working with Apache Kylin
4. Working on our “Project Manhattan”

More Related Content

PDF
Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...
PDF
Kubernetes Story - Day 1: Build and Manage Containers with Podman
PDF
Puppeteerのお話
PDF
Ansible Workshop for Pythonistas
PDF
Kubernetes Story - Day 3: Deploying and Scaling Applications on OpenShift
PDF
Lessons Learned: Using Concourse In Production
PDF
Building Reusable Development Environments with Docker
PPTX
Docker
Kubernetes Story - Day 2: Quay.io Container Registry for Publishing, Building...
Kubernetes Story - Day 1: Build and Manage Containers with Podman
Puppeteerのお話
Ansible Workshop for Pythonistas
Kubernetes Story - Day 3: Deploying and Scaling Applications on OpenShift
Lessons Learned: Using Concourse In Production
Building Reusable Development Environments with Docker
Docker

What's hot (20)

PDF
Docker for (Java) Developers
PPTX
HP Advanced Technology Group: Docker and Ansible
PDF
Openshift: The power of kubernetes for engineers - Riga Dev Days 18
PDF
Docker at Djangocon 2013 | Talk by Ken Cochrane
PDF
Docker & JVM: A Perfect Match
PDF
Continuous Integration using Docker & Jenkins
PDF
Docker - From Walking To Running
PPTX
Docker-hanoi meetup #1: introduction about Docker
PDF
Continuous Integration and Kamailio
PDF
Docker by Example - Basics
PDF
Introduction to Docker
PDF
Docker Registry + Basic Auth
PDF
Docker 101 - Intro to Docker
PPT
Python virtualenv & pip in 90 minutes
PPTX
Docker, LinuX Container
PPT
Learn RabbitMQ with Python in 90mins
PDF
Docker 101 2015-05-28
PPTX
Enable Fig to deploy to multiple Docker servers by Willy Kuo
PDF
Automate App Container Delivery with CI/CD and DevOps
PDF
Perspectives on Docker
Docker for (Java) Developers
HP Advanced Technology Group: Docker and Ansible
Openshift: The power of kubernetes for engineers - Riga Dev Days 18
Docker at Djangocon 2013 | Talk by Ken Cochrane
Docker & JVM: A Perfect Match
Continuous Integration using Docker & Jenkins
Docker - From Walking To Running
Docker-hanoi meetup #1: introduction about Docker
Continuous Integration and Kamailio
Docker by Example - Basics
Introduction to Docker
Docker Registry + Basic Auth
Docker 101 - Intro to Docker
Python virtualenv & pip in 90 minutes
Docker, LinuX Container
Learn RabbitMQ with Python in 90mins
Docker 101 2015-05-28
Enable Fig to deploy to multiple Docker servers by Willy Kuo
Automate App Container Delivery with CI/CD and DevOps
Perspectives on Docker
Ad

Similar to Deploying Apache Kylin on AWS and designing a task scheduler for it (20)

PDF
Sprint 70
ODP
The MirAL Story
PPTX
Ultimate Guide to Microservice Architecture on Kubernetes
PDF
Sprint 45 review
PPTX
Simple stock market analysis
PPTX
CI/CD Pipeline with Kubernetes
PDF
Sprint 71
PPTX
Core infra GCP : Application in the cloud
PPT
IBM Monitoring and Event Management Solutions
PDF
AKS: k8s e azure
PDF
Cloud Composer workshop at Airflow Summit 2023.pdf
PDF
Skill Petals - Google Associate Cloud Engineer GCP-ACE Syllabus.pdf
DOCX
Krishna_Divagar_Kumaresan
PDF
Continuous Delivery of a Cloud Deployment at a Large Telecommunications Provider
PDF
Train, predict, serve: How to go into production your machine learning model
PDF
Building Autonomous Operations for Kubernetes with keptn
PDF
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
PPTX
Kubernetes Services Lecture Explain Slide
PDF
A fresh look at Google’s Cloud by Mandy Waite
PPTX
ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...
Sprint 70
The MirAL Story
Ultimate Guide to Microservice Architecture on Kubernetes
Sprint 45 review
Simple stock market analysis
CI/CD Pipeline with Kubernetes
Sprint 71
Core infra GCP : Application in the cloud
IBM Monitoring and Event Management Solutions
AKS: k8s e azure
Cloud Composer workshop at Airflow Summit 2023.pdf
Skill Petals - Google Associate Cloud Engineer GCP-ACE Syllabus.pdf
Krishna_Divagar_Kumaresan
Continuous Delivery of a Cloud Deployment at a Large Telecommunications Provider
Train, predict, serve: How to go into production your machine learning model
Building Autonomous Operations for Kubernetes with keptn
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
Kubernetes Services Lecture Explain Slide
A fresh look at Google’s Cloud by Mandy Waite
ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...
Ad

More from Chase Zhang (6)

PPTX
AWS Summit: Strikingly analytics
PDF
Aws summit strikingly analytics
PDF
Pregel In Graphs - Models and Instances
PDF
Intro to Hadoop ecosystem and Apache Kylin
PDF
Immutable, and More
PDF
Intermediate Git
AWS Summit: Strikingly analytics
Aws summit strikingly analytics
Pregel In Graphs - Models and Instances
Intro to Hadoop ecosystem and Apache Kylin
Immutable, and More
Intermediate Git

Recently uploaded (20)

PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Computer network topology notes for revision
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Lecture1 pattern recognition............
PPTX
1_Introduction to advance data techniques.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
Foundation of Data Science unit number two notes
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction-to-Cloud-ComputingFinal.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Supervised vs unsupervised machine learning algorithms
Computer network topology notes for revision
.pdf is not working space design for the following data for the following dat...
Lecture1 pattern recognition............
1_Introduction to advance data techniques.pptx
Mega Projects Data Mega Projects Data
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Fluorescence-microscope_Botany_detailed content
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Foundation of Data Science unit number two notes
Reliability_Chapter_ presentation 1221.5784
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx

Deploying Apache Kylin on AWS and designing a task scheduler for it

  • 1. Deploying Apache Kylin on AWS And designing a task scheduler for it Chase Zhang Strikingly
  • 2. Outline Introduction Strikingly Analytics Service of Strikingly Deploy Apache Kylin on AWS Overview Containerizing Kylin Maintenance Scheduler for Kylin System Designing Goals Basic Idea & Implementation Tasks, Executors and Services Concurrency and Fault Tolerance Maintenance and Monitoring Conclusion
  • 4. Introduction Strikingly At Strikingly, we are devoted to provide convenient and one stop website building solution to our customers.
  • 5. Introduction Analytics Service of Strikingly The version 0 of our analytics service is Google Analytics Strikingly Google Analytics User Pages User Register / Get Track IDSet Track ID Generate User's website Collect Page Views Data Serve User Query Figure: Google Analytics
  • 6. Introduction Analytics Service of Strikingly The version 1 of our analytics service is through Keen IO, a 3rd party service Strikingly User Pages User View Analytics Generate User's website Keen.IO Serve User Query Collect Page Views Data Figure: Keen IO
  • 7. Introduction Analytics Service of Strikingly The version 2 of our analytics services is combining Keen IO and Apache Kylin Strikingly User Pages User View Analytics Generate User's website Keen.IO Collect Page Views Data Apache Kylin Serve User Query Figure: Keen IO + Apache Kylin
  • 8. Deploy Apache Kylin on AWS Overview EMR ECS 7070 33345 33345Kylin (query) Kylin (query) Kylin (query) Kylin (job) Application Load Balancer Hadoop Hive HBase Hive HBase S3 33347 Target Group 80 Hadoop Hadoop Hive YARN HBase Query Requests Keen.IO Figure: Deploy Apache Kylin on AWS
  • 9. Deploy Apache Kylin on AWS Containerizing Kylin Hive HBase HDFS Apache Kylin MapReduce / Spark Figure: Apache Kylin is “Stateless”
  • 10. Deploy Apache Kylin on AWS Containerizing Kylin Problem We’d like to ▶ Deploy Kylin on multiple regions ▶ Customize behaviors with environment variables ▶ Build a single docker image and run everywhere
  • 11. Deploy Apache Kylin on AWS Containerizing Kylin Hive HBase Hadoop YARN Kylin Configuration Files Templates Configuration Files Substitute variables Start running Figure: Launching Kylin with customized script
  • 12. Deploy Apache Kylin on AWS Maintenance Problem Two problems while maintaining this system: ▶ Auto scale and dynamic ports ▶ Clean-up and back-up
  • 13. Deploy Apache Kylin on AWS Maintenance ECS 33345 33345 xxxxx Kylin (query) Kylin (query) Kylin (query) Application Load Balancer 33347 Target Group 80 Query Requests Kylin (query)7070Kylin (job) Figure: Auto Scale and Dynamic Listening Ports
  • 14. Deploy Apache Kylin on AWS Maintenance ./bin/metastore.sh backup ./bin/metastore.sh restore ./bin/metastore.sh clean ./bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob Figure: Clean-up and back-up tools
  • 15. Deploy Apache Kylin on AWS Maintenance Solution A customized task scheduler.
  • 16. Scheduler for Kylin System Designing Goals ▶ Customizing task scheduling ▶ Making system robust and fault tolerant ▶ Solving both previously mentioned maintenance problems
  • 17. Scheduler for Kylin System Basic Idea & Overall Design The Systemd (Anti-UNIX) philosophy ▶ Scheduler works as a central service ▶ Other components work as RPC services
  • 18. Scheduler for Kylin System Basic Idea & Overal Design Scheduler Target Group Kylin (query) Kylin (query) Kylin (query) Kylin (query) Kylin (job) HBase Hive DynamoDB S3 Keen.IO Kylin (query)
  • 19. Scheduler for Kylin System Basic Idea & Overall Design Implementation details: ▶ Applying FP9 and Actor Model10 ideas ▶ Implemented with Scala11 and Akka12 ▶ Interact with Hadoop components through Java libraries 9 https://en.wikipedia.org/wiki/Functional_programming 10 https://en.wikipedia.org/wiki/Actor_model 11 http://scala-lang.org/ 12 https://akka.io/
  • 20. Scheduler for Kylin System Basic Idea & Overall Design Control Actor Consistent Hashing Router Task Actor Executor Scheduler 1 2 1 22 3 3 3 1 2 3 1 Control Message Task Message Service Figure: Scheduler’s Actor System
  • 21. Scheduler for Kylin System Tasks, Executors and Services ▶ Task = immutable message ▶ Task has a type for executor ▶ Executor call services to work ▶ Task categories: planning tasks, working tasks, maintaining tasks
  • 22. Scheduler for Kylin System Tasks, Executors and Services PlanDataRefresh PlanCubeMaintenance HiveTableRefresh KylinCubeBuild KylinCubeRefresh KylinCubeMerge Hive Service Kylin Service Hourly Daily Need import new data? Need build a new segment? Need refresh old segments? Need fill holes between segments? Need merge segments? Need fill holes in hive table? Hive table has been refreshed, refresh segment Planning Tasks Working Tasks Services Message Storage Service Figure: Planning Tasks and Working Tasks
  • 23. Scheduler for Kylin System Tasks, Executors and Services KylinMetadataBackup KylinMetadataCleanup KylinMetadataRestore KylinHBaseTableCleanup HBase Service Kylin Service AWS Service S3 Apache Kylin KYLIN_XWFQ12 kylin_metadata kylin-metadata-backups Update Cache Get Cube Info Delete Table Read MetadataDelete Row Write ZIP File Read ZIP File Write Table Get Cube Info Figure: Maintaining Tasks
  • 24. Scheduler for Kylin System Concurrency and Fault Tolerance Problem We’d like to execute tasks in order ▶ Maintaining tasks run exclusively ▶ Tasks of the same cube run execlusively
  • 25. Scheduler for Kylin System Concurrency and Fault Tolerance Solution Two manners to solve this problem: ▶ ReadWriteLock ▶ ConsistentHashingRouter
  • 26. Scheduler for Kylin System Concurrency and Fault Tolerance Problem We’d like to be fault tolerant: 1. Recovering from failures 2. Filling missed segment gaps 3. Recording history
  • 27. Scheduler for Kylin System Concurrency and Fault Tolerance Solution We’re taking multiple manners to solve this problem: 1. Assigning each task with a Unique ID 2. Persisting task message with progress to DynamoDB 3. Implementing planning and working tasks carefully to be issue aware
  • 28. Scheduler for Kylin System Concurrency and Fault Tolerance ControlActor TaskActor Executor Consistent HashingRouterinit running finish error DynamoDB DynamoDB DynamoDB TaskMessage Acquire Lock Release Lock Figure: Concurrency and Message Persistent
  • 29. Scheduler for Kylin System Maintenance and Monitoring Problem We still have two trival problems to solve: ▶ Manually performing actions ▶ Task monitoring and error notification
  • 30. Scheduler for Kylin System Maintenance and Monitoring How to design the user interface of scheduler?
  • 31. Scheduler for Kylin System Maintenance and Monitoring Introducing scheduler slack bot...
  • 32. Scheduler for Kylin System Maintenance and Monitoring Event Bus Control Actor Consistent Hashing Router Task Actor Executor Service SlackBot Actor User Command Figure: Scheduler Slack Bot
  • 33. Scheduler for Kylin System Maintenance and Monitoring Figure: List task status
  • 34. Scheduler for Kylin System Maintenance and Monitoring Figure: List Kylin Job Progress
  • 35. Conclusion ▶ With Apache Kylin, we’re providing a sub-second web analytics service ▶ With little effort, we managed to deploy Apache Kylin with docker container ▶ With the scheduler, we deployed the system on AWS without losses of features ▶ We’ve made the system concurrency safe and robust
  • 36. Conclusion Version 3? But wait, we still have a problem, don’t we?
  • 37. Conclusion Version 3? User Keen.IOS3 North America S3 Tokyo, Japan S3 Beijing, China User 5 minutes10 minutes20 minutes Page Views Figure: Data Transfer Delay of Keen IO
  • 38. Conclusion Version 3? User S3 Tokyo, Japan Application Load Balancer S3 Beijing, China User 5 minutes Application Load Balancer 5 minutes Page Views Page Views Figure: Collecting Data with ALB?
  • 39. Thank you! BTW, we’re still hiring Data Platform Engineer: 1. Writing Scala 2. Working on AWS 3. Working with Apache Kylin 4. Working on our “Project Manhattan”