SlideShare a Scribd company logo
Till Rohrmann
trohrmann@apache.org
@stsffap
Dynamic Scaling: How Apache
Flink® Adapts to Changing
Workloads
Changing Workloads And SLAs
2
Resource Adaption
3
+
time
Workload
Resources
time
Workload
Resources
What Is This Talk About?
§ Flink’s approach to dynamic scaling
§ Current state with demo
§ Outlook on next development steps
4
Dynamic Scaling
5
Basic Idea
6
• Spread work across more workers to decrease workload
Scaling Stateless Jobs
7
Scale Up Scale Down
Source
Mapper
Sink
• Scale up: Deploy new tasks
• Scale down: Cancel running tasks
Scaling Stateful Jobs
8
?
• Problem: Which state to assign to new task?
State in Apache Flink
9
Keyed vs. Non-keyed State
10
• State bound to a key
• E.g. Keyed UDF and window state
• State bound to a subtask
• E.g. Source state
Keyed Non-keyed
Repartitioning Keyed State
§ Similar to consistent
hashing
§ Split key space into
key groups
§ Assign key groups to
tasks
11
Key space
Key group #1 Key group #2
Key group #3Key group #4
Repartitioning Keyed State contd.
§ Rescaling changes
key group
assignment
§ Maximum parallelism
defined by #key
groups
12
Repartitioning Non-keyed state
§ User defined merge and
split functions
• Most general approach
§ Breaking non-keyed state
up into finer granularity
• State has to contain
multiple entries
• Automatic repartitioning
wrt granularity
13
#1 #2
#3
Repartitioning Non-keyed State contd.
§ Non-keyed state entries gathered at the
job manager
§ Repartitioning schemes
• Repartition & send
• Union & broadcast
14
Example: Kafka Source
15
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
• Store offset for each partition
• Individual entries are repartitionable
partitionId: 1, offset: 42
partitionId: 3, offset: 10
partitionId: 6, offset: 27
Rescaling: Why is That so Hard?
§ Handling of state
§ Repartitioning of keyed & non-keyed
state
§ Unique among open source stream
processors, afaik
16
Demo Time
17
Demo Topology
18
Kafka Source Counter
KeyBy
Current State and next Steps
19
Current State
§ Manual rescaling
1. Take savepoint
2. Stop the job
3. Restart job with adjusted parallelism and
savepoint
20
Next Steps
§ Integrate savepoint with stop signal
§ Rescaling individual operators w/o restart
§ Dynamic container de-/allocation
• “Running Flink Everywhere” by Stephan
Ewen, 16:45 at Kesselhaus
21
Auto Scaling Policies
22
• Latency
• Throughput
• Resource utilization
• Kubernetes on GCE, EC2 and Mesos (marathon-
autoscale) already support auto-scaling
Conclusion
§ Scaling of keyed and non-keyed state
§ Flink supports manual rescaling with
restart
(WIP branch: https://github.com/tillrohrmann/flink/tree/partitionable-op-state)
§ Future versions might support scaling on
the fly and automatic rescaling policies
23

More Related Content

PDF
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
PPTX
Fabian Hueske – Cascading on Flink
PDF
Marton Balassi – Stateful Stream Processing
PDF
Christian Kreuzfeld – Static vs Dynamic Stream Processing
PDF
Vyacheslav Zholudev – Flink, a Convenient Abstraction Layer for Yarn?
PDF
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
PPTX
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
PPTX
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Fabian Hueske – Cascading on Flink
Marton Balassi – Stateful Stream Processing
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Vyacheslav Zholudev – Flink, a Convenient Abstraction Layer for Yarn?
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
A Data Streaming Architecture with Apache Flink (berlin Buzzwords 2016)

What's hot (20)

PDF
Dongwon Kim – A Comparative Performance Evaluation of Flink
PDF
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
PDF
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
PPTX
GOTO Night Amsterdam - Stream processing with Apache Flink
PPTX
Apache flink 1.7 and Beyond
PPTX
QCon London - Stream Processing with Apache Flink
PPTX
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
PPTX
Aljoscha Krettek - The Future of Apache Flink
PPTX
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
PPTX
Taking a look under the hood of Apache Flink's relational APIs.
PDF
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
PPTX
Stephan Ewen - Experiences running Flink at Very Large Scale
PPTX
Flink history, roadmap and vision
PPTX
Apache Flink Berlin Meetup May 2016
PPTX
January 2016 Flink Community Update & Roadmap 2016
PDF
Stateful stream processing with Apache Flink
PDF
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
PDF
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
PPTX
data Artisans Product Announcement
PPTX
Apache flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
GOTO Night Amsterdam - Stream processing with Apache Flink
Apache flink 1.7 and Beyond
QCon London - Stream Processing with Apache Flink
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
Aljoscha Krettek - The Future of Apache Flink
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
Taking a look under the hood of Apache Flink's relational APIs.
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Stephan Ewen - Experiences running Flink at Very Large Scale
Flink history, roadmap and vision
Apache Flink Berlin Meetup May 2016
January 2016 Flink Community Update & Roadmap 2016
Stateful stream processing with Apache Flink
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
data Artisans Product Announcement
Apache flink
Ad

Viewers also liked (20)

PPTX
Stephan Ewen - Running Flink Everywhere
PDF
Jamie Grier - Robust Stream Processing with Apache Flink
PPTX
Kostas Tzoumas - Apache Flink®: State of the Union and What's Next
PPTX
Kostas Tzoumas - Stream Processing with Apache Flink®
PPTX
Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem ...
PPTX
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
PPTX
Robert Metzger - Connecting Apache Flink to the World - Reviewing the streami...
PPTX
Stephan Ewen - Scaling to large State
PPTX
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
PPTX
Apache Flink Community Updates November 2016 @ Berlin Meetup
PDF
A look at Flink 1.2
PPTX
Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...
PPTX
Streaming in the Wild with Apache Flink
PPTX
The Stream Processor as the Database - Apache Flink @ Berlin buzzwords
PPTX
Will it Scale? The Secrets behind Scaling Stream Processing Applications
PDF
Streaming Analytics & CEP - Two sides of the same coin?
PDF
Bay Area Apache Flink Meetup Community Update August 2015
PPTX
Matthias Kricke_Martin Grimmer_Michael Schmeißer - Building a real time Tweet...
PDF
Gyula Fóra - RBEA- Scalable Real-Time Analytics at King
PDF
Automatic Detection of Web Trackers by Vasia Kalavri
Stephan Ewen - Running Flink Everywhere
Jamie Grier - Robust Stream Processing with Apache Flink
Kostas Tzoumas - Apache Flink®: State of the Union and What's Next
Kostas Tzoumas - Stream Processing with Apache Flink®
Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem ...
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Robert Metzger - Connecting Apache Flink to the World - Reviewing the streami...
Stephan Ewen - Scaling to large State
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Apache Flink Community Updates November 2016 @ Berlin Meetup
A look at Flink 1.2
Aljoscha Krettek - Apache Flink for IoT: How Event-Time Processing Enables Ea...
Streaming in the Wild with Apache Flink
The Stream Processor as the Database - Apache Flink @ Berlin buzzwords
Will it Scale? The Secrets behind Scaling Stream Processing Applications
Streaming Analytics & CEP - Two sides of the same coin?
Bay Area Apache Flink Meetup Community Update August 2015
Matthias Kricke_Martin Grimmer_Michael Schmeißer - Building a real time Tweet...
Gyula Fóra - RBEA- Scalable Real-Time Analytics at King
Automatic Detection of Web Trackers by Vasia Kalavri
Ad

Similar to Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads (20)

PPTX
Autoscaling Flink with Reactive Mode
PDF
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
PDF
Apache flink
PDF
Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable
PPTX
Robust stream processing with Apache Flink
PPTX
Flink System Overview
PDF
Introducing the Apache Flink Kubernetes Operator
PDF
Flink at netflix paypal speaker series
PPTX
Apache Flink(tm) - A Next-Generation Stream Processor
PDF
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
PDF
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
PDF
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
PPTX
Apache Flink in the Cloud-Native Era
PPTX
Flink Architecture
PPTX
Apache flink 1.0.0 overview
PDF
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
PDF
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
PPTX
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
PPTX
Flink 1.0-slides
PDF
Apache Flink
Autoscaling Flink with Reactive Mode
Stefan Richter - A look at Flink 1.2 and beyond @ Berlin Meetup
Apache flink
Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable
Robust stream processing with Apache Flink
Flink System Overview
Introducing the Apache Flink Kubernetes Operator
Flink at netflix paypal speaker series
Apache Flink(tm) - A Next-Generation Stream Processor
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Apache Flink in the Cloud-Native Era
Flink Architecture
Apache flink 1.0.0 overview
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
2018-04 Kafka Summit London: Stephan Ewen - "Apache Flink and Apache Kafka fo...
Flink 1.0-slides
Apache Flink

More from Flink Forward (20)

PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PPTX
Evening out the uneven: dealing with skew in Flink
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PPTX
One sink to rule them all: Introducing the new Async Sink
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
PDF
Flink powered stream processing platform at Pinterest
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PPTX
The Current State of Table API in 2022
PDF
Flink SQL on Pulsar made easy
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PPTX
Processing Semantically-Ordered Streams in Financial Services
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PDF
Batch Processing at Scale with Flink & Iceberg
PPTX
Welcome to the Flink Community!
PPTX
Practical learnings from running thousands of Flink jobs
PPTX
Extending Flink SQL for stream processing use cases
Building a fully managed stream processing platform on Flink at scale for Lin...
Evening out the uneven: dealing with skew in Flink
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
One sink to rule them all: Introducing the new Async Sink
Tuning Apache Kafka Connectors for Flink.pptx
Flink powered stream processing platform at Pinterest
Where is my bottleneck? Performance troubleshooting in Flink
Using the New Apache Flink Kubernetes Operator in a Production Deployment
The Current State of Table API in 2022
Flink SQL on Pulsar made easy
Dynamic Rule-based Real-time Market Data Alerts
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Processing Semantically-Ordered Streams in Financial Services
Tame the small files problem and optimize data layout for streaming ingestion...
Batch Processing at Scale with Flink & Iceberg
Welcome to the Flink Community!
Practical learnings from running thousands of Flink jobs
Extending Flink SQL for stream processing use cases

Recently uploaded (20)

PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Foundation of Data Science unit number two notes
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
Introduction to Business Data Analytics.
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Mega Projects Data Mega Projects Data
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Computer network topology notes for revision
PDF
Launch Your Data Science Career in Kochi – 2025
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Foundation of Data Science unit number two notes
.pdf is not working space design for the following data for the following dat...
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Introduction to Business Data Analytics.
Moving the Public Sector (Government) to a Digital Adoption
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Introduction to Knowledge Engineering Part 1
Mega Projects Data Mega Projects Data
Reliability_Chapter_ presentation 1221.5784
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Business Acumen Training GuidePresentation.pptx
Computer network topology notes for revision
Launch Your Data Science Career in Kochi – 2025
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Major-Components-ofNKJNNKNKNKNKronment.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx

Till Rohrmann - Dynamic Scaling - How Apache Flink adapts to changing workloads