SlideShare a Scribd company logo
Confidential + Proprietary
No shard left behind
Dynamic Work Rebalancing
and other adaptive features in
Apache Beam
Malo Denielou (malo@google.com)
Apache Beam is a unified
programming model
designed to provide
efficient and portable
data processing pipelines.
Apache Beam
1. The Beam Programming Model
2. SDKs for writing Beam pipelines -- Java/Python/...
3. Runners for existing distributed processing
backends
○ Apache Flink
○ Apache Spark
○ Apache Apex
○ Dataflow
○ Direct runner (for testing)
Beam Model: Fn Runners
Apache
Flink
Apache
Spark
Beam Model: Pipeline Construction
Other
LanguagesBeam Java
Beam
Python
Execution Execution
Cloud
Dataflow
Execution
1.Classic Batch 2. Batch with Fixed
Windows
3. Streaming
5. Streaming With
Retractions
4. Streaming with
Speculative + Late Data
6. Streaming With
Sessions
Apache Beam use cases
Data processing for realistic workloads
Workload
Time
Streaming pipelines have variable input Batch pipelines have stages of different sizes
The curse of configuration
Workload
Time
Workload
Time
Over-provisioning resources? Under-provisioning on purpose?
A considerable effort is spent to finely tune all the parameters of the jobs.
Ideal case
Workload
Time
A system that adapts.
The straggler problem in batchWorkers
Time
Tasks do not finish evenly on the workers.
● Data is not evenly distributed among tasks
● Processing time is uneven between tasks
● Runtime constraints
Effects are cumulative per stage!
Common straggler mitigation techniques
● Split files into equal sizes?
● Pre-emptively over split?
● Detect slow workers and reexecute?
● Sample the data and split based on partial execution
All have major costs, but do not solve completely the problem.
Workers
Time
Common straggler mitigation techniques
● Split files into equal sizes?
● Pre-emptively over split?
● Detect slow workers and reexecute?
● Sample the data and split based on partial execution
All have major costs, but do not solve completely the problem.
Workers
Time
« The most straightforward way to tune the number of partitions is experimentation:
Look at the number of partitions in the parent RDD and then keep multiplying that
by 1.5 until performance stops improving. »
From [blog]how-to-tune-your-apache-spark-jobs
No amount of upfront heuristic tuning (be it manual or
automatic) is enough to guarantee good performance: the system
will always hit unpredictable situations at run-time.
A system that's able to dynamically adapt and get out of a bad
situation is much more powerful than one that heuristically
hopes to avoid getting into it.
Fine-tuning execution parameters goes against having a truly
portable and unified programming environment.
Beam abstractions empower runners
A bundle is group of elements of a PCollection processed and committed together.
APIs (ParDo/DoFn):
• setup()
• startBundle()
• processElement() n times
• finishBundle()
• teardown()
Streaming runner:
• small bundles, low-latency pipelining across stages, overhead of frequent commits.
Classic batch runner:
• large bundles, fewer large commits, more efficient, long synchronous stages.
Other runner strategies may strike a different balance.
Beam abstractions empower runners
Efficiency at runner’s discretion
“Read from this source, splitting it 1000 ways”
➔ user decides
“Read from this source”
➔ runner decides
APIs for portable Sources:
• long getEstimatedSize()
• List<Source> splitIntoBundles(size)
Beam abstractions empower runners
Efficiency at runner’s discretion
“Read from this source, splitting it 1000 ways”
➔ user decides
“Read from this source”
➔ runner decides
APIs:
• long getEstimatedSize()
• List<Source> splitIntoBundles(size)
Runner
Source
gs://logs/*
Size?
Beam abstractions empower runners
Efficiency at runner’s discretion
“Read from this source, splitting it 1000 ways”
➔ user decides
“Read from this source”
➔ runner decides
APIs:
• long getEstimatedSize()
• List<Source> splitIntoBundles(size)
Runner
Source
gs://logs/*
50TB
Beam abstractions empower runners
Efficiency at runner’s discretion
“Read from this source, splitting it 1000 ways”
➔ user decides
“Read from this source”
➔ runner decides
APIs:
• long getEstimatedSize()
• List<Source> splitIntoBundles(size)
Runner
(cluster utilization, quota,
bandwidth, throughput,
concurrent stages, …)
Source
gs://logs/*
Split in
chunks of
500GB
Beam abstractions empower runners
Efficiency at runner’s discretion
“Read from this source, splitting it 1000 ways”
➔ user decides
“Read from this source”
➔ runner decides
APIs:
• long getEstimatedSize()
• List<Source> splitIntoBundles(size)
Runner
Source
gs://logs/*
List<Source>
Solving the straggler problem: Dynamic Work Rebalancing
Solving the straggler problem: Dynamic Work Rebalancing
Workers
Time
Done work Active work Predicted completion
Now Average
Solving the straggler problem: Dynamic Work Rebalancing
Workers
Time
Done work Active work Predicted completion
Now Average
Solving the straggler problem: Dynamic Work Rebalancing
Workers
Time
Done work Active work Predicted completion
Now Average
Workers
Time
Now Average
Solving the straggler problem: Dynamic Work Rebalancing
Workers
Time
Done work Active work Predicted completion
Now Average
Workers
Time
Now Average
Dynamic Work Rebalancing in the wild
A classic MapReduce job (read from Google Cloud Storage,
GroupByKey, write to Google Cloud Storage), 400 workers.
Dynamic Work Rebalancing disabled to demonstrate
stragglers.
X axis: time (total ~20min.); Y axis: workers
Dynamic Work Rebalancing in the wild
A classic MapReduce job (read from Google Cloud Storage,
GroupByKey, write to Google Cloud Storage), 400 workers.
Dynamic Work Rebalancing disabled to demonstrate
stragglers.
X axis: time (total ~20min.); Y axis: workers
Same job,
Dynamic Work Rebalancing enabled.
X axis: time (total ~15min.); Y axis: workers
Dynamic Work Rebalancing in the wild
A classic MapReduce job (read from Google Cloud Storage,
GroupByKey, write to Google Cloud Storage), 400 workers.
Dynamic Work Rebalancing disabled to demonstrate
stragglers.
X axis: time (total ~20min.); Y axis: workers
Same job,
Dynamic Work Rebalancing enabled.
X axis: time (total ~15min.); Y axis: workers
Savings
Dynamic Work Rebalancing with Autoscaling
Initial allocation of 80
workers, based on size
Multiple rounds of
upsizing, enabled by
dynamic work
rebalancing
Upscales to 1000
workers.
● tasks are balanced
● no oversplitting or
manual tuning
Apache Beam enable dynamic adaptation
Beam Source Readers provide simple progress signals, which enable runners to take action based on
execution-time characteristics.
All Beam runners can implement Autoscaling and Dynamic Work Rebalancing.
APIs for how much work is pending.
• bounded: double getFractionConsumed()
• unbounded: long getBacklogBytes()
APIs for splitting:
• bounded:
• Source splitAtFraction(double)
• int getParallelismRemaining()
• unbounded:
• Coming soon ...
Apache Beam is a unified
programming model
designed to provide
efficient and portable
data processing pipelines.
To learn more
Read our blog posts!
• No shard left behind: Dynamic work rebalancing in Google Cloud Dataflow
https://cloud.google.com/blog/big-data/2016/05/no-shard-left-behind-dyna
mic-work-rebalancing-in-google-cloud-dataflow
• Comparing Cloud Dataflow autoscaling to Spark and Hadoop
https://cloud.google.com/blog/big-data/2016/03/comparing-cloud-dataflow-
autoscaling-to-spark-and-hadoop
Join the Apache Beam community!
https://beam.apache.org/

More Related Content

PDF
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
PDF
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
PDF
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
PPTX
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynami...
PDF
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
PDF
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
PDF
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
PPTX
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Bill Liu & Haohui Mai - AthenaX : Uber’s streaming pro...
Flink Forward SF 2017: Joe Olson - Using Flink and Queryable State to Buffer ...
Flink Forward SF 2017: David Hardwick, Sean Hester & David Brelloch - Dynami...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow

What's hot (20)

PPTX
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
PDF
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
PDF
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
PPTX
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
PPTX
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
PDF
Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...
PDF
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
PDF
Spark Summit EU talk by Sital Kedia
PPTX
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
PDF
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
PDF
Spark Summit EU talk by Luc Bourlier
PPTX
What's new in 1.9.0 blink planner - Kurt Young, Alibaba
PDF
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
PPTX
Portable Streaming Pipelines with Apache Beam
PDF
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
PPTX
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
PPTX
Flink Forward SF 2017: Till Rohrmann - Redesigning Apache Flink’s Distributed...
PDF
Airflow tutorials hands_on
PDF
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
PPTX
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward SF 2017: Shaoxuan Wang_Xiaowei Jiang - Blinks Improvements to F...
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...
Flink Forward SF 2017: Srikanth Satya & Tom Kaitchuck - Pravega: Storage Rei...
Spark Summit EU talk by Sital Kedia
Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink
Kafka Summit NYC 2017 - Building Advanced Streaming Applications using the La...
Spark Summit EU talk by Luc Bourlier
What's new in 1.9.0 blink planner - Kurt Young, Alibaba
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Portable Streaming Pipelines with Apache Beam
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Flink Forward SF 2017: Till Rohrmann - Redesigning Apache Flink’s Distributed...
Airflow tutorials hands_on
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Ad

Similar to Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work rebalancing and Autoscaling in Apache Beam (20)

PDF
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
PPT
Cast Iron Cloud Integration Best Practices
PDF
Headaches and Breakthroughs in Building Continuous Applications
PDF
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
PPTX
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
PPTX
Apache Beam: A unified model for batch and stream processing data
PDF
SamzaSQL QCon'16 presentation
PPTX
Running Presto and Spark on the Netflix Big Data Platform
PDF
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
PDF
Amazon EMR Masterclass
PDF
DevoxxUK: Optimizating Application Performance on Kubernetes
PDF
Unified, Efficient, and Portable Data Processing with Apache Beam
PPTX
Profiling & Testing with Spark
PDF
Kafka to the Maxka - (Kafka Performance Tuning)
PPTX
Explore big data at speed of thought with Spark 2.0 and Snappydata
PDF
AWS glue technical enablement training
PDF
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
PDF
Java Performance and Profiling
PDF
Introduction to Apache Beam
PPTX
Intro to SnappyData Webinar
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Cast Iron Cloud Integration Best Practices
Headaches and Breakthroughs in Building Continuous Applications
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Apache Beam: A unified model for batch and stream processing data
SamzaSQL QCon'16 presentation
Running Presto and Spark on the Netflix Big Data Platform
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Amazon EMR Masterclass
DevoxxUK: Optimizating Application Performance on Kubernetes
Unified, Efficient, and Portable Data Processing with Apache Beam
Profiling & Testing with Spark
Kafka to the Maxka - (Kafka Performance Tuning)
Explore big data at speed of thought with Spark 2.0 and Snappydata
AWS glue technical enablement training
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Java Performance and Profiling
Introduction to Apache Beam
Intro to SnappyData Webinar
Ad

More from Flink Forward (20)

PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PPTX
Evening out the uneven: dealing with skew in Flink
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
PDF
Introducing the Apache Flink Kubernetes Operator
PPTX
Autoscaling Flink with Reactive Mode
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PPTX
One sink to rule them all: Introducing the new Async Sink
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
PDF
Flink powered stream processing platform at Pinterest
PPTX
Apache Flink in the Cloud-Native Era
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PPTX
The Current State of Table API in 2022
PDF
Flink SQL on Pulsar made easy
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PPTX
Processing Semantically-Ordered Streams in Financial Services
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PDF
Batch Processing at Scale with Flink & Iceberg
Building a fully managed stream processing platform on Flink at scale for Lin...
Evening out the uneven: dealing with skew in Flink
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing the Apache Flink Kubernetes Operator
Autoscaling Flink with Reactive Mode
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
One sink to rule them all: Introducing the new Async Sink
Tuning Apache Kafka Connectors for Flink.pptx
Flink powered stream processing platform at Pinterest
Apache Flink in the Cloud-Native Era
Where is my bottleneck? Performance troubleshooting in Flink
Using the New Apache Flink Kubernetes Operator in a Production Deployment
The Current State of Table API in 2022
Flink SQL on Pulsar made easy
Dynamic Rule-based Real-time Market Data Alerts
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Processing Semantically-Ordered Streams in Financial Services
Tame the small files problem and optimize data layout for streaming ingestion...
Batch Processing at Scale with Flink & Iceberg

Recently uploaded (20)

PPTX
Introduction to Knowledge Engineering Part 1
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Foundation of Data Science unit number two notes
PPTX
Supervised vs unsupervised machine learning algorithms
PPT
Reliability_Chapter_ presentation 1221.5784
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
Database Infoormation System (DBIS).pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Introduction to Business Data Analytics.
PPTX
1_Introduction to advance data techniques.pptx
Introduction to Knowledge Engineering Part 1
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
STUDY DESIGN details- Lt Col Maksud (21).pptx
IB Computer Science - Internal Assessment.pptx
Clinical guidelines as a resource for EBP(1).pdf
Foundation of Data Science unit number two notes
Supervised vs unsupervised machine learning algorithms
Reliability_Chapter_ presentation 1221.5784
Miokarditis (Inflamasi pada Otot Jantung)
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Moving the Public Sector (Government) to a Digital Adoption
Database Infoormation System (DBIS).pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Introduction to Business Data Analytics.
1_Introduction to advance data techniques.pptx

Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work rebalancing and Autoscaling in Apache Beam

  • 1. Confidential + Proprietary No shard left behind Dynamic Work Rebalancing and other adaptive features in Apache Beam Malo Denielou (malo@google.com)
  • 2. Apache Beam is a unified programming model designed to provide efficient and portable data processing pipelines.
  • 3. Apache Beam 1. The Beam Programming Model 2. SDKs for writing Beam pipelines -- Java/Python/... 3. Runners for existing distributed processing backends ○ Apache Flink ○ Apache Spark ○ Apache Apex ○ Dataflow ○ Direct runner (for testing) Beam Model: Fn Runners Apache Flink Apache Spark Beam Model: Pipeline Construction Other LanguagesBeam Java Beam Python Execution Execution Cloud Dataflow Execution
  • 4. 1.Classic Batch 2. Batch with Fixed Windows 3. Streaming 5. Streaming With Retractions 4. Streaming with Speculative + Late Data 6. Streaming With Sessions Apache Beam use cases
  • 5. Data processing for realistic workloads Workload Time Streaming pipelines have variable input Batch pipelines have stages of different sizes
  • 6. The curse of configuration Workload Time Workload Time Over-provisioning resources? Under-provisioning on purpose? A considerable effort is spent to finely tune all the parameters of the jobs.
  • 8. The straggler problem in batchWorkers Time Tasks do not finish evenly on the workers. ● Data is not evenly distributed among tasks ● Processing time is uneven between tasks ● Runtime constraints Effects are cumulative per stage!
  • 9. Common straggler mitigation techniques ● Split files into equal sizes? ● Pre-emptively over split? ● Detect slow workers and reexecute? ● Sample the data and split based on partial execution All have major costs, but do not solve completely the problem. Workers Time
  • 10. Common straggler mitigation techniques ● Split files into equal sizes? ● Pre-emptively over split? ● Detect slow workers and reexecute? ● Sample the data and split based on partial execution All have major costs, but do not solve completely the problem. Workers Time « The most straightforward way to tune the number of partitions is experimentation: Look at the number of partitions in the parent RDD and then keep multiplying that by 1.5 until performance stops improving. » From [blog]how-to-tune-your-apache-spark-jobs
  • 11. No amount of upfront heuristic tuning (be it manual or automatic) is enough to guarantee good performance: the system will always hit unpredictable situations at run-time. A system that's able to dynamically adapt and get out of a bad situation is much more powerful than one that heuristically hopes to avoid getting into it. Fine-tuning execution parameters goes against having a truly portable and unified programming environment.
  • 12. Beam abstractions empower runners A bundle is group of elements of a PCollection processed and committed together. APIs (ParDo/DoFn): • setup() • startBundle() • processElement() n times • finishBundle() • teardown() Streaming runner: • small bundles, low-latency pipelining across stages, overhead of frequent commits. Classic batch runner: • large bundles, fewer large commits, more efficient, long synchronous stages. Other runner strategies may strike a different balance.
  • 13. Beam abstractions empower runners Efficiency at runner’s discretion “Read from this source, splitting it 1000 ways” ➔ user decides “Read from this source” ➔ runner decides APIs for portable Sources: • long getEstimatedSize() • List<Source> splitIntoBundles(size)
  • 14. Beam abstractions empower runners Efficiency at runner’s discretion “Read from this source, splitting it 1000 ways” ➔ user decides “Read from this source” ➔ runner decides APIs: • long getEstimatedSize() • List<Source> splitIntoBundles(size) Runner Source gs://logs/* Size?
  • 15. Beam abstractions empower runners Efficiency at runner’s discretion “Read from this source, splitting it 1000 ways” ➔ user decides “Read from this source” ➔ runner decides APIs: • long getEstimatedSize() • List<Source> splitIntoBundles(size) Runner Source gs://logs/* 50TB
  • 16. Beam abstractions empower runners Efficiency at runner’s discretion “Read from this source, splitting it 1000 ways” ➔ user decides “Read from this source” ➔ runner decides APIs: • long getEstimatedSize() • List<Source> splitIntoBundles(size) Runner (cluster utilization, quota, bandwidth, throughput, concurrent stages, …) Source gs://logs/* Split in chunks of 500GB
  • 17. Beam abstractions empower runners Efficiency at runner’s discretion “Read from this source, splitting it 1000 ways” ➔ user decides “Read from this source” ➔ runner decides APIs: • long getEstimatedSize() • List<Source> splitIntoBundles(size) Runner Source gs://logs/* List<Source>
  • 18. Solving the straggler problem: Dynamic Work Rebalancing
  • 19. Solving the straggler problem: Dynamic Work Rebalancing Workers Time Done work Active work Predicted completion Now Average
  • 20. Solving the straggler problem: Dynamic Work Rebalancing Workers Time Done work Active work Predicted completion Now Average
  • 21. Solving the straggler problem: Dynamic Work Rebalancing Workers Time Done work Active work Predicted completion Now Average Workers Time Now Average
  • 22. Solving the straggler problem: Dynamic Work Rebalancing Workers Time Done work Active work Predicted completion Now Average Workers Time Now Average
  • 23. Dynamic Work Rebalancing in the wild A classic MapReduce job (read from Google Cloud Storage, GroupByKey, write to Google Cloud Storage), 400 workers. Dynamic Work Rebalancing disabled to demonstrate stragglers. X axis: time (total ~20min.); Y axis: workers
  • 24. Dynamic Work Rebalancing in the wild A classic MapReduce job (read from Google Cloud Storage, GroupByKey, write to Google Cloud Storage), 400 workers. Dynamic Work Rebalancing disabled to demonstrate stragglers. X axis: time (total ~20min.); Y axis: workers Same job, Dynamic Work Rebalancing enabled. X axis: time (total ~15min.); Y axis: workers
  • 25. Dynamic Work Rebalancing in the wild A classic MapReduce job (read from Google Cloud Storage, GroupByKey, write to Google Cloud Storage), 400 workers. Dynamic Work Rebalancing disabled to demonstrate stragglers. X axis: time (total ~20min.); Y axis: workers Same job, Dynamic Work Rebalancing enabled. X axis: time (total ~15min.); Y axis: workers Savings
  • 26. Dynamic Work Rebalancing with Autoscaling Initial allocation of 80 workers, based on size Multiple rounds of upsizing, enabled by dynamic work rebalancing Upscales to 1000 workers. ● tasks are balanced ● no oversplitting or manual tuning
  • 27. Apache Beam enable dynamic adaptation Beam Source Readers provide simple progress signals, which enable runners to take action based on execution-time characteristics. All Beam runners can implement Autoscaling and Dynamic Work Rebalancing. APIs for how much work is pending. • bounded: double getFractionConsumed() • unbounded: long getBacklogBytes() APIs for splitting: • bounded: • Source splitAtFraction(double) • int getParallelismRemaining() • unbounded: • Coming soon ...
  • 28. Apache Beam is a unified programming model designed to provide efficient and portable data processing pipelines.
  • 29. To learn more Read our blog posts! • No shard left behind: Dynamic work rebalancing in Google Cloud Dataflow https://cloud.google.com/blog/big-data/2016/05/no-shard-left-behind-dyna mic-work-rebalancing-in-google-cloud-dataflow • Comparing Cloud Dataflow autoscaling to Spark and Hadoop https://cloud.google.com/blog/big-data/2016/03/comparing-cloud-dataflow- autoscaling-to-spark-and-hadoop Join the Apache Beam community! https://beam.apache.org/