SlideShare a Scribd company logo
Introducing Flink
on Mesos
Eron Wright – eron.wright@emc.com
DELL EMC
@eronwright
2 of 15
What is Apache Mesos?
• A popular cluster manager (similar to YARN)
• MakesavailableCPU, memory, & diskresources
• Uniquecapabilitiesforstorageservices
• Emerging asa foundationfordata-centric,convergedinfrastructure
• Provides a programming model for using cluster resources
• A Mesosprogram is calleda “framework”
• Packaged into an open-source distribution called DCOS
• Prescribesbestpracticesrelatedto Mesosframeworks, relatedservices,etc.
3 of 15
Why Flink on Mesos?
• Flink works best on a cluster manager
– Easy to scale each job independently
– Externalize scheduling logic (fairness, quota, …)
– Good job isolation
• Flink can benefit from unique Mesos capabilities
– Disk resources
– Dynamic resource management
– Unique management features (e.g. inverse offers for controlled downscaling & maintenance)
Demo
Flink Master Process
6 of 15
Introduction
Flink Master Process
• The Flink Master Process is:
– The “Application Master” for a single Flink cluster
– A Mesos framework!
• Hosts numerous components:
– Job Manager
– Resource Manager (acts as Mesos scheduler)
– Artifact Server (HTTP server for Mesos fetcher)
• Responsible for TM scaling and recovery
– Handles JobManager scale change requests
– Stores task state in ZooKeeper
host1host2
Master
JM
RM
HTTPD
TM TM
Mesos
7 of 15
How it Works
Flink Master Process
• Offer handling:
– Uses Netflix Fenzo as an optimizer
– Gathers offers until all tasks launched
• Recovery:
– Stores intentional state in ZooKeeper
– Master uses leader election
– Mesos allows some time for recovery before killing
tasks
• Monitoring:
– Detects task failure; launches replacement
automatically.
host1host2
Master
TM TM
4. Launch
Mesos
2. Resource Offers
1. Register
5. Fetch (HTTP)
6. Status update
3. Optimize
8 of 15
Configuration
Flink Master Process (Con’t)
• Framework Info
– mesos.resourcemanager.framework.secret
– mesos.resourcemanager.framework.principal
– mesos.resourcemanager.framework.role
• Mesos Master Info
– mesos.master: (IP address or ZK lookup info)
– mesos.failover-timeout
Note: no port configuration is necessary; Mesos
automatically assigns ports.
Dispatcher
10 of 15
Introduction
Dispatcher
• A highly-available service for launching Flink
clusters.
• A Mesos framework!
• Accessed via REST by the CLI
• DCOS compatibility:
– HTTP-based
– Accessible via the Admin Router
– (future) JWT authentication
• Aligned with FLIP-6
host1
1D
1C
1B
1A
host2
2D
2C
2B
2A
host3
3D
3C
3B
3A
host4
4D
4C
4B
4A
Dispatcher
Master
TM TM
TMTM
Master
CLI
TM
Mesos
11 of 15
Framework Hierarchy
Dispatcher (Con’t)
• Nesting of frameworks is a common Mesos
pattern. Here, Marathon launches the
dispatcher, which launches the Flink Master
Process, etc.
• Architecturally, it avoids a dependency on the
Marathon API. For example, Aurora could be
used here in place of Marathon.
Dispatcher
Master
Maratho
n
TM
(Task)
(Task)
(Task)
12 of 15
Launching a Session
Dispatcher (Con’t)
• Use: mesos-session.sh
• CLI uploads files to dispatcher via HTTP
– Flink Configuration
– Supplemental files (--ship)
– Keytabs
– Certificates
• Dispatcher adds additional elements:
– Configuration
› ZooKeeper Namespace
– Flink JAR
– …
host1
1D
1C
1B
1A
host2
2D
2C
2B
2A
host3
3D
3C
3B
3A
host4
4D
4C
4B
4A
Dispatcher
Master
TM TM
CLI
HTTP(S)
TM
HTTP(S)
Mesos
13 of 15
Dispatcher Deployment Modes
Dispatcher (Con’t)
• Dispatcher is usable in two ways
• Remote Mode:
– Recommended for detached execution
• Local Mode:
– Recommended for simple, interactive sessions
(e.g. flink shell)
3C
3B
3A
4C
4B
4A
Dispatcher
Master
Master
CLI
HTTP(S)
3C
3B
3A
4C
4B
4A
Master
CLI +
Dispatcher
Local Mode Remote Mode
Summary
15 of 15
Future Directions
• Dynamic Scaling
– Add/remove Task Managers in response to scale changes over a job’s lifetime
– Support Mesos maintenance procedures (e.g. inverse offers)
• Dispatcher Evolution (FLIP-6)
– Generalize to support all deployment scenarios, unified CLI
– Provide a centralized Web UI (incl. job history)
– Authentication Support (e.g. OAuth 2.0)
• Docker Image Support
– Tracking the “Mesos unified containerizer”
• Mesos Disk Support
– Allocate multiple disks for Task Manager temp space
– Scale up the I/O
16 of 15
Project Status
• Targeted for: Flink 1.2
• Contributors:
– Eron Wright (Dell EMC)
– Maximilian Michels (data Artisans)
• Design Doc:
– Mesos Integration on Google Docs
• JIRAs:
– FLINK-1984 – Integrate Flink with Apache Mesos
• Code:
– https://github.com/EronWright/flink/tree/feature-FLINK-1984-T2
Eron Wright - Introducing Flink on Mesos

More Related Content

PPTX
Stephan Ewen - Running Flink Everywhere
PPTX
Eron Wright - Flink Security Enhancements
PPTX
Flink Forward SF 2017: Till Rohrmann - Redesigning Apache Flink’s Distributed...
PDF
Power of the Log: LSM & Append Only Data Structures
PDF
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
PDF
Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul
PDF
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
PPTX
Exactly-once Stream Processing with Kafka Streams
Stephan Ewen - Running Flink Everywhere
Eron Wright - Flink Security Enhancements
Flink Forward SF 2017: Till Rohrmann - Redesigning Apache Flink’s Distributed...
Power of the Log: LSM & Append Only Data Structures
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
Exactly-once Stream Processing with Kafka Streams

What's hot (20)

PDF
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
PDF
The Log of All Logs: Raft-based Consensus Inside Kafka | Guozhang Wang, Confl...
PDF
What's new in Confluent 3.2 and Apache Kafka 0.10.2
PDF
A Journey through the JDKs (Java 9 to Java 11)
PPTX
Streaming and Messaging
PDF
Containerizing Distributed Pipes
PDF
Flume and HBase
PDF
Robust Operations of Kafka Streams
PDF
Securing the Message Bus with Kafka Streams | Paul Otto and Ryan Salcido, Raf...
PDF
PaaSTA: Autoscaling at Yelp
PPTX
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
PPTX
Espresso Database Replication with Kafka, Tom Quiggle
PDF
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
PPTX
Introduction to Apache Mesos
PPTX
Managing multiple event types in a single topic with Schema Registry | Bill B...
PDF
ksqlDB: A Stream-Relational Database System
PDF
Introduction to Akka-Streams
PDF
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
PPTX
Apache Kafka 0.8 basic training - Verisign
PDF
Federated mesos clusters for global data center designs
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
The Log of All Logs: Raft-based Consensus Inside Kafka | Guozhang Wang, Confl...
What's new in Confluent 3.2 and Apache Kafka 0.10.2
A Journey through the JDKs (Java 9 to Java 11)
Streaming and Messaging
Containerizing Distributed Pipes
Flume and HBase
Robust Operations of Kafka Streams
Securing the Message Bus with Kafka Streams | Paul Otto and Ryan Salcido, Raf...
PaaSTA: Autoscaling at Yelp
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
Espresso Database Replication with Kafka, Tom Quiggle
Cross the streams thanks to Kafka and Flink (Christophe Philemotte, Digazu) K...
Introduction to Apache Mesos
Managing multiple event types in a single topic with Schema Registry | Bill B...
ksqlDB: A Stream-Relational Database System
Introduction to Akka-Streams
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Apache Kafka 0.8 basic training - Verisign
Federated mesos clusters for global data center designs
Ad

Viewers also liked (20)

PDF
Márton Balassi Streaming ML with Flink-
PPTX
Stephan Ewen - Scaling to large State
PDF
Automatic Detection of Web Trackers by Vasia Kalavri
PPTX
Ted Dunning-Faster and Furiouser- Flink Drift
PDF
Julian Hyde - Streaming SQL
PDF
Sanjar Akhmedov - Joining Infinity – Windowless Stream Processing with Flink
PPTX
Aljoscha Krettek - The Future of Apache Flink
PDF
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
PDF
Jamie Grier - Robust Stream Processing with Apache Flink
PPTX
Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem ...
PPTX
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
PPTX
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
PPTX
Flink Case Study: OKKAM
PDF
Dongwon Kim – A Comparative Performance Evaluation of Flink
PPTX
RocksDB compaction
PPTX
Gábor Horváth - Code Generation in Serializers and Comparators of Apache Flink
PPTX
RocksDB detail
PDF
Gyula Fóra - RBEA- Scalable Real-Time Analytics at King
PPTX
Matthias Kricke_Martin Grimmer_Michael Schmeißer - Building a real time Tweet...
PDF
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Márton Balassi Streaming ML with Flink-
Stephan Ewen - Scaling to large State
Automatic Detection of Web Trackers by Vasia Kalavri
Ted Dunning-Faster and Furiouser- Flink Drift
Julian Hyde - Streaming SQL
Sanjar Akhmedov - Joining Infinity – Windowless Stream Processing with Flink
Aljoscha Krettek - The Future of Apache Flink
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
Jamie Grier - Robust Stream Processing with Apache Flink
Kostas Tzoumas_Stephan Ewen - Keynote -The maturing data streaming ecosystem ...
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
Fabian Hueske_Till Rohrmann - Declarative stream processing with StreamSQL an...
Flink Case Study: OKKAM
Dongwon Kim – A Comparative Performance Evaluation of Flink
RocksDB compaction
Gábor Horváth - Code Generation in Serializers and Comparators of Apache Flink
RocksDB detail
Gyula Fóra - RBEA- Scalable Real-Time Analytics at King
Matthias Kricke_Martin Grimmer_Michael Schmeißer - Building a real time Tweet...
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Ad

Similar to Eron Wright - Introducing Flink on Mesos (20)

PDF
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
PPTX
Redesigning Apache Flink's Distributed Architecture @ Flink Forward 2017
PDF
Apache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup Berlin
PPTX
Operating Flink on Mesos at Scale
PDF
Apache Flink® Meets Apache Mesos® and DC/OS
PDF
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
PDF
Apache Flink and More @ MesosCon Asia 2017
PDF
Apache Flink
PDF
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
PPTX
Apache Mesos
PDF
Improvements to Flink & it's Applications in Alibaba Search
PPT
Step-by-Step Introduction to Apache Flink
PPT
Apache flink-crash-course-by-slim-baltagi-and-srini-palthepu-150817191850-lva...
PPT
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
PDF
seven-ways-to-run-flink-on-aws.pdf
PPTX
Apache mesos
PPTX
Introduction to mesos
PDF
Building Distributed Systems from Scratch - Part 1
PDF
Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable
PPTX
Apache mesos - overview
Flink Forward San Francisco 2018: Jörg Schad and Biswajit Das - "Operating Fl...
Redesigning Apache Flink's Distributed Architecture @ Flink Forward 2017
Apache Flink Meets Apache Mesos And DC/OS @ Mesos Meetup Berlin
Operating Flink on Mesos at Scale
Apache Flink® Meets Apache Mesos® and DC/OS
Flink Forward Berlin 2017: Jörg Schad, Till Rohrmann - Apache Flink meets Apa...
Apache Flink and More @ MesosCon Asia 2017
Apache Flink
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Apache Mesos
Improvements to Flink & it's Applications in Alibaba Search
Step-by-Step Introduction to Apache Flink
Apache flink-crash-course-by-slim-baltagi-and-srini-palthepu-150817191850-lva...
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
seven-ways-to-run-flink-on-aws.pdf
Apache mesos
Introduction to mesos
Building Distributed Systems from Scratch - Part 1
Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable
Apache mesos - overview

More from Flink Forward (20)

PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PPTX
Evening out the uneven: dealing with skew in Flink
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
PDF
Introducing the Apache Flink Kubernetes Operator
PPTX
Autoscaling Flink with Reactive Mode
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PPTX
One sink to rule them all: Introducing the new Async Sink
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
PDF
Flink powered stream processing platform at Pinterest
PPTX
Apache Flink in the Cloud-Native Era
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PPTX
The Current State of Table API in 2022
PDF
Flink SQL on Pulsar made easy
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PPTX
Processing Semantically-Ordered Streams in Financial Services
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PDF
Batch Processing at Scale with Flink & Iceberg
Building a fully managed stream processing platform on Flink at scale for Lin...
Evening out the uneven: dealing with skew in Flink
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing the Apache Flink Kubernetes Operator
Autoscaling Flink with Reactive Mode
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
One sink to rule them all: Introducing the new Async Sink
Tuning Apache Kafka Connectors for Flink.pptx
Flink powered stream processing platform at Pinterest
Apache Flink in the Cloud-Native Era
Where is my bottleneck? Performance troubleshooting in Flink
Using the New Apache Flink Kubernetes Operator in a Production Deployment
The Current State of Table API in 2022
Flink SQL on Pulsar made easy
Dynamic Rule-based Real-time Market Data Alerts
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Processing Semantically-Ordered Streams in Financial Services
Tame the small files problem and optimize data layout for streaming ingestion...
Batch Processing at Scale with Flink & Iceberg

Recently uploaded (20)

PDF
Fluorescence-microscope_Botany_detailed content
PDF
Launch Your Data Science Career in Kochi – 2025
PPT
Quality review (1)_presentation of this 21
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Introduction to Business Data Analytics.
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Lecture1 pattern recognition............
PPTX
Computer network topology notes for revision
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Fluorescence-microscope_Botany_detailed content
Launch Your Data Science Career in Kochi – 2025
Quality review (1)_presentation of this 21
Major-Components-ofNKJNNKNKNKNKronment.pptx
climate analysis of Dhaka ,Banglades.pptx
Mega Projects Data Mega Projects Data
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Introduction to Business Data Analytics.
IB Computer Science - Internal Assessment.pptx
Lecture1 pattern recognition............
Computer network topology notes for revision
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Database Infoormation System (DBIS).pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Business Acumen Training GuidePresentation.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg

Eron Wright - Introducing Flink on Mesos

  • 1. Introducing Flink on Mesos Eron Wright – eron.wright@emc.com DELL EMC @eronwright
  • 2. 2 of 15 What is Apache Mesos? • A popular cluster manager (similar to YARN) • MakesavailableCPU, memory, & diskresources • Uniquecapabilitiesforstorageservices • Emerging asa foundationfordata-centric,convergedinfrastructure • Provides a programming model for using cluster resources • A Mesosprogram is calleda “framework” • Packaged into an open-source distribution called DCOS • Prescribesbestpracticesrelatedto Mesosframeworks, relatedservices,etc.
  • 3. 3 of 15 Why Flink on Mesos? • Flink works best on a cluster manager – Easy to scale each job independently – Externalize scheduling logic (fairness, quota, …) – Good job isolation • Flink can benefit from unique Mesos capabilities – Disk resources – Dynamic resource management – Unique management features (e.g. inverse offers for controlled downscaling & maintenance)
  • 6. 6 of 15 Introduction Flink Master Process • The Flink Master Process is: – The “Application Master” for a single Flink cluster – A Mesos framework! • Hosts numerous components: – Job Manager – Resource Manager (acts as Mesos scheduler) – Artifact Server (HTTP server for Mesos fetcher) • Responsible for TM scaling and recovery – Handles JobManager scale change requests – Stores task state in ZooKeeper host1host2 Master JM RM HTTPD TM TM Mesos
  • 7. 7 of 15 How it Works Flink Master Process • Offer handling: – Uses Netflix Fenzo as an optimizer – Gathers offers until all tasks launched • Recovery: – Stores intentional state in ZooKeeper – Master uses leader election – Mesos allows some time for recovery before killing tasks • Monitoring: – Detects task failure; launches replacement automatically. host1host2 Master TM TM 4. Launch Mesos 2. Resource Offers 1. Register 5. Fetch (HTTP) 6. Status update 3. Optimize
  • 8. 8 of 15 Configuration Flink Master Process (Con’t) • Framework Info – mesos.resourcemanager.framework.secret – mesos.resourcemanager.framework.principal – mesos.resourcemanager.framework.role • Mesos Master Info – mesos.master: (IP address or ZK lookup info) – mesos.failover-timeout Note: no port configuration is necessary; Mesos automatically assigns ports.
  • 10. 10 of 15 Introduction Dispatcher • A highly-available service for launching Flink clusters. • A Mesos framework! • Accessed via REST by the CLI • DCOS compatibility: – HTTP-based – Accessible via the Admin Router – (future) JWT authentication • Aligned with FLIP-6 host1 1D 1C 1B 1A host2 2D 2C 2B 2A host3 3D 3C 3B 3A host4 4D 4C 4B 4A Dispatcher Master TM TM TMTM Master CLI TM Mesos
  • 11. 11 of 15 Framework Hierarchy Dispatcher (Con’t) • Nesting of frameworks is a common Mesos pattern. Here, Marathon launches the dispatcher, which launches the Flink Master Process, etc. • Architecturally, it avoids a dependency on the Marathon API. For example, Aurora could be used here in place of Marathon. Dispatcher Master Maratho n TM (Task) (Task) (Task)
  • 12. 12 of 15 Launching a Session Dispatcher (Con’t) • Use: mesos-session.sh • CLI uploads files to dispatcher via HTTP – Flink Configuration – Supplemental files (--ship) – Keytabs – Certificates • Dispatcher adds additional elements: – Configuration › ZooKeeper Namespace – Flink JAR – … host1 1D 1C 1B 1A host2 2D 2C 2B 2A host3 3D 3C 3B 3A host4 4D 4C 4B 4A Dispatcher Master TM TM CLI HTTP(S) TM HTTP(S) Mesos
  • 13. 13 of 15 Dispatcher Deployment Modes Dispatcher (Con’t) • Dispatcher is usable in two ways • Remote Mode: – Recommended for detached execution • Local Mode: – Recommended for simple, interactive sessions (e.g. flink shell) 3C 3B 3A 4C 4B 4A Dispatcher Master Master CLI HTTP(S) 3C 3B 3A 4C 4B 4A Master CLI + Dispatcher Local Mode Remote Mode
  • 15. 15 of 15 Future Directions • Dynamic Scaling – Add/remove Task Managers in response to scale changes over a job’s lifetime – Support Mesos maintenance procedures (e.g. inverse offers) • Dispatcher Evolution (FLIP-6) – Generalize to support all deployment scenarios, unified CLI – Provide a centralized Web UI (incl. job history) – Authentication Support (e.g. OAuth 2.0) • Docker Image Support – Tracking the “Mesos unified containerizer” • Mesos Disk Support – Allocate multiple disks for Task Manager temp space – Scale up the I/O
  • 16. 16 of 15 Project Status • Targeted for: Flink 1.2 • Contributors: – Eron Wright (Dell EMC) – Maximilian Michels (data Artisans) • Design Doc: – Mesos Integration on Google Docs • JIRAs: – FLINK-1984 – Integrate Flink with Apache Mesos • Code: – https://github.com/EronWright/flink/tree/feature-FLINK-1984-T2