SlideShare a Scribd company logo
Joining the Scurry of Squirrels:
Contributing to Apache Flink®
@tzulitai
tzulitai@apache.org
Tzu-Li (Gordon) Tai
Who?
▪ Apache Flink Committer
• Streaming connectors (Kafka, Kinesis, Elasticsearch)
▪ Software Engineer at data Artisans
2
Who?
▪ Definitely not a veteran contributor of Apache Flink ;-)
3
… my life of Apache Flink contribution ;)
Who?
▪ Definitely not a veteran contributor of Apache Flink ;-)
4
… my life of Apache Flink contribution ;)
How can I contribute?
5
The usual first question ...
It never is just about the code ...
6
Different forms of Contribution
▪ As an user …
• file bug reports
• join discussions / propose new features on the mailing list
• testing release candidates
• talk about Flink
▪ As a developer …
• submit & review patches for features and fixes
• help answer user questions
7
Mailing Lists, JIRA, and Wiki
8
Know your resources!
Know your resources!
▪ The resources is the community, and where
everything happens
• Mailing lists
• JIRA
• Wiki
9
Mailing lists
▪ Developer mailing list
• dev@flink.apache.org
• Ask questions about Flink development
• Discuss new Flink features
▪ User mailing list
• user@flink.apache.org
• General questions about using Flink
10
JIRA board
▪ Issue tracking
• https://issues.apache.org/jira/browse/FLINK
• Where bugs and new features / improvements are filed
• All code contributions must always have a corresponding
JIRA issue ticket opened
• Do NOT ask questions here (--> mailing lists)
11
Wiki
▪ General information about the project
• https://cwiki.apache.org/confluence/display/FLINK
• Mostly relevant for FLIPs (Flink Improvement Proposals)
• Other info somewhat outdated
12
Apache Flink Patch Submission
13
Patch Submission Process
14
File and discuss
directly on JIRA
Start a discussion
on the developer
mailing list
FLIP
(Flink Improvement
Proposals)
Community
Consensus
Submit Github
pull request
Committer
merges patch
review & refine
Patch Submission Process
15
File and discuss
directly on JIRA
Start a discussion
on the developer
mailing list
FLIP
(Flink Improvement
Proposals)
Community
Consensus
Submit Github
pull request
Committer
merges patch
review & refine
The Apache Way: Community Over Code
16
▪ Changes, especially new features, should be agreed upon
by the community
• Does not apply just to the developers
• Beneficial to users as a whole
▪ Always seek to reach some level of consensus throughout
the code contribution process
• Avoid needing to go back to feature discussion after coding
Patch Submission Process
17
File and discuss
directly on JIRA
Start a discussion
on the developer
mailing list
FLIP
(Flink Improvement
Proposals)
Community
Consensus
Submit Github
pull request
Committer
merges patch
review & refine
Filing a JIRA
18
▪ Always have a good description of what the issue is
• Bugs - the cause and the proposed solution
• Improvements and features - bootstrap potential discussion with the
implementation you have in mind
▪ Remember to appropriately set “Components”
Assign yourself to a JIRA
19
▪ Set the “Assignee” field to yourself
• Notifies others that this issue is already in good hands
▪ Simply request JIRA permission on the developer mailing list
Patch Submission Process
20
File and discuss
directly on JIRA
Start a discussion
on the developer
mailing list
FLIP
(Flink Improvement
Proposals)
Community
Consensus
Submit Github
pull request
Committer
merges patch
review & refine
FLIP (Flink Improvement Proposals)
21
▪ Adopted from the Apache Kafka community
▪ Official design documentation for major features
• Serves as the basis of discussion on the dev mailing list
• Allows the community to decide on the best future-proof design
Patch Submission Process
22
File and discuss
directly on JIRA
Start a discussion
on the developer
mailing list
FLIP
(Flink Improvement
Proposals)
Community
Consensus
Submit Github
pull request
Committer
merges patch
review & refine
Patch Submission Best Practices
23
▪ Rebase onto latest master prior to opening pull request
• git pull --rebase origin master
▪ Squash your commits to appropriate set
▪ Commit message example:
• [FLINK-6025] [cep] Implement skip till next match strategy
Patch Submission Best Practices
24
▪ Usually the pull request naming follows the primary
commit’s message:
• [FLINK-6025] [cep] Implement skip till next match strategy
▪ Address all discussions happening on the pull request
▪ Push follow-up commits
• Retain history of the patch review
• Also have meaningful msgs for follow-up commits
Patch Submission Process
25
File and discuss
directly on JIRA
Start a discussion
on the developer
mailing list
FLIP
(Flink Improvement
Proposals)
Community
Consensus
Submit Github
pull request
Committer
merges patch
review & refine
Closing
26
Final Takeaways
27
▪ Development of Apache Flink is heavily centered around its
community
• The Apache Way - community over code
• Contributing is never just about writing code
▪ Know the resources and do not hesitate to approach.
▪ Extra pair of helping hands is always welcome ;-) !
Shoutouts
28
▪ Flink CEP
• Complex event processing
• Pattern detection for streams
• → If interested, contact
Kostas Kloudas (kkloudas@apache.org)
▪ Flink ML
• Online learning
• Incremental learning
• Model serving
• → If interested, contact
Theodore Vasiloudis (tvas@apache.org)
Thank you!
@tzulitai
@ApacheFlink
@dataArtisans

More Related Content

PDF
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
PPTX
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...
PDF
Márton Balassi Streaming ML with Flink-
PPTX
Suneel Marthi - Deep Learning with Apache Flink and DL4J
PPTX
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
PPTX
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
PDF
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
PDF
Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...
Flink Forward SF 2017: Dean Wampler - Streaming Deep Learning Scenarios with...
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...
Márton Balassi Streaming ML with Flink-
Suneel Marthi - Deep Learning with Apache Flink and DL4J
Jim Dowling - Multi-tenant Flink-as-a-Service on YARN
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Flink Forward San Francisco 2019: Elastic Data Processing with Apache Flink a...

What's hot (20)

PDF
Atlanta Hadoop Users Meetup 09 21 2016
PPTX
Kafka Streams for Java enthusiasts
PPTX
Emerging technologies /frameworks in Big Data
PPTX
Capture the Streams of Database Changes
PDF
Operationalizing Machine Learning: Serving ML Models
PPTX
Unified Batch and Real-Time Stream Processing Using Apache Flink
PPT
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
PDF
Introduction to Apache Flink
PDF
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
PDF
Dongwon Kim – A Comparative Performance Evaluation of Flink
PDF
What's new in Confluent 3.2 and Apache Kafka 0.10.2
PPTX
Building Streaming Applications with Apache Storm 1.1
PPTX
Apache Kafka 0.8 basic training - Verisign
PPTX
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
PPTX
Kafka Streams: The Stream Processing Engine of Apache Kafka
PDF
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
PDF
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
PPTX
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
PPTX
Performance Comparison of Streaming Big Data Platforms
PPTX
Design Patterns for working with Fast Data
Atlanta Hadoop Users Meetup 09 21 2016
Kafka Streams for Java enthusiasts
Emerging technologies /frameworks in Big Data
Capture the Streams of Database Changes
Operationalizing Machine Learning: Serving ML Models
Unified Batch and Real-Time Stream Processing Using Apache Flink
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
Introduction to Apache Flink
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
Dongwon Kim – A Comparative Performance Evaluation of Flink
What's new in Confluent 3.2 and Apache Kafka 0.10.2
Building Streaming Applications with Apache Storm 1.1
Apache Kafka 0.8 basic training - Verisign
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Kafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Advanced Spark and TensorFlow Meetup 08-04-2016 One Click Spark ML Pipeline D...
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Performance Comparison of Streaming Big Data Platforms
Design Patterns for working with Fast Data
Ad

Similar to Flink Forward SF 2017: Tzu-Li (Gordon) Tai - Joining the Scurry of Squirrels: Contributing to Apache Flink (20)

PDF
Community update on flink 1.9 and How to Contribute to Flink
PDF
How to Contribute to Apache Flink (and Flink at the Apache Software Foundation)
PDF
Webinar: How to contribute to Apache Flink - Robert Metzger
PDF
How to contribute to Apache Flink @ Seattle Flink meetup
PDF
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
PDF
Apache contribution-bar camp-colombo
PDF
The Apache Way
PDF
Integrate (Yourself) with the Apache Software Foundation - Full Version
ODP
The Apache Way
PDF
The Apache Way
PDF
The Apache Way: A Proven Way Toward Success
ODP
The Apache Way
PPTX
Flink Community Update December 2015: Year in Review
PDF
Git best practices 2016
PDF
Dealing with contributor overload - FOSS Backstage
PDF
Bay Area Apache Flink Meetup Community Update August 2015
ODP
Maven university-course
PDF
SFO15-TR2: Upstreaming 101
PDF
Getting started contributing to Apache Spark
ODP
The Apache Way olamy
Community update on flink 1.9 and How to Contribute to Flink
How to Contribute to Apache Flink (and Flink at the Apache Software Foundation)
Webinar: How to contribute to Apache Flink - Robert Metzger
How to contribute to Apache Flink @ Seattle Flink meetup
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache contribution-bar camp-colombo
The Apache Way
Integrate (Yourself) with the Apache Software Foundation - Full Version
The Apache Way
The Apache Way
The Apache Way: A Proven Way Toward Success
The Apache Way
Flink Community Update December 2015: Year in Review
Git best practices 2016
Dealing with contributor overload - FOSS Backstage
Bay Area Apache Flink Meetup Community Update August 2015
Maven university-course
SFO15-TR2: Upstreaming 101
Getting started contributing to Apache Spark
The Apache Way olamy
Ad

More from Flink Forward (20)

PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PPTX
Evening out the uneven: dealing with skew in Flink
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
PDF
Introducing the Apache Flink Kubernetes Operator
PPTX
Autoscaling Flink with Reactive Mode
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PPTX
One sink to rule them all: Introducing the new Async Sink
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
PDF
Flink powered stream processing platform at Pinterest
PPTX
Apache Flink in the Cloud-Native Era
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PPTX
The Current State of Table API in 2022
PDF
Flink SQL on Pulsar made easy
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PPTX
Processing Semantically-Ordered Streams in Financial Services
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PDF
Batch Processing at Scale with Flink & Iceberg
Building a fully managed stream processing platform on Flink at scale for Lin...
Evening out the uneven: dealing with skew in Flink
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing the Apache Flink Kubernetes Operator
Autoscaling Flink with Reactive Mode
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
One sink to rule them all: Introducing the new Async Sink
Tuning Apache Kafka Connectors for Flink.pptx
Flink powered stream processing platform at Pinterest
Apache Flink in the Cloud-Native Era
Where is my bottleneck? Performance troubleshooting in Flink
Using the New Apache Flink Kubernetes Operator in a Production Deployment
The Current State of Table API in 2022
Flink SQL on Pulsar made easy
Dynamic Rule-based Real-time Market Data Alerts
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Processing Semantically-Ordered Streams in Financial Services
Tame the small files problem and optimize data layout for streaming ingestion...
Batch Processing at Scale with Flink & Iceberg

Recently uploaded (20)

PPTX
Introduction to machine learning and Linear Models
PDF
Mega Projects Data Mega Projects Data
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Global journeys: estimating international migration
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to machine learning and Linear Models
Mega Projects Data Mega Projects Data
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Fluorescence-microscope_Botany_detailed content
IB Computer Science - Internal Assessment.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Clinical guidelines as a resource for EBP(1).pdf
STUDY DESIGN details- Lt Col Maksud (21).pptx
Data_Analytics_and_PowerBI_Presentation.pptx
climate analysis of Dhaka ,Banglades.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
.pdf is not working space design for the following data for the following dat...
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Global journeys: estimating international migration
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf

Flink Forward SF 2017: Tzu-Li (Gordon) Tai - Joining the Scurry of Squirrels: Contributing to Apache Flink

  • 1. Joining the Scurry of Squirrels: Contributing to Apache Flink® @tzulitai tzulitai@apache.org Tzu-Li (Gordon) Tai
  • 2. Who? ▪ Apache Flink Committer • Streaming connectors (Kafka, Kinesis, Elasticsearch) ▪ Software Engineer at data Artisans 2
  • 3. Who? ▪ Definitely not a veteran contributor of Apache Flink ;-) 3 … my life of Apache Flink contribution ;)
  • 4. Who? ▪ Definitely not a veteran contributor of Apache Flink ;-) 4 … my life of Apache Flink contribution ;)
  • 5. How can I contribute? 5 The usual first question ...
  • 6. It never is just about the code ... 6
  • 7. Different forms of Contribution ▪ As an user … • file bug reports • join discussions / propose new features on the mailing list • testing release candidates • talk about Flink ▪ As a developer … • submit & review patches for features and fixes • help answer user questions 7
  • 8. Mailing Lists, JIRA, and Wiki 8 Know your resources!
  • 9. Know your resources! ▪ The resources is the community, and where everything happens • Mailing lists • JIRA • Wiki 9
  • 10. Mailing lists ▪ Developer mailing list • dev@flink.apache.org • Ask questions about Flink development • Discuss new Flink features ▪ User mailing list • user@flink.apache.org • General questions about using Flink 10
  • 11. JIRA board ▪ Issue tracking • https://issues.apache.org/jira/browse/FLINK • Where bugs and new features / improvements are filed • All code contributions must always have a corresponding JIRA issue ticket opened • Do NOT ask questions here (--> mailing lists) 11
  • 12. Wiki ▪ General information about the project • https://cwiki.apache.org/confluence/display/FLINK • Mostly relevant for FLIPs (Flink Improvement Proposals) • Other info somewhat outdated 12
  • 13. Apache Flink Patch Submission 13
  • 14. Patch Submission Process 14 File and discuss directly on JIRA Start a discussion on the developer mailing list FLIP (Flink Improvement Proposals) Community Consensus Submit Github pull request Committer merges patch review & refine
  • 15. Patch Submission Process 15 File and discuss directly on JIRA Start a discussion on the developer mailing list FLIP (Flink Improvement Proposals) Community Consensus Submit Github pull request Committer merges patch review & refine
  • 16. The Apache Way: Community Over Code 16 ▪ Changes, especially new features, should be agreed upon by the community • Does not apply just to the developers • Beneficial to users as a whole ▪ Always seek to reach some level of consensus throughout the code contribution process • Avoid needing to go back to feature discussion after coding
  • 17. Patch Submission Process 17 File and discuss directly on JIRA Start a discussion on the developer mailing list FLIP (Flink Improvement Proposals) Community Consensus Submit Github pull request Committer merges patch review & refine
  • 18. Filing a JIRA 18 ▪ Always have a good description of what the issue is • Bugs - the cause and the proposed solution • Improvements and features - bootstrap potential discussion with the implementation you have in mind ▪ Remember to appropriately set “Components”
  • 19. Assign yourself to a JIRA 19 ▪ Set the “Assignee” field to yourself • Notifies others that this issue is already in good hands ▪ Simply request JIRA permission on the developer mailing list
  • 20. Patch Submission Process 20 File and discuss directly on JIRA Start a discussion on the developer mailing list FLIP (Flink Improvement Proposals) Community Consensus Submit Github pull request Committer merges patch review & refine
  • 21. FLIP (Flink Improvement Proposals) 21 ▪ Adopted from the Apache Kafka community ▪ Official design documentation for major features • Serves as the basis of discussion on the dev mailing list • Allows the community to decide on the best future-proof design
  • 22. Patch Submission Process 22 File and discuss directly on JIRA Start a discussion on the developer mailing list FLIP (Flink Improvement Proposals) Community Consensus Submit Github pull request Committer merges patch review & refine
  • 23. Patch Submission Best Practices 23 ▪ Rebase onto latest master prior to opening pull request • git pull --rebase origin master ▪ Squash your commits to appropriate set ▪ Commit message example: • [FLINK-6025] [cep] Implement skip till next match strategy
  • 24. Patch Submission Best Practices 24 ▪ Usually the pull request naming follows the primary commit’s message: • [FLINK-6025] [cep] Implement skip till next match strategy ▪ Address all discussions happening on the pull request ▪ Push follow-up commits • Retain history of the patch review • Also have meaningful msgs for follow-up commits
  • 25. Patch Submission Process 25 File and discuss directly on JIRA Start a discussion on the developer mailing list FLIP (Flink Improvement Proposals) Community Consensus Submit Github pull request Committer merges patch review & refine
  • 27. Final Takeaways 27 ▪ Development of Apache Flink is heavily centered around its community • The Apache Way - community over code • Contributing is never just about writing code ▪ Know the resources and do not hesitate to approach. ▪ Extra pair of helping hands is always welcome ;-) !
  • 28. Shoutouts 28 ▪ Flink CEP • Complex event processing • Pattern detection for streams • → If interested, contact Kostas Kloudas (kkloudas@apache.org) ▪ Flink ML • Online learning • Incremental learning • Model serving • → If interested, contact Theodore Vasiloudis (tvas@apache.org)