SlideShare a Scribd company logo
Robin Moffatt, Principal DevEx Engineer @ Decodable
@rmoff / 19 Mar 2024 / #kafkasum
m
i
t
🐲 Here be Dragons Stacktraces
Flink SQL for Non-Java Developers
@rmoff / #kafkasum
m
i
t
@decodableco
Actual footage of a SQL Developer
looking at Apache Flink for the first t
i
m
e
@rmoff / #kafkasum
m
i
t
@decodableco
@rmoff / #kafkasum
m
i
t
@decodableco
What Is Apache Flink?
@rmoff / #kafkasum
m
i
t
@decodableco
A Brief History of Flink
@rmoff / #kafkasum
m
i
t
@decodableco
or
@rmoff / #kafkasum
m
i
t
@decodableco
@rmoff / #kafkasum
m
i
t
@decodableco
• Started life as a
research project in 2
0
1
1,
called Stratosphere.
• This was the t
i
m
e of
MapReduce. Java and Scala
were the only way to do
this.
Back in the t
i
m
e
of dinosaurs Hadoop
@rmoff / #kafkasum
m
i
t
@decodableco
Flink is a big project
• Flink
• Stateful Functions
•
M
L
• Kubernetes Operator
• CDC Connector
• Pa
i
m
on (incubating)
@rmoff / #kafkasum
m
i
t
@decodableco
Capabilities
@rmoff / #kafkasum
m
i
t
@decodableco
Connect to Lots of Source and Target Systems
DynamoDB Object stores
Kinesis
Firehose
Kinesis
JDBC
CDC
@rmoff / #kafkasum
m
i
t
@decodableco
Stateful and Stateless computations
• Filtering
• Joining
• Transfor
m
i
ng
• Aggregations
• Pattern matching
• …and a whole lot more
SELECT * FROM myStream WHERE foo=42
SELECT a.*, b.* FROM myStream a
INNER JOIN myLookup b ON a.id=b.foo_id
SELECT cost * tax_rate AS total_cost
FROM myStream
SELECT SU
M
(order_value) AS total_order_values
FROM orders
SELECT *
FROM myStream
MATCH_RECOGNIZE (
PART
I
T
I
ON BY id
ORDER BY user_action_t
i
m
e
[…]
@rmoff / #kafkasum
m
i
t
@decodableco
Batch and Strea
m
i
ng
Bounded and Unbounded
@rmoff / #kafkasum
m
i
t
@decodableco
How Does Flink Work?
@rmoff / #kafkasum
m
i
t
@decodableco
<< magic 🪄 🧙 >>
@rmoff / #kafkasum
m
i
t
@decodableco
https:
//
nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/
@rmoff / #kafkasum
m
i
t
@decodableco
https:
//
nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/
Flink's SQL Engine: Let's Open
the Engine Room!
🗣 T
i
m
o Walther
📆 Tuesday
⏰ 5:30pm
🗺 Breakout Room 2
@rmoff / #kafkasum
m
i
t
@decodableco
Running Flink
Works on my machine…
$ ./bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host asgard08.
Starting taskexecutor daemon on host asgard08.
$ jps -l
14656 org.apache.flink.runt
i
m
e.taskexecutor.
T
askManagerRunner
14379 org.apache.flink.runt
i
m
e.entrypoint.StandaloneSessionClusterEntrypoint
@rmoff / #kafkasum
m
i
t
@decodableco
Using Flink
@rmoff / #kafkasum
m
i
t
@decodableco
It's not just Java
• PyFlink
• added in 1.9.
0
in 2
0
1
9
• Flink SQL
• Added in 1.5.
0
in 2
0
1
8
@rmoff / #kafkasum
m
i
t
@decodableco
Flink SQL
@rmoff / #kafkasum
m
i
t
@decodableco
• Built on Apache Calcite
• Common Table Expression (C
T
E
) (
W
I
T
H)
• Set-based operations
• Joins
• Aggregations
• And lots more…
SQL Language Support
@rmoff / #kafkasum
m
i
t
@decodableco
Running Flink SQL
• SQL Client
• SQL Gateway
• RE
S
T
API
• Hive
• JDBC Driver
• From Java or Python
@rmoff / #kafkasum
m
i
t
@decodableco
▒▓██▓██▒
▓████▒▒█▓▒▓███▓▒
▓███▓░░ ▒▒▒▓██▒ ▒
░██▒ ▒▒▓▓█▓▓▒░ ▒████
██▒ ░▒▓███▒ ▒█▒█▒
░▓█ ███ ▓░▒██
▓█ ▒▒▒▒▒▓██▓░▒░▓▓█
█░ █ ▒▒░ ███▓▓█ ▒█▒▒▒
████░ ▒▓█▓ ██▒▒▒ ▓███▒
░▒█▓▓██ ▓█▒ ▓█▒▓██▓ ░█░
▓░▒▓████▒ ██ ▒█ █▓░▒█▒░▒█▒
███▓░██▓ ▓█ █ █▓ ▒▓█▓▓█▒
░██▓ ░█░ █ █▒ ▒█████▓▒ ██▓░▒
███░ ░ █░ ▓ ░█ █████▒░░ ░█░▓ ▓░
██▓█ ▒▒▓▒ ▓███████▓░ ▒█▒ ▒▓ ▓██▓
▒██▓ ▓█ █▓█ ░▒█████▓▓▒░ ██▒▒ █ ▒ ▓█▒
▓█▓ ▓█ ██▓ ░▓▓▓▓▓▓▓▒ ▒██▓ ░█▒
▓█ █ ▓███▓▒░ ░▓▓▓███▓ ░▒░ ▓█
██▓ ██▒ ░▒▓▓███▓▓▓▓▓██████▓▒ ▓███ █
▓███▒ ███ ░▓▓▒░░ ░▓████▓░ ░▒▓▒ █▓
█▓▒▒▓▓██ ░▒▒░░░▒▒▒▒▓██▓░ █▓
██ ▓░▒█ ▓▓▓▓▒░░ ▒█▓ ▒▓▓██▓ ▓▒ ▒▒▓
▓█▓ ▓▒█ █▓░ ░▒▓▓██▒ ░▓█▒ ▒▒▒░▒▒▓█████▒
██░ ▓█▒█▒ ▒▓▓▒ ▓█ █░ ░░░░ ░█▒
▓█ ▒█▓ ░ █░ ▒█ █▓
█▓ ██ █░ ▓▓ ▒█▓▓▓▒█░
█▓ ░▓██░ ▓▒ ▓█▓▒░░░▒▓█░ ▒█
██ ▓█▓░ ▒ ░▒█▒██▒ ▓▓
▓█▒ ▒█▓▒░ ▒▒ █▒█▓▒▒░░▒██
░██▒ ▒▓▓▒ ▓██▓▒█▒ ░▓▓▓▓▒█▓
░▓██▒ ▓░ ▒█▓█ ░░▒▒▒
▒▓▓▓▓▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░▓▓ ▓░▒█░
______ _ _ _ _____ ____ _ _____ _ _ _ BETA
| ____| (_) | | / ____|/ __ | | / ____| (_) | |
| |__ | |_ _ __ | | __ | (___ | | | | | | | | |_ ___ _ __ | |_
| __| | | | '_ | |/ / ___ | | | | | | | | | |/ _  '_ | __|
| | | | | | | | < ____) | |__| | |____ | |____| | | __/ | | | |_
|_| |_|_|_| |_|_|_ |_____/ __________| _____|_|_|___|_| |_|__|
Flink SQL Client
$ ./bin/sql-client.sh
Welcome! Enter 'HELP;' to list
all available commands. 'QU
I
T
;' to
exit.
Command history file path: /opt/
flink/.flink-sql-history
Flink SQL>
@decodableco @rmoff / #kafkasum
m
i
t
D
E
M
O
https:
//
github.com/decodableco/examples/kafka-iceberg
@rmoff / #kafkasum
m
i
t
@decodableco
A Few Useful Settings
@rmoff / #kafkasum
m
i
t
@decodableco
Runt
i
m
e Mode
S
E
T
'execution.runt
i
m
e-mode' = 'strea
m
i
ng';
• strea
m
i
ng [default]
• batch
@rmoff / #kafkasum
m
i
t
@decodableco
Result Mode
S
E
T
'sql-client.execution.result-mode' = 'table';
• table [default]
• changelog
• tableau
@rmoff / #kafkasum
m
i
t
@decodableco
Colour Scheme
S
E
T
'sql-client.display.color-schema' = 'Chester';
• Because why not?!
@rmoff / #kafkasum
m
i
t
@decodableco
Changing the defaults
Setting up a SQL Client initialisation file
• Create a SQL file:
$ cat init.sql
S
E
T
'execution.runt
i
m
e-mode' = 'batch';
S
E
T
'sql-client.execution.result-mode' = 'tableau';
• Launch SQL Client
w
i
th the -i flag and pass the file
as a parameter:
./bin/sql-client.sh -i init.sql
@rmoff / #kafkasum
m
i
t
@decodableco
Sub
m
i
tting SQL as a job
• SQL Client
$ ./bin/sql-client.sh --file ~/my_query.sql
• SQL Gateway
curl --location 'localhost:8083/sessions/42/statements' 
--header 'Content-Type: application/json' 
--header 'Accept: application/json' 
--data '{
"statement": "SELECT * FROM foo;"
}'
• Application mode support: FLIP-316: Support application mode for
SQL Gateway
@rmoff / #kafkasum
m
i
t
@decodableco
Some of the Gnarly Stuff
@rmoff / #kafkasum
m
i
t
@decodableco
• For each connector,
format, and catalog you
need to install
dependencies.
• All of these are
available as JARs (Java
ARchive)
The Joy of JARs
@rmoff / #kafkasum
m
i
t
@decodableco
This
m
i
ght jar a bit…
Could not execute SQL statement.
Reason: java.lang.ClassNotFoundException
Could not find any factory for identifier
'hive' that
i
m
plements
'org.apache.flink.table.factories.CatalogF
actory' in the classpath.
org.apache.flink.core.fs.UnsupportedFileSy
ste
m
S
chemeException: Could not find a file
system
i
m
plementation for scheme 's3'
@rmoff / #kafkasum
m
i
t
@decodableco
Finding JARs
• Usually the docs
w
i
ll tell you which JAR you need.
• JARs are very specific to the versions of the tools that
you're using.
@rmoff / #kafkasum
m
i
t
@decodableco
@rmoff / #kafkasum
m
i
t
@decodableco
@rmoff / #kafkasum
m
i
t
@decodableco
@rmoff / #kafkasum
m
i
t
@decodableco $ tree /opt/flink/lib
├── aws
│ ├── aws-java-sdk-bundle-1.12.648.jar
│ └── hadoop-aws-3.3.4.jar
├── flink-cep-1.18.1.jar
├── flink-parquet_2.12-1.18.1.jar
├── flink-table-runt
i
m
e-1.18.1.jar
├── hadoop
│ ├── commons-configuration2-2.1.1.jar
│ ├── commons-logging-1.1.3.jar
│ ├── hadoop-auth-3.3.4.jar
├── hive
│ └── flink-sql-connector-hive-3.1.3_2.12-1.18.1.jar
├── iceberg
│ └── iceberg-flink-runt
i
m
e-1.18-1.5.
0
.jar
├── kafka
│ └── flink-sql-connector-kafka-3.1.
0
-1.18.jar
@decodableco @rmoff / #kafkasum
m
i
t
Don't forget to restart!
@rmoff / #kafkasum
m
i
t
@decodableco
Tables, Connectors, and
Catalogs
@rmoff / #kafkasum
m
i
t
@decodableco
Tables
CREA
T
E
TABLE t_k_orders
(
orderid
S
T
RING,
customerid
S
T
RING,
ordernumber IN
T
,
product
S
T
RING,
discountpercent INT
) W
I
T
H (
'connector' = 'kafka',
'topic' = 'orders',
'properties.bootstrap.servers' = 'broker:29092',
'scan.startup.
m
ode' = 'earliest-offset',
'format' = 'json'
);
@rmoff / #kafkasum
m
i
t
@decodableco
• The data and
information about the
data was all stored in
the database
• Information Schema
• System Catalog
• Data Dictionary Views
This used to be s
i
m
ple
@decodableco @rmoff / #kafkasum
m
i
t
Now it's not so s
i
m
ple
@rmoff / #kafkasum
m
i
t
@decodableco
Flink Catalogs
@rmoff / #kafkasum
m
i
t
@decodableco
Flink Catalogs
@rmoff / #kafkasum
m
i
t
@decodableco
Flink Catalogs
@rmoff / #kafkasum
m
i
t
@decodableco
Flink Catalogs
@rmoff / #kafkasum
m
i
t
@decodableco
Flink Catalogs
@rmoff / #kafkasum
m
i
t
@decodableco
Flink Catalogs
CREA
T
E
CATALOG c_hive W
I
T
H (
'type' = 'hive',
'hive-conf-dir' = './conf');
@rmoff / #kafkasum
m
i
t
@decodableco
Flink Catalogs
table.catalog-store.kind: file
table.catalog-store.file.path: ./conf/catalogs
@decodableco @rmoff / #kafkasum
m
i
t
D
E
M
O
https:
//
github.com/decodableco/examples/kafka-iceberg
@rmoff / #kafkasum
m
i
t
@decodableco
In Conclusion…
@rmoff / #kafkasum
m
i
t
@decodableco
Flink SQL is Fun!
But there's a bit of a learning curve
• Run ad-hoc queries
w
i
th the SQL Client
• Understand JAR dependencies for connectors, catalogs,
formats, etc
• Don't be put off by the docs - there
i
s SQL content
there if you look hard enough
@rmoff / #kafkasum
m
i
t
@decodableco
decodable.co/blog
@decodableco @rmoff / #kafkasum
m
i
t
#
E
OF
@rmoff / 19 Mar 2024 / #kafkasum
m
i
t

More Related Content

PPTX
PuppetConf 2017: Use Puppet to Tame the Dockerfile Monster- Bryan Belanger, A...
PDF
Linux Shell Scripting Craftsmanship
PDF
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
PDF
Running PHP on a Java container
PDF
Is your python application secure? - PyCon Canada - 2015-11-07
PDF
State of Akka 2017 - The best is yet to come
PDF
DevOps in PHP environment
PDF
#PDR15 - waf, wscript and Your Pebble App
PuppetConf 2017: Use Puppet to Tame the Dockerfile Monster- Bryan Belanger, A...
Linux Shell Scripting Craftsmanship
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
Running PHP on a Java container
Is your python application secure? - PyCon Canada - 2015-11-07
State of Akka 2017 - The best is yet to come
DevOps in PHP environment
#PDR15 - waf, wscript and Your Pebble App

Similar to 🐲 Here be Stacktraces — Flink SQL for Non-Java Developers (20)

PDF
Seven perilous pitfalls to avoid with Java | DevNation Tech Talk
PPT
Deploy Rails Application by Capistrano
PDF
PySpark Best Practices
PDF
Sinatra and friends
PPTX
Geecon 2019 - Taming Code Quality in the Worst Language I Know: Bash
PDF
2010 Smith Scripting101
PDF
Building web framework with Rack
KEY
PDF
From Zero to Hero with Kafka Connect
PDF
Performance Profiling in Rust
PPTX
PyCon Canada 2015 - Is your python application secure
PDF
Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...
PPTX
One Click Provisioning With Enterprise Manager 12c
PDF
Tanel Poder - Scripts and Tools short
KEY
Rapid Prototyping FTW!!!
PPTX
Taming Snakemake
PPT
2016年のPerl (Long version)
PDF
Ratpack Web Framework
PDF
SANS @Night There's Gold in Them Thar Package Management Databases
PDF
Real-time Streaming Pipelines with FLaNK
Seven perilous pitfalls to avoid with Java | DevNation Tech Talk
Deploy Rails Application by Capistrano
PySpark Best Practices
Sinatra and friends
Geecon 2019 - Taming Code Quality in the Worst Language I Know: Bash
2010 Smith Scripting101
Building web framework with Rack
From Zero to Hero with Kafka Connect
Performance Profiling in Rust
PyCon Canada 2015 - Is your python application secure
Red Hat Nordics 2020 - Apache Camel 3 the next generation of enterprise integ...
One Click Provisioning With Enterprise Manager 12c
Tanel Poder - Scripts and Tools short
Rapid Prototyping FTW!!!
Taming Snakemake
2016年のPerl (Long version)
Ratpack Web Framework
SANS @Night There's Gold in Them Thar Package Management Databases
Real-time Streaming Pipelines with FLaNK
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Ad

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Approach and Philosophy of On baking technology
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
KodekX | Application Modernization Development
PPTX
Cloud computing and distributed systems.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Electronic commerce courselecture one. Pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Machine learning based COVID-19 study performance prediction
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Approach and Philosophy of On baking technology
Spectral efficient network and resource selection model in 5G networks
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
KodekX | Application Modernization Development
Cloud computing and distributed systems.
Reach Out and Touch Someone: Haptics and Empathic Computing
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Electronic commerce courselecture one. Pdf
Chapter 3 Spatial Domain Image Processing.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Encapsulation theory and applications.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

🐲 Here be Stacktraces — Flink SQL for Non-Java Developers