SlideShare a Scribd company logo
Stream Processing Airport Data
Sönke Liebau – Co-Founder and Partner @ OpenCore
October 17th 2018
Serving the Real-Time Data Needs of an Airport with Kafka
Streams and KSQL
Who Am I?
• Partner & Co-Founder at
• Small consulting company with a Big Data & Open Source focus
• First production Kafka deployment in 2014
Website: www.opencore.com
soenke.liebau@opencore.com
https://www.linkedin.com/in/soenkeliebau/
@soenkeliebau
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Kafka Streams & KSQL
Source: https://kafka.apache.org/20/documentation/streams/ 13
What Is Kafka Streams?
“The easiest way to write mission-critical real-time applications and
microservices”
“Kafka Streams is a client library for building applications and microservices, where the input and output
data are stored in Kafka clusters. “
What Is KSQL?
Confluent KSQL is the open source,
streaming SQL engine that enables
real-time data processing against
Apache Kafka®
Source: https://www.confluent.io/product/ksql/ 14
© 2018 OpenCore GmbH & Co. KG 17
Kafka Streams In The Ecosystem
Sources KafkaConnect
KafkaConnect
Destinations
Kafka
Streams
Jobs
© 2018 OpenCore GmbH & Co. KG 18
The Big Difference
20
Using Kafka Streams
final Serde<String> stringSerde = Serdes.String();
final Serde<Long> longSerde = Serdes.Long();
KStream<String, String> textLines = builder.stream("streams-plaintext-input",
Consumed.with(stringSerde, stringSerde);
KTable<String, Long> wordCounts = textLines
.flatMapValues(value -> Arrays.asList(value.toLowerCase().split("W+")))
.groupBy((key, value) -> value)
.count()
wordCounts.toStream().to("streams-wordcount-output", Produced.with(Serdes.String(), Serdes.Long()));
Source: https://kafka.apache.org/20/documentation/streams/quickstart
21
Using Kafka Streams
final Serde<String> stringSerde = Serdes.String();
final Serde<Long> longSerde = Serdes.Long();
KStream<String, String> textLines = builder.stream("streams-plaintext-input",
Consumed.with(stringSerde, stringSerde);
KTable<String, Long> wordCounts = textLines
.flatMapValues(value -> Arrays.asList(value.toLowerCase().split("W+")))
.groupBy((key, value) -> value)
.count()
wordCounts.toStream().to("streams-wordcount-output", Produced.with(Serdes.String(),
Serdes.Long()));
Source: https://kafka.apache.org/20/documentation/streams/quickstart
…
© 2018 OpenCore GmbH & Co. KG 22
Using KSQL
RestInterface
CLI
Rest
Client
SELECT *
FROM security_in
WHERE status=’success’
AND terminal=’t1’;
23
Running A KSQL Statement
© 2017 OpenCore GmbH & Co. KG 24
The Competition
Kafka Streams KSQL
When To Use Which?
• Offers lower level access
• More data formats supported
• Queryable state
• Problems that cannot be expressed in
SQL
• Easier for people used to SQL
• No need for additional orchestration
• Data exploration
© 2018 OpenCore GmbH & Co. KG 25
Our Airport
© 2017 OpenCore GmbH & Co. KG 26
© 2017 OpenCore GmbH & Co. KG 27
A Few Facts Up Front
• A lot of independent data sources
• Airline ticketing
• Baggage transport system
• Passenger counting
• Retail
• Radar
• Weather
• …
• Spread over multiple companies
• Many legacy interfaces
© 2018 OpenCore GmbH & Co. KG 28
Integrations
Operations
Database
External
System
External
System
External
System
External
System
External
System
External
System
© 2018 OpenCore GmbH & Co. KG 29
Isolated Islands Of Data
• A lot of isolated data stores to provide data for necessary solutions
• Spiderweb of integrations
• Operational DB needs to push data to a lot of systems
• Many different formats
© 2018 OpenCore GmbH & Co. KG 31
The Dream
…
Weird
binary
source
XML
Source
Destination
Destination
Destination
Raw Source Processed
RestStream Processing
© 2018 OpenCore GmbH & Co. KG 32
Ingest Transformation - Kafka Streams
StreamsBuilder builder = new StreamsBuilder();
Serde<ProprietaryObject> weirdFormatSerde = new ProprietaryWeirdFormatSerde();
Serde<ProprietaryObject> avroSerde = new ProprietaryAvroSerde();
builder.stream(“proprietary_input_topic",
Consumed.with(
Serdes.String(),
weirdFormatSerde))
.to("avro_output_topic",
Produced.with(
Serdes.String(),
avroSerde));
33
Ingest Transformation - KSQL
ksql> CREATE STREAM source (uid INT, name VARCHAR) WITH (KAFKA_TOPIC='mysql_users',
VALUE_FORMAT='JSON‘);
ksql> CREATE STREAM target_avro WITH (VALUE_FORMAT='AVRO', KAFKA_TOPIC='mysql_users_avro')
AS SELECT * FROM source;
Source: https://gist.github.com/rmoff/165b05e4554c41719b71f1a47ee7b113
© 2018 OpenCore GmbH & Co. KG 34
Stream Processing
• Stream processing jobs read converted avro topics and create enriched
topics/alerts/… by
• Joining streams
• Aggregating streams
• Filtering or alerting on streams
• …
© 2018 OpenCore GmbH & Co. KG 35
DISCLAIMER
© 2018 OpenCore GmbH & Co. KG 36
Gate Changes
• Gate changes can be based on different information
• Delays of the incoming flight
• Changes on other outgoing flights
• …
• Join relevant streams and publish change events that are consumed by
• Apps
• Gate monitors
• Departure boards
• …
© 2018 OpenCore GmbH & Co. KG 37
Passenger Count
• Join stream of tickets scanned before line to security check and camera
count of passengers leaving security check to estimate number of waiting
passengers
• Change routing of passengers (physical: signs change & digital: different routing in
app)
• Also consumed by
• Monitors to display predicted waiting time
• App to display predicted wait time
• Predicition systems to feed models for capacity planning
• Models to predict if a passenger might miss his flight -> reroute to priority lane
© 2018 OpenCore GmbH & Co. KG 38
Wait Time
• Calculate how long a passenger took to clear the security checkpoint by
joining when he scanned his boarding pass and when he is first spotted by
an iBeacon beyond security
• Push offers based on wait time and flight time
• Long wait, lot of time till take-off -> free coffee or sandwich
• Long wait, short time till take-off -> duty free voucher
• …
© 2018 OpenCore GmbH & Co. KG 39
Baggage Notification
• Baggage containers are scanned when they are loaded/unloaded
• By joining this with data from the baggage sorter passengers could receive
push notifications when their luggage is loaded/unloaded into/from the
plane
© 2018 OpenCore GmbH & Co. KG 40
Arrival At Gate
• There are complex models running to estimate when the plane will arrive
at the gate after it has landed
• Based on ground radar data
• Can be used to
• Predict whether the following flight might be delayed
• Coordinate cleaning crews
• Coordinate refueling
• Feed into gate change decisions
© 2018 OpenCore GmbH & Co. KG 41
An Example Flow
{"boardingpass_id":"123",
"passenger“:"smith",
"flight_number":"LH454“,
“checked_bags”:1}
{"boardingpass_id":"123",
"security_area":"t1_2",
"status":"success"}
{"security_area":"t1_2",
"count":"1"}
{"passenger":"smith",
"beacon_id":"t1_b123"}
{"boardingpass_id":"123",
"item_group":"cigarettes"}
{"boardingpass_id":"123",
"status":"success"}
{"flight_no":"LH454",
“runway":“1north"}
{“old_gate":“a12”,
“new_gate":“e50"}
© 2018 OpenCore GmbH & Co. KG 42
Check-In Event
{"boardingpass_id":"123",
"passenger":"smith",
"flight_number":"LH454“,
“terminal”:“terminal1” }
check_in_count
CREATE TABLE check_in_count
AS SELECT terminal, count(terminal)
FROM security_in
WINDOW TUMBLING (SIZE 24 hour)
GROUP BY terminal;
check_in
What is it good for?
• Early warning for security capacity
• „Don‘t dawdle“ warning based on
security queues
© 2018 OpenCore GmbH & Co. KG 43
Passenger Enters Security Area
{"boardingpass_id":"123",
"security_area":"t1_2",
"status" : "success"}
security_in_count
CREATE TABLE security_in_count
AS SELECT
security_area,
count(security_area)
FROM security_in
WINDOW TUMBLING (SIZE 24 hour)
WHERE status='success'
GROUP BY security_area;
security_in
What is it good for?
• Monitor for failed attempts
• Passenger routing to security
• Unload baggage of late passengers
• …
time_to_security
SELECT
s.boardingpass_id, c.rowtime - s.rowtime
as time_to_security
FROM security_in s
LEFT JOIN check_in c WITHIN 1 HOUR
ON s.boardingpass_id=c.boardingpass_id;
© 2018 OpenCore GmbH & Co. KG 44
Passenger Leaves Security Area
{"security_area":"t1_2",
"count“:"1"}
security_out security_out_count
CREATE TABLE security_out_count
AS SELECT security_area, sum(count)
FROM security_out
WINDOW TUMBLING (SIZE 24 hour)
GROUP BY security_area;
security_in_count
current_count
What is it good for?
• Capacity planning
• Wait time prediction
• Passenger routing (apps & physical)
• Alerting on late passengers checking in
• …
SELECT
i.terminal AS terminal,
i.KSQL_COL_1 AS entry,
o.KSQL_COL_1 AS exit,
i.KSQL_COL_1 - o.KSQL_COL_1 AS
waiting
FROM security_in_count i
INNER JOIN security_out_count o
ON i.terminal=o.terminal;
© 2018 OpenCore GmbH & Co. KG 45
Passenger Located Via iBeacon
{"passenger":"smith",
"beacon_id":"t1_b123"}
security_duration
security_in
dutyfree_joined
CREATE STREAM dutyfree_joined
AS SELECT c.boardingpass_id, d.passenger
FROM dutyfree_in d
LEFT JOIN security_in s WITHIN 1 HOURS
ON s.passenger=d.passenger;
dutyfree_in
SELECT
d.boardingpass_id,
d.d_passenger,
d.rowtime - s.rowtime as
time_in_security
FROM dutyfree_in_with_bc d
LEFT JOIN security_in s WITHIN 1 HOUR
ON d.boardingpass_id=s.boardingpass_id;
What is it good for?
• Refining wait time prediction
• Targeted questionaire (find reasons for
outliers)
• Vouchers for huge delays
• …
© 2018 OpenCore GmbH & Co. KG 46
Purchase Event
{"boardingpass_id":"123",
"item_group":"cigarettes"}
flight_information
check_in
dutyfree_joined
What is it good for?
• Retail models
• Route to smoking area nearest to gate
• Advise of walk time if time is tight
• …
dutyfree_purchase
CREATE STREAM dutyfree_joined
AS SELECT
c.boardingpass_id,
c.passenger,
p.purchase_type,
f.gate
FROM dutyfree_purchase p
LEFT JOIN check_in c WITHIN 1 HOURS
ON c.passenger=p.passenger
LEFT JOIN flight_information f
WITHIN 1 HOURS
ON f.flight_number = c.flight_number;
© 2018 OpenCore GmbH & Co. KG 47
Gate Change
expected_gate_arrival
notifications
expected_gate_departure
CREATE STREAM gate_wait_time
AS SELECT
a.flight,
d.departure_time - a.arrival_time as wait_time
FROM expected_gate_arrival a
INNER JOIN expected_gate_departure d WITHIN 1 HOURS
ON a.gate=d.gate;
gate_wait_time
gate_change
CREATE STREAM gate_change
AS SELECT
flight
FROM gate_wait_time
WHERE wait_time > 600000;
CREATE STREAM notifications
AS SELECT f.passenger
FROM gate_change g
LEFT JOIN flight_information f
WITHIN 1 HOURS ON f.gate=g.gate;
flight_information
© 2018 OpenCore GmbH & Co. KG 48
Passenger Boards Plane
{"boardingpass_id":"123",
"status":"success"}
gate_in
What is it good for?
• Alert on bags without matching passengers
• Trigger unloading based on related events
• Gate closed
• Time based
• …
bags_joined
check_in baggage_loaded
CREATE STREAM bag_join
AS SELECT
c.passenger,
c.bags
FROM gate_in g
LEFT JOIN check_in c
WITHIN 1 HOURS
ON c.boardingpass_id=g.boardingpass_id
LEFT JOIN baggage_loaded b
WITHIN 1 HOURS
ON b.bag_id = c.bag_id;
Thank You!
© 2018 OpenCore GmbH & Co. KG

More Related Content

PDF
Apache Kafka in the Airline, Aviation and Travel Industry
PDF
Kappa vs Lambda Architectures and Technology Comparison
PDF
End-End Security with Confluent Platform
PDF
Santander Stream Processing with Apache Flink
PPTX
2015 04 Preparing for the SAP S/4HANA Migration
PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
PDF
Apache Kafka in Financial Services - Use Cases and Architectures
PDF
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Apache Kafka in the Airline, Aviation and Travel Industry
Kappa vs Lambda Architectures and Technology Comparison
End-End Security with Confluent Platform
Santander Stream Processing with Apache Flink
2015 04 Preparing for the SAP S/4HANA Migration
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Apache Kafka in Financial Services - Use Cases and Architectures
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...

What's hot (20)

PDF
Event Streaming in Retail with Apache Kafka
PDF
PySpark in practice slides
PDF
An architecture for federated data discovery and lineage over on-prem datasou...
PDF
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
PPTX
Confluent_Banking_Usecases_Examples.pptx
PPTX
Apache kafka
PPTX
L1_RISE_with_SAP_NNN_V3.4.pptx
PDF
Introduction to Apache Kafka and Confluent... and why they matter
PPTX
Kafka presentation
PDF
Build a Financial Data Hub
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
PDF
From Zero to Hero with Kafka Connect
PDF
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
PPTX
Apache Knox - Hadoop Security Swiss Army Knife
PDF
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
PDF
Grafana introduction
PPTX
A visual introduction to Apache Kafka
PDF
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
PDF
When NOT to use Apache Kafka?
PPTX
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Event Streaming in Retail with Apache Kafka
PySpark in practice slides
An architecture for federated data discovery and lineage over on-prem datasou...
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Confluent_Banking_Usecases_Examples.pptx
Apache kafka
L1_RISE_with_SAP_NNN_V3.4.pptx
Introduction to Apache Kafka and Confluent... and why they matter
Kafka presentation
Build a Financial Data Hub
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
From Zero to Hero with Kafka Connect
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
Apache Knox - Hadoop Security Swiss Army Knife
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Grafana introduction
A visual introduction to Apache Kafka
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
When NOT to use Apache Kafka?
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Ad

Similar to Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL (20)

PDF
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
PPTX
Machine Learning with Apache Spark
PDF
UberCloud: From Experiment to Marketplace
PDF
UberCloud: From Experiment to Marketplace
PPTX
Anatomy of an AWS account Cryptojack
PPTX
Enabling Event Driven Architecture with PubSub+
PPTX
Driving Efficiency with Splunk Cloud at Gatwick Airport
PDF
Smart Manufacturing: CAE in the Cloud
PPTX
Serverless patterns
PPTX
Javier Hijas & Ori Kuyumgiski - Security at the speed of DevOps [rooted2018]
PPTX
Cloud computing
PDF
Containers and Kubernetes without limits
PPTX
IoT Connected Brewery
PDF
Check Point and Accenture Webinar
PDF
Meetup: Streaming Data Pipeline Development
PDF
Cncf microservices security
PDF
GSC Platform pitch
PPTX
Webinar: Extend The Power of The ForgeRock Identity Platform Through Scripting
PDF
Aerial Inventory Counts at IAG Cargo
PPT
Cloud computing-2 (1)
Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
Machine Learning with Apache Spark
UberCloud: From Experiment to Marketplace
UberCloud: From Experiment to Marketplace
Anatomy of an AWS account Cryptojack
Enabling Event Driven Architecture with PubSub+
Driving Efficiency with Splunk Cloud at Gatwick Airport
Smart Manufacturing: CAE in the Cloud
Serverless patterns
Javier Hijas & Ori Kuyumgiski - Security at the speed of DevOps [rooted2018]
Cloud computing
Containers and Kubernetes without limits
IoT Connected Brewery
Check Point and Accenture Webinar
Meetup: Streaming Data Pipeline Development
Cncf microservices security
GSC Platform pitch
Webinar: Extend The Power of The ForgeRock Identity Platform Through Scripting
Aerial Inventory Counts at IAG Cargo
Cloud computing-2 (1)
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
PDF
Migration, backup and restore made easy using Kannika
PDF
Five Things You Need to Know About Data Streaming in 2025
PDF
Data in Motion Tour Seoul 2024 - Keynote
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
PDF
Unlocking value with event-driven architecture by Confluent
PDF
Il Data Streaming per un’AI real-time di nuova generazione
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
PDF
Building API data products on top of your real-time data infrastructure
PDF
Speed Wins: From Kafka to APIs in Minutes
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
Webinar Think Right - Shift Left - 19-03-2025.pptx
Migration, backup and restore made easy using Kannika
Five Things You Need to Know About Data Streaming in 2025
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Roadmap Demo
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Build a Real-Time Decision Support Application for Financial Market Traders w...
Strumenti e Strategie di Stream Governance con Confluent Platform
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Building Real-Time Gen AI Applications with SingleStore and Confluent
Unlocking value with event-driven architecture by Confluent
Il Data Streaming per un’AI real-time di nuova generazione
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Break data silos with real-time connectivity using Confluent Cloud Connectors
Building API data products on top of your real-time data infrastructure
Speed Wins: From Kafka to APIs in Minutes
Evolving Data Governance for the Real-time Streaming and AI Era

Recently uploaded (20)

PPTX
sap open course for s4hana steps from ECC to s4
PDF
Encapsulation_ Review paper, used for researhc scholars
PPT
Teaching material agriculture food technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Big Data Technologies - Introduction.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Encapsulation theory and applications.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
sap open course for s4hana steps from ECC to s4
Encapsulation_ Review paper, used for researhc scholars
Teaching material agriculture food technology
Chapter 3 Spatial Domain Image Processing.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Big Data Technologies - Introduction.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
cuic standard and advanced reporting.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Spectroscopy.pptx food analysis technology
Reach Out and Touch Someone: Haptics and Empathic Computing
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Encapsulation theory and applications.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Programs and apps: productivity, graphics, security and other tools
20250228 LYD VKU AI Blended-Learning.pptx
Review of recent advances in non-invasive hemoglobin estimation

Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL

  • 1. Stream Processing Airport Data Sönke Liebau – Co-Founder and Partner @ OpenCore October 17th 2018 Serving the Real-Time Data Needs of an Airport with Kafka Streams and KSQL
  • 2. Who Am I? • Partner & Co-Founder at • Small consulting company with a Big Data & Open Source focus • First production Kafka deployment in 2014 Website: www.opencore.com soenke.liebau@opencore.com https://www.linkedin.com/in/soenkeliebau/ @soenkeliebau
  • 13. Source: https://kafka.apache.org/20/documentation/streams/ 13 What Is Kafka Streams? “The easiest way to write mission-critical real-time applications and microservices” “Kafka Streams is a client library for building applications and microservices, where the input and output data are stored in Kafka clusters. “
  • 14. What Is KSQL? Confluent KSQL is the open source, streaming SQL engine that enables real-time data processing against Apache Kafka® Source: https://www.confluent.io/product/ksql/ 14
  • 15. © 2018 OpenCore GmbH & Co. KG 17 Kafka Streams In The Ecosystem Sources KafkaConnect KafkaConnect Destinations Kafka Streams Jobs
  • 16. © 2018 OpenCore GmbH & Co. KG 18 The Big Difference
  • 17. 20 Using Kafka Streams final Serde<String> stringSerde = Serdes.String(); final Serde<Long> longSerde = Serdes.Long(); KStream<String, String> textLines = builder.stream("streams-plaintext-input", Consumed.with(stringSerde, stringSerde); KTable<String, Long> wordCounts = textLines .flatMapValues(value -> Arrays.asList(value.toLowerCase().split("W+"))) .groupBy((key, value) -> value) .count() wordCounts.toStream().to("streams-wordcount-output", Produced.with(Serdes.String(), Serdes.Long())); Source: https://kafka.apache.org/20/documentation/streams/quickstart
  • 18. 21 Using Kafka Streams final Serde<String> stringSerde = Serdes.String(); final Serde<Long> longSerde = Serdes.Long(); KStream<String, String> textLines = builder.stream("streams-plaintext-input", Consumed.with(stringSerde, stringSerde); KTable<String, Long> wordCounts = textLines .flatMapValues(value -> Arrays.asList(value.toLowerCase().split("W+"))) .groupBy((key, value) -> value) .count() wordCounts.toStream().to("streams-wordcount-output", Produced.with(Serdes.String(), Serdes.Long())); Source: https://kafka.apache.org/20/documentation/streams/quickstart …
  • 19. © 2018 OpenCore GmbH & Co. KG 22 Using KSQL RestInterface CLI Rest Client SELECT * FROM security_in WHERE status=’success’ AND terminal=’t1’;
  • 20. 23 Running A KSQL Statement
  • 21. © 2017 OpenCore GmbH & Co. KG 24 The Competition
  • 22. Kafka Streams KSQL When To Use Which? • Offers lower level access • More data formats supported • Queryable state • Problems that cannot be expressed in SQL • Easier for people used to SQL • No need for additional orchestration • Data exploration © 2018 OpenCore GmbH & Co. KG 25
  • 23. Our Airport © 2017 OpenCore GmbH & Co. KG 26
  • 24. © 2017 OpenCore GmbH & Co. KG 27 A Few Facts Up Front • A lot of independent data sources • Airline ticketing • Baggage transport system • Passenger counting • Retail • Radar • Weather • … • Spread over multiple companies • Many legacy interfaces
  • 25. © 2018 OpenCore GmbH & Co. KG 28 Integrations Operations Database External System External System External System External System External System External System
  • 26. © 2018 OpenCore GmbH & Co. KG 29 Isolated Islands Of Data • A lot of isolated data stores to provide data for necessary solutions • Spiderweb of integrations • Operational DB needs to push data to a lot of systems • Many different formats
  • 27. © 2018 OpenCore GmbH & Co. KG 31 The Dream … Weird binary source XML Source Destination Destination Destination Raw Source Processed RestStream Processing
  • 28. © 2018 OpenCore GmbH & Co. KG 32 Ingest Transformation - Kafka Streams StreamsBuilder builder = new StreamsBuilder(); Serde<ProprietaryObject> weirdFormatSerde = new ProprietaryWeirdFormatSerde(); Serde<ProprietaryObject> avroSerde = new ProprietaryAvroSerde(); builder.stream(“proprietary_input_topic", Consumed.with( Serdes.String(), weirdFormatSerde)) .to("avro_output_topic", Produced.with( Serdes.String(), avroSerde));
  • 29. 33 Ingest Transformation - KSQL ksql> CREATE STREAM source (uid INT, name VARCHAR) WITH (KAFKA_TOPIC='mysql_users', VALUE_FORMAT='JSON‘); ksql> CREATE STREAM target_avro WITH (VALUE_FORMAT='AVRO', KAFKA_TOPIC='mysql_users_avro') AS SELECT * FROM source; Source: https://gist.github.com/rmoff/165b05e4554c41719b71f1a47ee7b113
  • 30. © 2018 OpenCore GmbH & Co. KG 34 Stream Processing • Stream processing jobs read converted avro topics and create enriched topics/alerts/… by • Joining streams • Aggregating streams • Filtering or alerting on streams • …
  • 31. © 2018 OpenCore GmbH & Co. KG 35 DISCLAIMER
  • 32. © 2018 OpenCore GmbH & Co. KG 36 Gate Changes • Gate changes can be based on different information • Delays of the incoming flight • Changes on other outgoing flights • … • Join relevant streams and publish change events that are consumed by • Apps • Gate monitors • Departure boards • …
  • 33. © 2018 OpenCore GmbH & Co. KG 37 Passenger Count • Join stream of tickets scanned before line to security check and camera count of passengers leaving security check to estimate number of waiting passengers • Change routing of passengers (physical: signs change & digital: different routing in app) • Also consumed by • Monitors to display predicted waiting time • App to display predicted wait time • Predicition systems to feed models for capacity planning • Models to predict if a passenger might miss his flight -> reroute to priority lane
  • 34. © 2018 OpenCore GmbH & Co. KG 38 Wait Time • Calculate how long a passenger took to clear the security checkpoint by joining when he scanned his boarding pass and when he is first spotted by an iBeacon beyond security • Push offers based on wait time and flight time • Long wait, lot of time till take-off -> free coffee or sandwich • Long wait, short time till take-off -> duty free voucher • …
  • 35. © 2018 OpenCore GmbH & Co. KG 39 Baggage Notification • Baggage containers are scanned when they are loaded/unloaded • By joining this with data from the baggage sorter passengers could receive push notifications when their luggage is loaded/unloaded into/from the plane
  • 36. © 2018 OpenCore GmbH & Co. KG 40 Arrival At Gate • There are complex models running to estimate when the plane will arrive at the gate after it has landed • Based on ground radar data • Can be used to • Predict whether the following flight might be delayed • Coordinate cleaning crews • Coordinate refueling • Feed into gate change decisions
  • 37. © 2018 OpenCore GmbH & Co. KG 41 An Example Flow {"boardingpass_id":"123", "passenger“:"smith", "flight_number":"LH454“, “checked_bags”:1} {"boardingpass_id":"123", "security_area":"t1_2", "status":"success"} {"security_area":"t1_2", "count":"1"} {"passenger":"smith", "beacon_id":"t1_b123"} {"boardingpass_id":"123", "item_group":"cigarettes"} {"boardingpass_id":"123", "status":"success"} {"flight_no":"LH454", “runway":“1north"} {“old_gate":“a12”, “new_gate":“e50"}
  • 38. © 2018 OpenCore GmbH & Co. KG 42 Check-In Event {"boardingpass_id":"123", "passenger":"smith", "flight_number":"LH454“, “terminal”:“terminal1” } check_in_count CREATE TABLE check_in_count AS SELECT terminal, count(terminal) FROM security_in WINDOW TUMBLING (SIZE 24 hour) GROUP BY terminal; check_in What is it good for? • Early warning for security capacity • „Don‘t dawdle“ warning based on security queues
  • 39. © 2018 OpenCore GmbH & Co. KG 43 Passenger Enters Security Area {"boardingpass_id":"123", "security_area":"t1_2", "status" : "success"} security_in_count CREATE TABLE security_in_count AS SELECT security_area, count(security_area) FROM security_in WINDOW TUMBLING (SIZE 24 hour) WHERE status='success' GROUP BY security_area; security_in What is it good for? • Monitor for failed attempts • Passenger routing to security • Unload baggage of late passengers • … time_to_security SELECT s.boardingpass_id, c.rowtime - s.rowtime as time_to_security FROM security_in s LEFT JOIN check_in c WITHIN 1 HOUR ON s.boardingpass_id=c.boardingpass_id;
  • 40. © 2018 OpenCore GmbH & Co. KG 44 Passenger Leaves Security Area {"security_area":"t1_2", "count“:"1"} security_out security_out_count CREATE TABLE security_out_count AS SELECT security_area, sum(count) FROM security_out WINDOW TUMBLING (SIZE 24 hour) GROUP BY security_area; security_in_count current_count What is it good for? • Capacity planning • Wait time prediction • Passenger routing (apps & physical) • Alerting on late passengers checking in • … SELECT i.terminal AS terminal, i.KSQL_COL_1 AS entry, o.KSQL_COL_1 AS exit, i.KSQL_COL_1 - o.KSQL_COL_1 AS waiting FROM security_in_count i INNER JOIN security_out_count o ON i.terminal=o.terminal;
  • 41. © 2018 OpenCore GmbH & Co. KG 45 Passenger Located Via iBeacon {"passenger":"smith", "beacon_id":"t1_b123"} security_duration security_in dutyfree_joined CREATE STREAM dutyfree_joined AS SELECT c.boardingpass_id, d.passenger FROM dutyfree_in d LEFT JOIN security_in s WITHIN 1 HOURS ON s.passenger=d.passenger; dutyfree_in SELECT d.boardingpass_id, d.d_passenger, d.rowtime - s.rowtime as time_in_security FROM dutyfree_in_with_bc d LEFT JOIN security_in s WITHIN 1 HOUR ON d.boardingpass_id=s.boardingpass_id; What is it good for? • Refining wait time prediction • Targeted questionaire (find reasons for outliers) • Vouchers for huge delays • …
  • 42. © 2018 OpenCore GmbH & Co. KG 46 Purchase Event {"boardingpass_id":"123", "item_group":"cigarettes"} flight_information check_in dutyfree_joined What is it good for? • Retail models • Route to smoking area nearest to gate • Advise of walk time if time is tight • … dutyfree_purchase CREATE STREAM dutyfree_joined AS SELECT c.boardingpass_id, c.passenger, p.purchase_type, f.gate FROM dutyfree_purchase p LEFT JOIN check_in c WITHIN 1 HOURS ON c.passenger=p.passenger LEFT JOIN flight_information f WITHIN 1 HOURS ON f.flight_number = c.flight_number;
  • 43. © 2018 OpenCore GmbH & Co. KG 47 Gate Change expected_gate_arrival notifications expected_gate_departure CREATE STREAM gate_wait_time AS SELECT a.flight, d.departure_time - a.arrival_time as wait_time FROM expected_gate_arrival a INNER JOIN expected_gate_departure d WITHIN 1 HOURS ON a.gate=d.gate; gate_wait_time gate_change CREATE STREAM gate_change AS SELECT flight FROM gate_wait_time WHERE wait_time > 600000; CREATE STREAM notifications AS SELECT f.passenger FROM gate_change g LEFT JOIN flight_information f WITHIN 1 HOURS ON f.gate=g.gate; flight_information
  • 44. © 2018 OpenCore GmbH & Co. KG 48 Passenger Boards Plane {"boardingpass_id":"123", "status":"success"} gate_in What is it good for? • Alert on bags without matching passengers • Trigger unloading based on related events • Gate closed • Time based • … bags_joined check_in baggage_loaded CREATE STREAM bag_join AS SELECT c.passenger, c.bags FROM gate_in g LEFT JOIN check_in c WITHIN 1 HOURS ON c.boardingpass_id=g.boardingpass_id LEFT JOIN baggage_loaded b WITHIN 1 HOURS ON b.bag_id = c.bag_id;
  • 45. Thank You! © 2018 OpenCore GmbH & Co. KG