SlideShare a Scribd company logo
Talk 1: Flink SQL For Continuous SQL/ETL/Apps
Talk 2: Apache NiFi DevOps
Timothy Spann - Principal DataFlow Field Engineer
25-March-2021
https://www.meetup.com/futureofdata-princeton/
@PaasDev
© 2021 Cloudera, Inc. All rights reserved. 2
Upcoming Events
● 7/April/2021 - CDF - Ask the Experts
● 15/April/2021 - Real-time Streaming Pipelines with FLaNK DataConLA
● 21/April/2021 - Emerging Tech Day
● 22/April/2021- Demo Jam NiFi
● 27/April/2021 - Developer Week Europe
● 06/May/2021 - Continuous SQL with SQL Stream Builder
● https://www.meetup.com/pro/futureofdata/
● https://www.linkedin.com/pulse/2021-schedule-tim-spann/
● https://www.meetup.com/futureofdata-newyork
© 2021 Cloudera, Inc. All rights reserved. 3
Welcome to Future of Data - Virtual - 25/March/2021
@PaasDev
https://www.meetup.com/futureofdata-princeton/
https://www.meetup.com/futureofdata-newyork/
https://www.meetup.com/futureofdata-philadelphia/
From Big Data to AI to Streaming to Containers to
Cloud to Analytics to Cloud Storage to Fast Data to
Machine Learning to Microservices to ...
© 2021 Cloudera, Inc. All rights reserved. 4
© 2021 Cloudera, Inc. All rights reserved. 5
https://github.com/tspannhw https://www.datainmotion.dev/
© 2021 Cloudera, Inc. All rights reserved. 6
© 2021 Cloudera, Inc. All rights reserved. 7
Simplifying the User Experience
© 2021 Cloudera, Inc. All rights reserved. 8
REST Data Collect Transform
Demo Scenario
Scalable
collection
Publish /
Analyze
Scalable
transformation
Alerting
Analyze
CLOUD DATA STORES
Event
Stream
CLOUDERA DATA PLATFORM
© 2021 Cloudera, Inc. All rights reserved. 9
End to End Demo Pipeline
Enterprise
sources
Weather
Errors
Aggregates
Alerts
Stocks
ETL
Analytics
Streaming SQL
Clickstream Market data
Machine logs Social
© 2021 Cloudera, Inc. All rights reserved. 10
Weather Streaming Pipeline
Weather
Weather
Climate
Aggregates
Sensors
SQL
Analytics
Sources
Pollution
© 2021 Cloudera, Inc. All rights reserved. 11
https://github.com/tspannhw/SmartStocks
© 2021 Cloudera, Inc. All rights reserved. 12
https://www.datainmotion.dev/2021/01/flank-real-time-transit-information-for.html
© 2021 Cloudera, Inc. All rights reserved. 13
Flink SQL Examples
INSERT INTO weathernj
SELECT CAST(`location` as STRING) `location`,
CAST(station_id as STRING) station_id, latitude, longitude,
CAST(observation_time as STRING) observation_time,
CAST(weather as STRING) weather,
CAST(temperature_string as STRING) temperature_string,
temp_f, temp_c, relative_humidity, CAST(wind_string as STRING) wind_string,
CAST(wind_dir as STRING) wind_dir,
wind_degrees, wind_mph, wind_kt, pressure_in,
CAST(dewpoint_string as STRING) dewpoint_string, dewpoint_f, dewpoint_c
FROM weather
WHERE
`location` is not null and `location` <> 'null' and trim(`location`) <> '' and `location` like '%NJ';
© 2021 Cloudera, Inc. All rights reserved. 14
Flink SQL Examples
INSERT INTO global_sensor_events
SELECT scada.uuid, scada.systemtime ,
scada.temperaturef ,
scada.pressure , scada.humidity ,
scada.lux , scada.proximity ,
scada.oxidising ,
scada.reducing , scada.nh3 ,
scada.gasko,energy.`current`,
energy.voltage ,energy.`power` ,
Energy.`total`,energy.fanstatus
FROM energy, scada
WHERE
scada.systemtime >= energy.systemtime;
© 2021 Cloudera, Inc. All rights reserved. 15
© 2021 Cloudera, Inc. All rights reserved. 16
Apache NiFi DevOps
https://nifi.apache.org/docs/nifi-docs/html/tool
kit-guide.html#nifi_CLI
© 2021 Cloudera, Inc. All rights reserved. 17
Apache NiFi DevOps
Make sure you check out Apache NiFi 1.13.2.
There are some great new features.
https://www.datainmotion.dev/2021/02/new-features
-of-apache-nifi-1130.html
Today we will discuss some DevOps around NiFi
with NiFi CLI, REST and NiPyAPI.
Some cool tools in the toolkit like diagnostics,
threaddump.
https://www.datainmotion.dev/2019/11/nifi-toolkit-cli-
for-nifi-110.html
© 2021 Cloudera, Inc. All rights reserved. 18
Apache NiFi REST API
https:/
/nifi.apache.org/docs/nifi-docs/rest-api/index.html
{"clusterSummary":{"connectedNodes":"1 /
1","connectedNodeCount":1,"totalNodeCoun
t":1,"connectedToCluster":true,"clustered":tr
ue}}
© 2021 Cloudera, Inc. All rights reserved. 19
Apache NiFi REST API
https:/
/nifi.apache.org/docs/nifi-docs/rest-api/index.html
/nifi-api/flow/cluster/summary
© 2021 Cloudera, Inc. All rights reserved. 20
http://localhost:8080/nifi-api/resources
http://localhost:8080/nifi-api/flow/cluster/summary
http://localhost:8080/nifi-api/flow/config
http://localhost:8080/nifi-api/flow/controller/bulletins
http://localhost:8080/nifi-api/flow/history/?offset=1&count=100
http://localhost:8080/nifi-api/flow/processor-types
http://localhost:8080/nifi-api/flow/reporting-tasks
http://localhost:8080/nifi-api/flow/status
nifi-api/system-diagnostics
nifi-api/flow/controller/bulletins
nifi-api/flow/process-groups/root
nifi-api/flow/process-groups/root/controller-services
nifi-api/flow/process-groups/root/status
nifi-api/flow/process-groups/7a01d441-0164-1000-ec7a-54109819f084
Apache NiFi REST API - Examples
© 2021 Cloudera, Inc. All rights reserved. 21
Apache NiFi DevOps
Some of this is a bit tricky. I will show you what we
can do with DataFlow Experience in a future
meetup. Time to hit the command line.
cli.sh nifi pg-enable-services -u
http:/
/edge2ai-1.dim.local:8080
--processGroupId root
© 2021 Cloudera, Inc. All rights reserved. 22
Apache NiFi CLI
nifi pg-list -u http:/
/edge2ai-1.dim.local:8080
© 2021 Cloudera, Inc. All rights reserved. 23
CLI Tips
You can run interactive or command at a time.
For output:
● -ot simple
● -ot json
You can loop calls with Shell Script, DevOps tools, NiFi and more.
© 2021 Cloudera, Inc. All rights reserved. 24
CLI Overall Commands
nifi cluster-summary -u http://localhost:8080
nifi get-services -u http://localhost:8080
nifi get-reporting-task --baseUrl http://edge2ai-1.dim.local:8080 -verbose --reportingTaskId
07914d9f-1ce3-1174-0000-000039db6547 -ot json
nifi get-reporting-tasks --baseUrl http://edge2ai-1.dim.local:8080 -verbose
https://github.com/tspannhw/CloudDemo2021
© 2021 Cloudera, Inc. All rights reserved. 25
CLI Process Groups Commands
nifi pg-list -u http://localhost:8080
nifi pg-enable-services -u http://edge2ai-1.dim.local:8080 --processGroupId root
nifi pg-status -u http://localhost:8080 --processGroupId 6608ff51-89bb-3d66-4caf-f86ecaea950d
nifi pg-status -u http://localhost:8080 --processGroupId root
nifi pg-get-services -u http://localhost:8080 --processGroupId root
nifi pg-enable-services -u http://localhost:8080 --processGroupId root
nifi pg-start -u http://edge2ai-1.dim.local:8080 -pgid 2c1860b3-7f21-36f4-a0b8-b415c652fc62
© 2021 Cloudera, Inc. All rights reserved. 26
CLI Parameter Contexts Commands
nifi list-param-contexts -u http://edge2ai-1.dim.local:8080 -verbose
nifi pg-set-param-context -u http://edge2ai-1.dim.local:8080 -verbose -pgid
2c1860b3-7f21-36f4-a0b8-b415c652fc62 -pcid 39f0f296-0177-1000-ffff-ffffdccb6d90
nifi export-param-context -u http://localhost:8080 -verbose --paramContextId
8067d863-016e-1000-f0f7-265210d3e7dc
nifi import-param-context -u http://localhost:8080 -i $f
© 2021 Cloudera, Inc. All rights reserved. 27
CLI Registry Commands
registry list-buckets -u http://edge2ai-1.dim.local:18080 -verbose
registry create-flow -verbose -u http://server:18080 -b 250a5ae5-ced8-4f4e-8b3b-01eb9d47a0d9
--flowName iotFlow
registry import-flow-version -verbose -u http://server:18080 -f a5a4ac59-9aeb-416e-937f-e601ca8beba9
-i iot-1.json
registry list-flows -u http://server:18080 -b 250a5ae5-ced8-4f4e-8b3b-01eb9d47a0d9
registry list-flows -bucketId 36cb79a4-f735-4f77-ba55-606718a9c3c9 -u http://localhost:18080
registry export-flow-version -f 5ebc2183-954e-4887-a28c-9d0ee54a02ed -o rainbow.json -ot json
https://dzone.com/articles/devops-for-apache-nifi-17-and-more
© 2021 Cloudera, Inc. All rights reserved. 28
Apache NiFi DevOps
● https://github.com/tspannhw/CloudDemo2021
● https://www.datainmotion.dev/2021/01/automating-starting-services-in-apache.html
● https://github.com/tspannhw/EverythingApacheNiFi
● https://github.com/tspannhw/BackupRegistry
● https://dev.to/tspannhw/backup-and-restore-nifi-registry-templates-14m
● https://dev.to/tspannhw/using-nifi-cli-to-restore-nifi-flows-from-backups-18p9
● https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#nifi_CLI
● https://nipyapi.readthedocs.io/en/latest/
● https://www.datainmotion.dev/2019/04/simple-apache-nifi-operations-dashboard.html
● https://www.datainmotion.dev/2020/09/devops-working-with-parameter-contexts.html
© 2021 Cloudera, Inc. All rights reserved. 29
CLOUDERA DATAFLOW DATA-IN-MOTION PLATFORM
© 2021 Cloudera, Inc. All rights reserved. 30
CLOUDERA FLOW AND EDGE MANAGEMENT
Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud, data
center) to any downstream system with built in end-to-end security and provenance
Advanced tooling to industrialize
flow development (Flow Development
Life Cycle)
ACQUIRE
• Over 300 Prebuilt Processors
• Easy to build your own
• Parse, Enrich & Apply Schema
• Filter, Split, Merger & Route
• Throttle & Backpressure
FTP
SFTP
HL7
UDP
XML
HTTP
EMAIL
HTML
IMAGE
SYSLOG
PROCESS
HASH
MERGE
EXTRACT
DUPLICATE
SPLIT
ENCRYPT
TALL
EVALUATE
EXECUTE
GEOENRICH
SCAN
REPLACE
TRANSLATE
CONVERT
ROUTE TEXT
ROUTE CONTENT
ROUTE CONTEXT
ROUTE RATE
DISTRIBUTE LOAD
DELIVER
• Guaranteed Delivery
• Full data provenance from
acquisition to delivery
• Diverse, Non-Traditional Sources
• Eco-system integration
FTP
SFTP
HL7
UDP
XML
HTTP
EMAIL
HTML
IMAGE
SYSLOG
© 2021 Cloudera, Inc. All rights reserved. 31
© 2021 Cloudera, Inc. All rights reserved. 32
TH N Y U

More Related Content

PDF
Using the flipn stack for edge ai (flink, nifi, pulsar)
PDF
Using FLiP with influxdb for edgeai iot at scale 2022
PDF
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
PDF
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)
PDF
Real time stock processing with apache nifi, apache flink and apache kafka
PDF
ApacheCon 2021 - Apache NiFi Deep Dive 300
PDF
Python web conference 2022 apache pulsar development 101 with python (f li-...
PDF
Incrementally streaming rdbms data to your data lake automagically
Using the flipn stack for edge ai (flink, nifi, pulsar)
Using FLiP with influxdb for edgeai iot at scale 2022
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)
Real time stock processing with apache nifi, apache flink and apache kafka
ApacheCon 2021 - Apache NiFi Deep Dive 300
Python web conference 2022 apache pulsar development 101 with python (f li-...
Incrementally streaming rdbms data to your data lake automagically

What's hot (20)

PDF
Api world apache nifi 101
PDF
FLiP Into Trino
PDF
ApacheCon 2021 Apache Deep Learning 302
PDF
Learning the basics of Apache NiFi for iot OSS Europe 2020
PDF
ApacheCon 2021: Apache NiFi 101- introduction and best practices
PDF
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
PDF
Music city data Hail Hydrate! from stream to lake
PDF
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
PDF
Cracking the nut, solving edge ai with apache tools and frameworks
PPTX
Flink. Pure Streaming
PDF
Cracking the nut, solving edge ai with apache tools and frameworks
PDF
DBCC 2021 - FLiP Stack for Cloud Data Lakes
PDF
Using apache mx net in production deep learning streaming pipelines
PDF
Using the FLiPN stack for edge ai (flink, nifi, pulsar)
PPTX
Extending the Yahoo Streaming Benchmark
PDF
Codeless pipelines with pulsar and flink
PDF
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PDF
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
PDF
Cloud lunch and learn real-time streaming in azure
PPTX
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Api world apache nifi 101
FLiP Into Trino
ApacheCon 2021 Apache Deep Learning 302
Learning the basics of Apache NiFi for iot OSS Europe 2020
ApacheCon 2021: Apache NiFi 101- introduction and best practices
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Music city data Hail Hydrate! from stream to lake
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Cracking the nut, solving edge ai with apache tools and frameworks
Flink. Pure Streaming
Cracking the nut, solving edge ai with apache tools and frameworks
DBCC 2021 - FLiP Stack for Cloud Data Lakes
Using apache mx net in production deep learning streaming pipelines
Using the FLiPN stack for edge ai (flink, nifi, pulsar)
Extending the Yahoo Streaming Benchmark
Codeless pipelines with pulsar and flink
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Cloud lunch and learn real-time streaming in azure
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Ad

Similar to Flink sql for continuous sql etl apps & Apache NiFi devops (20)

PDF
AIDEVDAY_ Data-in-Motion to Supercharge AI
PDF
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
PDF
Meetup - Brasil - Data In Motion - 2023 September 19
PDF
Meetup - Brasil - Data In Motion - 2023 September 19
PDF
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
PDF
Building Real-Time Travel Alerts
PDF
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
PDF
Meetup Streaming Data Pipeline Development
PDF
BigDataFest_ Building Modern Data Streaming Apps
PDF
big data fest building modern data streaming apps
PDF
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
PDF
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
PDF
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
PDF
Real-time Streaming Pipelines with FLaNK
PDF
IoT Edge Data Processing with NVidia Jetson Nano oct 3 2019
PDF
BigDataFest Building Modern Data Streaming Apps
PDF
The Never Landing Stream with HTAP and Streaming
PDF
RTAS 2023: Building a Real-Time IoT Application
PDF
Introduction to Apache NiFi 1.10
PDF
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
AIDEVDAY_ Data-in-Motion to Supercharge AI
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Meetup - Brasil - Data In Motion - 2023 September 19
Meetup - Brasil - Data In Motion - 2023 September 19
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Building Real-Time Travel Alerts
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Meetup Streaming Data Pipeline Development
BigDataFest_ Building Modern Data Streaming Apps
big data fest building modern data streaming apps
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
Real-time Streaming Pipelines with FLaNK
IoT Edge Data Processing with NVidia Jetson Nano oct 3 2019
BigDataFest Building Modern Data Streaming Apps
The Never Landing Stream with HTAP and Streaming
RTAS 2023: Building a Real-Time IoT Application
Introduction to Apache NiFi 1.10
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Ad

More from Timothy Spann (20)

PDF
14May2025_TSPANN_FromAirQualityUnstructuredData.pdf
PDF
Streaming AI Pipelines with Apache NiFi and Snowflake NYC 2025
PDF
2025-03-03-Philly-AAAI-GoodData-Build Secure RAG Apps With Open LLM
PDF
Conf42_IoT_Dec2024_Building IoT Applications With Open Source
PDF
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
PDF
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
PDF
TSPANN-2024-Nov-CloudX-Adding Generative AI to Real-Time Streaming Pipelines
PDF
2024-Nov-BuildStuff-Adding Generative AI to Real-Time Streaming Pipelines
PDF
14 November 2024 - Conf 42 - Prompt Engineering - Codeless Generative AI Pipe...
PDF
2024 Nov 05 - Linux Foundation TAC TALK With Milvus
PPTX
tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAG
PDF
tspann08-Nov-2024_PyDataNYC_Unstructured Data Processing with a Raspberry Pi ...
PDF
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
PDF
10-25-2024_BITS_NYC_Unstructured Data and LLM_ What, Why and How
PDF
2024-OCT-23 NYC Meetup - Unstructured Data Meetup - Unstructured Halloween
PDF
DBTA Round Table with Zilliz and Airbyte - Unstructured Data Engineering
PDF
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
PDF
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
PDF
2024-10-04 - Grace Hopper Celebration Open Source Day - Stefan
PDF
01-Oct-2024_PES-VectorDatabasesAndAI.pdf
14May2025_TSPANN_FromAirQualityUnstructuredData.pdf
Streaming AI Pipelines with Apache NiFi and Snowflake NYC 2025
2025-03-03-Philly-AAAI-GoodData-Build Secure RAG Apps With Open LLM
Conf42_IoT_Dec2024_Building IoT Applications With Open Source
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
TSPANN-2024-Nov-CloudX-Adding Generative AI to Real-Time Streaming Pipelines
2024-Nov-BuildStuff-Adding Generative AI to Real-Time Streaming Pipelines
14 November 2024 - Conf 42 - Prompt Engineering - Codeless Generative AI Pipe...
2024 Nov 05 - Linux Foundation TAC TALK With Milvus
tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAG
tspann08-Nov-2024_PyDataNYC_Unstructured Data Processing with a Raspberry Pi ...
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
10-25-2024_BITS_NYC_Unstructured Data and LLM_ What, Why and How
2024-OCT-23 NYC Meetup - Unstructured Data Meetup - Unstructured Halloween
DBTA Round Table with Zilliz and Airbyte - Unstructured Data Engineering
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
2024-10-04 - Grace Hopper Celebration Open Source Day - Stefan
01-Oct-2024_PES-VectorDatabasesAndAI.pdf

Recently uploaded (20)

PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
System and Network Administraation Chapter 3
PPTX
Transform Your Business with a Software ERP System
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
Introduction to Artificial Intelligence
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
CHAPTER 2 - PM Management and IT Context
Navsoft: AI-Powered Business Solutions & Custom Software Development
Odoo POS Development Services by CandidRoot Solutions
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Design an Analysis of Algorithms I-SECS-1021-03
System and Network Administraation Chapter 3
Transform Your Business with a Software ERP System
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
VVF-Customer-Presentation2025-Ver1.9.pptx
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
How to Migrate SBCGlobal Email to Yahoo Easily
Online Work Permit System for Fast Permit Processing
Introduction to Artificial Intelligence
How to Choose the Right IT Partner for Your Business in Malaysia
ManageIQ - Sprint 268 Review - Slide Deck
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Design an Analysis of Algorithms II-SECS-1021-03

Flink sql for continuous sql etl apps & Apache NiFi devops

  • 1. Talk 1: Flink SQL For Continuous SQL/ETL/Apps Talk 2: Apache NiFi DevOps Timothy Spann - Principal DataFlow Field Engineer 25-March-2021 https://www.meetup.com/futureofdata-princeton/ @PaasDev
  • 2. © 2021 Cloudera, Inc. All rights reserved. 2 Upcoming Events ● 7/April/2021 - CDF - Ask the Experts ● 15/April/2021 - Real-time Streaming Pipelines with FLaNK DataConLA ● 21/April/2021 - Emerging Tech Day ● 22/April/2021- Demo Jam NiFi ● 27/April/2021 - Developer Week Europe ● 06/May/2021 - Continuous SQL with SQL Stream Builder ● https://www.meetup.com/pro/futureofdata/ ● https://www.linkedin.com/pulse/2021-schedule-tim-spann/ ● https://www.meetup.com/futureofdata-newyork
  • 3. © 2021 Cloudera, Inc. All rights reserved. 3 Welcome to Future of Data - Virtual - 25/March/2021 @PaasDev https://www.meetup.com/futureofdata-princeton/ https://www.meetup.com/futureofdata-newyork/ https://www.meetup.com/futureofdata-philadelphia/ From Big Data to AI to Streaming to Containers to Cloud to Analytics to Cloud Storage to Fast Data to Machine Learning to Microservices to ...
  • 4. © 2021 Cloudera, Inc. All rights reserved. 4
  • 5. © 2021 Cloudera, Inc. All rights reserved. 5 https://github.com/tspannhw https://www.datainmotion.dev/
  • 6. © 2021 Cloudera, Inc. All rights reserved. 6
  • 7. © 2021 Cloudera, Inc. All rights reserved. 7 Simplifying the User Experience
  • 8. © 2021 Cloudera, Inc. All rights reserved. 8 REST Data Collect Transform Demo Scenario Scalable collection Publish / Analyze Scalable transformation Alerting Analyze CLOUD DATA STORES Event Stream CLOUDERA DATA PLATFORM
  • 9. © 2021 Cloudera, Inc. All rights reserved. 9 End to End Demo Pipeline Enterprise sources Weather Errors Aggregates Alerts Stocks ETL Analytics Streaming SQL Clickstream Market data Machine logs Social
  • 10. © 2021 Cloudera, Inc. All rights reserved. 10 Weather Streaming Pipeline Weather Weather Climate Aggregates Sensors SQL Analytics Sources Pollution
  • 11. © 2021 Cloudera, Inc. All rights reserved. 11 https://github.com/tspannhw/SmartStocks
  • 12. © 2021 Cloudera, Inc. All rights reserved. 12 https://www.datainmotion.dev/2021/01/flank-real-time-transit-information-for.html
  • 13. © 2021 Cloudera, Inc. All rights reserved. 13 Flink SQL Examples INSERT INTO weathernj SELECT CAST(`location` as STRING) `location`, CAST(station_id as STRING) station_id, latitude, longitude, CAST(observation_time as STRING) observation_time, CAST(weather as STRING) weather, CAST(temperature_string as STRING) temperature_string, temp_f, temp_c, relative_humidity, CAST(wind_string as STRING) wind_string, CAST(wind_dir as STRING) wind_dir, wind_degrees, wind_mph, wind_kt, pressure_in, CAST(dewpoint_string as STRING) dewpoint_string, dewpoint_f, dewpoint_c FROM weather WHERE `location` is not null and `location` <> 'null' and trim(`location`) <> '' and `location` like '%NJ';
  • 14. © 2021 Cloudera, Inc. All rights reserved. 14 Flink SQL Examples INSERT INTO global_sensor_events SELECT scada.uuid, scada.systemtime , scada.temperaturef , scada.pressure , scada.humidity , scada.lux , scada.proximity , scada.oxidising , scada.reducing , scada.nh3 , scada.gasko,energy.`current`, energy.voltage ,energy.`power` , Energy.`total`,energy.fanstatus FROM energy, scada WHERE scada.systemtime >= energy.systemtime;
  • 15. © 2021 Cloudera, Inc. All rights reserved. 15
  • 16. © 2021 Cloudera, Inc. All rights reserved. 16 Apache NiFi DevOps https://nifi.apache.org/docs/nifi-docs/html/tool kit-guide.html#nifi_CLI
  • 17. © 2021 Cloudera, Inc. All rights reserved. 17 Apache NiFi DevOps Make sure you check out Apache NiFi 1.13.2. There are some great new features. https://www.datainmotion.dev/2021/02/new-features -of-apache-nifi-1130.html Today we will discuss some DevOps around NiFi with NiFi CLI, REST and NiPyAPI. Some cool tools in the toolkit like diagnostics, threaddump. https://www.datainmotion.dev/2019/11/nifi-toolkit-cli- for-nifi-110.html
  • 18. © 2021 Cloudera, Inc. All rights reserved. 18 Apache NiFi REST API https:/ /nifi.apache.org/docs/nifi-docs/rest-api/index.html {"clusterSummary":{"connectedNodes":"1 / 1","connectedNodeCount":1,"totalNodeCoun t":1,"connectedToCluster":true,"clustered":tr ue}}
  • 19. © 2021 Cloudera, Inc. All rights reserved. 19 Apache NiFi REST API https:/ /nifi.apache.org/docs/nifi-docs/rest-api/index.html /nifi-api/flow/cluster/summary
  • 20. © 2021 Cloudera, Inc. All rights reserved. 20 http://localhost:8080/nifi-api/resources http://localhost:8080/nifi-api/flow/cluster/summary http://localhost:8080/nifi-api/flow/config http://localhost:8080/nifi-api/flow/controller/bulletins http://localhost:8080/nifi-api/flow/history/?offset=1&count=100 http://localhost:8080/nifi-api/flow/processor-types http://localhost:8080/nifi-api/flow/reporting-tasks http://localhost:8080/nifi-api/flow/status nifi-api/system-diagnostics nifi-api/flow/controller/bulletins nifi-api/flow/process-groups/root nifi-api/flow/process-groups/root/controller-services nifi-api/flow/process-groups/root/status nifi-api/flow/process-groups/7a01d441-0164-1000-ec7a-54109819f084 Apache NiFi REST API - Examples
  • 21. © 2021 Cloudera, Inc. All rights reserved. 21 Apache NiFi DevOps Some of this is a bit tricky. I will show you what we can do with DataFlow Experience in a future meetup. Time to hit the command line. cli.sh nifi pg-enable-services -u http:/ /edge2ai-1.dim.local:8080 --processGroupId root
  • 22. © 2021 Cloudera, Inc. All rights reserved. 22 Apache NiFi CLI nifi pg-list -u http:/ /edge2ai-1.dim.local:8080
  • 23. © 2021 Cloudera, Inc. All rights reserved. 23 CLI Tips You can run interactive or command at a time. For output: ● -ot simple ● -ot json You can loop calls with Shell Script, DevOps tools, NiFi and more.
  • 24. © 2021 Cloudera, Inc. All rights reserved. 24 CLI Overall Commands nifi cluster-summary -u http://localhost:8080 nifi get-services -u http://localhost:8080 nifi get-reporting-task --baseUrl http://edge2ai-1.dim.local:8080 -verbose --reportingTaskId 07914d9f-1ce3-1174-0000-000039db6547 -ot json nifi get-reporting-tasks --baseUrl http://edge2ai-1.dim.local:8080 -verbose https://github.com/tspannhw/CloudDemo2021
  • 25. © 2021 Cloudera, Inc. All rights reserved. 25 CLI Process Groups Commands nifi pg-list -u http://localhost:8080 nifi pg-enable-services -u http://edge2ai-1.dim.local:8080 --processGroupId root nifi pg-status -u http://localhost:8080 --processGroupId 6608ff51-89bb-3d66-4caf-f86ecaea950d nifi pg-status -u http://localhost:8080 --processGroupId root nifi pg-get-services -u http://localhost:8080 --processGroupId root nifi pg-enable-services -u http://localhost:8080 --processGroupId root nifi pg-start -u http://edge2ai-1.dim.local:8080 -pgid 2c1860b3-7f21-36f4-a0b8-b415c652fc62
  • 26. © 2021 Cloudera, Inc. All rights reserved. 26 CLI Parameter Contexts Commands nifi list-param-contexts -u http://edge2ai-1.dim.local:8080 -verbose nifi pg-set-param-context -u http://edge2ai-1.dim.local:8080 -verbose -pgid 2c1860b3-7f21-36f4-a0b8-b415c652fc62 -pcid 39f0f296-0177-1000-ffff-ffffdccb6d90 nifi export-param-context -u http://localhost:8080 -verbose --paramContextId 8067d863-016e-1000-f0f7-265210d3e7dc nifi import-param-context -u http://localhost:8080 -i $f
  • 27. © 2021 Cloudera, Inc. All rights reserved. 27 CLI Registry Commands registry list-buckets -u http://edge2ai-1.dim.local:18080 -verbose registry create-flow -verbose -u http://server:18080 -b 250a5ae5-ced8-4f4e-8b3b-01eb9d47a0d9 --flowName iotFlow registry import-flow-version -verbose -u http://server:18080 -f a5a4ac59-9aeb-416e-937f-e601ca8beba9 -i iot-1.json registry list-flows -u http://server:18080 -b 250a5ae5-ced8-4f4e-8b3b-01eb9d47a0d9 registry list-flows -bucketId 36cb79a4-f735-4f77-ba55-606718a9c3c9 -u http://localhost:18080 registry export-flow-version -f 5ebc2183-954e-4887-a28c-9d0ee54a02ed -o rainbow.json -ot json https://dzone.com/articles/devops-for-apache-nifi-17-and-more
  • 28. © 2021 Cloudera, Inc. All rights reserved. 28 Apache NiFi DevOps ● https://github.com/tspannhw/CloudDemo2021 ● https://www.datainmotion.dev/2021/01/automating-starting-services-in-apache.html ● https://github.com/tspannhw/EverythingApacheNiFi ● https://github.com/tspannhw/BackupRegistry ● https://dev.to/tspannhw/backup-and-restore-nifi-registry-templates-14m ● https://dev.to/tspannhw/using-nifi-cli-to-restore-nifi-flows-from-backups-18p9 ● https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#nifi_CLI ● https://nipyapi.readthedocs.io/en/latest/ ● https://www.datainmotion.dev/2019/04/simple-apache-nifi-operations-dashboard.html ● https://www.datainmotion.dev/2020/09/devops-working-with-parameter-contexts.html
  • 29. © 2021 Cloudera, Inc. All rights reserved. 29 CLOUDERA DATAFLOW DATA-IN-MOTION PLATFORM
  • 30. © 2021 Cloudera, Inc. All rights reserved. 30 CLOUDERA FLOW AND EDGE MANAGEMENT Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud, data center) to any downstream system with built in end-to-end security and provenance Advanced tooling to industrialize flow development (Flow Development Life Cycle) ACQUIRE • Over 300 Prebuilt Processors • Easy to build your own • Parse, Enrich & Apply Schema • Filter, Split, Merger & Route • Throttle & Backpressure FTP SFTP HL7 UDP XML HTTP EMAIL HTML IMAGE SYSLOG PROCESS HASH MERGE EXTRACT DUPLICATE SPLIT ENCRYPT TALL EVALUATE EXECUTE GEOENRICH SCAN REPLACE TRANSLATE CONVERT ROUTE TEXT ROUTE CONTENT ROUTE CONTEXT ROUTE RATE DISTRIBUTE LOAD DELIVER • Guaranteed Delivery • Full data provenance from acquisition to delivery • Diverse, Non-Traditional Sources • Eco-system integration FTP SFTP HL7 UDP XML HTTP EMAIL HTML IMAGE SYSLOG
  • 31. © 2021 Cloudera, Inc. All rights reserved. 31
  • 32. © 2021 Cloudera, Inc. All rights reserved. 32 TH N Y U