SlideShare a Scribd company logo
Cloud Tools Guidance
Author: Timothy Spann
Michael Kohs George Vetticaden
Date: 04/19/2023
Last Updated: 5/01/2023
Notice
This document assumes that you have registered for an account, activated it and logged into
the CDP Sandbox. This is for authorized users only who have attended the webinar and have
read the training materials.
A short guide and references are listed here.
THIS IS NOT FOR USE WITH THE FIRST TWO
TUTORIALS. THIS IS FOR BUILDING ASSETS FOR
YOUR OWN NEW FLOWS.
1. How To Build Data Assets
1.1 Create a Kafka Topic
1. Navigate to Data Hub Clusters
2. Navigate to the oss-kafka-demo cluster
3. Navigate to Streams Messaging Manager
Info: Streams Messaging Manager (SMM) is a tool for working with Apache Kafka.
4. Now that you are in SMM.
5. Navigate to the round icon third from the top, click this Topic button.
6. You are now in the Topic browser.
7. Click Add New to build a new topic.
8. Enter the name of your topic prefixed with your “Workload User Name“
<yourusername>_yournewtopic, ex: tim_younewtopic.
9. Enter the name of your topic prefixed with your Workload User Name, ex:
tim_younewtopic. For settings you should create it with (3 partitions,
cleanup.policy: delete, availability maximum) as shown above.
Congratulations! You have built a new topic.
1.2 Create a Schema If You Need One. Not Required For Using Kafka Topics or Tutorials.
1. Navigate to Schema Registry from the Kafka Data Hub.
2. You will see existing schemas.
3. Click the white plus sign in the gray hexagon to create a new schema.
CloudToolGuidance03May2023
4. You can now add a new schema by entering a unique name starting with your Workload
User Name (ex: tim), followed by a short description and then the schema text as
shown. If you need examples, see the github list at the end of this guide.
5. Click Save and you have a new schema. If there were errors they will be shown and
you can fix them. For more help see, Schema Registry Documentation and Schema
Registry Public Cloud.
Congratulations! You have built a new schema. Start using it in your DataFlow application.
1.3 Create an Apache Iceberg Table
1. Navigate to oss-kudu-demo from the Data Hubs list
2. Navigate to Hue from the Kudu Data Hub.
3. Inside of Hue you can now create your table.
4. Navigate to your database, this was created for you.
Info: The database name pattern is your email address and then all special characters are
replaced with underscore and then _db is appended to that to make the db name and the
ranger policy is created to limit access to just the user and those that are in the admin
group. For example:
5. Create your Apache Iceberg table, it must be prefixed with your Work Load User Name
(userid).
CREATE TABLE <<userid>>_syslog_critical_archive
(priority int, severity int, facility int, version int, event_timestamp bigint, hostname string,
body string, appName string, procid string, messageid string,
structureddata struct<sdid:struct<eventid:string,eventsource:string,iut:string>>)
STORED BY ICEBERG
6. Your table is created in s3a://oss-uat2/iceberg/
7. Once you have sent data to your table, you can query it.
Additional Documentation
● Create a Table
● Query a Table
● Apache Iceberg Table Properties
2. Streaming Data Sets Available for Apps
The following Kafka topics are being populated with streaming data for you.
These come from the read-only Kafka cluster.
Navigate to the Data Hub Clusters.
Click on oss-kafka-datagen.
Click Schema Registry.
Click Streams Messaging Manager.
Use these brokers to connect to them:
Brokers
oss-kafka-datagen-corebroker1.oss-demo.qsm5-opic.cloudera.site:9093,oss-kafka-dat
agen-corebroker0.oss-demo.qsm5-opic.cloudera.site:9093,oss-kafka-datagen-corebro
ker2.oss-demo.qsm5-opic.cloudera.site:9093
Use this link for Schema Registry
https://#{Schema2}:7790/api/v1
Schema Registry Parameter Hostname: Schema2
oss-kafka-datagen-master0.oss-demo.qsm5-opic.cloudera.site
To View Schemas in the Schema Registry click the icon from the datahub
https://oss-kafka-datagen-gateway.oss-demo.qsm5-opic.cloudera.site/oss-kafka-datagen/cd
p-proxy/schema-registry/ui/#/
Schemas
https://github.com/tspannhw/FLaNK-DataFlows/tree/main/schemas
Group ID: yourid_cdf
Customers (customer)
Example Row
{"first_name":"Charley","last_name":"Farrell","age":19,"city":"Sawaynside","country":"Guinea","em
ail":"keven.herzog@hotmail.com","phone_number":"312-269-6619"}
IP Tables (ip_address)
Example Row
{"source_ip":"216.25.204.241","dest_port":219,"tcp_flags_ack":0,"tcp_flags_reset":0,"ts":"2023-0
4-20 15:26:45.517"}
Orders (orders)
Example Row
{"order_id":84170282,"city":"Wintheiserton","street_address":"80206 Caroyln
Lakes","amount":29,"order_time":"2023-04-20 13:25:06.097","order_status":"DELIVERED"}
Plants (plant)
Example Row
{"plant_id":829,"city":"Lake Gerald","lat":"39.568679","lon":"-151.64497","country":"Eritrea"}
Sensors (sensor)
Example Row
{"sensor_id":264,"timestamp_of_production":"2023-04-20 18:28:42.751"}
Sensor Data (sensor_data)
Example Row
{"sensor_id":250,"timestamp_of_production":"2023-04-20 18:42:04.847","sensor_value":-72}
Weather (weather)
Example Row
{"city":"New Ernesto","temp_c":21,"description":"Sleet"}
Transactions (transactions)
Example Row
{"sender_id":40816,"receiver_id":96057,"amount":557,"execution_date":"2023-04-20
16:15:30.744","currency":"UYU"}
These are realistic generated data sources that you can use, they are available from
read-only Kafka topics. These can be consumed by any developers in the sandbox.
Make sure you name your Kafka Consumer your Workload Username _ Some
Name.
Ex: tim_customerdata_reader
3. Bring Your Own Data (Public Only)
● Data is visible and downloadable to all, make sure it is safe, free, open, public data.
● Public REST Feeds are good
○ Wikipedia
https://docs.cloudera.com/dataflow/cloud/flow-designer-beginners-guide-readyflo
w/topics/cdf-flow-designer-getting-started-readyflow.html
○ https://gbfs.citibikenyc.com/gbfs/en/station_status.json
○ https://travel.state.gov/_res/rss/TAsTWs.xml
○ https://www.njtransit.com/rss/BusAdvisories_feed.xml
○ https://www.njtransit.com/rss/RailAdvisories_feed.xml
○ https://www.njtransit.com/rss/LightRailAdvisories_feed.xml
○ https://www.njtransit.com/rss/CustomerNotices_feed.xml
○ https://w1.weather.gov/xml/current_obs/all_xml.zip
○ https://dailymed.nlm.nih.gov/dailymed/services/v2/spls.json?page=1&pagesize=1
00
○ https://dailymed.nlm.nih.gov/dailymed/services/v2/drugnames.json?pagesize=10
0
○ https://dailymed.nlm.nih.gov/dailymed/rss.cfm
● Generic data files
○ https://aws.amazon.com/data-exchange
● Simulators
○ Use external data simulators via REST
○ Use GeneralFlowFile see:
https://www.datainmotion.dev/2019/04/integration-testing-for-apache-nifi.html
● Schemas, Data Sources and Examples
○ https://github.com/tspannhw/FLaNK-AllTheStreams/
○ https://github.com/tspannhw/FLaNK-DataFlows
○ https://github.com/tspannhw/FLaNK-TravelAdvisory/
○ https://github.com/tspannhw/FLiP-Current22-LetsMonitorAllTheThings
○ https://github.com/tspannhw/create-nifi-kafka-flink-apps
○ https://www.datainmotion.dev/2021/01/flank-real-time-transit-information-for.html
CloudToolGuidance03May2023

More Related Content

PDF
Building a Streaming Platform with Kafka
PDF
DevFest Nantes 2018 - Créer un data pipeline en 20 minutes avec Kafka Connect
PPTX
Kafka & Hadoop - for NYC Kafka Meetup
PPTX
Data Architectures for Robust Decision Making
PPTX
Connecting kafka message systems with scylla
PDF
[Apache Iceberg Meetup - 01/30/25] Ursa:Augmenting Iceberg with Kafka-Compati...
PPTX
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
Building a Streaming Platform with Kafka
DevFest Nantes 2018 - Créer un data pipeline en 20 minutes avec Kafka Connect
Kafka & Hadoop - for NYC Kafka Meetup
Data Architectures for Robust Decision Making
Connecting kafka message systems with scylla
[Apache Iceberg Meetup - 01/30/25] Ursa:Augmenting Iceberg with Kafka-Compati...
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Beyond the brokers - Un tour de l'écosystème Kafka

Similar to CloudToolGuidance03May2023 (20)

PDF
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
PDF
Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline! (Ro...
PDF
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
PDF
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
PDF
Beyond the brokers - A tour of the Kafka ecosystem
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
PDF
Apache Kafka - A modern Stream Processing Platform
PDF
10 essentials steps for kafka streaming services
PDF
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
PPTX
Kafka/SMM Crash Course
PDF
Meetup: Streaming Data Pipeline Development
PDF
London Apache Kafka Meetup (Jan 2017)
PPTX
Tutorial(release)
PDF
Apache Kafka with AWS s3 storage
PDF
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
PDF
Architecting Applications With Multiple Open Source Big Data Technologies
PPTX
End to End Streaming Architectures
PPTX
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
PDF
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
PPTX
Kafka for DBAs
Big Data Open Source Security LLC: Realtime log analysis with Mesos, Docker, ...
Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline! (Ro...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
Apache Kafka - A modern Stream Processing Platform
10 essentials steps for kafka streaming services
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Kafka/SMM Crash Course
Meetup: Streaming Data Pipeline Development
London Apache Kafka Meetup (Jan 2017)
Tutorial(release)
Apache Kafka with AWS s3 storage
Streaming Customer Insights with DataStax Cassandra & Apache Kafta at British...
Architecting Applications With Multiple Open Source Big Data Technologies
End to End Streaming Architectures
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka for DBAs
Ad

More from Timothy Spann (20)

PDF
14May2025_TSPANN_FromAirQualityUnstructuredData.pdf
PDF
Streaming AI Pipelines with Apache NiFi and Snowflake NYC 2025
PDF
2025-03-03-Philly-AAAI-GoodData-Build Secure RAG Apps With Open LLM
PDF
Conf42_IoT_Dec2024_Building IoT Applications With Open Source
PDF
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
PDF
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
PDF
TSPANN-2024-Nov-CloudX-Adding Generative AI to Real-Time Streaming Pipelines
PDF
2024-Nov-BuildStuff-Adding Generative AI to Real-Time Streaming Pipelines
PDF
14 November 2024 - Conf 42 - Prompt Engineering - Codeless Generative AI Pipe...
PDF
2024 Nov 05 - Linux Foundation TAC TALK With Milvus
PPTX
tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAG
PDF
tspann08-Nov-2024_PyDataNYC_Unstructured Data Processing with a Raspberry Pi ...
PDF
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
PDF
10-25-2024_BITS_NYC_Unstructured Data and LLM_ What, Why and How
PDF
2024-OCT-23 NYC Meetup - Unstructured Data Meetup - Unstructured Halloween
PDF
DBTA Round Table with Zilliz and Airbyte - Unstructured Data Engineering
PDF
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
PDF
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
PDF
2024-10-04 - Grace Hopper Celebration Open Source Day - Stefan
PDF
01-Oct-2024_PES-VectorDatabasesAndAI.pdf
14May2025_TSPANN_FromAirQualityUnstructuredData.pdf
Streaming AI Pipelines with Apache NiFi and Snowflake NYC 2025
2025-03-03-Philly-AAAI-GoodData-Build Secure RAG Apps With Open LLM
Conf42_IoT_Dec2024_Building IoT Applications With Open Source
2024 Dec 05 - PyData Global - Tutorial Its In The Air Tonight
2024Nov20-BigDataEU-RealTimeAIWithOpenSource
TSPANN-2024-Nov-CloudX-Adding Generative AI to Real-Time Streaming Pipelines
2024-Nov-BuildStuff-Adding Generative AI to Real-Time Streaming Pipelines
14 November 2024 - Conf 42 - Prompt Engineering - Codeless Generative AI Pipe...
2024 Nov 05 - Linux Foundation TAC TALK With Milvus
tspann06-NOV-2024_AI-Alliance_NYC_ intro to Data Prep Kit and Open Source RAG
tspann08-Nov-2024_PyDataNYC_Unstructured Data Processing with a Raspberry Pi ...
2024-10-28 All Things Open - Advanced Retrieval Augmented Generation (RAG) Te...
10-25-2024_BITS_NYC_Unstructured Data and LLM_ What, Why and How
2024-OCT-23 NYC Meetup - Unstructured Data Meetup - Unstructured Halloween
DBTA Round Table with Zilliz and Airbyte - Unstructured Data Engineering
17-October-2024 NYC AI Camp - Step-by-Step RAG 101
11-OCT-2024_AI_101_CryptoOracle_UnstructuredData
2024-10-04 - Grace Hopper Celebration Open Source Day - Stefan
01-Oct-2024_PES-VectorDatabasesAndAI.pdf
Ad

Recently uploaded (20)

PDF
System and Network Administration Chapter 2
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Understanding Forklifts - TECH EHS Solution
PDF
medical staffing services at VALiNTRY
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Designing Intelligence for the Shop Floor.pdf
PPTX
Introduction to Artificial Intelligence
PDF
top salesforce developer skills in 2025.pdf
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
history of c programming in notes for students .pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Nekopoi APK 2025 free lastest update
PPTX
Transform Your Business with a Software ERP System
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
System and Network Administration Chapter 2
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Operating system designcfffgfgggggggvggggggggg
Odoo Companies in India – Driving Business Transformation.pdf
Understanding Forklifts - TECH EHS Solution
medical staffing services at VALiNTRY
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Upgrade and Innovation Strategies for SAP ERP Customers
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Designing Intelligence for the Shop Floor.pdf
Introduction to Artificial Intelligence
top salesforce developer skills in 2025.pdf
PTS Company Brochure 2025 (1).pdf.......
history of c programming in notes for students .pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Nekopoi APK 2025 free lastest update
Transform Your Business with a Software ERP System
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf

CloudToolGuidance03May2023

  • 1. Cloud Tools Guidance Author: Timothy Spann Michael Kohs George Vetticaden Date: 04/19/2023 Last Updated: 5/01/2023 Notice This document assumes that you have registered for an account, activated it and logged into the CDP Sandbox. This is for authorized users only who have attended the webinar and have read the training materials. A short guide and references are listed here. THIS IS NOT FOR USE WITH THE FIRST TWO TUTORIALS. THIS IS FOR BUILDING ASSETS FOR YOUR OWN NEW FLOWS. 1. How To Build Data Assets 1.1 Create a Kafka Topic 1. Navigate to Data Hub Clusters
  • 2. 2. Navigate to the oss-kafka-demo cluster 3. Navigate to Streams Messaging Manager
  • 3. Info: Streams Messaging Manager (SMM) is a tool for working with Apache Kafka. 4. Now that you are in SMM. 5. Navigate to the round icon third from the top, click this Topic button.
  • 4. 6. You are now in the Topic browser. 7. Click Add New to build a new topic. 8. Enter the name of your topic prefixed with your “Workload User Name“ <yourusername>_yournewtopic, ex: tim_younewtopic.
  • 5. 9. Enter the name of your topic prefixed with your Workload User Name, ex: tim_younewtopic. For settings you should create it with (3 partitions, cleanup.policy: delete, availability maximum) as shown above. Congratulations! You have built a new topic.
  • 6. 1.2 Create a Schema If You Need One. Not Required For Using Kafka Topics or Tutorials. 1. Navigate to Schema Registry from the Kafka Data Hub. 2. You will see existing schemas. 3. Click the white plus sign in the gray hexagon to create a new schema.
  • 8. 4. You can now add a new schema by entering a unique name starting with your Workload User Name (ex: tim), followed by a short description and then the schema text as shown. If you need examples, see the github list at the end of this guide. 5. Click Save and you have a new schema. If there were errors they will be shown and you can fix them. For more help see, Schema Registry Documentation and Schema Registry Public Cloud. Congratulations! You have built a new schema. Start using it in your DataFlow application.
  • 9. 1.3 Create an Apache Iceberg Table 1. Navigate to oss-kudu-demo from the Data Hubs list 2. Navigate to Hue from the Kudu Data Hub. 3. Inside of Hue you can now create your table. 4. Navigate to your database, this was created for you.
  • 10. Info: The database name pattern is your email address and then all special characters are replaced with underscore and then _db is appended to that to make the db name and the ranger policy is created to limit access to just the user and those that are in the admin group. For example: 5. Create your Apache Iceberg table, it must be prefixed with your Work Load User Name (userid). CREATE TABLE <<userid>>_syslog_critical_archive (priority int, severity int, facility int, version int, event_timestamp bigint, hostname string, body string, appName string, procid string, messageid string, structureddata struct<sdid:struct<eventid:string,eventsource:string,iut:string>>) STORED BY ICEBERG 6. Your table is created in s3a://oss-uat2/iceberg/
  • 11. 7. Once you have sent data to your table, you can query it.
  • 12. Additional Documentation ● Create a Table ● Query a Table ● Apache Iceberg Table Properties
  • 13. 2. Streaming Data Sets Available for Apps The following Kafka topics are being populated with streaming data for you. These come from the read-only Kafka cluster. Navigate to the Data Hub Clusters. Click on oss-kafka-datagen.
  • 14. Click Schema Registry. Click Streams Messaging Manager.
  • 15. Use these brokers to connect to them: Brokers oss-kafka-datagen-corebroker1.oss-demo.qsm5-opic.cloudera.site:9093,oss-kafka-dat agen-corebroker0.oss-demo.qsm5-opic.cloudera.site:9093,oss-kafka-datagen-corebro ker2.oss-demo.qsm5-opic.cloudera.site:9093 Use this link for Schema Registry https://#{Schema2}:7790/api/v1 Schema Registry Parameter Hostname: Schema2 oss-kafka-datagen-master0.oss-demo.qsm5-opic.cloudera.site To View Schemas in the Schema Registry click the icon from the datahub https://oss-kafka-datagen-gateway.oss-demo.qsm5-opic.cloudera.site/oss-kafka-datagen/cd p-proxy/schema-registry/ui/#/ Schemas https://github.com/tspannhw/FLaNK-DataFlows/tree/main/schemas Group ID: yourid_cdf Customers (customer) Example Row {"first_name":"Charley","last_name":"Farrell","age":19,"city":"Sawaynside","country":"Guinea","em ail":"keven.herzog@hotmail.com","phone_number":"312-269-6619"}
  • 16. IP Tables (ip_address) Example Row {"source_ip":"216.25.204.241","dest_port":219,"tcp_flags_ack":0,"tcp_flags_reset":0,"ts":"2023-0 4-20 15:26:45.517"} Orders (orders) Example Row
  • 17. {"order_id":84170282,"city":"Wintheiserton","street_address":"80206 Caroyln Lakes","amount":29,"order_time":"2023-04-20 13:25:06.097","order_status":"DELIVERED"} Plants (plant) Example Row {"plant_id":829,"city":"Lake Gerald","lat":"39.568679","lon":"-151.64497","country":"Eritrea"} Sensors (sensor) Example Row {"sensor_id":264,"timestamp_of_production":"2023-04-20 18:28:42.751"}
  • 18. Sensor Data (sensor_data) Example Row {"sensor_id":250,"timestamp_of_production":"2023-04-20 18:42:04.847","sensor_value":-72} Weather (weather) Example Row {"city":"New Ernesto","temp_c":21,"description":"Sleet"}
  • 19. Transactions (transactions) Example Row {"sender_id":40816,"receiver_id":96057,"amount":557,"execution_date":"2023-04-20 16:15:30.744","currency":"UYU"} These are realistic generated data sources that you can use, they are available from read-only Kafka topics. These can be consumed by any developers in the sandbox. Make sure you name your Kafka Consumer your Workload Username _ Some Name. Ex: tim_customerdata_reader
  • 20. 3. Bring Your Own Data (Public Only) ● Data is visible and downloadable to all, make sure it is safe, free, open, public data. ● Public REST Feeds are good ○ Wikipedia https://docs.cloudera.com/dataflow/cloud/flow-designer-beginners-guide-readyflo w/topics/cdf-flow-designer-getting-started-readyflow.html ○ https://gbfs.citibikenyc.com/gbfs/en/station_status.json ○ https://travel.state.gov/_res/rss/TAsTWs.xml ○ https://www.njtransit.com/rss/BusAdvisories_feed.xml ○ https://www.njtransit.com/rss/RailAdvisories_feed.xml ○ https://www.njtransit.com/rss/LightRailAdvisories_feed.xml ○ https://www.njtransit.com/rss/CustomerNotices_feed.xml ○ https://w1.weather.gov/xml/current_obs/all_xml.zip ○ https://dailymed.nlm.nih.gov/dailymed/services/v2/spls.json?page=1&pagesize=1 00 ○ https://dailymed.nlm.nih.gov/dailymed/services/v2/drugnames.json?pagesize=10 0 ○ https://dailymed.nlm.nih.gov/dailymed/rss.cfm ● Generic data files ○ https://aws.amazon.com/data-exchange ● Simulators ○ Use external data simulators via REST ○ Use GeneralFlowFile see: https://www.datainmotion.dev/2019/04/integration-testing-for-apache-nifi.html ● Schemas, Data Sources and Examples ○ https://github.com/tspannhw/FLaNK-AllTheStreams/ ○ https://github.com/tspannhw/FLaNK-DataFlows ○ https://github.com/tspannhw/FLaNK-TravelAdvisory/ ○ https://github.com/tspannhw/FLiP-Current22-LetsMonitorAllTheThings ○ https://github.com/tspannhw/create-nifi-kafka-flink-apps ○ https://www.datainmotion.dev/2021/01/flank-real-time-transit-information-for.html