SlideShare a Scribd company logo
Looking at the New Features of
Apache NiFi
Timothy Spann
Principal Developer Advocate
Sunday October 8, 2023
4:10PM - 4:50 PM
Room 102
Slides, Code, Articles and More…
3
FLaNK Stack
Tim Spann
@PaasDev // Blog: www.datainmotion.dev
Principal Developer Advocate.
Princeton Future of Data Meetup.
ex-Pivotal, ex-Hortonworks, ex-StreamNative, ex-PwC
https://medium.com/@tspann
https://github.com/tspannhw
Apache NiFi x Apache Kafka x Apache Flink
© 2023 Cloudera, Inc. All rights reserved. 4
Future of Data - New York + Princeton + Virtual
@PaasDev
https://www.meetup.com/futureofdata-princeton/
https://www.meetup.com/futureofdata-newyork/
From Big Data to AI to Streaming to Containers to
Cloud to Analytics to Cloud Storage to Fast Data to
Machine Learning to Microservices to ...
FLaNK Stack Weekly
This week in Apache NiFi, Apache Flink, Apache
Kafka, Apache Spark, Apache Iceberg, Python, Java,
AI, ML, LLM and Open Source friends.
https://bit.ly/32dAJft
My Talk List
Utilizing Real-Time Transit Data for Travel Optimization
Let’s Monitor the Conditions at the Conference
Agenda
Apache NiFi has a lot of new features, processors and best practices that have arrived
in the last year or so.
I will walk through building flows using the latest tips, techniques and processor.
I will and change a number of data flows utilizing the latest NiFi version and point out
gotchas and some never dos. The deck will act as a take-away with notes, tips and
guides to what we covered.
===> Any NiFi 1.23+ and 2.0 in progress features people want to see?
Records
New ExcelRecord Reader
AmazonGlueSchemaRegistry
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020&version=12353320
New to 2023 Processors
GenerateRecord
GetAsanaObject
PutSalesforceObject
QuerySalesforceObject
PutIoTDBRecord
QueryIoTDBRecord
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020&version=12353320
ListGoogleDrive
FetchGoogleDrive
PutGoogleDrive
PutBoxFile
ListBoxFile
FetchBoxFile
PutDropbox
DecryptContent
DecryptContentCompatibility
New to 2023 Processors
ExtractRecordSchema
RemoveRecordField
VerifyContentMAC
TriggerHiveMetaStoreEvent
“count” function added to RecordPath
AWS ML Service Processors
https://github.com/tspannhw/FLaNK-AWSML
AWS Translate
Deprecating for Removal
Deprecate Lua and Ruby Script Engines
Deprecate ECMAScript Script Engine
Deprecate the Ambari Reporting Task
Deprecate Kafka 1.x components and 2.0 components
XML Templates
Variables
See:
https://cwiki.apache.org/confluence/display/NIFI/Deprecated+Components+and+Features
Start Using
ExecuteStateless -> run your stateless flows right in a regular NiFi cluster
Parameters
JSON Flow Serialization
Records everywhere
© 2020 Cloudera, Inc. All rights reserved. 15
https://medium.com/cloudera-inc/getting-ready-for-apache-nifi-2-0-5a5e6a67f450
NiFi 2.0 Coming
● Python Integration
● Parameters
● JDK 17, maybe JDK 21+
● JSON Flow Serialization
● Rules Engine for Development Assistance
● Run Process Group as Stateless
● flow.json.gz
https://cwiki.apache.org/confluence/display/NIFI/NiFi+2.0+Release+Goals
Thanks to Pierre!
© 2019 Cloudera, Inc. All rights reserved. 18
Python as First Class (NIFI-11241)
Graphical UI with custom Python based extensions
NEW
in NiFi
2.0
© 2019 Cloudera, Inc. All rights reserved. 19
Apache NiFi in a few numbers
A very active project with a dynamic community & comparison with ACEU 2019
2800+ members on the Slack channel (535+ - 4 years ago)
475+ contributors on Github across the repositories (260+ - 4 years
ago)
65 committers in the Apache NiFi community (45 - 4 years ago)
Apache NiFi 1.23.2 is the latest release, NiFi 2.0 coming soon (NiFi
1.10 - 4 years ago)
14M+ docker pulls of the Apache NiFi image (1M+ - 4 years ago)
20
© 2023 Cloudera, Inc. All rights reserved.
Cloudera Edge Flow Manager
(Command & Control of MiNiFi Agents)
MiNiFi C++
(small footprint)
MiNiFi Java
(headless version of NiFi)
NiFi Registry
Cloudera NiFi for Kafka
Connect
NiFi in
Cloudera DataFlow Functions
Cloudera DataFlow
Stateless NiFi
NiFi Deploy Options from Open Source to Managed
21
© 2023 Cloudera, Inc. All rights reserved.
NiFi 2.0 is coming… https://medium.com/cloudera-inc/getting-ready-for-apache-nifi-2-0-5a5e6a67f450
- First-class citizen Python API
- Rules Engine
- NiFi Stateless at Process Group level
- Java 21 (virtual threads, perf improvements, etc)
https://medium.com/@george.vetticaden/accelerating-ai-data-pipelines-building-an-evernote-chatbot-with-apache-nifi-2-0-and-generative-ai-9d977466ff4c
Closing the gap between data engineers and data scientists…
- Export documentation (Sharepoint, OCR) to build the knowledge base powering your chatbot
- Scrape the internet (Sitemap) to build the knowledge base powering your chatbot
- Real-time streaming ingest of Slack to build the knowledge base powering your chatbot
DEMO
CoC23_ Looking at the New Features of Apache NiFi
24
TH N Y U

More Related Content

PDF
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
PDF
AIDevWorldApacheNiFi101
PDF
Introduction to Apache NiFi 1.11.4
PDF
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
PDF
Building Real-Time Travel Alerts
PDF
WarsawITDays_ ApacheNiFi202
PDF
Automate your data flows with Apache NIFI
PDF
AIDEVDAY_ Data-in-Motion to Supercharge AI
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
AIDevWorldApacheNiFi101
Introduction to Apache NiFi 1.11.4
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-Time Travel Alerts
WarsawITDays_ ApacheNiFi202
Automate your data flows with Apache NIFI
AIDEVDAY_ Data-in-Motion to Supercharge AI

Similar to CoC23_ Looking at the New Features of Apache NiFi (20)

PDF
Introduction to Apache NiFi 1.10
PDF
CoC23_ Let’s Monitor The Conditions at the Conference
PDF
Introduction to data flow management using apache nifi
PDF
Flink and NiFi, Two Stars in the Apache Big Data Constellation
PDF
ApacheCon 2021: Apache NiFi 101- introduction and best practices
PDF
Introduction to Apache NiFi dws19 DWS - DC 2019
PDF
ApacheCon 2021 - Apache NiFi Deep Dive 300
PDF
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
PDF
Real time stock processing with apache nifi, apache flink and apache kafka
PDF
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
PDF
Learning the basics of Apache NiFi for iot OSS Europe 2020
PDF
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
PDF
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
PDF
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
PPTX
Real-Time Data Flows with Apache NiFi
PDF
Joe Witt presentation on Apache NiFi
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Connecting the Drops with Apache NiFi & Apache MiNiFi
PPTX
Integração de Dados com Apache NIFI - Marco Garcia Cetax
PDF
Dataflow Management From Edge to Core with Apache NiFi
Introduction to Apache NiFi 1.10
CoC23_ Let’s Monitor The Conditions at the Conference
Introduction to data flow management using apache nifi
Flink and NiFi, Two Stars in the Apache Big Data Constellation
ApacheCon 2021: Apache NiFi 101- introduction and best practices
Introduction to Apache NiFi dws19 DWS - DC 2019
ApacheCon 2021 - Apache NiFi Deep Dive 300
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
Real time stock processing with apache nifi, apache flink and apache kafka
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
Learning the basics of Apache NiFi for iot OSS Europe 2020
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Real-Time Data Flows with Apache NiFi
Joe Witt presentation on Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
Integração de Dados com Apache NIFI - Marco Garcia Cetax
Dataflow Management From Edge to Core with Apache NiFi
Ad

More from ssuser73434e (6)

PDF
Streaming AI Pipelines with Apache NiFi and Snowflake 2025
PDF
Streaming Data and AI with Apache NiFi and Snowflake
PDF
Meetup - Brasil - Data In Motion - 2023 September 19
PDF
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
PDF
BigDataFest_ Building Modern Data Streaming Apps
PDF
BigDataFest Building Modern Data Streaming Apps
Streaming AI Pipelines with Apache NiFi and Snowflake 2025
Streaming Data and AI with Apache NiFi and Snowflake
Meetup - Brasil - Data In Motion - 2023 September 19
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest Building Modern Data Streaming Apps
Ad

Recently uploaded (20)

PDF
medical staffing services at VALiNTRY
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Transform Your Business with a Software ERP System
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPT
Introduction Database Management System for Course Database
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Nekopoi APK 2025 free lastest update
medical staffing services at VALiNTRY
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Transform Your Business with a Software ERP System
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
VVF-Customer-Presentation2025-Ver1.9.pptx
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Introduction Database Management System for Course Database
ManageIQ - Sprint 268 Review - Slide Deck
Online Work Permit System for Fast Permit Processing
Internet Downloader Manager (IDM) Crack 6.42 Build 41
2025 Textile ERP Trends: SAP, Odoo & Oracle
Odoo Companies in India – Driving Business Transformation.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Upgrade and Innovation Strategies for SAP ERP Customers
How to Migrate SBCGlobal Email to Yahoo Easily
Nekopoi APK 2025 free lastest update

CoC23_ Looking at the New Features of Apache NiFi

  • 1. Looking at the New Features of Apache NiFi Timothy Spann Principal Developer Advocate Sunday October 8, 2023 4:10PM - 4:50 PM Room 102
  • 3. 3 FLaNK Stack Tim Spann @PaasDev // Blog: www.datainmotion.dev Principal Developer Advocate. Princeton Future of Data Meetup. ex-Pivotal, ex-Hortonworks, ex-StreamNative, ex-PwC https://medium.com/@tspann https://github.com/tspannhw Apache NiFi x Apache Kafka x Apache Flink
  • 4. © 2023 Cloudera, Inc. All rights reserved. 4 Future of Data - New York + Princeton + Virtual @PaasDev https://www.meetup.com/futureofdata-princeton/ https://www.meetup.com/futureofdata-newyork/ From Big Data to AI to Streaming to Containers to Cloud to Analytics to Cloud Storage to Fast Data to Machine Learning to Microservices to ...
  • 5. FLaNK Stack Weekly This week in Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Python, Java, AI, ML, LLM and Open Source friends. https://bit.ly/32dAJft
  • 6. My Talk List Utilizing Real-Time Transit Data for Travel Optimization Let’s Monitor the Conditions at the Conference
  • 7. Agenda Apache NiFi has a lot of new features, processors and best practices that have arrived in the last year or so. I will walk through building flows using the latest tips, techniques and processor. I will and change a number of data flows utilizing the latest NiFi version and point out gotchas and some never dos. The deck will act as a take-away with notes, tips and guides to what we covered. ===> Any NiFi 1.23+ and 2.0 in progress features people want to see?
  • 9. New to 2023 Processors GenerateRecord GetAsanaObject PutSalesforceObject QuerySalesforceObject PutIoTDBRecord QueryIoTDBRecord https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020&version=12353320 ListGoogleDrive FetchGoogleDrive PutGoogleDrive PutBoxFile ListBoxFile FetchBoxFile PutDropbox DecryptContent DecryptContentCompatibility
  • 10. New to 2023 Processors ExtractRecordSchema RemoveRecordField VerifyContentMAC TriggerHiveMetaStoreEvent “count” function added to RecordPath
  • 11. AWS ML Service Processors https://github.com/tspannhw/FLaNK-AWSML
  • 13. Deprecating for Removal Deprecate Lua and Ruby Script Engines Deprecate ECMAScript Script Engine Deprecate the Ambari Reporting Task Deprecate Kafka 1.x components and 2.0 components XML Templates Variables See: https://cwiki.apache.org/confluence/display/NIFI/Deprecated+Components+and+Features
  • 14. Start Using ExecuteStateless -> run your stateless flows right in a regular NiFi cluster Parameters JSON Flow Serialization Records everywhere
  • 15. © 2020 Cloudera, Inc. All rights reserved. 15
  • 16. https://medium.com/cloudera-inc/getting-ready-for-apache-nifi-2-0-5a5e6a67f450 NiFi 2.0 Coming ● Python Integration ● Parameters ● JDK 17, maybe JDK 21+ ● JSON Flow Serialization ● Rules Engine for Development Assistance ● Run Process Group as Stateless ● flow.json.gz https://cwiki.apache.org/confluence/display/NIFI/NiFi+2.0+Release+Goals
  • 18. © 2019 Cloudera, Inc. All rights reserved. 18 Python as First Class (NIFI-11241) Graphical UI with custom Python based extensions NEW in NiFi 2.0
  • 19. © 2019 Cloudera, Inc. All rights reserved. 19 Apache NiFi in a few numbers A very active project with a dynamic community & comparison with ACEU 2019 2800+ members on the Slack channel (535+ - 4 years ago) 475+ contributors on Github across the repositories (260+ - 4 years ago) 65 committers in the Apache NiFi community (45 - 4 years ago) Apache NiFi 1.23.2 is the latest release, NiFi 2.0 coming soon (NiFi 1.10 - 4 years ago) 14M+ docker pulls of the Apache NiFi image (1M+ - 4 years ago)
  • 20. 20 © 2023 Cloudera, Inc. All rights reserved. Cloudera Edge Flow Manager (Command & Control of MiNiFi Agents) MiNiFi C++ (small footprint) MiNiFi Java (headless version of NiFi) NiFi Registry Cloudera NiFi for Kafka Connect NiFi in Cloudera DataFlow Functions Cloudera DataFlow Stateless NiFi NiFi Deploy Options from Open Source to Managed
  • 21. 21 © 2023 Cloudera, Inc. All rights reserved. NiFi 2.0 is coming… https://medium.com/cloudera-inc/getting-ready-for-apache-nifi-2-0-5a5e6a67f450 - First-class citizen Python API - Rules Engine - NiFi Stateless at Process Group level - Java 21 (virtual threads, perf improvements, etc) https://medium.com/@george.vetticaden/accelerating-ai-data-pipelines-building-an-evernote-chatbot-with-apache-nifi-2-0-and-generative-ai-9d977466ff4c Closing the gap between data engineers and data scientists… - Export documentation (Sharepoint, OCR) to build the knowledge base powering your chatbot - Scrape the internet (Sitemap) to build the knowledge base powering your chatbot - Real-time streaming ingest of Slack to build the knowledge base powering your chatbot
  • 22. DEMO