SlideShare a Scribd company logo
Edge to AI: Analytics from Edge to Cloud with
Efficient Movement of Machine Data
TIMOTHY SPANN, Sr. SE, Cloudera
2 © Cloudera, Inc. All rights reserved.
DISCLAIMER
The information in this document is proprietary to Cloudera. No part of this document may be reproduced,
copied or transmitted in any form for any purpose without the express prior written permission of Cloudera.
This document is a preliminary version and not subject to your license agreement or any other agreement
with Cloudera. This document contains only intended strategies, developments and functionalities of
Cloudera products and is not intended to be binding upon Cloudera to any particular course of business,
product strategy and/or development. Please note that this document is subject to change and may be
changed by Cloudera at any time without notice.
Cloudera assumes no responsibility for errors or omissions in this document. Cloudera does not warrant
the accuracy or completeness of the information, text, graphics, links or other items contained within this
material. This document is provided without a warranty of any kind, either express or implied, including but
not limited to the implied warranties of merchantability, fitness for a particular purpose or non-infringement.
Cloudera shall have no liability for damages of any kind including without limitation direct, special, indirect
or consequential damages that may result from the use of these materials. The limitation shall not apply in
cases of gross negligence.
3© Cloudera, Inc. All rights reserved.
4© Cloudera, Inc. All rights reserved.
MACHINE LEARNING PHASES
Where to Connect to Apache NiFi
Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine Data
6© Cloudera, Inc. All rights reserved.
7© Cloudera, Inc. All rights reserved.
Speed of Data Model Training Model Scoring Use Case
Batch
Batch
Batch
Batch Reporting,
Analytics,
Applications
Online
DS Applications/
Interactive
Dashboards
Streaming
In-stream
Streaming
Applications
Incremental/Online In-stream
Streaming
Applications
Training, Scoring and Monitoring
9 © Cloudera, Inc. All rights reserved.
INTRODUCING MODELS
Machine learning models as one-click microservices (REST APIs)
Model APIs made easy!
1. Choose Python/R file, e.g. score.py
2. Choose function, e.g. forecast
f = open('model.pk', 'rb')
model = pickle.load(f)
def forecast(data):
return model.predict(data)
3. Choose resources
10© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCH
Select a Project, Create a Session, Load Libraries and Data
CLOUDERA DATA SCIENCE WORKBENCH
11© Cloudera, Inc. All rights reserved.
Load a File and Run It
CLOUDERA DATA SCIENCE WORKBENCH
12© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCH
Install Python Libraries for Python 2 or Python 3
CLOUDERA DATA SCIENCE WORKBENCH
13© Cloudera, Inc. All rights reserved.
Test your function with an argument
CLOUDERA DATA SCIENCE WORKBENCH
14© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCH
Create a model from that file and function
CLOUDERA DATA SCIENCE WORKBENCH
15© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHList All The Models
CLOUDERA DATA SCIENCE WORKBENCH
16© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHDeploy the Model
CLOUDERA DATA SCIENCE WORKBENCH
17© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHCheckout The Build
CLOUDERA DATA SCIENCE WORKBENCH
18© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHTest the Model
CLOUDERA DATA SCIENCE WORKBENCH
19© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHValidate the Model Results
CLOUDERA DATA SCIENCE WORKBENCH
20© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHMonitor The Running Models
CLOUDERA DATA SCIENCE WORKBENCH
21© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHInvoke the Model From Apache NiFi In Flow
CLOUDERA DATA SCIENCE WORKBENCH
22© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHQuery Results of Classification in Flow
{ "class1": "cat", "cpu": 38.3, "end": "1549672761.1262221",
"host": "gluoncv-apache-mxnet-29-50-7fb5cfc5b9-sx6dg", "memory": 14.9,
"pct1": "98.15670800000001",
"shape": "(1, 3, 566, 512)", "systemtime": "02/09/2019 00:39:21",
"te": "3.380652666091919"
}
CLOUDERA DATA-IN-MOTION (APACHE NIFI)
23© Cloudera, Inc. All rights reserved.
CLOUDERA DATA SCIENCE WORKBENCHReferences
● https://blog.cloudera.com/blog/2019/02/integrating-machine-learning-models-into-your-big-data-
pipelines-in-real-time-with-no-coding/
● https://community.hortonworks.com/articles/239961/using-cloudera-data-science-workbench-with-
apache.html
● https://community.hortonworks.com/content/kbentry/239858/integrating-machine-learning-models-
into-your-big.html
● https://github.com/tspannhw/nifi-cdsw-gluoncv
CLOUDERA

More Related Content

PPTX
Parallel/Distributed Deep Learning and CDSW
PPTX
Part 2: A Visual Dive into Machine Learning and Deep Learning 

PPTX
Live Cloudera Cybersecurity Solution Demo
PPT
A Community Approach to Fighting Cyber Threats
PDF
Data Science and Machine Learning for the Enterprise
PPTX
Transforming Insurance Analytics with Big Data and Automated Machine Learning

PPTX
Cloudera Altus: Big Data in the Cloud Made Easy
PDF
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
Parallel/Distributed Deep Learning and CDSW
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Live Cloudera Cybersecurity Solution Demo
A Community Approach to Fighting Cyber Threats
Data Science and Machine Learning for the Enterprise
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Cloudera Altus: Big Data in the Cloud Made Easy
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench

What's hot (20)

PDF
Manage the Digital Transformation with Machine Learning in a Reactive Microse...
PPTX
Supercharge Splunk with Cloudera

PPTX
Deep Learning with Cloudera
PPTX
Big data journey to the cloud 5.30.18 asher bartch
PPTX
Part 3: Models in Production: A Look From Beginning to End
PPTX
Big data journey to the cloud rohit pujari 5.30.18
PPTX
Cloudera - The Modern Platform for Analytics
PPTX
The Big Picture: Learned Behaviors in Churn
PPTX
The Edge to AI Deep Dive Barcelona Meetup March 2019
PPTX
Kudu Forrester Webinar
PPTX
Self-service Big Data Analytics on Microsoft Azure
PPTX
Machine Learning Models: From Research to Production 6.13.18
PPTX
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
PPTX
Part 1: Lambda Architectures: Simplified by Apache Kudu
PPTX
Simplifying Real-Time Architectures for IoT with Apache Kudu
PPTX
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
PPTX
Data Science and CDSW
PDF
Machine Learning in the Enterprise 2019
PPTX
Part 1: Introducing the Cloudera Data Science Workbench
PPTX
Apache Impala (incubating) 2.5 Performance Update
Manage the Digital Transformation with Machine Learning in a Reactive Microse...
Supercharge Splunk with Cloudera

Deep Learning with Cloudera
Big data journey to the cloud 5.30.18 asher bartch
Part 3: Models in Production: A Look From Beginning to End
Big data journey to the cloud rohit pujari 5.30.18
Cloudera - The Modern Platform for Analytics
The Big Picture: Learned Behaviors in Churn
The Edge to AI Deep Dive Barcelona Meetup March 2019
Kudu Forrester Webinar
Self-service Big Data Analytics on Microsoft Azure
Machine Learning Models: From Research to Production 6.13.18
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 1: Lambda Architectures: Simplified by Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Data Science and CDSW
Machine Learning in the Enterprise 2019
Part 1: Introducing the Cloudera Data Science Workbench
Apache Impala (incubating) 2.5 Performance Update
Ad

Similar to Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine Data (20)

PDF
Edge to ai analytics from edge to cloud with efficient movement of machine data
PPTX
Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine ...
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
PDF
Enterprise machine learning on k8s lessons learned and the road ahead
PPTX
Big Data Fundamentals
PPTX
Big Data Fundamentals 6.6.18
PDF
Train, predict, serve: How to go into production your machine learning model
PPTX
Data Science in Enterprise
PPTX
Splunk and Multicloud
PPTX
Splunk and Multicloud
PPTX
The 5 Biggest Data Myths in Telco: Exposed
PPTX
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
PDF
Machine Learning Model Deployment: Strategy to Implementation
PDF
2017 sitNL Cloud Foundry Masterclass
PDF
What’s new in SAP Cloud Platform Workflow – summer edition 2019
PDF
FulcrumWay Webinar - Fusion Security
PPTX
Splunk und Multi-Cloud
PPTX
Modern Data Warehouse Fundamentals Part 3
PDF
EPBCS ADMIN GUIDE.pdf
PDF
Migrating From Legacy Applications To The Cloud
Edge to ai analytics from edge to cloud with efficient movement of machine data
Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine ...
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Enterprise machine learning on k8s lessons learned and the road ahead
Big Data Fundamentals
Big Data Fundamentals 6.6.18
Train, predict, serve: How to go into production your machine learning model
Data Science in Enterprise
Splunk and Multicloud
Splunk and Multicloud
The 5 Biggest Data Myths in Telco: Exposed
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Machine Learning Model Deployment: Strategy to Implementation
2017 sitNL Cloud Foundry Masterclass
What’s new in SAP Cloud Platform Workflow – summer edition 2019
FulcrumWay Webinar - Fusion Security
Splunk und Multi-Cloud
Modern Data Warehouse Fundamentals Part 3
EPBCS ADMIN GUIDE.pdf
Migrating From Legacy Applications To The Cloud
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
KodekX | Application Modernization Development
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Electronic commerce courselecture one. Pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Empathic Computing: Creating Shared Understanding
PDF
Modernizing your data center with Dell and AMD
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Machine learning based COVID-19 study performance prediction
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
A Presentation on Artificial Intelligence
Building Integrated photovoltaic BIPV_UPV.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The Rise and Fall of 3GPP – Time for a Sabbatical?
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
KodekX | Application Modernization Development
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
NewMind AI Monthly Chronicles - July 2025
Electronic commerce courselecture one. Pdf
Spectral efficient network and resource selection model in 5G networks
Empathic Computing: Creating Shared Understanding
Modernizing your data center with Dell and AMD
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
Machine learning based COVID-19 study performance prediction
Reach Out and Touch Someone: Haptics and Empathic Computing
A Presentation on Artificial Intelligence

Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine Data

  • 1. Edge to AI: Analytics from Edge to Cloud with Efficient Movement of Machine Data TIMOTHY SPANN, Sr. SE, Cloudera
  • 2. 2 © Cloudera, Inc. All rights reserved. DISCLAIMER The information in this document is proprietary to Cloudera. No part of this document may be reproduced, copied or transmitted in any form for any purpose without the express prior written permission of Cloudera. This document is a preliminary version and not subject to your license agreement or any other agreement with Cloudera. This document contains only intended strategies, developments and functionalities of Cloudera products and is not intended to be binding upon Cloudera to any particular course of business, product strategy and/or development. Please note that this document is subject to change and may be changed by Cloudera at any time without notice. Cloudera assumes no responsibility for errors or omissions in this document. Cloudera does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose or non-infringement. Cloudera shall have no liability for damages of any kind including without limitation direct, special, indirect or consequential damages that may result from the use of these materials. The limitation shall not apply in cases of gross negligence.
  • 3. 3© Cloudera, Inc. All rights reserved.
  • 4. 4© Cloudera, Inc. All rights reserved. MACHINE LEARNING PHASES Where to Connect to Apache NiFi
  • 6. 6© Cloudera, Inc. All rights reserved.
  • 7. 7© Cloudera, Inc. All rights reserved.
  • 8. Speed of Data Model Training Model Scoring Use Case Batch Batch Batch Batch Reporting, Analytics, Applications Online DS Applications/ Interactive Dashboards Streaming In-stream Streaming Applications Incremental/Online In-stream Streaming Applications Training, Scoring and Monitoring
  • 9. 9 © Cloudera, Inc. All rights reserved. INTRODUCING MODELS Machine learning models as one-click microservices (REST APIs) Model APIs made easy! 1. Choose Python/R file, e.g. score.py 2. Choose function, e.g. forecast f = open('model.pk', 'rb') model = pickle.load(f) def forecast(data): return model.predict(data) 3. Choose resources
  • 10. 10© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH Select a Project, Create a Session, Load Libraries and Data CLOUDERA DATA SCIENCE WORKBENCH
  • 11. 11© Cloudera, Inc. All rights reserved. Load a File and Run It CLOUDERA DATA SCIENCE WORKBENCH
  • 12. 12© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH Install Python Libraries for Python 2 or Python 3 CLOUDERA DATA SCIENCE WORKBENCH
  • 13. 13© Cloudera, Inc. All rights reserved. Test your function with an argument CLOUDERA DATA SCIENCE WORKBENCH
  • 14. 14© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCH Create a model from that file and function CLOUDERA DATA SCIENCE WORKBENCH
  • 15. 15© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHList All The Models CLOUDERA DATA SCIENCE WORKBENCH
  • 16. 16© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHDeploy the Model CLOUDERA DATA SCIENCE WORKBENCH
  • 17. 17© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHCheckout The Build CLOUDERA DATA SCIENCE WORKBENCH
  • 18. 18© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHTest the Model CLOUDERA DATA SCIENCE WORKBENCH
  • 19. 19© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHValidate the Model Results CLOUDERA DATA SCIENCE WORKBENCH
  • 20. 20© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHMonitor The Running Models CLOUDERA DATA SCIENCE WORKBENCH
  • 21. 21© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHInvoke the Model From Apache NiFi In Flow CLOUDERA DATA SCIENCE WORKBENCH
  • 22. 22© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHQuery Results of Classification in Flow { "class1": "cat", "cpu": 38.3, "end": "1549672761.1262221", "host": "gluoncv-apache-mxnet-29-50-7fb5cfc5b9-sx6dg", "memory": 14.9, "pct1": "98.15670800000001", "shape": "(1, 3, 566, 512)", "systemtime": "02/09/2019 00:39:21", "te": "3.380652666091919" } CLOUDERA DATA-IN-MOTION (APACHE NIFI)
  • 23. 23© Cloudera, Inc. All rights reserved. CLOUDERA DATA SCIENCE WORKBENCHReferences ● https://blog.cloudera.com/blog/2019/02/integrating-machine-learning-models-into-your-big-data- pipelines-in-real-time-with-no-coding/ ● https://community.hortonworks.com/articles/239961/using-cloudera-data-science-workbench-with- apache.html ● https://community.hortonworks.com/content/kbentry/239858/integrating-machine-learning-models- into-your-big.html ● https://github.com/tspannhw/nifi-cdsw-gluoncv CLOUDERA