SlideShare a Scribd company logo
Hortonworks Industrial Data Platform
IIoT & Predictive Analytics in the Energy
Industry
Kenneth Smith – General Manager, Energy
@KennethSmith99
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Internet of Things Data Sources
 User Generated Content (Web & Mobile)
– Twitter, Facebook, Snapchat, YouTube
– Clickstream, Ads, User Engagement
– Payments: Paypal, Venmo
 Internet of Anything (IoAT)
– Wind Turbines, Oil Rigs, Cars
– Weather Stations, Smart Grids
– RFID Tags, Beacons, Wearables
What generates data?
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Industrial IoT Market Opportunity Estimates
“In other words, the industrial internet will be worth more than twice the consumer internet”
https://www.forbes.com/sites/louiscolumbus/2016/11/27/roundup-of-
internet-of-things-forecasts-and-market-estimates-2016/#ad68a67292d5
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Hortonworks for IIoT??
 Technology to deliver the only end-to-end OPEN SOURCE IoT data platform for “industrials”.
 It’s not just about time-series data; it’s the ability to collect, manage, and analyze all
pertinent structured & unstructured data sets related to an industrial asset, operation,
process, piece of equipment, etc. in in addition to time-series.
 Open Connected Data Platforms enables OT/IT/ET convergence to build descriptive,
predictive, & prescriptive applications.
 An open source IIoT platforms allow operators to maintain control over their data and
analytics vs. a ”closed” OEM’s IIoT product telling them when their own equipment needs
replacing.
 An open IIoT platform is applicable across all asset intensive industries with “moving metal”;
oil & gas, utilities, mining, manufacturing, automotive, transportation, agriculture, etc.
 “Data is not a competitive advantage. It’s the algorithms you build to analyze your data
that will differentiate you from your competitors.”
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Is the Energy Industry Ready to Embrace an Open Model?
http://www.lockheedmartin.com/us/news/press-releases/2016/january/160114-mst-us-exxonmobil-awards-
lockheed-martin-next-generation-refining-and-chemical-facility-automation-system-contract.html
ExxonMobil representatives
express frustration when
observing step change
improvements in adjacent
industries enabled by open
technologies. Those adjacent
industries have deployed
significantly higher function
software that have lowered
lifecycle cost and delivered
higher return on investment.
The explosive growth of technologies driven by the Internet of Things (IoT) including
cloud computing, mobile computing, embedded computing, and consumer electronics
makes it obvious that the mainstream industrial automation industry can deliver
more value with the adoption of an open, multi-vendor platform approach.
http://www.automation.com/automati
on-news/article/exxonmobil-to-build-
next-generation-multi-vendor-
automation-architecture
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Industrial Processes Create Large Amounts of Data
Always on, always connected devices generate a
constant stream of data related to the operations of
industrial businesses
These datasets contain:
• What happened?
• Why something happened or not?
• Quantification of events
These datasets go by many names:
• “SCADA Data”
• “Control System Data”
• “Historian Data”
• “Machine Data”
• “Measurement Logs”
How are my …
People?
Processes?
Equipment?
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Why Collect this Data with Hadoop?
• Scale and Flexibility
– Linear & low cost scale out of storage &
processing
– Bring compute to data
– Multi tenant environment that allows multiple
modes of simultaneous ingestion & interaction
• Increased value of data
– Reduce the friction of data access “everything is
accessed in one place”
– Simplified or new analytic applications
• Democratize data
– Single point of access
– Simplified access & security controls
DATASYSTEMSOURCES
SCADA ERP EPM
Governance
&Integration
Security
Operations
Data Access
Data Management
APPLICATIONS
Business
Analytics
Advanced Process
Control
Operations
Planning Suites
AG. Image source "© Siemens AG 2015, All rights reserved"
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
My Industrial Dataset is in Hadoop! Now what?
Uses for Industrial Datasets
• Condition Based Monitoring
• Single View of an Asset
• Dashboards & Mobile Applications
• Statistics & Predictive Analytics
• Event Based Surveillance
• Remote Operations Support
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Field Data Capture Office or Datacenter
Hortonworks Industrial Data Analytics Platform – In Practice
OPC UA/DA, WITSML
Video, Audio
Commodity Market
Weather, Environmental
Social Media
IoT, Machine Data, Historians
Central HDP Cluster
Hive
Central HDF Cluster
NiFi
Kafka
Storm
Streaming
Options
HBase Solr
YARN
HDFS
Location 1
NiFi
Location n
NiFi
Data Center
Data Ingestion Framework
End users
DATA IN MOTION – H DF DATA AT R EST – H DP
HDF Edge (MiNiFi + NiFi)
 Reliable collection
 Small footprint
 Edge processing
 Data provenance
 Integrates with core
policies
HDF Core (NiFi with Streaming)
 Processing at larger scale
 Distributed stream processing
HDP
 Security and data governance
 Monitoring, management, operations
 Applications
 Analytics
Structured / Unstructured Data Sets
10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Hortonworks Connected Data Platforms in Energy & Utilities
Source: https://www.cm-collaborative-tech.com/wp-content/uploads/2016/11/Smart-grid-A-1.jpg
Predictive MaintenanceFraud DetectionExternal Sources
(Weather, Social
Media, GPS, etc.)
Single View of Customer
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Time Series Analytics for Power Generation Anomaly Detection
 Two week engagement – no direct knowledge of existing systems
 Two days were able to isolate problem down from 5000 potential
causes to 19 using standard data science algorithms
 Company investigated findings and found a valve was installed
backwards causing plant to shutdown
 Plant failure hasn’t occurred since, saving millions of dollars in
unplanned shutdowns
 VP of Engineering – “I never thought we would see a solution like
this”
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Vertically Integrated Utility’s Data Journey
Accelerating Revenue Protection with an Open Analytics Platform
 One of the largest electric power holding companies in the US that supplies electricity to approximately 7.4 million
customers and operates natural gas distribution services serving more than 1.5 million customers.
 Revenue Protection Use Case: Protect revenue from theft, malfunctioning meters, and misconfigured meters.
 Why HDP: The only cost effective platform able to do parallel / multi-node analytics on large data sets.
 Currently have loaded 200 Billion rows of meter data across 80 nodes of HDP growing to 1.4 Trillion by 2020 from all of
their service areas.
 Previous energy theft data science process: Predictive model was run on a laptop 1x per week for 10K accounts at a time
and produced 100 leads weekly for investigation. At that rate, it would have taken them 6 months to process one state’s
data (all states/enterprise data would take much longer)
 Current process: Leveraging HDP the run time to analyze one state’s data has been reduced from 6 months to less than
an hour, producing theft leads from the entire data set in minutes.
 Expected realized business value from the Revenue Protection use-case to be tens of millions of dollars by 2020.
 Other use-case include predictive equipment maintenance on a time-series data ingested from OSIsoft Pi and a “Next
Best Action” program for cross-selling opportunities on goods and services.
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Utility Big Data Journey – Crawl, Walk, Run
Arrears & Credit
Collections*
•Better identify customers
prior to going into
Arrears to start payment
plans
Revenue Protection*
•More quickly identify theft,
malfunctioning meters and
misconfigured meters across
entire customer base with HDP
•Estimated business value –
millions of dollars in previously
unrecognized revenue
360 Degree View of
Customer*
•Aggregate customer data
across enterprise: usage,
billing, profile info,
surveys, call center logs,
order history, social media
sentiment, etc.
•Develop customer
segmentation models &
KPI’s to improve customer
service, reduce call center
volumes/times, feed next
best action programs, etc.
Predictive
Maintenance*
•Ingest time-series data
from control systems
and previous
maintenance records to
identify patterns in
malfunctioning
equipment
•Shift from time-based
maintenance to
condition-based to
prioritize and optimize
maintenance resources
and operations.
Outage Detection
& Prevention*
•Identify outages in
real-time, notify
customers of outages
and reduce time to
resolution
•Better forecasting
models = lower service
costs, reduced truck
rolls, increased
revenue, and higher
customer satisfaction
• Start Small, Think Big
• Improve Top-line and bottom-line revenue
• Develop In-house talent
*AMI data is the foundation for both T&D and customer-focused use-cases.
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Manufacturing Data Lake for Global Operations
Capabilities
• Capture new and breakdown existing
operational data silos
• Democratize data access to a wider audience
• Flexible architecture to incorporate the latest
Apache open source/3rd party/customer
innovations
• Foster community
“Not an ops historian but a enterprise
historian of ALL PROCESS DATA”
Design
• Embedded analytics and visualizations
• Embedded open source graphical data ingest
• Proven at scale – 1M tags / minute
• Comply with existing security, governance and
operations
• Built for extension
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Develop: Using HDP and HDF for Industrial IOT
Hortonworks Customer
• Schlumberger Drilling Technology
Application Area
• Real Time Drilling Data Delivery - WITSML
A Few Requirements
• Provenance – knowing where the data came
from is crucial (and often missing) to real time
decision making especially when dealing with
3,000 wells per month
• Visualization – the ability to visualize the data
flow at a granular level aids in troubleshooting
and operational understanding
• Reduced overhead leveraging NiFi vs. previously
built custom-coded solution
http://www.slideshare.net/HadoopSummit/from-zero-to-data-flow-in-hours-with-apache-nifi-64032731
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Real-time Remote Surveillance
Requirement – A New Business Model:
• Fluid and flexible data platforms that can quickly integrate raw data
and deliver actionable intelligence to people and processes
• Ability to operate when network connectivity with a data center or
the shore is intermittent, latent and provide minimal bandwidth
• Analysis of large volumes of data and avoid data being stranded and
out of reach for analysts and support teams.
• Move from an operations posture of reacting and suffering from
unnecessary downtime, equipment failures, efficiency losses, and
safety risks
• Increases the collective expertise available to support safer and
more efficient operations
Solution and Outcomes – New Sources of Value:
• HDF aggregates, prioritizes, compresses and encrypts control system
data before sending it over a 64 KB/sec satellite link to the data
center in real-time
• Data from top drives, BOPs and other equipment is in HDP and every
data consumer from data scientist to BI users can be serviced from
their tool of choice
• Key data consumption patterns enabled include KPI dashboards,
condition-based monitoring and maintenance, event-based
surveillance, and traditional BI reporting; ensuring safer more
efficient operations
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Open source is a way to enable a group of
collaborative people to further their
individual interests while contributing back to
the community for the common good.
Open source

More Related Content

PPTX
YARN - Past, Present, & Future
PPTX
Interactive Analytics at Scale in Apache Hive Using Druid
PDF
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
PPTX
Spark Summit EMEA - Arun Murthy's Keynote
PPTX
Automatic Detection, Classification and Authorization of Sensitive Personal D...
PDF
Fast SQL on Hadoop, really?
PPTX
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
PPTX
Hortonworks Data In Motion Series Part 4
YARN - Past, Present, & Future
Interactive Analytics at Scale in Apache Hive Using Druid
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
Spark Summit EMEA - Arun Murthy's Keynote
Automatic Detection, Classification and Authorization of Sensitive Personal D...
Fast SQL on Hadoop, really?
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Hortonworks Data In Motion Series Part 4

What's hot (20)

PDF
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
PPTX
Why is my Hadoop* job slow?
PDF
Apache Hadoop Crash Course
PPTX
Hortonworks Data In Motion Webinar Series Pt. 2
PPTX
Protecting your Critical Hadoop Clusters Against Disasters
PPT
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
PDF
Powering Big Data Success On-Prem and in the Cloud
PPTX
Intro to Spark with Zeppelin
PDF
Splunk-hortonworks-risk-management-oct-2014
PDF
Data in the Cloud Crash Course
PDF
Data in the Cloud Crash Course
PPTX
Edw Optimization Solution
PPTX
Modernise your EDW - Data Lake
PDF
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
PDF
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
PPTX
Design a Dataflow in 7 minutes with Apache NiFi/HDF
PPTX
Enabling the Real Time Analytical Enterprise
PDF
What's New in Apache Hive 3.0?
PPTX
Apache Atlas: Governance for your Data
PPTX
Hadoop & Cloud Storage: Object Store Integration in Production
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Why is my Hadoop* job slow?
Apache Hadoop Crash Course
Hortonworks Data In Motion Webinar Series Pt. 2
Protecting your Critical Hadoop Clusters Against Disasters
Delivering a Flexible IT Infrastructure for Analytics on IBM Power Systems
Powering Big Data Success On-Prem and in the Cloud
Intro to Spark with Zeppelin
Splunk-hortonworks-risk-management-oct-2014
Data in the Cloud Crash Course
Data in the Cloud Crash Course
Edw Optimization Solution
Modernise your EDW - Data Lake
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Enabling the Real Time Analytical Enterprise
What's New in Apache Hive 3.0?
Apache Atlas: Governance for your Data
Hadoop & Cloud Storage: Object Store Integration in Production
Ad

Similar to Hortonworks Open Connected Data Platforms for IoT and Predictive Big Data Analytics for the Energy Industry (20)

PPTX
IIoT + Predictive Analytics: Solving for Disruption in Oil & Gas and Energy &...
PPTX
Achieving a 360 degree view of manufacturing
PDF
Achieving a 360-degree view of manufacturing via open source industrial data ...
PDF
Hortonworks - IBM Cognitive - The Future of Data Science
PDF
Oil & Gas Big Data use cases
PDF
Data in Motion - Data at Rest - Hortonworks a Modern Architecture
PDF
Hortonworks & Bilot Data Driven Transformations with Hadoop
PDF
Meetup oslo hortonworks HDP
PDF
Hortonworks Hadoop @ Oslo Hadoop User Group
PDF
Reinvent Your Data Management Strategy for Successful Digital Transformation
PDF
IoT Crash Course Hadoop Summit SJ
PDF
Solving Big Data Problems using Hortonworks
PDF
Storm Demo Talk - Denver Apr 2015
PPTX
S2DS London 2015 - Hadoop Real World
PPTX
BigData Technology in energy and public sector
PDF
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
PDF
Open Source Data Management for Industry 4.0
PDF
Eliminating the Challenges of Big Data Management Inside Hadoop
PDF
Eliminating the Challenges of Big Data Management Inside Hadoop
PDF
Introduction to Hadoop
IIoT + Predictive Analytics: Solving for Disruption in Oil & Gas and Energy &...
Achieving a 360 degree view of manufacturing
Achieving a 360-degree view of manufacturing via open source industrial data ...
Hortonworks - IBM Cognitive - The Future of Data Science
Oil & Gas Big Data use cases
Data in Motion - Data at Rest - Hortonworks a Modern Architecture
Hortonworks & Bilot Data Driven Transformations with Hadoop
Meetup oslo hortonworks HDP
Hortonworks Hadoop @ Oslo Hadoop User Group
Reinvent Your Data Management Strategy for Successful Digital Transformation
IoT Crash Course Hadoop Summit SJ
Solving Big Data Problems using Hortonworks
Storm Demo Talk - Denver Apr 2015
S2DS London 2015 - Hadoop Real World
BigData Technology in energy and public sector
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Open Source Data Management for Industry 4.0
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
Introduction to Hadoop
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Approach and Philosophy of On baking technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
KodekX | Application Modernization Development
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPT
Teaching material agriculture food technology
cuic standard and advanced reporting.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Approach and Philosophy of On baking technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation theory and applications.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Programs and apps: productivity, graphics, security and other tools
MIND Revenue Release Quarter 2 2025 Press Release
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KodekX | Application Modernization Development
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
“AI and Expert System Decision Support & Business Intelligence Systems”
The AUB Centre for AI in Media Proposal.docx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Teaching material agriculture food technology

Hortonworks Open Connected Data Platforms for IoT and Predictive Big Data Analytics for the Energy Industry

  • 1. Hortonworks Industrial Data Platform IIoT & Predictive Analytics in the Energy Industry Kenneth Smith – General Manager, Energy @KennethSmith99
  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Internet of Things Data Sources  User Generated Content (Web & Mobile) – Twitter, Facebook, Snapchat, YouTube – Clickstream, Ads, User Engagement – Payments: Paypal, Venmo  Internet of Anything (IoAT) – Wind Turbines, Oil Rigs, Cars – Weather Stations, Smart Grids – RFID Tags, Beacons, Wearables What generates data?
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Industrial IoT Market Opportunity Estimates “In other words, the industrial internet will be worth more than twice the consumer internet” https://www.forbes.com/sites/louiscolumbus/2016/11/27/roundup-of- internet-of-things-forecasts-and-market-estimates-2016/#ad68a67292d5
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why Hortonworks for IIoT??  Technology to deliver the only end-to-end OPEN SOURCE IoT data platform for “industrials”.  It’s not just about time-series data; it’s the ability to collect, manage, and analyze all pertinent structured & unstructured data sets related to an industrial asset, operation, process, piece of equipment, etc. in in addition to time-series.  Open Connected Data Platforms enables OT/IT/ET convergence to build descriptive, predictive, & prescriptive applications.  An open source IIoT platforms allow operators to maintain control over their data and analytics vs. a ”closed” OEM’s IIoT product telling them when their own equipment needs replacing.  An open IIoT platform is applicable across all asset intensive industries with “moving metal”; oil & gas, utilities, mining, manufacturing, automotive, transportation, agriculture, etc.  “Data is not a competitive advantage. It’s the algorithms you build to analyze your data that will differentiate you from your competitors.”
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Is the Energy Industry Ready to Embrace an Open Model? http://www.lockheedmartin.com/us/news/press-releases/2016/january/160114-mst-us-exxonmobil-awards- lockheed-martin-next-generation-refining-and-chemical-facility-automation-system-contract.html ExxonMobil representatives express frustration when observing step change improvements in adjacent industries enabled by open technologies. Those adjacent industries have deployed significantly higher function software that have lowered lifecycle cost and delivered higher return on investment. The explosive growth of technologies driven by the Internet of Things (IoT) including cloud computing, mobile computing, embedded computing, and consumer electronics makes it obvious that the mainstream industrial automation industry can deliver more value with the adoption of an open, multi-vendor platform approach. http://www.automation.com/automati on-news/article/exxonmobil-to-build- next-generation-multi-vendor- automation-architecture
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Industrial Processes Create Large Amounts of Data Always on, always connected devices generate a constant stream of data related to the operations of industrial businesses These datasets contain: • What happened? • Why something happened or not? • Quantification of events These datasets go by many names: • “SCADA Data” • “Control System Data” • “Historian Data” • “Machine Data” • “Measurement Logs” How are my … People? Processes? Equipment?
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why Collect this Data with Hadoop? • Scale and Flexibility – Linear & low cost scale out of storage & processing – Bring compute to data – Multi tenant environment that allows multiple modes of simultaneous ingestion & interaction • Increased value of data – Reduce the friction of data access “everything is accessed in one place” – Simplified or new analytic applications • Democratize data – Single point of access – Simplified access & security controls DATASYSTEMSOURCES SCADA ERP EPM Governance &Integration Security Operations Data Access Data Management APPLICATIONS Business Analytics Advanced Process Control Operations Planning Suites AG. Image source "© Siemens AG 2015, All rights reserved"
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved My Industrial Dataset is in Hadoop! Now what? Uses for Industrial Datasets • Condition Based Monitoring • Single View of an Asset • Dashboards & Mobile Applications • Statistics & Predictive Analytics • Event Based Surveillance • Remote Operations Support
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Field Data Capture Office or Datacenter Hortonworks Industrial Data Analytics Platform – In Practice OPC UA/DA, WITSML Video, Audio Commodity Market Weather, Environmental Social Media IoT, Machine Data, Historians Central HDP Cluster Hive Central HDF Cluster NiFi Kafka Storm Streaming Options HBase Solr YARN HDFS Location 1 NiFi Location n NiFi Data Center Data Ingestion Framework End users DATA IN MOTION – H DF DATA AT R EST – H DP HDF Edge (MiNiFi + NiFi)  Reliable collection  Small footprint  Edge processing  Data provenance  Integrates with core policies HDF Core (NiFi with Streaming)  Processing at larger scale  Distributed stream processing HDP  Security and data governance  Monitoring, management, operations  Applications  Analytics Structured / Unstructured Data Sets
  • 10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Hortonworks Connected Data Platforms in Energy & Utilities Source: https://www.cm-collaborative-tech.com/wp-content/uploads/2016/11/Smart-grid-A-1.jpg Predictive MaintenanceFraud DetectionExternal Sources (Weather, Social Media, GPS, etc.) Single View of Customer
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Time Series Analytics for Power Generation Anomaly Detection  Two week engagement – no direct knowledge of existing systems  Two days were able to isolate problem down from 5000 potential causes to 19 using standard data science algorithms  Company investigated findings and found a valve was installed backwards causing plant to shutdown  Plant failure hasn’t occurred since, saving millions of dollars in unplanned shutdowns  VP of Engineering – “I never thought we would see a solution like this”
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Vertically Integrated Utility’s Data Journey Accelerating Revenue Protection with an Open Analytics Platform  One of the largest electric power holding companies in the US that supplies electricity to approximately 7.4 million customers and operates natural gas distribution services serving more than 1.5 million customers.  Revenue Protection Use Case: Protect revenue from theft, malfunctioning meters, and misconfigured meters.  Why HDP: The only cost effective platform able to do parallel / multi-node analytics on large data sets.  Currently have loaded 200 Billion rows of meter data across 80 nodes of HDP growing to 1.4 Trillion by 2020 from all of their service areas.  Previous energy theft data science process: Predictive model was run on a laptop 1x per week for 10K accounts at a time and produced 100 leads weekly for investigation. At that rate, it would have taken them 6 months to process one state’s data (all states/enterprise data would take much longer)  Current process: Leveraging HDP the run time to analyze one state’s data has been reduced from 6 months to less than an hour, producing theft leads from the entire data set in minutes.  Expected realized business value from the Revenue Protection use-case to be tens of millions of dollars by 2020.  Other use-case include predictive equipment maintenance on a time-series data ingested from OSIsoft Pi and a “Next Best Action” program for cross-selling opportunities on goods and services.
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Utility Big Data Journey – Crawl, Walk, Run Arrears & Credit Collections* •Better identify customers prior to going into Arrears to start payment plans Revenue Protection* •More quickly identify theft, malfunctioning meters and misconfigured meters across entire customer base with HDP •Estimated business value – millions of dollars in previously unrecognized revenue 360 Degree View of Customer* •Aggregate customer data across enterprise: usage, billing, profile info, surveys, call center logs, order history, social media sentiment, etc. •Develop customer segmentation models & KPI’s to improve customer service, reduce call center volumes/times, feed next best action programs, etc. Predictive Maintenance* •Ingest time-series data from control systems and previous maintenance records to identify patterns in malfunctioning equipment •Shift from time-based maintenance to condition-based to prioritize and optimize maintenance resources and operations. Outage Detection & Prevention* •Identify outages in real-time, notify customers of outages and reduce time to resolution •Better forecasting models = lower service costs, reduced truck rolls, increased revenue, and higher customer satisfaction • Start Small, Think Big • Improve Top-line and bottom-line revenue • Develop In-house talent *AMI data is the foundation for both T&D and customer-focused use-cases.
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Manufacturing Data Lake for Global Operations Capabilities • Capture new and breakdown existing operational data silos • Democratize data access to a wider audience • Flexible architecture to incorporate the latest Apache open source/3rd party/customer innovations • Foster community “Not an ops historian but a enterprise historian of ALL PROCESS DATA” Design • Embedded analytics and visualizations • Embedded open source graphical data ingest • Proven at scale – 1M tags / minute • Comply with existing security, governance and operations • Built for extension
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Develop: Using HDP and HDF for Industrial IOT Hortonworks Customer • Schlumberger Drilling Technology Application Area • Real Time Drilling Data Delivery - WITSML A Few Requirements • Provenance – knowing where the data came from is crucial (and often missing) to real time decision making especially when dealing with 3,000 wells per month • Visualization – the ability to visualize the data flow at a granular level aids in troubleshooting and operational understanding • Reduced overhead leveraging NiFi vs. previously built custom-coded solution http://www.slideshare.net/HadoopSummit/from-zero-to-data-flow-in-hours-with-apache-nifi-64032731
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Real-time Remote Surveillance Requirement – A New Business Model: • Fluid and flexible data platforms that can quickly integrate raw data and deliver actionable intelligence to people and processes • Ability to operate when network connectivity with a data center or the shore is intermittent, latent and provide minimal bandwidth • Analysis of large volumes of data and avoid data being stranded and out of reach for analysts and support teams. • Move from an operations posture of reacting and suffering from unnecessary downtime, equipment failures, efficiency losses, and safety risks • Increases the collective expertise available to support safer and more efficient operations Solution and Outcomes – New Sources of Value: • HDF aggregates, prioritizes, compresses and encrypts control system data before sending it over a 64 KB/sec satellite link to the data center in real-time • Data from top drives, BOPs and other equipment is in HDP and every data consumer from data scientist to BI users can be serviced from their tool of choice • Key data consumption patterns enabled include KPI dashboards, condition-based monitoring and maintenance, event-based surveillance, and traditional BI reporting; ensuring safer more efficient operations
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Open source is a way to enable a group of collaborative people to further their individual interests while contributing back to the community for the common good. Open source