SlideShare a Scribd company logo
Bigdataeverywhere 2016 Alex Bates, CTO, Mtell
Operationalizing Apache Spark for the IoT
Operationalizing the IoT with Mtell and Apache Spark
image purchased from Alamy, NY for use in commercial presentation
Smart Machine
https://vimeo.com/145543808
The Collection of:
Sensors
Sensor Networks
Smart Machines
Computer Power
Analytics
People
One element of the Industrial Internet of Things
Solving problems that were previously unsolvable
TheInternetofThingsTheInternetofThings
Network of physical
objects or “things”
embedded with
electronics,
software, sensors,
and network
connectivity, which
enables these
objects to collect
and exchange data.
4
BILLION
Connected People
25+
MILLION
Apps
25+
BILLION
Devices
50
TRILLION
GBs of DATA
Use Cases
remote monitoring
SME’s manage
many machines
condition monitoring
detect early when
problems are small
share learning
learn on one
transfer to many
maintenance
sensor
streams
leverages existing infrastructure: plant historian sensor data streams, and EAM system
connect import combine analyze present
prescriptive
action
Information Flow
Library
Benchmark
Statistics
Equipment Metadata
Raw Data
Failure Signatures,
PM Repository,
Failure Code Hierarchy
Failure Rates,
MTBF/MTTF/MTBR/MTBM,
Average Repair Cost,
Performance Data
Equipment Type,
ISO 14224 Structure,
Sensor Templates
Sensor Data,
Maintenance Work Orders,
Operational Events,
Crowdsourced Info
Levels of Data
global equipment knowledge base
Machine Genome Project
WAN
Evolution to Tier 2
IoT and Big Data
Comet 67P vs Los Angeles
Scale
Data Points/Yr
Single Rig
315 B
100 rigs
31.5 T
Scale
Data Points/Yr
Mtell and OpenTSDB
Optimized for Time Series Data Access
Querying sensor data by date range plus other filters
MapR OpenTSDB Optimization
• Blob Ingestion – 100x faster
– Instead of inserting each point, buffer
data in memory and insert a blob
containing batch
– Move blob maker upstream of insertion
into the storage tier
– Less writes to disk (once / blob instead
of once/point) and reduced data size
(blob compresses raw data)
– 10 node MapR cluster achieved 100
million points/sec ingestion (10 million
points/sec/node)
What has happened?
What will happen?
What should we do?
DESCRIPTIVE
PREDICTIVE
PRESCRIPTIVE
Analytics in Maintenance
Explanation from Gartner
Types of Agents
Learns precise specific failure signature
& performs live monitoring, providing
early warnings of recurrences
Failure Agent
Learns baseline normal & performs
live monitoring to expose abnormal
operations – updates as conditions
change
Anomaly Agent
Finds unrecorded failure patterns
in training data & excludes suspect
data from baseline normal conditions
Hidden Failure Agent
anomaly agent
knows all learned patterns
matching all normal operating states
failure agent 001
knows precise signature of
patterns leading to bearing failure
failure agent 002
knows precise signature of
patterns leading to drive
coupling failure
failure agents 003+
many other agents each
assigned to detect
exact failure patterns
Many Agents Per Asset
…each one holds the
precise multi-
dimensional/temporal data
pattern of a machine in a
specific operating mode
capture worker
experiences & actual
measured sensor data
Mtell uses agents for
individual machines
created by you
…in minutes
one single job
…constantly monitor for
that exact pattern
& sound off “alarm bells”
Agents – Retained Knowledge
Find Degradation Earlier
Multi-variate
Temporal &
multi-variate
Detect complex failure patterns that cannot be detected by humans,
or other technologies, or seen in any single variable trend
Platform for IoT Analytics
Equipment Sets
& Taxonomy
make & model
operating context
population analysis
Equipment
asset hierarchy
sync from EAM
Sensor Mapping
data
streams
Live Agents
rules
maintenance scheduling
machine learning
M2M
population learning
transfer learning
Performance
usage, states
wear, fatigue
efficacy
benchmarking
advanced
analytics
analyst
automation
population
learning
deep
learning
fleet bench-
marking
reservoir
signature
library
sensor
data store
cloud
sync
fault/eff.
signatures
signature
search
operating
center
mgmt.
incident
response
immersive
visualization
adaptive
feedback
knowledge
capture
intelligent
signal
processing
audit
automation
instrument
reliability
derived
signal mgt.
interpola-
tion
global
equipment
taxonomy
eqmt model
catalog
Industry
op. context
fleet sensor
templates
sensor
groups
Platform Functions
Transfer Learning Signature Library
Template
Signature
Pump 02
Pump 01
Library of Known
Failure Signatures
Time-Series
Sensor Data
A
B
C
A
B
C
Operationalizing the IoT with Mtell and Apache Spark
Mtell and Spark
RDD – Resilient
Distributed DataSet
OpenTSDB
Spark RDD
RDD – Resilient Distributed DataSet
• Read-only collection partitioned across a set of
machines
• Can be rebuilt if partition lost
• Enables spark to outperform Hadoop 10x on iterative
machine learning jobs
Query data via HTTP
Over time builds RDD Data-frame;
distributed across nodes
Aim: query database only once
Distributed data storage system
Stores high-precision data points
Scales almost linearly
But lacks analytics
Mtell REST API
Any request
Any client
Flexible/scalable
link to any
Python-based
machine learning
libraries
Human friendly
Spark Integration
Automated & Self-Improving
Previously Learned Normal
Automated & Self-Improving
Known Failure Signature
prescriptive
maintenance
well in advance
fixing a small problem
before it’s a big one
Automated & Self-Improving
Learn New Operating State
Automated & Self-Improving
Learn New Failure Signature
search deeper
& improve
7-day anomaly alert
 30+ day failure signature alert
Predictive / Prescriptive Analytics
Maintenance costs
decrease dramatically
Machines
last longer
Net output
increases dramatically
Critical Assets stop
breaking down
Alex Bates
ABates@Mtell.com

More Related Content

PPTX
PreMonR - A Reactive Platform To Monitor Reactive Application
PDF
Library turnstiles security solutions mairsturnstile 2020
PPTX
Internet of Things - An Architectural Perspective
PPSX
Transcend Automation's Kepware OPC Products
DOCX
user centric machine learning framework for cyber security operations center
PPTX
Machine Learning and Iot change detection and security
PDF
Proposed Algorithm for Surveillance Applications
PPTX
Excelpresentationdatavalidation
PreMonR - A Reactive Platform To Monitor Reactive Application
Library turnstiles security solutions mairsturnstile 2020
Internet of Things - An Architectural Perspective
Transcend Automation's Kepware OPC Products
user centric machine learning framework for cyber security operations center
Machine Learning and Iot change detection and security
Proposed Algorithm for Surveillance Applications
Excelpresentationdatavalidation

Similar to Operationalizing the IoT with Mtell and Apache Spark (20)

PPTX
Predictive Maintenance - Portland Machine Learning Meetup
PPTX
Predictive maintenance withsensors_in_utilities_
PDF
Predictive Maintenance Using Recurrent Neural Networks
PDF
WSO2Con ASIA 2016: IoT Analytics
PPTX
Feature Store as a Data Foundation for Machine Learning
PDF
Map r chicago_advanalytics_oct_meetup
PDF
PPTX
COSMOS Data Analytics Architecture
PDF
Live Tutorial – Streaming Real-Time Events Using Apache APIs
PDF
Adding Edge Data to Your AI and Analytics Strategy
PDF
FIWARE Global Summit - Big Data and Machine Learning with FIWARE
PPTX
Designing data pipelines for analytics and machine learning in industrial set...
PDF
Berlin buzzwords 2020-feature-store-dowling
PDF
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
PDF
IOT_MODULE_4.pd easy to understand notes
PDF
Digital Transformation and Process Optimization in Manufacturing
PDF
stackconf 2024 | IGNITE: Practical AI with Machine Learning for Observability...
PPT
Real-time data integration to the cloud
PDF
Machine learning in the physical world by Kip Larson from AWS IoT
PPTX
ML on Big Data: Real-Time Analysis on Time Series
Predictive Maintenance - Portland Machine Learning Meetup
Predictive maintenance withsensors_in_utilities_
Predictive Maintenance Using Recurrent Neural Networks
WSO2Con ASIA 2016: IoT Analytics
Feature Store as a Data Foundation for Machine Learning
Map r chicago_advanalytics_oct_meetup
COSMOS Data Analytics Architecture
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Adding Edge Data to Your AI and Analytics Strategy
FIWARE Global Summit - Big Data and Machine Learning with FIWARE
Designing data pipelines for analytics and machine learning in industrial set...
Berlin buzzwords 2020-feature-store-dowling
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
IOT_MODULE_4.pd easy to understand notes
Digital Transformation and Process Optimization in Manufacturing
stackconf 2024 | IGNITE: Practical AI with Machine Learning for Observability...
Real-time data integration to the cloud
Machine learning in the physical world by Kip Larson from AWS IoT
ML on Big Data: Real-Time Analysis on Time Series
Ad

Recently uploaded (20)

PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Electronic commerce courselecture one. Pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Cloud computing and distributed systems.
PDF
Machine learning based COVID-19 study performance prediction
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPT
Teaching material agriculture food technology
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Electronic commerce courselecture one. Pdf
NewMind AI Weekly Chronicles - August'25 Week I
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Cloud computing and distributed systems.
Machine learning based COVID-19 study performance prediction
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx
Per capita expenditure prediction using model stacking based on satellite ima...
The Rise and Fall of 3GPP – Time for a Sabbatical?
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Teaching material agriculture food technology
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
Review of recent advances in non-invasive hemoglobin estimation
Digital-Transformation-Roadmap-for-Companies.pptx
Understanding_Digital_Forensics_Presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Ad

Operationalizing the IoT with Mtell and Apache Spark