Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015
Predicting failures on complex
machines
Ion Marqués
OUTLINE
NEM Solutions provides complete management solutions to businesses
responsible for the operation and maintenance of multi-system assets.
OUTLINE
NEM Solutions provides complete management solutions to businesses
responsible for the operation and maintenance of multi-system assets.
Nowadays, we have clients with thousands of assets, generating massive
volume of data.
OUTLINE
NEM Solutions provides complete management solutions to businesses
responsible for the operation and maintenance of multi-system assets.
Nowadays, we have clients with thousands of assets, generating massive
volume of data.
What we’ll see in the following 15 minutes:
1. The client’s needs
2. Our approach
3. The solution’s overview
4. The engine - the core of the solution.
5. How we did it, what did we learn.
DEMAND FOR EFFICIENT AND SUSTAINABLE TRANSPORTATION SYSTEMS.
HIGH SPEED & URBAN TRANSPORTATION NEEDS ON THE RISE.
INCREASING ENERGY NEEDS. ON & OFF SHORE RENEWABLES GROWING.
NEED FOR PRODUCTIVITY, RELIABILITY AND CONTINUOUS IMPROVEMENT.
THE CLIENTS’ NEEDS
REACTIVE
APPROACH
The business
under control
Avoid
surprises
The
unexpected
happens
Business
plan fails
BUSINESS &
KNOWLEDGE
Normality
model
definition
Normality
model
Vs = Failure
Symptoms
Real time
data
FUTURE
PROJECTION
FROM DATA
KNOWLEDGE
GENERATION
A.U.R.A: ARTIFICIAL INMUNE SYSTEM
OUR BIG DATA SOLUTION
THE WORKFLOW: 1st APPROACH
• We translate the calculations to a topology.
• Each topology node is a computational unit, i.e arithmetical operations,
symptom calculations, machine learning algorithm testings, …
• Each node is a Storm bolt. We had around 160 bolts each doing one task.
THE WORKFLOW: 1st APPROACH
• We translate the calculations to a topology.
• Each topology node is a computational unit, i.e arithmetical operations,
symptom calculations, machine learning algorithm testings, …
• Each node is a Storm bolt. We had around 160 bolts each doing one task.
• One “master” spout.
• If a bolt fails, all the
data must be re-
emmited!
THE WORKFLOW: 2nd APPROACH
• We translate the calculations to a topology.
• Each topology node is a computational unit, i.e arithmetical operations,
symptom calculations, machine learning algorithm testings, …
• Each node is a Storm bolt. We had around 160 bolts each doing one task.
THE WORKFLOW: 2nd APPROACH
• We translate the calculations to a topology.
• Each topology node is a computational unit, i.e arithmetical operations,
symptom calculations, machine learning algorithm testings, …
• Each node is a Storm bolt. We had around 160 bolts each doing one task.
• One spout per variable
• Too much
communication for our
case.
• Not efficient enough.
THE WORKFLOW: CURRENT APPROACH
• We translate the calculations to a simple topology.
• Non-codependant tasks are grouped into computational units.
• We have a few nodes, assigning one executor per task.
THE WORKFLOW: CURRENT APPROACH
• We translate the calculations to a simple topology.
• Non-codependant tasks are grouped into computational units.
• We have a few nodes, assigning one executor per task.
• Same parallelization.
• Less communication.
• Adapted to small
clusters.
• Better performance.
WE HAD:
 The knowledge about the industries’ needs.
 The machine learning methodologies to extract useful information.
 A successful non-scalable product.
CONCLUSION
WE HAD:
 The knowledge about the industries’ needs.
 The machine learning methodologies to extract useful information.
 A successful non-scalable product.
CONCLUSION
WE NEEDED:
o The means to make that product capable of processing massive amount
of data.
o To solve a key point: Embedding algorithms into a scalable streaming
framework.
• ROI: Industry demands tools that assist in making decisions affecting lots
of complex machines.
• In order to meet that particular demand, we need more than amazing
visualizations and simple data mining methods.
LEASONS LEARNED
• ROI: Industry demands tools that assist in making decisions affecting lots
of complex machines.
• In order to meet that particular demand, we need more than amazing
visualizations and simple data mining methods.
LEASONS LEARNED
Technically, it is a challenge:
• Kafka+Storm+Redis+Hbase can be a winning choice.
• There’s no free lunch, and every case is different.
• Translate your algorithms into a path the data will cross: A directed
graph, a topology. Then simplify. Fail. Try again.
• Your team must know your problem: From how heat in a wind rotor
behaves to how failures in Storm propagate.
LISTENING TO YOUR ASSETS
NEM Solutions
+34 943 30 93 28
info@nemsolutions.com
@NEMSolutions
Thank you!

More Related Content

PDF
Building graphs to discover information by David Martínez at Big Data Spain 2015
PDF
Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015
PDF
Real-time user profiling based on Spark streaming and HBase by Arkadiusz Jach...
PDF
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
PDF
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
PDF
Essential ingredients for real time stream processing @Scale by Kartik pParam...
PDF
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
PDF
A new streaming computation engine for real-time analytics by Michael Barton ...
Building graphs to discover information by David Martínez at Big Data Spain 2015
Euclid & Big Data from dark space by Guillermo Buenadicha at Big Data Spain 2015
Real-time user profiling based on Spark streaming and HBase by Arkadiusz Jach...
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
A new streaming computation engine for real-time analytics by Michael Barton ...

Viewers also liked (6)

PDF
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
PDF
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
PDF
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
PDF
Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...
PDF
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
PDF
Big Data as a game-changer of clinical research strategies by Rafael San Migu...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Begin at the beginning: Feature selection for Big Data by Amparo Alonso at Bi...
Apache flink: data streaming as a basis for all analytics by Kostas Tzoumas a...
Big Data as a game-changer of clinical research strategies by Rafael San Migu...
Ad

Similar to Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015 (20)

PDF
Dances with bits - industrial data analytics made easy!
PPTX
Asymmetric Modernization of Notes Applications
PPTX
Next Dimension IIoT Presentation
PDF
Robobusiness Europe 2014 presentation - future of industrial robotics
PDF
On codes, machines, and environments: reflections and experiences
PPT
Corporate presentation 2014
PDF
AppSphere 15 - AppDynamics: Beyond APM - Building an Operations Center
PPTX
Introduction to Basics C Programming.pptx
PDF
Zenithar General Company Presentation 2022
PPTX
Surviving as a Monolith in a Microservices World - by Blair Olynyk, Hyperwallet
PPTX
See the App Performance Future with Predictive Analytics Webcast
PDF
2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...
PDF
Big Data and OpenStack, a Love Story: Michael Still, Rackspace
PPTX
MongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDB
PPTX
Con3187 Creating Industrial Middleware with Java ME and Single-Board Computers
PDF
Classification of computer 2
PDF
TAUS Machine Translation Showcase, The Simplified Guide to Getting Started in...
PDF
SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)
PPTX
Chapter 1-1.pptx
Dances with bits - industrial data analytics made easy!
Asymmetric Modernization of Notes Applications
Next Dimension IIoT Presentation
Robobusiness Europe 2014 presentation - future of industrial robotics
On codes, machines, and environments: reflections and experiences
Corporate presentation 2014
AppSphere 15 - AppDynamics: Beyond APM - Building an Operations Center
Introduction to Basics C Programming.pptx
Zenithar General Company Presentation 2022
Surviving as a Monolith in a Microservices World - by Blair Olynyk, Hyperwallet
See the App Performance Future with Predictive Analytics Webcast
2017 Melbourne YOW! CTO Summit - Monolith to micro-services with CQRS & Event...
Big Data and OpenStack, a Love Story: Michael Still, Rackspace
MongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDB
Con3187 Creating Industrial Middleware with Java ME and Single-Board Computers
Classification of computer 2
TAUS Machine Translation Showcase, The Simplified Guide to Getting Started in...
SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)
Chapter 1-1.pptx
Ad

More from Big Data Spain (20)

PDF
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
PDF
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
PDF
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
PDF
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
PDF
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
PDF
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
PDF
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
PDF
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
PDF
State of the art time-series analysis with deep learning by Javier Ordóñez at...
PDF
Trading at market speed with the latest Kafka features by Iñigo González at B...
PDF
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
PDF
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
PDF
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
PDF
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
PDF
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
PDF
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
PDF
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
PDF
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
PDF
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
PDF
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
Trading at market speed with the latest Kafka features by Iñigo González at B...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017

Recently uploaded (20)

PDF
The influence of sentiment analysis in enhancing early warning system model f...
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PPT
Geologic Time for studying geology for geologist
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
STKI Israel Market Study 2025 version august
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PPTX
Microsoft Excel 365/2024 Beginner's training
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Five Habits of High-Impact Board Members
PPTX
Modernising the Digital Integration Hub
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PPTX
Chapter 5: Probability Theory and Statistics
The influence of sentiment analysis in enhancing early warning system model f...
Benefits of Physical activity for teenagers.pptx
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
Geologic Time for studying geology for geologist
Developing a website for English-speaking practice to English as a foreign la...
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
STKI Israel Market Study 2025 version august
Hindi spoken digit analysis for native and non-native speakers
CloudStack 4.21: First Look Webinar slides
Zenith AI: Advanced Artificial Intelligence
A proposed approach for plagiarism detection in Myanmar Unicode text
OpenACC and Open Hackathons Monthly Highlights July 2025
Microsoft Excel 365/2024 Beginner's training
2018-HIPAA-Renewal-Training for executives
Five Habits of High-Impact Board Members
Modernising the Digital Integration Hub
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Chapter 5: Probability Theory and Statistics

Predicting failures on complex machines by Ion Marqués at Big Data Spain 2015

  • 2. Predicting failures on complex machines Ion Marqués
  • 3. OUTLINE NEM Solutions provides complete management solutions to businesses responsible for the operation and maintenance of multi-system assets.
  • 4. OUTLINE NEM Solutions provides complete management solutions to businesses responsible for the operation and maintenance of multi-system assets. Nowadays, we have clients with thousands of assets, generating massive volume of data.
  • 5. OUTLINE NEM Solutions provides complete management solutions to businesses responsible for the operation and maintenance of multi-system assets. Nowadays, we have clients with thousands of assets, generating massive volume of data. What we’ll see in the following 15 minutes: 1. The client’s needs 2. Our approach 3. The solution’s overview 4. The engine - the core of the solution. 5. How we did it, what did we learn.
  • 6. DEMAND FOR EFFICIENT AND SUSTAINABLE TRANSPORTATION SYSTEMS. HIGH SPEED & URBAN TRANSPORTATION NEEDS ON THE RISE. INCREASING ENERGY NEEDS. ON & OFF SHORE RENEWABLES GROWING. NEED FOR PRODUCTIVITY, RELIABILITY AND CONTINUOUS IMPROVEMENT. THE CLIENTS’ NEEDS REACTIVE APPROACH The business under control Avoid surprises The unexpected happens Business plan fails BUSINESS & KNOWLEDGE
  • 7. Normality model definition Normality model Vs = Failure Symptoms Real time data FUTURE PROJECTION FROM DATA KNOWLEDGE GENERATION A.U.R.A: ARTIFICIAL INMUNE SYSTEM
  • 8. OUR BIG DATA SOLUTION
  • 9. THE WORKFLOW: 1st APPROACH • We translate the calculations to a topology. • Each topology node is a computational unit, i.e arithmetical operations, symptom calculations, machine learning algorithm testings, … • Each node is a Storm bolt. We had around 160 bolts each doing one task.
  • 10. THE WORKFLOW: 1st APPROACH • We translate the calculations to a topology. • Each topology node is a computational unit, i.e arithmetical operations, symptom calculations, machine learning algorithm testings, … • Each node is a Storm bolt. We had around 160 bolts each doing one task. • One “master” spout. • If a bolt fails, all the data must be re- emmited!
  • 11. THE WORKFLOW: 2nd APPROACH • We translate the calculations to a topology. • Each topology node is a computational unit, i.e arithmetical operations, symptom calculations, machine learning algorithm testings, … • Each node is a Storm bolt. We had around 160 bolts each doing one task.
  • 12. THE WORKFLOW: 2nd APPROACH • We translate the calculations to a topology. • Each topology node is a computational unit, i.e arithmetical operations, symptom calculations, machine learning algorithm testings, … • Each node is a Storm bolt. We had around 160 bolts each doing one task. • One spout per variable • Too much communication for our case. • Not efficient enough.
  • 13. THE WORKFLOW: CURRENT APPROACH • We translate the calculations to a simple topology. • Non-codependant tasks are grouped into computational units. • We have a few nodes, assigning one executor per task.
  • 14. THE WORKFLOW: CURRENT APPROACH • We translate the calculations to a simple topology. • Non-codependant tasks are grouped into computational units. • We have a few nodes, assigning one executor per task. • Same parallelization. • Less communication. • Adapted to small clusters. • Better performance.
  • 15. WE HAD:  The knowledge about the industries’ needs.  The machine learning methodologies to extract useful information.  A successful non-scalable product. CONCLUSION
  • 16. WE HAD:  The knowledge about the industries’ needs.  The machine learning methodologies to extract useful information.  A successful non-scalable product. CONCLUSION WE NEEDED: o The means to make that product capable of processing massive amount of data. o To solve a key point: Embedding algorithms into a scalable streaming framework.
  • 17. • ROI: Industry demands tools that assist in making decisions affecting lots of complex machines. • In order to meet that particular demand, we need more than amazing visualizations and simple data mining methods. LEASONS LEARNED
  • 18. • ROI: Industry demands tools that assist in making decisions affecting lots of complex machines. • In order to meet that particular demand, we need more than amazing visualizations and simple data mining methods. LEASONS LEARNED Technically, it is a challenge: • Kafka+Storm+Redis+Hbase can be a winning choice. • There’s no free lunch, and every case is different. • Translate your algorithms into a path the data will cross: A directed graph, a topology. Then simplify. Fail. Try again. • Your team must know your problem: From how heat in a wind rotor behaves to how failures in Storm propagate.
  • 19. LISTENING TO YOUR ASSETS NEM Solutions +34 943 30 93 28 info@nemsolutions.com @NEMSolutions Thank you!