SlideShare a Scribd company logo
© 2019 KNIME AG. All Right Reserved.
Discover Unknown Frauds and Anomalies
using Machine Learning
Kathrin Melcher
kathrin.melcher@knime.com
© 2019 KNIME AG. All Rights Reserved.
Anomaly Detection: Use Cases
3
Fault Detection
Fraud Detection
Predictive Maintenance
Intrusion
Medicine
Heart Beat
Sensor Data
AssemblingDetails
Transactions
Networks
Finance
IoT
Weather Information
Fraud Detection
Predictive Maintenance
System Health Monitoring
© 2019 KNIME AG. All Rights Reserved.
Anomaly Detection
What have all those use cases in common?
• Discover rare events that shouldn’t happen => often no labeled data
• Find a problem before other people see it => anomaly is unknown
Example with 2 dimensions, e.g. vibration and heat of an engine:
4
Historical data of normal
New data samples
How can we detect the anomalies?
© 2019 KNIME AG. All Rights Reserved.
Use Case 1:
Fraud Detection
6
© 2019 KNIME AG. All Rights Reserved.
The Dataset
• Kaggle Dataset https://www.kaggle.com/mlg-ulb/creditcardfraud
• 284 807 credit card transactions performed in September 2013 by
European cardholders
• 492 (0.2 %) transactions in the dataset are fraudulent
• Features:
– 28 principal components
– Time from the first transaction
– Amount of money
7
© 2019 KNIME AG. All Rights Reserved.
The KNIME Community Workflow Hub
8
https://hub.knime.com
© 2019 KNIME AG. All Rights Reserved.
Fraud Detection with Labeled Data
Model Training
Model Deployment
© 2019 KNIME AG. All Rights Reserved.
Fraud Detection using Autoencoder
11
Input Layer Hidden Layers Output Layer
Input ! Output !‘
!#$% − !#$%
' ( > * ⇒ anomaly
Each layer:
23(!356) = 9(: !356 ) = 9(;356 !356))
f is a non-linear activation function,
e.g. tanh, relu.
min
;= 3∈(6,(,@,A)
B(!, !') with
B !, !’ =
1
E
F
GH6
I
! − !' (
Execution of the Network:
The network structure on the left:
! ∈ ℝK
ℎ6 ! = 96 ;6! ∈ ℝ@
ℎ( ! = 9((;(26) ∈ ℝ(
ℎ@ ! = 9@ ;@2( ∈ ℝ@
!′ = 9A ;A2@ ∈ ℝK
Training of the network:
© 2019 KNIME AG. All Rights Reserved.
Fraud Detection using Autoencoder
13
© 2019 KNIME AG. All Rights Reserved.
Deployment via REST on KNIME Server
14
Workflow deployed as (REST) web service on KNIME Server
Workflow calling another workflow on KNIME Server
© 2019 KNIME AG. All Rights Reserved.
Use Case II
Anomaly Detection in Predictive Maintenance (IoT)
15
© 2019 KNIME AG. All Rights Reserved.
The Data
• 28 time series from 28 sensors attached to 8 different part of a rotor
• Time Series are FFT-derived Spectral Amplitudes
[date, time, FFT frequency, FFT amplitude]
• After Preprocessing: 313 time series for different frequency bands (100
Hz-wide) falling between 0 Hz and 1200 Hz and different sensors
• One breakdown, which is visible only from some sensors and only in
some frequency bands
16
© 2019 KNIME AG. All Rights Reserved.
A1-SV3 [500, 600] Hz
New motor piece
Old motor piece
Breaking point
July 21, 2008
Only some Spectral Time Series shows the break down
A1-SV3 [0, 100] Hz
Data Visualization: Time Plots by Frequency Bands
18
White paper: https://files.knime.com/sites/default/files/inline-images/knime_anomaly_detection_visualization.pdf
Community Hub: https://hub.knime.com/knime/workflows/*Ra9zhH3q0zKo-tUu
© 2019 KNIME AG. All Rights Reserved.
Control Chart (Rule Based)
• Define signal boundaries based on anomaly-free time windows
!"# = %&' + 2 ∗ +,--.& /.%+01. > !"# ⇒ 4.&.4 1 %4%1/
#"# = %&' − 2 ∗ +,--.& /.%+01. < #"# ⇒ 4.&.4 1 %4%1/
• Sequence of level 1 alarms across many time series => level 2 alarm
19
White paper: https://files.knime.com/sites/default/files/181212_Whitepaper_Anomaly_Detection_Predictive_Maintenance_KNIME.pdf
Community Hub: https://hub.knime.com/knime/workflows/*OwayKRE08PXVWiqx
© 2019 KNIME AG. All Rights Reserved.
A1-SV3 [0, 100] Hz
A1-SV3 [500, 600] Hz
Breaking point
July 21, 2008
31 August 2007
Training Set
Predictive Maintenance
Learn “normal”: Training Set
21
Idea:
1. Train Auto-Regressive (AR)
Model for each time series
on “ normal” data
2. Apply model and calculate
distance between predicted
and real values
3. Define alarm levels based
on distance statistics
© 2019 KNIME AG. All Rights Reserved.
Training of an Auto-Regressive Model
22
Community Hub: https://hub.knime.com/knime/workflows/*OwayKRE08PXVWiqx
© 2019 KNIME AG. All Rights Reserved.
Deployment of an Auto-Regressive Model
23
Community Hub: https://hub.knime.com/knime/workflows/*OwayKRE08PXVWiqx
© 2019 KNIME AG. All Rights Reserved.
Time Series Production: Use KNIME to Send Alarms !!!
25
© 2019 KNIME AG. All Rights Reserved.
Free Copy Practicing Data Science
26
• 2nd Edition: 22 Case Studies!
• Available at KNIME Press:
https://www.knime.com/knimepress
• Select book “Practicing Data Science”
• For free copy, use promotion code:
MEETUP-VIENNA-PDS
© 2019 KNIME AG. All Rights Reserved.
The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by
KNIME AG under license from KNIME GmbH, and are registered in the United States.
KNIME® is also registered in Germany.
27

More Related Content

PDF
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
PDF
Open Source Story and what’s new in KNIME Software
PDF
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
PDF
Just add Imagination
PDF
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
PDF
Guided Automation- A Blueprint for Interactive Automated Machine Learning
PDF
Codeless Deep Learning for Language Modeling and Image Classification
PDF
Scoring Metrics for Classification Models
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
Open Source Story and what’s new in KNIME Software
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Just add Imagination
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
Guided Automation- A Blueprint for Interactive Automated Machine Learning
Codeless Deep Learning for Language Modeling and Image Classification
Scoring Metrics for Classification Models

What's hot (19)

PDF
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
PDF
Automating Inferences out of Financial Data
PDF
Sharing and Deploying Data Science with KNIME Server
PDF
Webinar: Behind the Scenes on Guided Analytics
PDF
Sentiment Analysis with KNIME Analytics Platform
PDF
Heterogeneous Data Mining with Spark
PDF
Chemistry Data Basics with KNIME Analytics Platform
PPTX
From Raw Data to Deployment
PDF
What's New in KNIME Analytics Platform 4.1
PDF
Daho.am meetup kubernetes evolution @abi
PDF
AWS reInvent 2019 Trip Report
PDF
#AI + #Cloud = #DigitalTransformation
PPTX
Software-Cluster Internationalisation focusing Bahia/Brazil: R+D project of t...
PDF
Big Data LDN 2017: Your flight is boarding now!
PDF
Jan Kema - Fugro
PPTX
Progress on the New York Declaration on Forests
PPTX
Precisition Agriculture - (Stephan Vormbrock, CLAAS)
PDF
The Race To Better Datacenters - Tailormade Colocation by Globalways AG
PPTX
Real Time Analytics
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
Automating Inferences out of Financial Data
Sharing and Deploying Data Science with KNIME Server
Webinar: Behind the Scenes on Guided Analytics
Sentiment Analysis with KNIME Analytics Platform
Heterogeneous Data Mining with Spark
Chemistry Data Basics with KNIME Analytics Platform
From Raw Data to Deployment
What's New in KNIME Analytics Platform 4.1
Daho.am meetup kubernetes evolution @abi
AWS reInvent 2019 Trip Report
#AI + #Cloud = #DigitalTransformation
Software-Cluster Internationalisation focusing Bahia/Brazil: R+D project of t...
Big Data LDN 2017: Your flight is boarding now!
Jan Kema - Fugro
Progress on the New York Declaration on Forests
Precisition Agriculture - (Stephan Vormbrock, CLAAS)
The Race To Better Datacenters - Tailormade Colocation by Globalways AG
Real Time Analytics
Ad

Similar to Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning (20)

PPTX
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
PDF
Data pipelines and anomaly detection
PDF
Credit Card Fraud Detection Tutorial
PDF
Anomaly Detection using Neural Networks with Pandas, Keras and Python
PPTX
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
PDF
Enhancing Time Series Anomaly Detection: A Hybrid Model Fusion Approach
PDF
Anomaly detection (Unsupervised Learning) in Machine Learning
PPTX
Time Series Anomaly Detection with .net and Azure
PDF
anomalydetection-191104083630.pdf
PDF
Analytics for large-scale time series and event data
PPTX
Anomaly Detection Technique
PPTX
AI-Powered-Anomaly-Detection-in-Time-Series-Data.pptx
PPTX
AI-Powered-Anomaly-Detection-in-Time-Series-Data.pptx
PDF
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
PPTX
Anomaly Detection using Spark MLlib and Spark Streaming
PDF
POSTER_Ewonye.pdf
PPTX
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
PDF
An Introduction to Anomaly Detection
PDF
Watch everything, Watch anything
PPTX
Time Series Anomaly Detection with .net and Azure
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
Data pipelines and anomaly detection
Credit Card Fraud Detection Tutorial
Anomaly Detection using Neural Networks with Pandas, Keras and Python
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Enhancing Time Series Anomaly Detection: A Hybrid Model Fusion Approach
Anomaly detection (Unsupervised Learning) in Machine Learning
Time Series Anomaly Detection with .net and Azure
anomalydetection-191104083630.pdf
Analytics for large-scale time series and event data
Anomaly Detection Technique
AI-Powered-Anomaly-Detection-in-Time-Series-Data.pptx
AI-Powered-Anomaly-Detection-in-Time-Series-Data.pptx
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Anomaly Detection using Spark MLlib and Spark Streaming
POSTER_Ewonye.pdf
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
An Introduction to Anomaly Detection
Watch everything, Watch anything
Time Series Anomaly Detection with .net and Azure
Ad

More from KNIMESlides (9)

PDF
Practicing Data Science: A Collection of Case Studies
PDF
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
PDF
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
PDF
KNIME Data Science Learnathon: From Raw Data To Deployment
PDF
KNIME Software Overview
PDF
From raw data to deployment
PDF
Knime customer intelligence on social media: Text Analytics vs. Network Mining
PDF
Text Processing with KNIME
PDF
Big Data with KNIME is as easy as 1, 2, 3, ...4!
Practicing Data Science: A Collection of Case Studies
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
KNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Software Overview
From raw data to deployment
Knime customer intelligence on social media: Text Analytics vs. Network Mining
Text Processing with KNIME
Big Data with KNIME is as easy as 1, 2, 3, ...4!

Recently uploaded (20)

PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
.pdf is not working space design for the following data for the following dat...
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
Mega Projects Data Mega Projects Data
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
Launch Your Data Science Career in Kochi – 2025
Major-Components-ofNKJNNKNKNKNKronment.pptx
Clinical guidelines as a resource for EBP(1).pdf
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Business Acumen Training GuidePresentation.pptx
1_Introduction to advance data techniques.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Supervised vs unsupervised machine learning algorithms
IBA_Chapter_11_Slides_Final_Accessible.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
IB Computer Science - Internal Assessment.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
.pdf is not working space design for the following data for the following dat...
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Mega Projects Data Mega Projects Data
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to Knowledge Engineering Part 1
Miokarditis (Inflamasi pada Otot Jantung)
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Launch Your Data Science Career in Kochi – 2025

Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning

  • 1. © 2019 KNIME AG. All Right Reserved. Discover Unknown Frauds and Anomalies using Machine Learning Kathrin Melcher kathrin.melcher@knime.com
  • 2. © 2019 KNIME AG. All Rights Reserved. Anomaly Detection: Use Cases 3 Fault Detection Fraud Detection Predictive Maintenance Intrusion Medicine Heart Beat Sensor Data AssemblingDetails Transactions Networks Finance IoT Weather Information Fraud Detection Predictive Maintenance System Health Monitoring
  • 3. © 2019 KNIME AG. All Rights Reserved. Anomaly Detection What have all those use cases in common? • Discover rare events that shouldn’t happen => often no labeled data • Find a problem before other people see it => anomaly is unknown Example with 2 dimensions, e.g. vibration and heat of an engine: 4 Historical data of normal New data samples How can we detect the anomalies?
  • 4. © 2019 KNIME AG. All Rights Reserved. Use Case 1: Fraud Detection 6
  • 5. © 2019 KNIME AG. All Rights Reserved. The Dataset • Kaggle Dataset https://www.kaggle.com/mlg-ulb/creditcardfraud • 284 807 credit card transactions performed in September 2013 by European cardholders • 492 (0.2 %) transactions in the dataset are fraudulent • Features: – 28 principal components – Time from the first transaction – Amount of money 7
  • 6. © 2019 KNIME AG. All Rights Reserved. The KNIME Community Workflow Hub 8 https://hub.knime.com
  • 7. © 2019 KNIME AG. All Rights Reserved. Fraud Detection with Labeled Data Model Training Model Deployment
  • 8. © 2019 KNIME AG. All Rights Reserved. Fraud Detection using Autoencoder 11 Input Layer Hidden Layers Output Layer Input ! Output !‘ !#$% − !#$% ' ( > * ⇒ anomaly Each layer: 23(!356) = 9(: !356 ) = 9(;356 !356)) f is a non-linear activation function, e.g. tanh, relu. min ;= 3∈(6,(,@,A) B(!, !') with B !, !’ = 1 E F GH6 I ! − !' ( Execution of the Network: The network structure on the left: ! ∈ ℝK ℎ6 ! = 96 ;6! ∈ ℝ@ ℎ( ! = 9((;(26) ∈ ℝ( ℎ@ ! = 9@ ;@2( ∈ ℝ@ !′ = 9A ;A2@ ∈ ℝK Training of the network:
  • 9. © 2019 KNIME AG. All Rights Reserved. Fraud Detection using Autoencoder 13
  • 10. © 2019 KNIME AG. All Rights Reserved. Deployment via REST on KNIME Server 14 Workflow deployed as (REST) web service on KNIME Server Workflow calling another workflow on KNIME Server
  • 11. © 2019 KNIME AG. All Rights Reserved. Use Case II Anomaly Detection in Predictive Maintenance (IoT) 15
  • 12. © 2019 KNIME AG. All Rights Reserved. The Data • 28 time series from 28 sensors attached to 8 different part of a rotor • Time Series are FFT-derived Spectral Amplitudes [date, time, FFT frequency, FFT amplitude] • After Preprocessing: 313 time series for different frequency bands (100 Hz-wide) falling between 0 Hz and 1200 Hz and different sensors • One breakdown, which is visible only from some sensors and only in some frequency bands 16
  • 13. © 2019 KNIME AG. All Rights Reserved. A1-SV3 [500, 600] Hz New motor piece Old motor piece Breaking point July 21, 2008 Only some Spectral Time Series shows the break down A1-SV3 [0, 100] Hz Data Visualization: Time Plots by Frequency Bands 18 White paper: https://files.knime.com/sites/default/files/inline-images/knime_anomaly_detection_visualization.pdf Community Hub: https://hub.knime.com/knime/workflows/*Ra9zhH3q0zKo-tUu
  • 14. © 2019 KNIME AG. All Rights Reserved. Control Chart (Rule Based) • Define signal boundaries based on anomaly-free time windows !"# = %&' + 2 ∗ +,--.& /.%+01. > !"# ⇒ 4.&.4 1 %4%1/ #"# = %&' − 2 ∗ +,--.& /.%+01. < #"# ⇒ 4.&.4 1 %4%1/ • Sequence of level 1 alarms across many time series => level 2 alarm 19 White paper: https://files.knime.com/sites/default/files/181212_Whitepaper_Anomaly_Detection_Predictive_Maintenance_KNIME.pdf Community Hub: https://hub.knime.com/knime/workflows/*OwayKRE08PXVWiqx
  • 15. © 2019 KNIME AG. All Rights Reserved. A1-SV3 [0, 100] Hz A1-SV3 [500, 600] Hz Breaking point July 21, 2008 31 August 2007 Training Set Predictive Maintenance Learn “normal”: Training Set 21 Idea: 1. Train Auto-Regressive (AR) Model for each time series on “ normal” data 2. Apply model and calculate distance between predicted and real values 3. Define alarm levels based on distance statistics
  • 16. © 2019 KNIME AG. All Rights Reserved. Training of an Auto-Regressive Model 22 Community Hub: https://hub.knime.com/knime/workflows/*OwayKRE08PXVWiqx
  • 17. © 2019 KNIME AG. All Rights Reserved. Deployment of an Auto-Regressive Model 23 Community Hub: https://hub.knime.com/knime/workflows/*OwayKRE08PXVWiqx
  • 18. © 2019 KNIME AG. All Rights Reserved. Time Series Production: Use KNIME to Send Alarms !!! 25
  • 19. © 2019 KNIME AG. All Rights Reserved. Free Copy Practicing Data Science 26 • 2nd Edition: 22 Case Studies! • Available at KNIME Press: https://www.knime.com/knimepress • Select book “Practicing Data Science” • For free copy, use promotion code: MEETUP-VIENNA-PDS
  • 20. © 2019 KNIME AG. All Rights Reserved. The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME AG under license from KNIME GmbH, and are registered in the United States. KNIME® is also registered in Germany. 27