SlideShare a Scribd company logo
Anomaly Detection in Deep
Learning
Adam Gibson - Skymind
Deep Learning
book
Dl4j
Skymind
We take Deep Learning models to production on premise
Using Scala (think Python for production)
Java Virtual Machine stack connected to C++ (eg: first class
access to big data systems) with native compute
We make SKIL(Skymind Intelligence Layer): A production deep
learning system for building deep learning applications in
production
What’s an “Anomaly?”
Abnormal Patterns in Data
Fraud Detection - “Bad credit card Transactions”
ALSO Fraud detection - Detecting fake locations with call detail
records
Network Intrusion - Abnormal Activity in a network
Broken Computers in a data center
Brief Case Studies - eg: Why am I up here?
Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep-
learning-software-to-identify-fraud/
Network Infrastructure:
https://insights.ubuntu.com/2016/04/25/making-deep-
learning-accessible-on-openstack/
Network Infra - Save time and Money avoiding
Broken workloads by auto migration before it happens
Why Deep Learning?
Learns well from lots of data
Own feature representation: Robust to noise and allows for
learning cross domain patterns
Already applied in ads: Google itself invests lots in this same
kind of pattern recognition (targeting/relevance)
Techniques
Unsupervised - Use autoencoder reconstruction error and moving averages with
dropout over a set time window
Supervised - RNNs learn from a set of yes/nos in a time series. RNNs can learn from
a series of time steps and predict when an anomaly is about to occur.
Use streaming/minibatches (all neural nets can learn like this)
AutoEncoder Anomaly Detection
Moving average anomaly with KL Divergence
Autoencoder learns to reconstruct data (eg: the input is the labels)
Recurrent Net Anomalies
Learn a softmax over time series:
Given a fixed window, the goal is to predict a probability of an anomaly
occurring given a sequence
Sequences Time Series/Windows with RNNs
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
See: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
Some definitions
Reconstruction Error: Autoencoders can learn from
unsupervised pretraining and learn how to reconstruct data.
Minimize KL Divergence (the delta between two probability
distributions)
RNN/Time Series: See http://deeplearning4j.org/usingrnns
Production
Kafka/Spark Streaming/Flink/Apex
Neural networks as consumer of streaming updates
Data? Mostly log ingestion, could be video
Demo!
Kibana
Kafka
Elasticsearch
Logstash
NiFi
Cassandra
Lagom
Dl4j Ecosystem(DataVec,Nd4j,Dl4j,Arbiter)
Reference Architecture for Anomaly Detection
External
World
Ingest from
external with
nifi Send to
kafka
Make a
prediction
about the
data
Index the
prediction in
elasticsearch
with logstash
Render
the
data
with
kibana
Store raw
events in
cassandra
Summary
Real ML pipeline
Cassandra for storing raw data results
ELK (Elasticsearch, Logstash, Kibana) stack for alerting and
visualization
Kafka for model ingestion
Lagom for serving model predictions
NiFi for designing data pipelines
Questions?
Email: adam@skymind.io
Twitter: agibsonccc
Github: agibsonccc

More Related Content

PDF
Anomaly Detection and Automatic Labeling with Deep Learning
PDF
Keras: Deep Learning Library for Python
PDF
Anomaly Detection at Scale
PDF
Hands on image recognition with scala spark and deep learning4j
PPTX
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
ODP
Self driving computers active learning workflows with human interpretable ve...
PPTX
Deploying signature verification with deep learning
PDF
State of the art time-series analysis with deep learning by Javier Ordóñez at...
Anomaly Detection and Automatic Labeling with Deep Learning
Keras: Deep Learning Library for Python
Anomaly Detection at Scale
Hands on image recognition with scala spark and deep learning4j
Anomaly Detection using Deep Auto-Encoders | Gianmario Spacagna
Self driving computers active learning workflows with human interpretable ve...
Deploying signature verification with deep learning
State of the art time-series analysis with deep learning by Javier Ordóñez at...

What's hot (20)

PDF
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
PDF
First steps with Keras 2: A tutorial with Examples
PPTX
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
PDF
Image Classification Done Simply using Keras and TensorFlow
PDF
Deep learning on a mixed cluster with deeplearning4j and spark
PPTX
Machine Learning with Scala
PDF
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
PDF
Basic ideas on keras framework
PDF
Introduction to neural networks and Keras
PPTX
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
PDF
Machine Learning Use Cases with Azure
PDF
Deeplearning on Hadoop @OSCON 2014
PPTX
Keras on tensorflow in R & Python
PDF
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
PDF
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
PDF
CI/CD for Machine Learning with Daniel Kobran
PDF
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
PDF
DeepLearning4J and Spark: Successes and Challenges - François Garillot
PDF
Deep learning in production with the best
PDF
Distributed deep learning
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
First steps with Keras 2: A tutorial with Examples
Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016
Image Classification Done Simply using Keras and TensorFlow
Deep learning on a mixed cluster with deeplearning4j and spark
Machine Learning with Scala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Basic ideas on keras framework
Introduction to neural networks and Keras
Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop
Machine Learning Use Cases with Azure
Deeplearning on Hadoop @OSCON 2014
Keras on tensorflow in R & Python
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
CI/CD for Machine Learning with Daniel Kobran
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
DeepLearning4J and Spark: Successes and Challenges - François Garillot
Deep learning in production with the best
Distributed deep learning
Ad

Viewers also liked (20)

PDF
Anomaly detection in deep learning
PPTX
Chapter 10 Anomaly Detection
PPTX
Anomaly Detection
PDF
Analytics for large-scale time series and event data
PPTX
Anomaly Detection - New York Machine Learning
PPTX
前回のCasual Talkでいただいたご要望に対する進捗状況
PDF
Jubatus Casual Talks #2 異常検知入門
PDF
時系列分析による異常検知入門
PDF
まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化
PPTX
機械学習を用いた異常検知入門
PPTX
発言小町からのプロファイリング
PDF
Jubatus解説本の紹介
PDF
単語コレクター(文章自動校正器)
PPTX
新聞から今年の漢字を予測する
PDF
Python 特徴抽出プラグイン
PDF
Jubakitの解説
PPTX
Jubatus 1.0 の紹介
PPTX
かまってちゃん小町
PPTX
新機能紹介 1.0.6
ODP
小町のレス数が予測できるか試してみた
Anomaly detection in deep learning
Chapter 10 Anomaly Detection
Anomaly Detection
Analytics for large-scale time series and event data
Anomaly Detection - New York Machine Learning
前回のCasual Talkでいただいたご要望に対する進捗状況
Jubatus Casual Talks #2 異常検知入門
時系列分析による異常検知入門
まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化
機械学習を用いた異常検知入門
発言小町からのプロファイリング
Jubatus解説本の紹介
単語コレクター(文章自動校正器)
新聞から今年の漢字を予測する
Python 特徴抽出プラグイン
Jubakitの解説
Jubatus 1.0 の紹介
かまってちゃん小町
新機能紹介 1.0.6
小町のレス数が予測できるか試してみた
Ad

Similar to Anomaly detection in deep learning (Updated) English (20)

PDF
Anomaly Detection using Deep Auto-Encoders
PDF
Strata 2014 Anomaly Detection
PDF
Data pipelines and anomaly detection
PDF
Deep learning for detecting anomalies and software vulnerabilities
PDF
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
PPTX
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
PPTX
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
PPTX
Anomalies and events keep us on our toes
PPTX
Deep Dive Time Series Anomaly Detection in Azure with dotnet
PDF
Big Data Analytics Tokyo
PDF
BSSML17 - Anomaly Detection
PDF
Anomaly Detection using Neural Networks with Pandas, Keras and Python
PPTX
Deep dive time series anomaly detection with different Azure Data Services
PPTX
Time Series Anomaly Detection with Azure and .NETT
PPTX
Time Series Anomaly Detection for .net and Azure
PPTX
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
PPTX
Time Series Anomaly Detection Literature Survey
PDF
Anomaly detection made easy
PDF
Anomaly detection made easy - Piotr Guzik Allegro
PDF
Influx/Days 2017 San Francisco | Baron Schwartz
Anomaly Detection using Deep Auto-Encoders
Strata 2014 Anomaly Detection
Data pipelines and anomaly detection
Deep learning for detecting anomalies and software vulnerabilities
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
Machine Learning Algorithms for Anomaly Detection in Particles Accelerators T...
Anomalies and events keep us on our toes
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Big Data Analytics Tokyo
BSSML17 - Anomaly Detection
Anomaly Detection using Neural Networks with Pandas, Keras and Python
Deep dive time series anomaly detection with different Azure Data Services
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection for .net and Azure
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Time Series Anomaly Detection Literature Survey
Anomaly detection made easy
Anomaly detection made easy - Piotr Guzik Allegro
Influx/Days 2017 San Francisco | Baron Schwartz

More from Adam Gibson (20)

PDF
End to end MLworkflows
PDF
World Artificial Intelligence Conference Shanghai 2018
PDF
Strata Beijing 2017: Jumpy, a python interface for nd4j
PPTX
Boolan machine learning summit
PDF
Advanced deeplearning4j features
PDF
Deep Learning with GPUs in Production - AI By the Bay
PDF
Wrangleconf Big Data Malaysia 2016
PDF
Distributed deep rl on spark strata singapore
PPTX
Dl4j in the wild
PDF
SKIL - Dl4j in the wild meetup
PDF
Strata Beijing - Deep Learning in Production on Spark
PPTX
Skymind - Udacity China presentation
PDF
Anomaly Detection in Deep Learning (Updated)
PPTX
Hadoop summit 2016
PPTX
Brief introduction to Distributed Deep Learning
PPTX
Advanced spark deep learning
PPTX
Skymind Open Power Summit ISV Round Table
PPTX
Recurrent nets and sensors
PPTX
Future of ai on the jvm
PPTX
Productionizing dl from the ground up
End to end MLworkflows
World Artificial Intelligence Conference Shanghai 2018
Strata Beijing 2017: Jumpy, a python interface for nd4j
Boolan machine learning summit
Advanced deeplearning4j features
Deep Learning with GPUs in Production - AI By the Bay
Wrangleconf Big Data Malaysia 2016
Distributed deep rl on spark strata singapore
Dl4j in the wild
SKIL - Dl4j in the wild meetup
Strata Beijing - Deep Learning in Production on Spark
Skymind - Udacity China presentation
Anomaly Detection in Deep Learning (Updated)
Hadoop summit 2016
Brief introduction to Distributed Deep Learning
Advanced spark deep learning
Skymind Open Power Summit ISV Round Table
Recurrent nets and sensors
Future of ai on the jvm
Productionizing dl from the ground up

Recently uploaded (20)

PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Global journeys: estimating international migration
PDF
Mega Projects Data Mega Projects Data
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Computer network topology notes for revision
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Foundation of Data Science unit number two notes
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Fluorescence-microscope_Botany_detailed content
PDF
Clinical guidelines as a resource for EBP(1).pdf
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Global journeys: estimating international migration
Mega Projects Data Mega Projects Data
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
climate analysis of Dhaka ,Banglades.pptx
Computer network topology notes for revision
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
.pdf is not working space design for the following data for the following dat...
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Foundation of Data Science unit number two notes
Launch Your Data Science Career in Kochi – 2025
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction to Knowledge Engineering Part 1
IB Computer Science - Internal Assessment.pptx
1_Introduction to advance data techniques.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Fluorescence-microscope_Botany_detailed content
Clinical guidelines as a resource for EBP(1).pdf

Anomaly detection in deep learning (Updated) English

  • 1. Anomaly Detection in Deep Learning Adam Gibson - Skymind
  • 4. Skymind We take Deep Learning models to production on premise Using Scala (think Python for production) Java Virtual Machine stack connected to C++ (eg: first class access to big data systems) with native compute We make SKIL(Skymind Intelligence Layer): A production deep learning system for building deep learning applications in production
  • 5. What’s an “Anomaly?” Abnormal Patterns in Data Fraud Detection - “Bad credit card Transactions” ALSO Fraud detection - Detecting fake locations with call detail records Network Intrusion - Abnormal Activity in a network Broken Computers in a data center
  • 6. Brief Case Studies - eg: Why am I up here? Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep- learning-software-to-identify-fraud/ Network Infrastructure: https://insights.ubuntu.com/2016/04/25/making-deep- learning-accessible-on-openstack/
  • 7. Network Infra - Save time and Money avoiding Broken workloads by auto migration before it happens
  • 8. Why Deep Learning? Learns well from lots of data Own feature representation: Robust to noise and allows for learning cross domain patterns Already applied in ads: Google itself invests lots in this same kind of pattern recognition (targeting/relevance)
  • 9. Techniques Unsupervised - Use autoencoder reconstruction error and moving averages with dropout over a set time window Supervised - RNNs learn from a set of yes/nos in a time series. RNNs can learn from a series of time steps and predict when an anomaly is about to occur. Use streaming/minibatches (all neural nets can learn like this)
  • 10. AutoEncoder Anomaly Detection Moving average anomaly with KL Divergence Autoencoder learns to reconstruct data (eg: the input is the labels)
  • 11. Recurrent Net Anomalies Learn a softmax over time series: Given a fixed window, the goal is to predict a probability of an anomaly occurring given a sequence
  • 12. Sequences Time Series/Windows with RNNs http://karpathy.github.io/2015/05/21/rnn-effectiveness/ See: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
  • 13. Some definitions Reconstruction Error: Autoencoders can learn from unsupervised pretraining and learn how to reconstruct data. Minimize KL Divergence (the delta between two probability distributions) RNN/Time Series: See http://deeplearning4j.org/usingrnns
  • 14. Production Kafka/Spark Streaming/Flink/Apex Neural networks as consumer of streaming updates Data? Mostly log ingestion, could be video
  • 16. Reference Architecture for Anomaly Detection External World Ingest from external with nifi Send to kafka Make a prediction about the data Index the prediction in elasticsearch with logstash Render the data with kibana Store raw events in cassandra
  • 17. Summary Real ML pipeline Cassandra for storing raw data results ELK (Elasticsearch, Logstash, Kibana) stack for alerting and visualization Kafka for model ingestion Lagom for serving model predictions NiFi for designing data pipelines