Anomaly detection in deep learning (Updated) English

Anomaly Detection in Deep
Learning
Adam Gibson - Skymind

Skymind
We take Deep Learning models to production on premise
Using Scala (think Python for production)
Java Virtual Machine stack connected to C++ (eg: first class
access to big data systems) with native compute
We make SKIL(Skymind Intelligence Layer): A production deep
learning system for building deep learning applications in
production

What’s an “Anomaly?”
Abnormal Patterns in Data
Fraud Detection - “Bad credit card Transactions”
ALSO Fraud detection - Detecting fake locations with call detail
records
Network Intrusion - Abnormal Activity in a network
Broken Computers in a data center

Brief Case Studies - eg: Why am I up here?
Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep-
learning-software-to-identify-fraud/
Network Infrastructure:
https://insights.ubuntu.com/2016/04/25/making-deep-
learning-accessible-on-openstack/

Network Infra - Save time and Money avoiding
Broken workloads by auto migration before it happens

Why Deep Learning?
Learns well from lots of data
Own feature representation: Robust to noise and allows for
learning cross domain patterns
Already applied in ads: Google itself invests lots in this same
kind of pattern recognition (targeting/relevance)

Techniques
Unsupervised - Use autoencoder reconstruction error and moving averages with
dropout over a set time window
Supervised - RNNs learn from a set of yes/nos in a time series. RNNs can learn from
a series of time steps and predict when an anomaly is about to occur.
Use streaming/minibatches (all neural nets can learn like this)

AutoEncoder Anomaly Detection
Moving average anomaly with KL Divergence
Autoencoder learns to reconstruct data (eg: the input is the labels)

Recurrent Net Anomalies
Learn a softmax over time series:
Given a fixed window, the goal is to predict a probability of an anomaly
occurring given a sequence

Sequences Time Series/Windows with RNNs
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
See: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Some definitions
Reconstruction Error: Autoencoders can learn from
unsupervised pretraining and learn how to reconstruct data.
Minimize KL Divergence (the delta between two probability
distributions)
RNN/Time Series: See http://deeplearning4j.org/usingrnns

Production
Kafka/Spark Streaming/Flink/Apex
Neural networks as consumer of streaming updates
Data? Mostly log ingestion, could be video

Demo!
Kibana
Kafka
Elasticsearch
Logstash
NiFi
Cassandra
Lagom
Dl4j Ecosystem(DataVec,Nd4j,Dl4j,Arbiter)

Reference Architecture for Anomaly Detection
External
World
Ingest from
external with
nifi Send to
kafka
Make a
prediction
about the
data
Index the
prediction in
elasticsearch
with logstash
Render
the
data
with
kibana
Store raw
events in
cassandra

Summary
Real ML pipeline
Cassandra for storing raw data results
ELK (Elasticsearch, Logstash, Kibana) stack for alerting and
visualization
Kafka for model ingestion
Lagom for serving model predictions
NiFi for designing data pipelines

Questions?
Email: adam@skymind.io
Twitter: agibsonccc
Github: agibsonccc

Anomaly detection in deep learning (Updated) English

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Anomaly detection in deep learning (Updated) English (20)

More from Adam Gibson (20)

Recently uploaded (20)

Anomaly detection in deep learning (Updated) English