Deep learning for Industries

Deep Learning for Industries
Rahul Kumar
Chief AI Scientist
@BotSupply.ai / Jatana.ai
Demystifying Deep Learning
Hands on word2vec

About
@hellorahulk
https://github.com/goodrahstar/
https://medium.com/@hellorahulk
www.hellorahulk.com

DE3p Larenn1g mhica3ns wrok smliair to hOw biarns wrok.
Tehse mahcnies wrok by s33nig f22Uy pa773rns and cnonc3t1ng t3Hm t0
fU22y cnoc3tps. T3hy wRok l4y3r by ly43r, j5ut lK1e a f1L37r, t4k1NG
cmopl3x scn33s aNd br3k41ng tH3m dwon itno s1pmLe iD34s.
@hellorahulk

Deep Learning mechanism work similar to how brain work.
These machines works by seeing funny patterns and connecting them to
funny concepts. They work layer by layer, just like a filter, taking complex
scenes and breaking them down into simple ideas.
@hellorahulk

So far...
Some input vector (very few assumptions made).
Will discuss in detail
@hellorahulk

In many real-world applications input vectors have structure.
Spectrograms
ImagesText
@hellorahulk

Neural Networks:
A pinch of history
@hellorahulk

Hubel & Wiesel,
1959
RECEPTIVE FIELDS OF SINGLE
NEURONES IN
THE CAT'S STRIATE CORTEX
1962
RECEPTIVE FIELDS, BINOCULAR
INTERACTION
AND FUNCTIONAL ARCHITECTURE IN
THE CAT'S VISUAL CORTEX
1968...
@hellorahulk

car 99%
Computer
Vision
2011
@hellorahulk

Computer
Vision
2011
Page 1
@hellorahulk

Computer
Vision
2011
Page 2
@hellorahulk

Computer
Vision
2011
Page 3+ code complexity :(
@hellorahulk

[224x224x3]
f 1000 numbers,
indicating class scores
Feature
Extraction
vector describing
various image statistics
[224x224x3]
f 1000 numbers,
training
training
@hellorahulk

“Run the image through 20 layers of 3x3
convolutions and train the filters with SGD.”
@hellorahulk

DNN Approach
CNN Features off-the-shelf: an Astounding Baseline for Recognition
[Razavian et al, 2014]
@hellorahulk

e.g. with TensorFlow
The power is easily accessible.
# Python 2
$ sudo pip install --upgrade tensorflow
# Python 3
$ sudo pip3 install --upgrade tensorflow
@hellorahulk

TensorFlow : Programming Paradigm
Load library and MNIST data
@hellorahulk

Design neural network
architecture
@hellorahulk

Select optimization algorithm
@hellorahulk

Initialize the session
and variables
@hellorahulk

Train the model
@hellorahulk

95.3%
@hellorahulk

Convolutional Neural Networks
@hellorahulk

ConvNets are everywhere…
e.g. Google Photos search
Face Verification, Taigman et al. 2014 (FAIR)
Self-driving cars
[Goodfellow et al. 2014]
Ciresan et al. 2013
@hellorahulk

Whale recognition, Kaggle Challenge Satellite image analysis
Mnih and Hinton, 2010
Galaxy Challenge Dielman et al. 2015
WaveNet, van den Oord et al. 2016
Image captioning, Vinyals et al. 2015
@hellorahulk

DeepDream reddit.com/r/deepdream NeuralStyle, Gatys et al. 2015
deepart.io, Prisma, etc.
@hellorahulk

[224x224x3]
f 1000 numbers,
training
Only two basic operations are involved throughout:
1. Dot products wT x
2. Max operations max(.)
@hellorahulk

e.g. 200K numbers e.g. 10 numbers
@hellorahulk

32
32
3
Convolution Layer
32x32x3 image
width
height
depth
@hellorahulk

32
32
3
Convolution Layer
5x5x3 filter
32x32x3 image
Convolve the filter with the image
i.e. “slide over the image spatially,
computing dot products”
@hellorahulk

32
32
3
Convolution Layer
5x5x3 filter
32x32x3 image
Convolve the filter with the image
i.e. “slide over the image spatially,
computing dot products”
Filters always extend the full
depth of the input volume
@hellorahulk

32
32
3
Convolution Layer
32x32x3 image
5x5x3 filter
1 number:
the result of taking a dot product between the
filter and a small 5x5x3 chunk of the image
(i.e. 5*5*3 = 75-dimensional dot product + bias)
@hellorahulk

32
32
3
Convolution Layer
32x32x3 image
5x5x3 filter
convolve (slide) over all
spatial locations
activation map
1
28
28
@hellorahulk

32
32
3
Convolution Layer
32x32x3 image
5x5x3 filter
convolve (slide) over all
spatial locations
activation maps
1
28
28
consider a second, green filter
@hellorahulk

32
32
3
Convolution Layer
activation maps
6
28
28
For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps:
We stack these up to get a “new image” of size 28x28x6!
@hellorahulk

ConvNet is a sequence of Convolution Layers, interspersed with activation
functions
Sigmoid Tanh
ReLU
@hellorahulk

ConvNet is a sequence of Convolution Layers, interspersed with activation
functions
32
32
3
CONV,
ReLU
e.g. 6
5x5x3
filters 28
28
6
CONV,
ReLU
e.g. 10
5x5x6
filters
CONV,
ReLU
….
10
24
24
@hellorahulk

two more layers to go: POOL/FC
@hellorahulk

Pooling layer
- makes the representations smaller and more manageable
- operates over each activation map independently:
@hellorahulk

1 1 2 4
5 6 7 8
3 2 1 0
1 2 3 4
Single depth slice
x
y
max pool with 2x2 filters
and stride 2 6 8
3 4
MAX POOLING
@hellorahulk

Fully Connected Layer (FC layer)
- Contains neurons that connect to the entire input volume, as in ordinary Neural
Networks
@hellorahulk

Addressing other tasks...
@hellorahulk

image CNN
feature
s
224x224x3
A block of compute with a few
million calculations.
7x7x512
@hellorahulk

image CNN
feature
s
224x224x3
million calculations.
7x7x512
predicted thing
desired thing
@hellorahulk

image CNN
feature
s
224x224x3
million parameters.
7x7x512
predicted thing
desired thing
this part changes
from task to task
@hellorahulk

Image Classification
thing = a vector of probabilities for different classes
image CNN feature
s
224x224x3
7x7x512
e.g. vector of 1000 numbers giving
probabilities for different classes.
fully connected layer
@hellorahulk

Image Captioning
image CNN feature
s
224x224x3
7x7x512
A sequence of 10,000-dimensional vectors
giving probabilities of different words in the
caption.
RNN
@hellorahulk

Localization
image CNN feature
s
224x224x3
7x7x512
fully connected layer
Class
probabilities
(as before)
4 numbers:
- X coord
- Y coord
- Width
- Height
@hellorahulk

Reinforcement Learning
image CNN feature
s
160x210x3
fully connected
e.g. vector of 8 numbers giving probability
of wanting to take any of the 8 possible
ATARI actions.
Mnih et al. 2015
@hellorahulk

Segmentation
image CNN feature
s
224x224x3
7x7x512
deconv layers
224x224x20
array of class
probabilities at
each pixel.
image class “map”
@hellorahulk

Hands on word2vec
Colab link : https://goo.gl/7H7mVo
@hellorahulk

Thank you!
"Deep learning" offer us great power - and pose unique risks.
Can we Vectorise them?
@hellorahulk

Deep learning for Industries

More Related Content

Similar to Deep learning for Industries (20)

Recently uploaded (20)

Deep learning for Industries