Introduction to Convolutional Neural Networks

Hello!
I am Frederick Apina
Machine Learning Engineer @ParrotAI
I am here because I love to give
presentations.
2

“When I think about strong
innovations in term of
automation, cognitive computing,
and artificial intelligence, they will
be coming a lot from Tanzania as
well.”
3

1.
What are
Convolutional
Neural Networks?

5
CNN
Convolutional Neural Networks (ConvNets or CNNs) are a
category of Neural Networks that have proven very effective in
areas such as image recognition and classification. ConvNets
have been successful in identifying faces, objects and
traffic signs apart from powering vision in robots and self
driving cars.

7
There are four main operations in the ConvNet
1. Convolution
2. Non Linearity (ReLU)
3. Pooling or Sub Sampling
4. Classification (Fully Connected Layer)

8
An Image is a matrix of pixel values

9
Channel is a conventional term used to refer to a certain component of an
image. An image from a standard digital camera will have three channels – red,
green and blue – you can imagine those as three 2d-matrices stacked over each
other (one for each color), each having pixel values in the range 0 to 255

10
A grayscale image, on the other hand, has just one channel. For the purpose of
this post, we will only consider grayscale images, so we will have a single 2d
matrix representing an image. The value of each pixel in the matrix will range
from 0 to 255 – zero indicating black and 255 indicating white.

11
The Convolution Step
ConvNets derive their name from the “convolution” operator. The primary
purpose of Convolution in case of a ConvNet is to extract features from the
input image. Convolution preserves the spatial relationship between pixels by
learning image features using small squares of input data.
5x5 matrix
3x3 matrix

12
In CNN terminology, the 3×3 matrix is called a ‘filter‘ or ‘kernel’ or ‘feature detector’ and the matrix
formed by sliding the filter over the image and computing the dot product is called the ‘Convolved
Feature’ or ‘Activation Map’ or the ‘Feature Map‘.
Take this image

13
In practice, a CNN learns the values of these filters on its own during the training process (although we still need to
specify parameters such as number of filters, filter size, architecture of the network etc. before the training process).
The more number of filters we have, the more image features get extracted and the better our network becomes at
recognizing patterns in unseen image
The size of the Feature Map (Convolved Feature) is controlled by three parameters that we need to
decide before the convolution step is performed:
1. Depth 2. Stride 3. Zero-padding

Introduction to Non-Linearlity
ReLU stands for Rectified Linear Unit and is a non-linear operation. Its output is given by:

15
ReLU is an element wise operation (applied per pixel) and replaces all negative pixel values in the
feature map by zero. The purpose of ReLU is to introduce non-linearity in our ConvNet, since most of
the real-world data we would want our ConvNet to learn would be non-linear.

The Pooling step
Spatial Pooling (also called subsampling or downsampling) reduces the dimensionality of each feature
map but retains the most important information. Spatial Pooling can be of different types: Max,
Average, Sum etc.

19
The function of Pooling is to progressively reduce the spatial size of the input
representation. In particular, pooling;
1. makes the input representations (feature dimension) smaller and more manageable
2. reduces the number of parameters and computations in the network, therefore,
controlling overfitting.
3. makes the network invariant to small transformations, distortions and translations in
the input image (a small distortion in input will not change the output of Pooling –
since we take the maximum / average value in a local neighborhood).

Full Connected Layer
The term “Fully Connected” implies that every neuron in the previous layer is
connected to every neuron on the next layer.
The output from the convolutional and pooling layers represent high-level
features of the input image. The purpose of the Fully Connected layer is to use
these features for classifying the input image into various classes based on
the training dataset.

Putting it all together – Training using Backpropagation
The Convolution + Pooling layers act as Feature Extractors from the input
image while Fully Connected layer acts as a classifier.
The output from the convolutional and pooling layers represent high-level
features of the input image. The purpose of the Fully Connected layer is to use
these features for classifying the input image into various classes based on
the training dataset.

22
The overall training process of the Convolution Network may be summarized as below:
• Step1: We initialize all filters and parameters / weights with random values
• Step2: The network takes a training image as input, goes through the forward propagation step (convolution,
ReLU and pooling operations along with forward propagation in the Fully Connected layer) and finds the
output probabilities for each class.
• Step3: Calculate the total error at the output layer (summation over all 4 classes)
Total Error = ∑ ½ (target probability – output probability) ²
• Step4: Use Backpropagation to calculate the gradients of the error with respect to all weights in the network
and use gradient descent to update all filter values / weights and parameter values to minimize the output
error.
• Step5: Repeat steps 2-4 with all images in the training set.

Visualizing Convolutional Neural Networks

Deep Learning is an continuously-growing and a
relatively new concept, the vast amount of
resources can be a touch overwhelming for those
either looking to get into the field, or those
already engraved in it. A good way of cooping is to
get a good general knowledge of machine learning
and then find a good structured path to follow (be
a project or research).
24
Conclusion

25
Thanks!
Any questions?
You can find me at:
✗ Fred.apina@gmail.com

Introduction to Convolutional Neural Networks

More Related Content

What's hot (20)

Similar to Introduction to Convolutional Neural Networks (20)

Recently uploaded (20)

Introduction to Convolutional Neural Networks