Cnn method

Convolutional Neural Network…
Convolutional
Neural
Network
AmirHossein Sajedi
Master of Computer Engineering – Artificial Inteligence

Deep learning
Deep learning is an artificial intelligence
function that imitates the workings of the
human brain in processing data and creating
patterns for use in decision making. Deep
learning is a subset of machine learning in
artificial intelligence that has networks capable
of learning unsupervised from data that is
unstructured or unlabeled. Also known as
deep neural learning or deep neural network.

Convolutional Neural Network
CNN is a supervised deep learning
method.The origin of the applications of deep
learning to object recognition tasks can be
traced to the convolutional neural networks
(CNNs) in the early 90s.
The CNNbased architectures have captured
intense interest in computer vision since
October 2012 shortly after the ImageNet
competition results were released.

CNN ARCHITECTURE
They are very similar to ordinary NNs.
They are made up of neurons that have
learnable weights and biases.
Each neuron receives some inputs, performs
a dot product and optionally follows it with a
nonlinearity.
Their connectivity is now restricted to be local
spatially.

CNN ARCHITECTURE
CNNs are comprised of three types of layers:
—Convolutional layers
Pooling layers
Fully-connected layers
Unlike standard ANNs, the neurons within
Convolutional and Poolinglayers will only
connect to a small region of the layer
preceding it

CNN ARCHITECTURE
The neurons in a layer will only be connected
to a small region of the layer before it, instead
of all of the neurons in a fully-connected
manner.

CONVOLUTIONAL LAYER
The convolution operation extracts different
features of the input. The first convolution layer
extracts lowlevel features like edges, lines,
and corners. Higherlevel layers extract higher-
level features.

LOCAL CONNECTIVITY
Every neuron in a convolutional layer is only
connected to small region of the input volume.
The dimensionality of this region is commonly
referred to as the receptive field size of the
neuron.The magnitude of the connectivity
through the depth is nearly always equal to the
depth of the input.

CONVOLUTIONAL LAYER-DEPTH
The depth of the output volume produced by
the convolutional layer is the number of
neurons within the layer looking at the same
region in the input.
Reducing this hyperparameter can significantly
minimisethe total number of neurons of the
network, but it can also significantly reduce the
pattern recognition capabilities of the model.

STRIDE
we must specify the stride with which we slide
the filter. When the stride is 1 then we move
thefilters one pixel at a time. When the stride
is 2 (or uncommonly 3 or more, though this is
rare in practice) then the filters jump 2 pixels
at a time as we slide them around. This will
produce smaller output volumes spatially.

CONSTRAINTS ON STRIDES
When the input has size W=10, no zero
padding is used P=0, and the filter size is F=3,
then it would be impossible to use stride S=2
(W−F+2P)/S+1=(10−3+0)/2+1=4.5

ZERO-PADDING
Sometimes it will be convenient to pad the
input volume with zeros around the border.
The nice feature of zero padding is that it will
allow us to control the spatial size of the
output volumes.
We will use it to exactly preserve the spatial
size of the input volume so the input and
output width and height are the same.

THE SPATIAL SIZE OF THE OUTPUT
W: the input volume size
F: the receptive field size of the Conv Layer neurons
S: the stride with which they are applied
P: the amount of zero padding

POOLING LAYER/SUBSAMPLING
LAYERS
 It is common to periodically insert a
Pooling layer in-between successive
Conv layers.
 It progressively reduce the spatial
size of the representation to reduce
the amount of parameters and
computation in the network, and
hence to also control overfitting.
 It makes the features robust against
noise and distortion.
 The Pooling Layer operates
independently on every depth slice of
the input and resizes it spatially using
the pooling operation.

NON-LINEAR LAYERS
Neural networks in general and CNNs in
particular rely on a nonlinear trigger function to
signal distinct identification of likely features
on each hidden layer. CNNs may use a variety
of specific functions such as rectified linear
units (ReLUs) and continuous trigger (non-
linear) functions to efficiently implement this
nonlinear triggering.

RELU
In comparison to the other non-linear functions
used in CNNs (e.g.,hyperbolic tangent,
absolute of hyperbolic tangent, and sigmoid),
the advantage of a ReLU is that the network
trains many times faster.

NORMALIZATION LAYER
Many types of normalization layers have been
proposed for use in ConvNet architectures,
sometimes with the intentions of implementing
inhibition schemes observed in the biological
brain. However, these layers have since fallen
out of favor because in practice their
contribution has been shown to be minimal, if
any.For various types of normalizations.

FULLY-CONNECTED LAYER
They are often used as the final layers of a
CNN. These layers mathematically sum a
weighting of the previous layer of features,
indicating the precise mix of “ingredients” to
determine a specific target output result. In
case of a fully connected layer, all the
elements of all the features of the previous
layer get used in the calculation of each
element of each output feature.

EXAMPLE ARCHITECTURE: OVERVIEW
 A simple ConvNet for CIFAR-10 classification could have the
architecture [INPUT - CONV - RELU - POOL - FC]: v INPUT
[32x32x3] will hold the raw pixel values of the image,
(width,height,color).v CONV layer will compute the output of
neurons that are connected to local regions in the input, each
computing a dot product between their weights and a small
region they are connected to in the input volume. This may result
in volume such as [32x32x12] if we decided to use 12 filters. v
RELU layer will apply an element wise activation function, such
as the max(0,x). This leaves the size of the volume unchanged. v
POOL layer will perform a down sampling operation along the
spatial dimensions, resulting in volume such as [16x16x12]. v FC
(i.e. fully-connected) layer will compute the class scores,
resulting in volume of size [1x1x10], where each of the 10
numbers correspond to a class score, such as among the 10
categories of CIFAR-10.

CONV DEMO
 Since 3D volumes are hard to visualize, all the volumes (the input
volume (in blue), the weight volumes (in red), the output volume
(in green)) are visualized with each depth slice stacked in rows.
The input volume is of size W1=5,H1=5,D1=3, and the CONV
layer parameters are K=2,F=3,S=2,P=1. That is, we have two
filters of size 3×33×3, and they are applied with a stride of 2.
Therefore, the output volume size has spatial size
(5 - 3 + 2)/2 +1 = 3. Moreover, notice that a padding of P=1 is
applied to the input volume, making the outer border of the input
volume zero. The visualization below iterates over the output
activations (green), and shows that each element is computed by
element wise multiplying the highlighted input (blue) with the filter
(red), summing it up, and then offsetting the result by the bias.

GOOGLE-NET -Example
 GoogLeNet. The ILSVRC 2014 winner was a
Convolutional Network from Szegedy et al. from
Google. Its main contribution was the development of
an Inception Module that dramatically reduced the
number of parameters in the network (4M, compared
to AlexNet with 60M). Additionally, this paper uses
Average Pooling instead of Fully Connected layers at
the top of the ConvNet, eliminating a large amount of
parameters that do not seem to matter much. There
are also several follow up versions to the GoogLeNet,
most recently Inception-v4.

Cnn method

More Related Content

What's hot (20)

Similar to Cnn method (20)

Recently uploaded (20)

Cnn method