deeplearning

Deep Learning
Supervised by prof.asst. Dr.mohammad najem
by: Huda hamdan ali

contents
 Introduction and overveiw
 Deep learning challenges
 Deep N.N
 Unsupervised Preprocessing Networks
 Deep Belief Networks
 Denoising auto encoder
 Stacked Auto Encoders
 Deep Boltzmann Machines
 CNN – Convolutional Neural Networks
 Recurrent N.N
 Long Short-Term Memory RNN (LSTM)
 Generative Adversarial Neural
 Deep Reinforcement Learning
 Applications.
2

introduction
 Deep learning (also known as deep structured
learning or hierarchical learning) is part of a broader family
of machine learning methods based on learning data
representations, as opposed to task-specific algorithms.
Learning can be supervised, semi-
supervised or unsupervised.
 use a cascade of multiple layers of nonlinear
processing units for feature extraction and transformation.
Each successive layer uses the output from the previous layer
as input
 Deep learning architectures such as deep neural
networks, deep belief networks and recurrent neural
networks have been applied to fields including computer
vision, speech recognition, natural language processing,

Introduction cont..
 Deep learning algorithms can be applied to unsupervised learning
tasks.
 This is an important benefit because unlabeled data are more
abundant than labeled data.

Inspired by the Brain
 The first hierarchy of neurons that receives information in the
visual cortex are sensitive to specific edges while brain regions
further down the visual pipeline are sensitive to more complex
structures such as faces.
 Our brain has lots of neurons connected together and the
strength of the connections between neurons represents long
term knowledge.

Deep Learning training
Overview
 Train networks with many layers (Multiple layers work to build an
improved feature space
 First layer learns 1st
order features (e.g. edges…)
 2nd
layer learns higher order features (combinations of first layer
features, combinations of edges, etc.)
 Some models learn in an unsupervised mode and discover general
features of the input space – serving multiple tasks related to the
unsupervised instances (image recognition, etc.)
 Final layer of transformed features are fed into supervised layer(s)
 And entire network is often subsequently tuned using supervised training of
the entire net, using the initial weightings learned in the unsupervised phase

Deep Learning Architecture
A deep neural network consists of a hierarchy of layers, whereby each
layer transforms the input data into more abstract representations (e.g.
edge -> nose -> face). The output layer combines those features to make
predictions

Problems with Back Propagation
 Gradient is progressively getting more dilute
 Below top few layers, correction signal is minimal
 Gets stuck in local minima
 Especially since they start out far from ‘good’ regions
(i.e., random initialization)

DNN challenges
 As with ANNs, many issues can arise with naively trained DNNs. Two
common issues are overfitting and computation time.
 DNNs are prone to overfitting because of the added layers of
abstraction, which allow them to model rare dependencies in the
training data.
 Regularization methods such as Ivakhnenko's unit pruning or
weight decay (regularization) or sparsity (regularization) can be
applied during training to combat overfitting. Alternatively dropout
regularization randomly omits units from the hidden layers during
training.
 This helps to exclude rare dependencies.
 Finally, data can be augmented via methods such as cropping
and rotating such that smaller training sets can be increased in size
to reduce the chances of overfitting.


Challenge cont..
 DNNs must consider many training parameters, such as
the size (number of layers and number of units per layer),
the learning rate and initial weights.
 Sweeping through the parameter space for optimal
parameters may not be feasible due to the cost in time
and computational resources.
 Various tricks such as batching (computing the gradient
on several training examples at once rather than
individual examples) speed up computation.
 The large processing throughput of GPUs has produced
significant speedups in training, because the matrix and
vector computations required are well-suited for GPUs.

Challenge Cont..
 Alternatively, we may need to look for other type of
neural network which has straightforward and
convergent training algorithm.
 CMAC (cerebellar model articulation controller) is such
kind of neural network. For example, there is no need to
adjust learning rates or randomize initial weights for
CMAC. The training process can be guaranteed to
converge in one step with a new batch of data, and the
computational complexity of the training algorithm is
linear with respect to the number of neurons involved

Greedy Layer-Wise Training
1. Train first layer using your data without the labels (unsupervised)
 Since there are no targets at this level, labels don't help.
 Then freeze the first layer parameters and start training the second
layer using the output of the first layer as the unsupervised input to the
second layer
1. Repeat this for as many layers as desired
 This builds the set of robust features
1. Use the outputs of the final layer as inputs to a supervised
layer/model and train the last supervised layer (s) (leave early
weights frozen)
2. Unfreeze all weights and fine tune the full network by training with
a supervised approach, given the pre-training weight settings
15

David Corne, and Nick Taylor, Heriot-Watt University - dwcorne@gmail.com
These slides and related resources: http://www.macs.hw.ac.uk/~dwcorne/Teaching/dmml.html

Deep Belief Networks(DBNs)
 Unsupervised pre-learning provides a good initialization
of the network
 Probabilistic generative model
 Deep architecture – multiple layers
 Supervised fine-tuning
 Generative: Up-down algorithm
 Discriminative: backpropagation

DBN Greedy training
 First step:
 Construct an RBM with an input layer v and a hidden
layer h
 Train the RBM
 A restricted Boltzmann machine (RBM) is:
a generative stochastic artificial neural network that
can learn a probability distribution over its set of inputs.

Auto-Encoders
 A type of unsupervised learning,
 An autoencoder is typically a feedforward neural network which aims to
learn a compressed, distributed representation (encoding) of a dataset.
 Conceptually, the network is trained to “recreate” the input, i.e., the
input and the target data are the same. In other words: you’re trying to
output the same thing you were input, but compressed in some way.
 In effect, we want a few small nodes in the middle to really learn the
data at a conceptual level, producing a compact representation that in
some way captures the core features of our input.
22

Stacked (Denoising) Auto-
Encoders

Convolutional Neural Nets
(CNN)

Convolutional Neural Nets (CNN)
Convolution layers a feature detector that automagically learns to filter out
not needed information from an input by using convolution kernel.
Pooling layers compute the max or average value of a particular feature over
a region of the input data (downsizing of input images).Also helps to detect
objects in some unusual places and reduces memory size.

CNN
 High accuracy for image applications – Breaking all records and
doing it using just raw pixel features.
 Special purpose net – Just for images or problems with strong grid-like
local spatial/temporal correlation
 Once trained on one problem (e.g. vision) could use same net (often
tuned) for a new similar problem – general creator of vision features
 Unlike traditional nets, handles variable sized inputs
 Same filters and weights, just convolve across different sized image and
dynamically scale size of pooling regions (not # of nodes), to normalize
 Different sized images, different length speech segments, etc.
 Lots of hand crafting and CV tuning to find the right recipe of
receptive fields, layer interconnections, etc.
 Lots more Hyperparameters than standard nets, and even than other
deep networks, since the structures of CNNs are more handcrafted
 CNNs getting wider and deeper with speed-up techniques (e.g. GPU,
ReLU, etc.) and lots of current research, excitement, and success
31

Long Short-Term Memory RNN (LSTM)

Generative Adversarial Networks

Deep Learning in Computer
Vision
Image Segmentation

Deep Learning in Computer Vision
Image Captioning

Vision
Image Compression

Image Localization

Image Transformation –Adding
features

Vision
Image Colorization

Image Generation –From
Descriptions

Style Transfer –morph images
into paintings

Deep Learning in Audio
Processing Sound Generation

Deep Learning in NLP
Syntax Parsing

Deep Learning in NLP
Generating Text

Deep Learning in Medicine
Skin Cancer Diagnoses

Deep Learning in Medicine
Detection of diabetic eye disease

Deep Learning in Science
Saving Energy

Deep Learning in Cryptography
Learning to encrypt and decrypt
communication

Autonomous drone navigation
with deep learning

Finally ..
 That’s the basic idea..
 There are many types of deep learning,
 different kinds of autoencoder, variations on architectures and
training algorithms, etc…
 Very fast growing area …

Thanks for attention
2017 //H u d a

deeplearning

More Related Content

What's hot (20)

Similar to deeplearning (20)

Recently uploaded (20)

deeplearning

Editor's Notes