SlideShare a Scribd company logo
Introduction to Deep
Learning
Nandita Bhaskhar
Content adapted from CS231n and past CS229
teams
April 29th, 2022
2
Overview
● Motivation for deep
learning
● Areas of Deep Learning
● Convolutional neural
networks
● Recurrent neural networks
● Deep learning tools
Classical
Approaches
Saturate!
3
https://xkcd.com/
1425/
● Computer vision is especially hard
for conventional image processing
techniques
● Humans are just intrinsically better
at perceiving the world!
What about the MLPs we learnt in class?
Recall:
● Input Layer
● Hidden
layer
● Activations
● Outputs
Pic Credit: Becoming Human: Artificial Intelligence
Magazine
4
What about the MLPs we learnt in class?
Expensive to learn. Will not generalize well
Does not exploit the order and local relations in the
data!
64x64x3=12288
parameters We also want
many layers
5
6
Overview
● Motivation for deep
learning
● Areas in Deep Learning
● Convolutional neural
networks
● Recurrent neural networks
● Deep learning tools
What are different pillars of deep learning?
Convolutional
NN Image
7
Recurrent
NN Time
Series
Deep RL
Control
System
Graph NN
Networks/Relation
al
8
Overview
● Motivation for deep
learning
● Areas of Deep Learning
● Convolutional neural
networks
● Recurrent neural networks
● Deep learning tools
Convolutional Neural Networks
Convolutional
Neural
Network
Recurrent
NN 9
Deep RL Graph
NN
Let us look at images in detail
10
2D Convolution
11
Pic Credit: Apple, Chip
Convolving
Filters
Sharpenin
g
https://ai.stanford.edu/~syyeung/cvweb/
https://ai.stanford.edu/~syyeung/cvweb/
tutorials.html
Edge Detection: Laplacian
Filters
0 -1 0
-1 4 -1
0 -1 0
-1 -1 -1
-1 8 -1
-1 -1 -1
12
Convolving Filters
● Why not extract features using
filters?
● Better, why not let the data
dictate
what filters to use?
● Learnable filters!!
13
Convolution on multiple channels
● Images are generally RGB !!
● How would a filter work
on a image with RGB
channels?
● The filter should also have
3 channels.
● Now the output has a
channel
for every filter we have
used. 14
15
Slide Credit: CS231n
16
Slide Credit: CS231n
17
Slide Credit: CS231n
18
Slide Credit: CS231n
19
Slide Credit: CS231n
20
Slide Credit: CS231n
21
Slide Credit: CS231n
22
Slide Credit: CS231n
23
Slide Credit: CS231n
24
Slide Credit: CS231n
25
Slide Credit: CS231n
26
Slide Credit: CS231n
27
Slide Credit: CS231n
28
Slide Credit: CS231n
29
Slide Credit: CS231n
Parameter Sharing
Lesser the parameters less computationally intensive the training.
This is a
win win as we are reusing parameters.
30
Translational invariance
Since we are training filters
to detect cats and the
moving these filters over
the data, a differently
positioned cat will also get
detected by the same set of
filters.
31
Filteres? Layers of filters?
Images that maximize filter outputs at
certain layers. We observe that the images
get more complex as filters are situated
deeper
How deeper layers can learn deeper
embeddings. How an eye is made up of
multiple curves and a face is made up of two
eyes.
32
How do we use convolutions?
Let convolutions extract
features! 33
Image credit: LeCun et al.
Fun Fact: Convolution really is just a linear
operation
● In fact convolution is a giant
matrix
multiplication.
● We can expand the 2
dimensional image into a vector
and the conv operation into a
matrix.
34
35
How do we learn?
We now have a network with:
● a bunch of weights
● a loss function
To learn:
● Just do gradient descent and backpropagate the error derivates
How do we learn?
Instead of
There are “optimizers”
● Momentum: Gradient +
Momentum
● Nestrov: Momentum + Gradients
● Adagrad: Normalize with sum of
sq
● RMSprop: Normalize with
moving avg of sum of squares
● ADAM: RMsprop + momentum
36
Mini-batch Gradient Descent
Expensive to compute gradient for large dataset
Memory size
Compute time
Mini-batch: takes a sample of training data
How to we sample intelligently?
37
Is deeper better?
Deeper networks seem to be
more powerful but harder to
train.
● Loss of information during
forward propagation
● Loss of gradient info
during back propagation
There are many ways to
“keep the gradient going”
38
Solution
Connect the layers, create a gradient highway or
information
highway.
ResNet (2015)
39
Image credit: He et al.
Initialization
● Can we initialize all neurons to
zero?
● If all the weights are same we will
not be able to break symmetry of
the network and all filters will end
up learning the same thing.
● Large numbers, might knock
relu units out.
● Relu units once knocked out
and their output is zero,
their gradient flow also
becomes zero.
● We need small random numbers
at
initialization.
● Variance :
1/sqrt(n)
● Mean: 0
Popular initialization setups
40
(Xavier, He) (Uniform,
Normal)
Dropout
● What does cutting off some
network
connections do?
● Trains multiple smaller networks
in an ensemble.
● Can drop entire layer too!
● Acts like a really good
regularizer
41
Tricks for training
● Data augmentation if your data
set is smaller. This helps the
network generalize more.
● Early stopping if training loss
goes above validation loss.
● Random hyperparameter
search or
grid search?
42
43
Overview
● Motivation for deep
learning
● Areas of Deep Learning
● Convolutional neural
networks
● Recurrent neural networks
● Deep learning tools
CNN sounds like fun!
What are some deep learning pillars?
Recurrent
NN
Time Series
Convolutional
NN
Deep RL Graph
NN
44
We can also have 1D architectures (remember
this)
● CNN works on any data where there
is a local pattern
● We use 1D convolutions on DNA
sequences, text sequences and
music notes
● But what if time series has causal
dependency or any kind of
sequential dependency?
45
To address sequential dependency?
Use recurrent neural network
(RNN)
Previous
output
Latent
Output
One time
step RNN Cell
They are really the same cell,
NOT many different cells like kernels of
CNN
46
Unrolling an
RNN
How does RNN produce result?
I love CS !
Result after
reading full
sentence
Evolving
“embedding”
47
There are 2 types of RNN cells
Long Short Term Memory
(LSTM)
48
Gated Recurrent Unit
(GRU)
Store in “long term
memory”
Response to current
input
Update
gate
Reset
gate
Response
to
current
input
Recurrent AND deep?
Taking last
value
49
Pay “attention”
to everything
Stacking Attention
Model
“Recurrent” AND convolutional?
Temporal convolutional network
Temporal dependency achieved
through “one-sided” convolution
More efficient because deep
learning packages are optimized
for matrix multiplication =
convolution
No hard dependency
50
More? Take CS230, CS236, CS231N, CS224N
Convolutional
NN Image
Recurrent
NN Time
Series
Deep RL
Control
System
Graph NN
Networks/Relation
al
51
Not today, but take CS234 and CS224W
Convolutional
NN Image
Recurrent
NN Time
Series
Deep RL
Control
System
Graph NN
Networks/Relation
al
52
53
Overview
● Motivation for deep
learning
● Areas of Deep Learning
● Convolutional neural
networks
● Recurrent neural networks
● Deep learning tools
Tools for deep
learning
Popular Tools
Specialize
d
Groups
54
Where can I get free stuff?
Google
Colab
Free (limited-ish) GPU access
Works nicely with
Tensorflow
Links to Google
Drive
Register a new Google Cloud
account
=> Instant
$300??
=> AWS free tier (limited
compute)
=> Azure education account,
$200?
To SAVE money
CLOSE your GPU
instance
~$1 an hour
Azure
Notebook
Kaggle kernel???
Amazon
SageMaker?
55
Good luck!
Well, have fun
too :D
56

More Related Content

PPTX
Introduction_to_Deep_learning_Standford_university by Angelica Sun
PDF
Scene understanding
PPTX
Introduction to deep learning
PPTX
Deep learning from a novice perspective
PPTX
Deep Learning Tutorial
PDF
Finding the best solution for Image Processing
PPTX
Lecture on Deep Learning
PDF
Deep learning on spark
Introduction_to_Deep_learning_Standford_university by Angelica Sun
Scene understanding
Introduction to deep learning
Deep learning from a novice perspective
Deep Learning Tutorial
Finding the best solution for Image Processing
Lecture on Deep Learning
Deep learning on spark

Similar to Introduction to deep learning - basic concept of CNN (20)

PDF
Convolutional Neural Networks (CNN)
PPTX
Machine Learning, Deep Learning and Data Analysis Introduction
PDF
Deep Learning: Application & Opportunity
PDF
Fundamental of deep learning
PDF
Machine learining concepts and artifical intelligence
PPTX
Deep Learning
PDF
DEF CON 24 - Clarence Chio - machine duping 101
PDF
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
PPT
deeplearning
PPTX
Introduction to artificial neural network.pptx
PPTX
Deep Learning in Recommender Systems - RecSys Summer School 2017
PDF
Top 10 deep learning algorithms you should know in
PPTX
Convolutional Neural Network and RNN for OCR problem.
PDF
Training Neural Networks
DOCX
DEEP LEARNING.docx
PPT
Neural Networks in Data Mining - “An Overview”
PPTX
Visualization of Deep Learning
PPTX
Transfer Learning and Fine-tuning Deep Neural Networks
PDF
State of the art time-series analysis with deep learning by Javier Ordóñez at...
PPTX
Introduction to computer vision
Convolutional Neural Networks (CNN)
Machine Learning, Deep Learning and Data Analysis Introduction
Deep Learning: Application & Opportunity
Fundamental of deep learning
Machine learining concepts and artifical intelligence
Deep Learning
DEF CON 24 - Clarence Chio - machine duping 101
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
deeplearning
Introduction to artificial neural network.pptx
Deep Learning in Recommender Systems - RecSys Summer School 2017
Top 10 deep learning algorithms you should know in
Convolutional Neural Network and RNN for OCR problem.
Training Neural Networks
DEEP LEARNING.docx
Neural Networks in Data Mining - “An Overview”
Visualization of Deep Learning
Transfer Learning and Fine-tuning Deep Neural Networks
State of the art time-series analysis with deep learning by Javier Ordóñez at...
Introduction to computer vision
Ad

Recently uploaded (20)

PPTX
Safety Seminar civil to be ensured for safe working.
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Sustainable Sites - Green Building Construction
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
composite construction of structures.pdf
PPTX
Current and future trends in Computer Vision.pptx
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
PPT on Performance Review to get promotions
Safety Seminar civil to be ensured for safe working.
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CH1 Production IntroductoryConcepts.pptx
Sustainable Sites - Green Building Construction
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
composite construction of structures.pdf
Current and future trends in Computer Vision.pptx
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Operating System & Kernel Study Guide-1 - converted.pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Internet of Things (IOT) - A guide to understanding
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPT on Performance Review to get promotions
Ad

Introduction to deep learning - basic concept of CNN