SlideShare a Scribd company logo
Neural
Networks
By
Debajyoti Karmaker
1
• Computer Science
2
• Artificial Intelligence
3
• Machine Learning
4
• Neural Networks
5
• Deep Learning
Hype about Deep Learning
Challenges: Semantic gap
Challenges: View point variation
Challenges: Deformation
Challenges: Occlusion
Challenges: background clutter
Challenges: Intraclass variation
Dataset : CIFAR-10
Training images : 50000
Each image is 32 x 32 x 3
Test images : 10000
Labels : 10
Nearest Neighbor Classifier
L1 distance (Manhattan):
 L2 distance (Euclidian):
 Instant training
 Expensive at test
 Linear slow down
 CNN flips this
Distance Matric
Nearest Neighbors: Distance Metric
L1 distance (Manhattan): L2 distance (Euclidian):
K-Nearest Neighbors
Hyperparameters
Data set
Train Test
Train Validation Test
Setting hyperparameters
Data set
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test
Cross validation
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test
Cross validation on CIFAR-10 dataset
Performance on CIFAR-10 (~29%)
Machine Learning
Training Data Feature Extraction Test Image
Classifier
bird
Linear Classification
[ 32 x 32 x 3 ] (3072 numbers in total)
10 numbers
Indicating class scores
¿𝑾 𝒙
Parametric approach
10
x
3072
3072
x
1
10
x
1
0.2 -0.5 0.1 2.0
1.5 1.3 2.1 0.0
0.0 0.25 0.2 -0.3
56
231
24
2
+
1.1
3.2
-1.2
-96.8
437.9
60.75
Bird score
Dog score
Cat score
𝑊
𝑥𝑖
𝑏 𝑓 (𝑥𝑖 ;𝑊 ,𝑏)
Stress pixels into single column
𝒇 (𝒙,𝑾 ) + b
56 231
24 2
Linear classifier on CIFAR-10
plane car bird cat
deer dog frog horse
ship truck
Multiclass SVM Loss
Scores vector:
SVM Loss (Hinge Loss):
Bird 3.2 1.3 2.2
Dog 5.1 4.9 2.5
Cat -1.7 2.0 -3.1
Losses 2.9 0 12.9
Given any dataset of example
 is the image
 is the (integer) label
𝐿=
(2.9+0+12.9 )
3
=5.27
Bird 3.2 1.3 2.2
Dog 5.1 4.9 2.5
Cat -1.7 2.0 -3.1
Losses 2.9 0 10.9
¿max(0,1.3−4.9+1)+max(0,2.0−4.9+1)=max(0,−2.6)+max(0,−1.9)=0+0=0
𝐿𝑖=∑
𝑗≠𝑦𝑖
max ⁡(0, 𝑠𝑗 −𝑠𝑦𝑖
+1)
Multiclass SVM Loss
¿max(0,2.6−9.8+1)+max(0,4.0−9.8+1)=max(0,−6.2)+max(0,−4.8)=0+0=0
𝑊=𝑊 ∗2
Regularization
Data loss: Model prediction
should match training data
Regularization: Model
should be “simple”, so it
works on test data
+
𝐿=
1
𝑁
∑
𝑖=1
𝑁
∑
𝑗≠ 𝑦𝑖
max (0 , 𝑓 (𝑥𝑖 ,𝑊 )𝑗
− 𝑓 (𝑥𝑖 ,𝑊 )𝑦𝑖
+1)+ 𝜆 𝑅(𝑊 )
L2 regularization:
L1 regularization:
Elastic net (L2 + L1):
Model complexity: number of zeros
Model complexity: Smaller norm
Weight Regularization
0.13
0.87
0.00
24.5
164.0
0.18
Softmax Classifier (Multinomial Logistic Regression)
scores = unnormalized log probabilities of the classes.
where
Bird 3.2
Dog 5.1
Cat -1.7
Want to maximize the log likelihood, or (for a loss function) to minimize the
negative log likelihood of the correct class:
)
𝐿𝑖=− log (
𝑒
𝑠𝑦 𝑖
∑
𝑗
𝑒𝑠𝑗
)⁡
exp normalize
unnormalized probabilities
unnormalized log probabilities probabilities
Optimization
A first very bad idea solution: Random search
15.5% accuracy! not bad! (SOTA is ~95%)
Numeric Gradient
Follow the slope
[
0.34
-1.11
0.78
0.12
0.55
2.81
-3.1
-1.5
0.33
… ]
[
0.34
-1.11
0.78
0.12
0.55
2.81
-3.1
-1.5
0.33
… ]
[
?
?
?
?
?
?
?
… ]
Current W W + h Gradient dW
+ 0.0001
+ 0.0001
loss 1.25347 loss 1.25322
?
?
-2.5
loss 1.25353
0.6
Computational Graph
e.g. x = -2, y = 5, z = -4
+
𝑥=−2
𝑦 =5
*
𝑞=3
𝑧=−4
𝑓 =−12
1
𝑞=𝑥+ 𝑦
𝜕𝑞
𝜕 𝑥
=1,
𝜕𝑞
𝜕 𝑦
=1
𝑓 =𝑞𝑧
𝜕 𝑓
𝜕𝑞
=𝑧 ,
𝜕 𝑓
𝜕𝑧
=𝑞
Want : ,
𝜕 𝑓
𝜕 𝑓
𝜕 𝑓
𝜕 𝑧
𝜕 𝑓
𝜕𝑞
3
𝜕 𝑓
𝜕 𝑦
-4
𝜕 𝑓
𝜕 𝑥
-4
Chain rule :
-4
Chain rule :
𝑓 (𝑊 , 𝑥 )=
1
1+𝑒−(𝑤0 𝑥0+𝑤1 𝑥1 +𝑤2 )
Computational Graph
∗
∗
+¿ +¿ -1 𝑒𝑥𝑝
𝑤0
𝑥0
𝑤1
𝑥1
𝑤2
2.00
-1.00
-3.00
-2.00
-3.00
+1 1
𝑥
-2.00
6.00
4.00 1.00 -1.00 0.37 1.37 0.73
1.00
-0.53
-0.53
-0.20
0.20
0.20
0.20
0.20
0.20
-0.20
0.40
-0.40
-0.60 𝑓 (𝑥)=𝑒
𝑥
→
𝑑𝑓
𝑑𝑥
=𝑒
𝑥
𝑓 𝑐 (𝑥)=𝑐+𝑥 →
𝑑𝑓
𝑑𝑥
=1
𝑓 (𝑥)=
1
𝑥
→
𝑑𝑓
𝑑𝑥
=−
1
𝑥
2
𝑓 𝑎 (𝑥)=𝑎𝑥 →
𝑑𝑓
𝑑𝑥
=𝑎
(−
1
1.37
2 )∗(1.00 )=− 0.53
(1)(− 0.53)=− 0.53
(𝑒¿¿−1)(−0.53)=−0.20¿
(−1)(−0.20 )=0.20
ADD gate : Gradient distributor
Max gate : Gradient router
Mul gate: Gradient switcher
Patterns in backward flow
• Torch
• Theano
• Caffe
• Keras
• …
• etc
Deep Learning frameworks
Vectorized example
∈ℝ𝑛
∈ℝ𝑛×𝑛
*
𝑊 =
[ 0 .1 0.5
− 0.3 0.8 ]
𝑥=[0 .2
0.4 ]
𝑞=𝑊.𝑥=
[𝑊1,1𝑥1+¿⋯ +𝑊1,𝑛𝑥𝑛
⋮ ¿
⋮¿𝑊𝑛,1𝑥1+¿⋯¿+𝑊𝑛,𝑛𝑥𝑛¿]
L2
[0 .2 2
0. 26 ] 0.116
1.00
[0 . 4 4
0. 52 ]
𝜕 𝑓
𝜕𝑞𝑖
=2𝑞𝑖
𝑓 (𝑞)
𝑞
𝜕𝑞𝑘
𝜕𝑊𝑖, 𝑗
=1𝑘=𝑖 𝑥 𝑗
𝑓 (𝑞)=‖𝑞‖
2
=𝑞1
2
+…+𝑞𝑛
2
[0 . 088 0. 104
0. 176 0. 208]
𝜕𝑞𝑘
𝜕 𝑥𝑖
=𝑊 𝑘,𝑖
[−0. 1 12
0.636 ]

More Related Content

PPTX
Deep learning with TensorFlow
PPTX
The world of loss function
PDF
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
PPT
tutorial.ppt
PDF
Mit6 094 iap10_lec03
PDF
Lecture7 cross validation
PDF
Hands-on Tutorial of Machine Learning in Python
PPTX
Anomaly detection using deep one class classifier
Deep learning with TensorFlow
The world of loss function
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
tutorial.ppt
Mit6 094 iap10_lec03
Lecture7 cross validation
Hands-on Tutorial of Machine Learning in Python
Anomaly detection using deep one class classifier

Similar to CVPR_01 On Image Processing and application of various alogorithms (20)

PPT
Introduction
PDF
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
PDF
Introduction to Machine Learning
PDF
L1 intro2 supervised_learning
PPTX
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
PPTX
DeepLearningLecture.pptx
PDF
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
PPT
ECCV2010: feature learning for image classification, part 4
PPT
Introduction
PDF
Neural Networks. Overview
PDF
机器学习Adaboost
PDF
MLHEP 2015: Introductory Lecture #1
PPTX
aaaaaaaaaaaaaaaaaaadeep learning basics.pptx
PDF
Hardware Acceleration for Machine Learning
PPTX
presentation of IntroductionDeepLearning.pptx
PDF
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
PPTX
pptx - Psuedo Random Generator for Halfspaces
PPTX
pptx - Psuedo Random Generator for Halfspaces
PDF
Machine learning for_finance
PDF
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Introduction
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Introduction to Machine Learning
L1 intro2 supervised_learning
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
DeepLearningLecture.pptx
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
ECCV2010: feature learning for image classification, part 4
Introduction
Neural Networks. Overview
机器学习Adaboost
MLHEP 2015: Introductory Lecture #1
aaaaaaaaaaaaaaaaaaadeep learning basics.pptx
Hardware Acceleration for Machine Learning
presentation of IntroductionDeepLearning.pptx
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
pptx - Psuedo Random Generator for Halfspaces
pptx - Psuedo Random Generator for Halfspaces
Machine learning for_finance
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Ad

Recently uploaded (20)

PPTX
Emphasizing It's Not The End 08 06 2025.pptx
PPTX
Project and change Managment: short video sequences for IBA
PDF
Instagram's Product Secrets Unveiled with this PPT
PPTX
Impressionism_PostImpressionism_Presentation.pptx
PPTX
Intro to ISO 9001 2015.pptx wareness raising
PDF
oil_refinery_presentation_v1 sllfmfls.pdf
PPTX
The spiral of silence is a theory in communication and political science that...
PPTX
fundraisepro pitch deck elegant and modern
PPTX
_ISO_Presentation_ISO 9001 and 45001.pptx
PPTX
English-9-Q1-3-.pptxjkshbxnnxgchchxgxhxhx
PDF
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
DOCX
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PPTX
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
PPTX
lesson6-211001025531lesson plan ppt.pptx
PPTX
Tour Presentation Educational Activity.pptx
PPTX
Primary and secondary sources, and history
PPTX
Anesthesia and it's stage with mnemonic and images
PPTX
Effective_Handling_Information_Presentation.pptx
PPTX
2025-08-10 Joseph 02 (shared slides).pptx
PPTX
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
Emphasizing It's Not The End 08 06 2025.pptx
Project and change Managment: short video sequences for IBA
Instagram's Product Secrets Unveiled with this PPT
Impressionism_PostImpressionism_Presentation.pptx
Intro to ISO 9001 2015.pptx wareness raising
oil_refinery_presentation_v1 sllfmfls.pdf
The spiral of silence is a theory in communication and political science that...
fundraisepro pitch deck elegant and modern
_ISO_Presentation_ISO 9001 and 45001.pptx
English-9-Q1-3-.pptxjkshbxnnxgchchxgxhxhx
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
lesson6-211001025531lesson plan ppt.pptx
Tour Presentation Educational Activity.pptx
Primary and secondary sources, and history
Anesthesia and it's stage with mnemonic and images
Effective_Handling_Information_Presentation.pptx
2025-08-10 Joseph 02 (shared slides).pptx
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
Ad

CVPR_01 On Image Processing and application of various alogorithms

  • 2. 1 • Computer Science 2 • Artificial Intelligence 3 • Machine Learning 4 • Neural Networks 5 • Deep Learning Hype about Deep Learning
  • 9. Dataset : CIFAR-10 Training images : 50000 Each image is 32 x 32 x 3 Test images : 10000 Labels : 10 Nearest Neighbor Classifier
  • 10. L1 distance (Manhattan):  L2 distance (Euclidian):  Instant training  Expensive at test  Linear slow down  CNN flips this Distance Matric
  • 11. Nearest Neighbors: Distance Metric L1 distance (Manhattan): L2 distance (Euclidian):
  • 13. Setting hyperparameters Data set Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test Cross validation Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 test
  • 14. Cross validation on CIFAR-10 dataset
  • 16. Machine Learning Training Data Feature Extraction Test Image Classifier bird
  • 17. Linear Classification [ 32 x 32 x 3 ] (3072 numbers in total) 10 numbers Indicating class scores ¿𝑾 𝒙 Parametric approach 10 x 3072 3072 x 1 10 x 1 0.2 -0.5 0.1 2.0 1.5 1.3 2.1 0.0 0.0 0.25 0.2 -0.3 56 231 24 2 + 1.1 3.2 -1.2 -96.8 437.9 60.75 Bird score Dog score Cat score 𝑊 𝑥𝑖 𝑏 𝑓 (𝑥𝑖 ;𝑊 ,𝑏) Stress pixels into single column 𝒇 (𝒙,𝑾 ) + b 56 231 24 2
  • 18. Linear classifier on CIFAR-10 plane car bird cat deer dog frog horse ship truck
  • 19. Multiclass SVM Loss Scores vector: SVM Loss (Hinge Loss): Bird 3.2 1.3 2.2 Dog 5.1 4.9 2.5 Cat -1.7 2.0 -3.1 Losses 2.9 0 12.9 Given any dataset of example  is the image  is the (integer) label 𝐿= (2.9+0+12.9 ) 3 =5.27
  • 20. Bird 3.2 1.3 2.2 Dog 5.1 4.9 2.5 Cat -1.7 2.0 -3.1 Losses 2.9 0 10.9 ¿max(0,1.3−4.9+1)+max(0,2.0−4.9+1)=max(0,−2.6)+max(0,−1.9)=0+0=0 𝐿𝑖=∑ 𝑗≠𝑦𝑖 max ⁡(0, 𝑠𝑗 −𝑠𝑦𝑖 +1) Multiclass SVM Loss ¿max(0,2.6−9.8+1)+max(0,4.0−9.8+1)=max(0,−6.2)+max(0,−4.8)=0+0=0 𝑊=𝑊 ∗2
  • 21. Regularization Data loss: Model prediction should match training data Regularization: Model should be “simple”, so it works on test data +
  • 22. 𝐿= 1 𝑁 ∑ 𝑖=1 𝑁 ∑ 𝑗≠ 𝑦𝑖 max (0 , 𝑓 (𝑥𝑖 ,𝑊 )𝑗 − 𝑓 (𝑥𝑖 ,𝑊 )𝑦𝑖 +1)+ 𝜆 𝑅(𝑊 ) L2 regularization: L1 regularization: Elastic net (L2 + L1): Model complexity: number of zeros Model complexity: Smaller norm Weight Regularization
  • 23. 0.13 0.87 0.00 24.5 164.0 0.18 Softmax Classifier (Multinomial Logistic Regression) scores = unnormalized log probabilities of the classes. where Bird 3.2 Dog 5.1 Cat -1.7 Want to maximize the log likelihood, or (for a loss function) to minimize the negative log likelihood of the correct class: ) 𝐿𝑖=− log ( 𝑒 𝑠𝑦 𝑖 ∑ 𝑗 𝑒𝑠𝑗 )⁡ exp normalize unnormalized probabilities unnormalized log probabilities probabilities
  • 24. Optimization A first very bad idea solution: Random search 15.5% accuracy! not bad! (SOTA is ~95%)
  • 25. Numeric Gradient Follow the slope [ 0.34 -1.11 0.78 0.12 0.55 2.81 -3.1 -1.5 0.33 … ] [ 0.34 -1.11 0.78 0.12 0.55 2.81 -3.1 -1.5 0.33 … ] [ ? ? ? ? ? ? ? … ] Current W W + h Gradient dW + 0.0001 + 0.0001 loss 1.25347 loss 1.25322 ? ? -2.5 loss 1.25353 0.6
  • 26. Computational Graph e.g. x = -2, y = 5, z = -4 + 𝑥=−2 𝑦 =5 * 𝑞=3 𝑧=−4 𝑓 =−12 1 𝑞=𝑥+ 𝑦 𝜕𝑞 𝜕 𝑥 =1, 𝜕𝑞 𝜕 𝑦 =1 𝑓 =𝑞𝑧 𝜕 𝑓 𝜕𝑞 =𝑧 , 𝜕 𝑓 𝜕𝑧 =𝑞 Want : , 𝜕 𝑓 𝜕 𝑓 𝜕 𝑓 𝜕 𝑧 𝜕 𝑓 𝜕𝑞 3 𝜕 𝑓 𝜕 𝑦 -4 𝜕 𝑓 𝜕 𝑥 -4 Chain rule : -4 Chain rule :
  • 27. 𝑓 (𝑊 , 𝑥 )= 1 1+𝑒−(𝑤0 𝑥0+𝑤1 𝑥1 +𝑤2 ) Computational Graph ∗ ∗ +¿ +¿ -1 𝑒𝑥𝑝 𝑤0 𝑥0 𝑤1 𝑥1 𝑤2 2.00 -1.00 -3.00 -2.00 -3.00 +1 1 𝑥 -2.00 6.00 4.00 1.00 -1.00 0.37 1.37 0.73 1.00 -0.53 -0.53 -0.20 0.20 0.20 0.20 0.20 0.20 -0.20 0.40 -0.40 -0.60 𝑓 (𝑥)=𝑒 𝑥 → 𝑑𝑓 𝑑𝑥 =𝑒 𝑥 𝑓 𝑐 (𝑥)=𝑐+𝑥 → 𝑑𝑓 𝑑𝑥 =1 𝑓 (𝑥)= 1 𝑥 → 𝑑𝑓 𝑑𝑥 =− 1 𝑥 2 𝑓 𝑎 (𝑥)=𝑎𝑥 → 𝑑𝑓 𝑑𝑥 =𝑎 (− 1 1.37 2 )∗(1.00 )=− 0.53 (1)(− 0.53)=− 0.53 (𝑒¿¿−1)(−0.53)=−0.20¿ (−1)(−0.20 )=0.20
  • 28. ADD gate : Gradient distributor Max gate : Gradient router Mul gate: Gradient switcher Patterns in backward flow
  • 29. • Torch • Theano • Caffe • Keras • … • etc Deep Learning frameworks
  • 30. Vectorized example ∈ℝ𝑛 ∈ℝ𝑛×𝑛 * 𝑊 = [ 0 .1 0.5 − 0.3 0.8 ] 𝑥=[0 .2 0.4 ] 𝑞=𝑊.𝑥= [𝑊1,1𝑥1+¿⋯ +𝑊1,𝑛𝑥𝑛 ⋮ ¿ ⋮¿𝑊𝑛,1𝑥1+¿⋯¿+𝑊𝑛,𝑛𝑥𝑛¿] L2 [0 .2 2 0. 26 ] 0.116 1.00 [0 . 4 4 0. 52 ] 𝜕 𝑓 𝜕𝑞𝑖 =2𝑞𝑖 𝑓 (𝑞) 𝑞 𝜕𝑞𝑘 𝜕𝑊𝑖, 𝑗 =1𝑘=𝑖 𝑥 𝑗 𝑓 (𝑞)=‖𝑞‖ 2 =𝑞1 2 +…+𝑞𝑛 2 [0 . 088 0. 104 0. 176 0. 208] 𝜕𝑞𝑘 𝜕 𝑥𝑖 =𝑊 𝑘,𝑖 [−0. 1 12 0.636 ]

Editor's Notes

  • #16: Features can be colors, edges, SIFT, SURF, HoG etc. Invariant to lighting, scale, rotation.
  • #17: Template matching approach Each of the rows correspond to some template of the image Inner product or dot product gives us the similarity between the template and the image
  • #19: Loss function Li - which will take in the predicted scores coming in from the function f,
  • #20: Linear mapping Loss function in full form
  • #22: \lambda = Regularization strength (hyperparameter)
  • #25: Finite difference approximation Very slow
  • #28: Intuitive interpretation
  • #29: Gaint collection of layers or gates and thin connectivity layers