SlideShare a Scribd company logo
Amirkabir University of Technology
Department of Computer Engineering
and Information Technology
Image Classification with
Deep Convolutional
Neural Networks
Sepehr Rasouli
Introduction > Methods > Results > Conclusion2
Outline
• Introduction to Image Classification
& Deep Networks
• Proposed Method
• Main Idea
• Data Set
• Architecture
• Techniques
• Comparison & Results
• Conclusion
Introduction > Methods > Results > Conclusion3
Image Classification
Introduction > Methods > Results > Conclusion4
Why Deep Learning?
•“Shallow” vs. “deep” architectures
Learn a feature hierarchy all the way from pixels to classifier
Hand	Designed	
Feature	
Extraction
Trainable	
Classifier
Layer	1 Layer	N
Simpler	
classifier
Introduction > Methods > Results > Conclusion5
Our Method
• Deep Convolutional Neural Network
• 5 convolutional and 3 fully connected layers
• 650,000 neurons, 60 million parameters
• Techniques used for boosting up performance
• ReLU nonlinearity
• Training on Multiple GPUs
• Overlapping max pooling
• Data Augmentation
• Dropout
Introduction > Methods > Results > Conclusion6
Overall Architecture
• Trained with stochastic gradient descent on two NVIDIA GPUs for about a
week (5~6 days)
• 650,000 neurons, 60 million parameters, 630 million connections
• The last layer contains 1,000 neurons which produces a distribution over the
1,000 class labels.
Introduction > Methods > Results > Conclusion7
Dataset
• ImageNet
§ Over 15 million high-quality labeled images
§ About 22,000 categories
§ Collected from the web, labeled by humans on Amazon's Mechanical
Turk
§ Variable-resolution images
• ILSVRC Competition
§ ImageNet Large-Scale Pascal Visual Object Challenge
§ Annual competition of image classification at large scale
§ Subset of ImageNet
§ 1,000 categories with about 1,000 images each
§ 1.2M images in 1K categories
§ Classification: make 5 guesses about the image label
Introduction > Methods > Results > Conclusion8
Rectified Linear Units
𝑥 = 𝑤$ 𝑓 𝑍$ + 𝑤( 𝑓 𝑍(
+𝑤) 𝑓 𝑍)
x	is	called	the	total	input	
to	the	neuron,	and	f(x)	
is	its	output
Very	bad	
(slow	to	train	)
Very	good	
(quick	to	train)
f(x)	=	max(0,x)f(x)	=	tanh(x)
Introduction > Methods > Results > Conclusion9
Rectified Linear Units
• Biological plausibility: One-sided, compared
to the antisymmetry of tanh.
• Sparse activation: For example, in a randomly
initialized network, only about 50% of hidden
units are activated (having a non-zero output).
• Efficient gradient propagation: No vanishing
gradient problem or exploding effect.
• Efficient computation: Only comparison,
addition and multiplication
Introduction > Methods > Results > Conclusion10
Training on Multiple GPUs
• Spread across two GPUs
• GTX 580 GPU with 3GB memory
• Particularly well-suited to cross-GPU
parallelization
• Very efficient implementation of CNN on
GPUs
Model Top-1 Top-5
Sparse	coding [3] 47.1% 28.2%
SIFT	+	FVs	[4] 45.7% 25.7%
CNN 37.5 17.0%
Introduction > Methods > Results > Conclusion11
Results & Comparison
•ILSVRC-2010	test	set
ILSVRC-2010	winner
Previous	best
published	result
Our	Method
Comparison	of	results	on	ILSRVCs	2010	
test	set.	In	italics	best	results	achieved	
by	others.
Introduction > Methods > Results > Conclusion12
Conclusion
• Large, deep convolutional neural networks for large
scale image classification was proposed
• 5 convolutional layers, 3 fully-connected layers
• 650,000 neurons, 60 million parameters
• Several techniques for boosting up performance
• The proposed method won the ILSVRC-2012
• Achieved a winning top-5 error rate of 15.3%,
compared to 26.2% achieved by the second-best entry
Introduction > Methods > Results > Conclusion13
Conclusion
Introduction > Methods > Results > Conclusion14
References
[1]	 http://cs.nyu.edu/~fergus/tutorials/
deep_learning_cvpr12/fergus_dl_tutorial_final.pptx
[2]	 reference	:	http://web.engr.illinois.edu/
~slazebni/spring14/lec24_cnn.pdf
[3]	 A.	Berg,	J.	Deng,	and	L.	Fei-Fei.	Large	scale	
visual	recognition	challenge	2010.	
www.imagenet.org/challenges.	2010.	[4]	
S.	Tara,	Brian	Kingsbury,	A.-r.	Mohamed	and
B.	Ramabhadran,	"Learning	Filter	Banks	within	a	Deep
[4]	 J.Sánchezand F.Perronnin.High-dimensional	
signature	compression	for	large-scale	image	classification.
In	Computer	Vision	and	Pattern	Recognition(CVPR),
2011IEEEConferenceon,pages1665–1672.IEEE,	2011.
Introduction > Methods > Results > Conclusion15
Thank you for your attention
Any Questions?
Introduction > Methods > Results > Conclusion16
Results 2012
• ILSVRC-2012	results
Proposed	method
Top-5	error	rate	:	16.422%
Runner-up
Top-5	error	rate	:	26.172%
Introduction > Methods > Results > Conclusion17
Convolutional NNs
Introduction > Methods > Results > Conclusion18
Pooling
• Spatial Pooling
• Non-overlapping / overlapping regions
• Sum or max
Max
Sum
Introduction > Methods > Results > Conclusion19
Dropout
• Independently	set	each	hidden	unit	activity	to			zero	with	0.5	
probability
• Used	in	the	two	globally-connected	hidden	layers	at	the	net's	
output

More Related Content

PDF
Learning to Learn by Gradient Descent by Gradient Descent
PDF
An Introduction to Neural Architecture Search
PPT
Face alignment by deep convolutional network with adaptive learning rate
PPTX
Ashfaq Munshi, ML7 Fellow, Pepperdata
PDF
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
PDF
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
PPTX
Face Recognition: From Scratch To Hatch
PPTX
Regularization in deep learning
Learning to Learn by Gradient Descent by Gradient Descent
An Introduction to Neural Architecture Search
Face alignment by deep convolutional network with adaptive learning rate
Ashfaq Munshi, ML7 Fellow, Pepperdata
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
Face Recognition: From Scratch To Hatch
Regularization in deep learning

What's hot (20)

PDF
Network recasting
PPTX
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
PDF
YOLOv4: optimal speed and accuracy of object detection review
PDF
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
PDF
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
PDF
Restricting the Flow: Information Bottlenecks for Attribution
PPTX
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PDF
Human uncertainty makes classification more robust, ICCV 2019 Review
PDF
201907 AutoML and Neural Architecture Search
PPTX
Deep Learning in Computer Vision
PDF
Poster_Reseau_Neurones_Journees_2013
PDF
Dueling Network Architectures for Deep Reinforcement Learning
PDF
Bag of tricks for image classification with convolutional neural networks r...
PPTX
InfoGAIL
PDF
2019 cvpr paper_overview
PPTX
Kaggle review Planet: Understanding the Amazon from Space
PDF
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
PPTX
AlexNet and so on...
PDF
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
PDF
Deep learning and image analytics using Python by Dr Sanparit
Network recasting
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
YOLOv4: optimal speed and accuracy of object detection review
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Restricting the Flow: Information Bottlenecks for Attribution
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
Human uncertainty makes classification more robust, ICCV 2019 Review
201907 AutoML and Neural Architecture Search
Deep Learning in Computer Vision
Poster_Reseau_Neurones_Journees_2013
Dueling Network Architectures for Deep Reinforcement Learning
Bag of tricks for image classification with convolutional neural networks r...
InfoGAIL
2019 cvpr paper_overview
Kaggle review Planet: Understanding the Amazon from Space
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
AlexNet and so on...
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...
Deep learning and image analytics using Python by Dr Sanparit
Ad

Similar to Image classification with neural networks (20)

PPTX
Cvpr 2018 papers review (efficient computing)
PPTX
Development of Deep Learning Architecture
PPTX
08 neural networks
PPTX
Introduction to deep learning
PPTX
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
PPTX
Machine Learning Techniques - Linear Model.pptx
PPT
Artificial neural network model & hidden layers in multilayer artificial neur...
PDF
Deep Learning Initiative @ NECSTLab
PPTX
Machine Learning, Deep Learning and Data Analysis Introduction
PPTX
DeepLearningLecture.pptx
PPTX
Facial Emotion Detection on Children's Emotional Face
PPTX
Computer Vision for Beginners
PDF
Deep learning
PDF
Finding the best solution for Image Processing
PPTX
Introduction to Deep Learning
PPTX
An Introduction to Deep Learning
PDF
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
PPTX
Deep learning with TensorFlow
PPTX
Computer Design Concepts for Machine Learning
PPTX
Deep Learning for Computer Vision - PyconDE 2017
Cvpr 2018 papers review (efficient computing)
Development of Deep Learning Architecture
08 neural networks
Introduction to deep learning
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Machine Learning Techniques - Linear Model.pptx
Artificial neural network model & hidden layers in multilayer artificial neur...
Deep Learning Initiative @ NECSTLab
Machine Learning, Deep Learning and Data Analysis Introduction
DeepLearningLecture.pptx
Facial Emotion Detection on Children's Emotional Face
Computer Vision for Beginners
Deep learning
Finding the best solution for Image Processing
Introduction to Deep Learning
An Introduction to Deep Learning
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
Deep learning with TensorFlow
Computer Design Concepts for Machine Learning
Deep Learning for Computer Vision - PyconDE 2017
Ad

Recently uploaded (20)

PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
RMMM.pdf make it easy to upload and study
PDF
Insiders guide to clinical Medicine.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Cell Types and Its function , kingdom of life
PPTX
Pharma ospi slides which help in ospi learning
PPTX
master seminar digital applications in india
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Pre independence Education in Inndia.pdf
PDF
Basic Mud Logging Guide for educational purpose
PPH.pptx obstetrics and gynecology in nursing
Microbial disease of the cardiovascular and lymphatic systems
Sports Quiz easy sports quiz sports quiz
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
RMMM.pdf make it easy to upload and study
Insiders guide to clinical Medicine.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
01-Introduction-to-Information-Management.pdf
Cell Types and Its function , kingdom of life
Pharma ospi slides which help in ospi learning
master seminar digital applications in india
102 student loan defaulters named and shamed – Is someone you know on the list?
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Final Presentation General Medicine 03-08-2024.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
GDM (1) (1).pptx small presentation for students
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Pre independence Education in Inndia.pdf
Basic Mud Logging Guide for educational purpose

Image classification with neural networks

  • 1. Amirkabir University of Technology Department of Computer Engineering and Information Technology Image Classification with Deep Convolutional Neural Networks Sepehr Rasouli
  • 2. Introduction > Methods > Results > Conclusion2 Outline • Introduction to Image Classification & Deep Networks • Proposed Method • Main Idea • Data Set • Architecture • Techniques • Comparison & Results • Conclusion
  • 3. Introduction > Methods > Results > Conclusion3 Image Classification
  • 4. Introduction > Methods > Results > Conclusion4 Why Deep Learning? •“Shallow” vs. “deep” architectures Learn a feature hierarchy all the way from pixels to classifier Hand Designed Feature Extraction Trainable Classifier Layer 1 Layer N Simpler classifier
  • 5. Introduction > Methods > Results > Conclusion5 Our Method • Deep Convolutional Neural Network • 5 convolutional and 3 fully connected layers • 650,000 neurons, 60 million parameters • Techniques used for boosting up performance • ReLU nonlinearity • Training on Multiple GPUs • Overlapping max pooling • Data Augmentation • Dropout
  • 6. Introduction > Methods > Results > Conclusion6 Overall Architecture • Trained with stochastic gradient descent on two NVIDIA GPUs for about a week (5~6 days) • 650,000 neurons, 60 million parameters, 630 million connections • The last layer contains 1,000 neurons which produces a distribution over the 1,000 class labels.
  • 7. Introduction > Methods > Results > Conclusion7 Dataset • ImageNet § Over 15 million high-quality labeled images § About 22,000 categories § Collected from the web, labeled by humans on Amazon's Mechanical Turk § Variable-resolution images • ILSVRC Competition § ImageNet Large-Scale Pascal Visual Object Challenge § Annual competition of image classification at large scale § Subset of ImageNet § 1,000 categories with about 1,000 images each § 1.2M images in 1K categories § Classification: make 5 guesses about the image label
  • 8. Introduction > Methods > Results > Conclusion8 Rectified Linear Units 𝑥 = 𝑤$ 𝑓 𝑍$ + 𝑤( 𝑓 𝑍( +𝑤) 𝑓 𝑍) x is called the total input to the neuron, and f(x) is its output Very bad (slow to train ) Very good (quick to train) f(x) = max(0,x)f(x) = tanh(x)
  • 9. Introduction > Methods > Results > Conclusion9 Rectified Linear Units • Biological plausibility: One-sided, compared to the antisymmetry of tanh. • Sparse activation: For example, in a randomly initialized network, only about 50% of hidden units are activated (having a non-zero output). • Efficient gradient propagation: No vanishing gradient problem or exploding effect. • Efficient computation: Only comparison, addition and multiplication
  • 10. Introduction > Methods > Results > Conclusion10 Training on Multiple GPUs • Spread across two GPUs • GTX 580 GPU with 3GB memory • Particularly well-suited to cross-GPU parallelization • Very efficient implementation of CNN on GPUs
  • 11. Model Top-1 Top-5 Sparse coding [3] 47.1% 28.2% SIFT + FVs [4] 45.7% 25.7% CNN 37.5 17.0% Introduction > Methods > Results > Conclusion11 Results & Comparison •ILSVRC-2010 test set ILSVRC-2010 winner Previous best published result Our Method Comparison of results on ILSRVCs 2010 test set. In italics best results achieved by others.
  • 12. Introduction > Methods > Results > Conclusion12 Conclusion • Large, deep convolutional neural networks for large scale image classification was proposed • 5 convolutional layers, 3 fully-connected layers • 650,000 neurons, 60 million parameters • Several techniques for boosting up performance • The proposed method won the ILSVRC-2012 • Achieved a winning top-5 error rate of 15.3%, compared to 26.2% achieved by the second-best entry
  • 13. Introduction > Methods > Results > Conclusion13 Conclusion
  • 14. Introduction > Methods > Results > Conclusion14 References [1] http://cs.nyu.edu/~fergus/tutorials/ deep_learning_cvpr12/fergus_dl_tutorial_final.pptx [2] reference : http://web.engr.illinois.edu/ ~slazebni/spring14/lec24_cnn.pdf [3] A. Berg, J. Deng, and L. Fei-Fei. Large scale visual recognition challenge 2010. www.imagenet.org/challenges. 2010. [4] S. Tara, Brian Kingsbury, A.-r. Mohamed and B. Ramabhadran, "Learning Filter Banks within a Deep [4] J.Sánchezand F.Perronnin.High-dimensional signature compression for large-scale image classification. In Computer Vision and Pattern Recognition(CVPR), 2011IEEEConferenceon,pages1665–1672.IEEE, 2011.
  • 15. Introduction > Methods > Results > Conclusion15 Thank you for your attention Any Questions?
  • 16. Introduction > Methods > Results > Conclusion16 Results 2012 • ILSVRC-2012 results Proposed method Top-5 error rate : 16.422% Runner-up Top-5 error rate : 26.172%
  • 17. Introduction > Methods > Results > Conclusion17 Convolutional NNs
  • 18. Introduction > Methods > Results > Conclusion18 Pooling • Spatial Pooling • Non-overlapping / overlapping regions • Sum or max Max Sum
  • 19. Introduction > Methods > Results > Conclusion19 Dropout • Independently set each hidden unit activity to zero with 0.5 probability • Used in the two globally-connected hidden layers at the net's output