SlideShare a Scribd company logo
MACHINE
LEARNING –
CONVOLUTIONAL
NEURAL NETWORK
Basic Structure of CNN
• Input Layer: Accepts input images as pixel
data.
• Convolutional Layer: Applies filters to
extract features.
• ReLU Layer: Introduces non-linearity to
the network.
• Pooling Layer: Reduces spatial dimensions
of feature maps.
• Fully Connected Layer: Final layer for
classification.
Convolutional Layer
• Filters/Kernels:
Detect specific
features in input
images.
• Stride: Controls
the movement of
filters across the
input.
• Padding: Adds
pixels around the
input to maintain
dimensions.
• Output:
Produces feature
maps indicating
detected features.
Padding in CNN
• Zero Padding: Adds zeros around
the input image to preserve
dimensions.
• Valid Padding: No padding,
reduces the size of output feature
maps.
• Role: Helps preserve edge
information during convolution.
Pooling Layer
• • Purpose: Reduces dimensionality
and computation in the network.
• • Max Pooling: Selects the
maximum value from each pooling
region.
• • Average Pooling: Takes the
average value from each pooling
region.
• • Impact: Retains important
features while reducing overfitting.
Basic Mathematics of CNN (B&W Image)
• • Convolution: Applies a filter
matrix across the image to detect
features.
• • Example: Sliding a 3x3 filter over
a grayscale image, producing a
feature map.
• • ReLU: Applies non-linearity after
convolution.
• • Pooling: Reduces the size of the
resulting feature map.
Basic Mathematics of CNN (Colored Image)
• • Convolution: Applies the same filter across
each RGB channel.
• • Result: Produces a combined feature map
from all channels.
• • Example: Sliding a filter across an RGB
image and summing up feature maps.
• • Pooling: Reduces the size of the resulting
feature map while preserving important
information.
Fully Connected Layer
• • Purpose: Flattens the output and connects to a fully connected
layer.
• • Function: Combines features for final classification.
• • Uses: Softmax or sigmoid activation functions for output.
LeNet-5 Architecture
• • Designed for handwritten digit
recognition (MNIST dataset).
• • Structure: 2 convolutional layers,
2 subsampling layers, 2 fully
connected layers.
• • Key Feature: Simple and efficient,
early CNN model.
AlexNet Architecture
• • Winner of the ImageNet
competition in 2012.
• • Structure: 5 convolutional layers, 3
fully connected layers.
• • Features: Uses ReLU, dropout, and
data augmentation.
• • Impact: Revolutionized deep
learning and computer vision.
VGG-16 Architecture
• • Uses 16 layers (13 convolutional, 3
fully connected).
• • Features: Smaller filters (3x3) with
deeper networks.
• • Strength: Achieves high accuracy
with a simple structure.
ResNet Architecture
• • Introduces Residual Learning to
combat vanishing gradients.
• • Structure: Skip connections or
shortcuts between layers.
• • Impact: Allows very deep networks
(e.g., ResNet-50, ResNet-101).
Inception (GoogLeNet) Architecture
• • Introduces Inception modules: parallel
convolutional filters.
• • Structure: Multiple filter sizes (1x1,
3x3, 5x5) in parallel.
• • Impact: Efficient and scalable for large-
scale image recognition.
Transfer Learning
• • Concept: Uses a pre-trained model on a new but related task.
• • Benefits: Speeds up training, requires less data, and improves
performance.
• • Example: Using a pre-trained model like ResNet for a new image
classification task.
Object Localization
• • Purpose: Identifies the location of objects within an image.
• • Methods: Bounding box regression, Region Proposal Networks
(RPNs).
• • Applications: Object detection, image segmentation.
Landmark Detection
• • Definition: Detects specific key
points or landmarks within an
image.
• • Applications: Facial recognition,
medical imaging (e.g., key
anatomical points).
• • Methods: CNNs used to detect
and regress the position of
landmarks.
Conclusion
• • CNNs have revolutionized computer vision tasks.
• • Architectures like LeNet, AlexNet, VGG, ResNet, and Inception paved
the way for modern image processing.
• • Transfer learning, object localization, and landmark detection
expand the versatility of CNNs.

More Related Content

PPTX
adlkchiuabcndjhvkajnfdkjhcfatgcbajkbcyudfctauygb
PPTX
Tìm hiểu về CNN và ResNet | Computer Vision
PPTX
Introduction to Convolutional Neural Networks (CNNs).pptx
PPTX
intro-to-cnn-April_2020.pptx
PPTX
Convolutional Neural Networks
PPTX
PDF
Modern Convolutional Neural Network techniques for image segmentation
PDF
Convolutional Neural Networks : Popular Architectures
adlkchiuabcndjhvkajnfdkjhcfatgcbajkbcyudfctauygb
Tìm hiểu về CNN và ResNet | Computer Vision
Introduction to Convolutional Neural Networks (CNNs).pptx
intro-to-cnn-April_2020.pptx
Convolutional Neural Networks
Modern Convolutional Neural Network techniques for image segmentation
Convolutional Neural Networks : Popular Architectures

Similar to Introduction to Convolutional Neural Networks (CNNs).pptx (20)

PPTX
CNN_Presentation to learn the basics of CNN Model.pptx
PPTX
04 Deep CNN (Ch_01 to Ch_3).pptx
PPTX
CNN, Deep Learning ResNet_30_Slide_Presentation.pptx
PPTX
CNN LSTM Transformers Presentation .pptx
PPTX
Convolutional neural network in deep learning
PPTX
Convolutional neural network in deep learning
PPT
lec6a.ppt
PPTX
build a Convolutional Neural Network (CNN) using TensorFlow in Python
PDF
cnn.pdf
PPTX
Introduction to CNN Models: DenseNet & MobileNet
PPTX
Introduction to computer vision
PPTX
Introduction to computer vision with Convoluted Neural Networks
PPTX
Handwritten Digit Recognition(Convolutional Neural Network) PPT
PDF
interface and user experience. Responsive Design: Ensure the app is user-frie...
PPTX
GoogLeNet.pptx
PDF
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PDF
Deep learning for image video processing
PPT
Unit 1
PDF
Once-for-All: Train One Network and Specialize it for Efficient Deployment
CNN_Presentation to learn the basics of CNN Model.pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
CNN, Deep Learning ResNet_30_Slide_Presentation.pptx
CNN LSTM Transformers Presentation .pptx
Convolutional neural network in deep learning
Convolutional neural network in deep learning
lec6a.ppt
build a Convolutional Neural Network (CNN) using TensorFlow in Python
cnn.pdf
Introduction to CNN Models: DenseNet & MobileNet
Introduction to computer vision
Introduction to computer vision with Convoluted Neural Networks
Handwritten Digit Recognition(Convolutional Neural Network) PPT
interface and user experience. Responsive Design: Ensure the app is user-frie...
GoogLeNet.pptx
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
Deep learning for image video processing
Unit 1
Once-for-All: Train One Network and Specialize it for Efficient Deployment
Ad

Recently uploaded (20)

PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
August Patch Tuesday
PPTX
1. Introduction to Computer Programming.pptx
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Web App vs Mobile App What Should You Build First.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
TLE Review Electricity (Electricity).pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
observCloud-Native Containerability and monitoring.pptx
Developing a website for English-speaking practice to English as a foreign la...
1 - Historical Antecedents, Social Consideration.pdf
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
NewMind AI Weekly Chronicles - August'25-Week II
August Patch Tuesday
1. Introduction to Computer Programming.pptx
Enhancing emotion recognition model for a student engagement use case through...
Web App vs Mobile App What Should You Build First.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
A contest of sentiment analysis: k-nearest neighbor versus neural network
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
A comparative study of natural language inference in Swahili using monolingua...
TLE Review Electricity (Electricity).pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
OMC Textile Division Presentation 2021.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
Ad

Introduction to Convolutional Neural Networks (CNNs).pptx

  • 2. Basic Structure of CNN • Input Layer: Accepts input images as pixel data. • Convolutional Layer: Applies filters to extract features. • ReLU Layer: Introduces non-linearity to the network. • Pooling Layer: Reduces spatial dimensions of feature maps. • Fully Connected Layer: Final layer for classification.
  • 3. Convolutional Layer • Filters/Kernels: Detect specific features in input images. • Stride: Controls the movement of filters across the input. • Padding: Adds pixels around the input to maintain dimensions. • Output: Produces feature maps indicating detected features.
  • 4. Padding in CNN • Zero Padding: Adds zeros around the input image to preserve dimensions. • Valid Padding: No padding, reduces the size of output feature maps. • Role: Helps preserve edge information during convolution.
  • 5. Pooling Layer • • Purpose: Reduces dimensionality and computation in the network. • • Max Pooling: Selects the maximum value from each pooling region. • • Average Pooling: Takes the average value from each pooling region. • • Impact: Retains important features while reducing overfitting.
  • 6. Basic Mathematics of CNN (B&W Image) • • Convolution: Applies a filter matrix across the image to detect features. • • Example: Sliding a 3x3 filter over a grayscale image, producing a feature map. • • ReLU: Applies non-linearity after convolution. • • Pooling: Reduces the size of the resulting feature map.
  • 7. Basic Mathematics of CNN (Colored Image) • • Convolution: Applies the same filter across each RGB channel. • • Result: Produces a combined feature map from all channels. • • Example: Sliding a filter across an RGB image and summing up feature maps. • • Pooling: Reduces the size of the resulting feature map while preserving important information.
  • 8. Fully Connected Layer • • Purpose: Flattens the output and connects to a fully connected layer. • • Function: Combines features for final classification. • • Uses: Softmax or sigmoid activation functions for output.
  • 9. LeNet-5 Architecture • • Designed for handwritten digit recognition (MNIST dataset). • • Structure: 2 convolutional layers, 2 subsampling layers, 2 fully connected layers. • • Key Feature: Simple and efficient, early CNN model.
  • 10. AlexNet Architecture • • Winner of the ImageNet competition in 2012. • • Structure: 5 convolutional layers, 3 fully connected layers. • • Features: Uses ReLU, dropout, and data augmentation. • • Impact: Revolutionized deep learning and computer vision.
  • 11. VGG-16 Architecture • • Uses 16 layers (13 convolutional, 3 fully connected). • • Features: Smaller filters (3x3) with deeper networks. • • Strength: Achieves high accuracy with a simple structure.
  • 12. ResNet Architecture • • Introduces Residual Learning to combat vanishing gradients. • • Structure: Skip connections or shortcuts between layers. • • Impact: Allows very deep networks (e.g., ResNet-50, ResNet-101).
  • 13. Inception (GoogLeNet) Architecture • • Introduces Inception modules: parallel convolutional filters. • • Structure: Multiple filter sizes (1x1, 3x3, 5x5) in parallel. • • Impact: Efficient and scalable for large- scale image recognition.
  • 14. Transfer Learning • • Concept: Uses a pre-trained model on a new but related task. • • Benefits: Speeds up training, requires less data, and improves performance. • • Example: Using a pre-trained model like ResNet for a new image classification task.
  • 15. Object Localization • • Purpose: Identifies the location of objects within an image. • • Methods: Bounding box regression, Region Proposal Networks (RPNs). • • Applications: Object detection, image segmentation.
  • 16. Landmark Detection • • Definition: Detects specific key points or landmarks within an image. • • Applications: Facial recognition, medical imaging (e.g., key anatomical points). • • Methods: CNNs used to detect and regress the position of landmarks.
  • 17. Conclusion • • CNNs have revolutionized computer vision tasks. • • Architectures like LeNet, AlexNet, VGG, ResNet, and Inception paved the way for modern image processing. • • Transfer learning, object localization, and landmark detection expand the versatility of CNNs.