SlideShare a Scribd company logo
Image Segmentation with
Deep Learning
Antonio Rueda-Toicen and Imran Kocabiyik
Berlin Computer Vision Group
December 2020
https://www.meetup.com/Berlin-Computer-Vision-Group/
Agenda
● Image segmentation
■ Semantic segmentation
● Fully convolutional networks, U-net
■ Instance segmentation
● Mask R-CNN
■ Panoptic segmentation
● Feature Pyramid Networks
○ Public datasets
■ COCO
■ Google Open Images
○ Implementations: Detectron2, Fast.ai
Classification, detection, and segmentation
Classification refers to image-wide labels
Detection refers to localization of bounding boxes with labels
Segmentation refers to pixel-wise localization of the labels
Goals of supervised image segmentation
Given an input image we wish to obtain:
1. A class label associated to each individual pixel in the image. This is also called pixel-wise
localization.
3. The probability score associated with each class label
Applications of image segmentation
Link
Applications of image segmentation
http://withoutbg.com/
Applications of image segmentation
Applications of image segmentation
Applications of image segmentation
Applications of image segmentation
https://www.segmentive.ai/
Segmentation as pixel-wise localization
Instance segmentation requires object detection
Panoptic segmentation
https://arxiv.org/pdf/1801.00868.pdf
Explore it in the detectron2 inference notebook
“Fully Convolutional” networks draw segmentation
masks
All layers in the network are convolutional, there is no fully connected (aka “dense”) layer like in most
classifiers, we use the local info of the pixel neighborhood
What is a convolution filter?
https://setosa.io/ev/image-kernels/
What is a convolution filter?
https://setosa.io/ev/image-kernels/
What is a convolution filter?
Convolution of 3x3 and stride = 1 without padding
Effect: the output loses one pixel on each dimension
What is a convolution filter?
Convolution of 3x3 and stride = 1 with zero padding
Effect: the output preserves original image size
What is a convolution filter?
Convolution of 3x3 and stride = 2 with zero padding
Effect: the output is downsampled to about half its size
“Fully Convolutional” networks draw segmentation
masks
All layers in the network are convolutional, there is no fully connected (aka “dense”) layer like in most
classifiers, we use the local info of the pixel neighborhood
U-net for semantic segmentation
All layers in the network are convolutional, there is no fully connected (aka “dense”) layer like in most
classifiers, we need this fully convolutional architecture to label images pixel by pixel preserving their
local info
U-net for semantic segmentation
All layers in the network are convolutional, there is no fully connected (aka “dense”) layer like in most
classifiers, we need this fully convolutional architecture to label images pixel by pixel preserving their
local info
Image pyramids
Image Pyramids in Feature Proposal Networks
(FPNs)
Convolutional networks implement “pyramids”
The deeper we go into the network, the more semantic value is compressed in lower x,y dimensions
Resnets
Nearest neighbor interpolation
Resnets in feature pyramid networks
1x1 convolution
Resnets in feature pyramid networks
Feature Pyramid Networks
Image Pyramids in Feature Proposal Networks
(FPNs)
Mask R-CNN
The COCO dataset
http://cocodataset.org/#explore
The Google Open Images Dataset
https://storage.googleapis.com/openimages/web/index.html
Image segmentation with deep learning
https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=segmentation&r=false&c=%2Fm%2F03g8mr
https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=detection&c=%2Fm%2F04rmv
https://cocodataset.org/#explore
Detectron2
detectron2/MODEL_ZOO.md at master · facebookresearch/detectron2 · GitHub
Inference (Colab notebook)
Training (Colab notebook)
Generating validation set plots
Panoptic segmentation with feature pyramid network (FPN-50)
Detectron2 config files
https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md
Model output format
https://detectron2.readthedocs.io/tutorials/model
s.html#model-output-format
Objective
Example Case: Image Matting
⊕
Using a Unet
Example Case: Image Matting
Matting algorithm:
Example Case: Image Matting
instance segment
Using trimap or instance segments?
Example Case: Image Matting
⊕ or ⊕ ?
Results
Example Case: Image Matting
Photo: Ayo Ogunseinde
https://unsplash.com/photos/THIs-cpyebg
Results
Example Case: Image Matting
Photo: Eugen Proskouriakov
https://unsplash.com/photos/C-gvAA8q3Tc
Results
Example Case: Image Matting
Photo: Mathieu Renier
https://unsplash.com/photos/4WBvCqeMaDE
Results
Example Case: Image Matting
Photo: Gulyás Bianka
https://unsplash.com/photos/3WOh54znPGU
For more examples:
withoutbg.com
Example Case: Image Matting
Which things should be kept in this picture?
Kid, ball, 2 dogs, 9 people?
Example Case: Image Matting
Photo: Treddy Chen
https://unsplash.com/photos/UdQWvefOXJk
Issue: When there is more than one person in the image...
Example Case: Image Matting
Review questions
- How do we compute the confusion matrix for a segmentation mask? How do we
compute it for a bounding box?
- Can we use the Intersection over Union equation to evaluate the quality of a
segmentation mask?
- What’s the recall of a classifier that only outputs ‘1’ (positive class)?
- What’s the precision of a classifier that outputs a single true positive, with all its
other predictions being equal to ‘0’ (negative class)?
- Why does precision go down when recall increases?
- Does the F1 measure weigh precision and recall equally?
- What’s the appeal of using Detectron2? Do we need to write a Pytorch model to
use it for inference or training?
Google Colab Notebooks
● Unet in FastAI 2
● Mask R-CNN and Panoptic Segmentation with Detectron 2
- How does panoptic segmentation combine instance and semantic
segmentation? Which method produces the ‘stuff’? Which method produces
the ‘things’?
- Is semantic segmentation more computationally costly than instance
segmentation? Why?
- Is panoptic segmentation more computationally costly than instance
segmentation? Why?
Review questions
References
● Stanford’s cs231n lecture on Object Detection and Segmentation
● PyImageSearch tutorial on Mask R-CNN

More Related Content

PDF
Evolution of the StyleGAN family
PPTX
Image Segmentation Using Deep Learning : A survey
PPTX
Cuckoo Optimization ppt
PPTX
U-Net (1).pptx
PDF
Lec15: Medical Image Registration (Introduction)
PPTX
Religion
PDF
Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...
PDF
A Brief History of Object Detection / Tommi Kerola
Evolution of the StyleGAN family
Image Segmentation Using Deep Learning : A survey
Cuckoo Optimization ppt
U-Net (1).pptx
Lec15: Medical Image Registration (Introduction)
Religion
Lec7: Medical Image Segmentation (I) (Radiology Applications of Segmentation,...
A Brief History of Object Detection / Tommi Kerola

What's hot (20)

PDF
Deep learning for medical imaging
PDF
Object Detection Using R-CNN Deep Learning Framework
PPTX
CNN Tutorial
PPTX
Object Detection using Deep Neural Networks
PPTX
Transfer Learning and Fine-tuning Deep Neural Networks
PDF
(2017/06)Practical points of deep learning for medical imaging
PPTX
cnn ppt.pptx
PPTX
Object detection
PDF
Convolutional Neural Network Models - Deep Learning
PPTX
Convolution Neural Network (CNN)
PDF
Convolutional Neural Networks (CNN)
PPTX
Autoencoders in Deep Learning
PDF
Introduction to object detection
PPTX
Convolutional Neural Network and Its Applications
PPTX
Object detection with deep learning
PPTX
Human Pose Estimation by Deep Learning
PPTX
Machine Learning for Medical Image Analysis: What, where and how?
PPTX
Introduction to CNN
PPTX
Feedforward neural network
PPTX
Object detection
Deep learning for medical imaging
Object Detection Using R-CNN Deep Learning Framework
CNN Tutorial
Object Detection using Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
(2017/06)Practical points of deep learning for medical imaging
cnn ppt.pptx
Object detection
Convolutional Neural Network Models - Deep Learning
Convolution Neural Network (CNN)
Convolutional Neural Networks (CNN)
Autoencoders in Deep Learning
Introduction to object detection
Convolutional Neural Network and Its Applications
Object detection with deep learning
Human Pose Estimation by Deep Learning
Machine Learning for Medical Image Analysis: What, where and how?
Introduction to CNN
Feedforward neural network
Object detection
Ad

Similar to Image segmentation with deep learning (20)

PPTX
AaSeminar_Template.pptx
PPTX
computervisionanditsapplications-190311134821.pptx
PPTX
Convolution Neural Network_Artificial Intelligence_Good.pptx
PPTX
Convolution Neural Network (CNN)
PDF
IRJET- Machine Learning Application for Data Security
PDF
Unsupervised Object Detection
PPTX
One shot learning
PDF
Seeing what a gan cannot generate: paper review
PDF
Implementation of Picwords to Warping Pictures and Keywords through Calligram
PPTX
Introduction to Convolutional Neural Networks (CNNs).pptx
PPTX
Traffic Automation System
PPTX
Image Segmentation: Approaches and Challenges
PDF
IRJET- Alternate Vision Assistance: For the Blind
PPTX
Introduction to Segmentation in Computer vision
PDF
IRJET- Real-Time Object Detection using Deep Learning: A Survey
PDF
Automated Image Captioning – Model Based on CNN – GRU Architecture
PDF
Decomposing image generation into layout priction and conditional synthesis
PPTX
Mnist report ppt
PDF
A comparatively study on visual cryptography
PPTX
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
AaSeminar_Template.pptx
computervisionanditsapplications-190311134821.pptx
Convolution Neural Network_Artificial Intelligence_Good.pptx
Convolution Neural Network (CNN)
IRJET- Machine Learning Application for Data Security
Unsupervised Object Detection
One shot learning
Seeing what a gan cannot generate: paper review
Implementation of Picwords to Warping Pictures and Keywords through Calligram
Introduction to Convolutional Neural Networks (CNNs).pptx
Traffic Automation System
Image Segmentation: Approaches and Challenges
IRJET- Alternate Vision Assistance: For the Blind
Introduction to Segmentation in Computer vision
IRJET- Real-Time Object Detection using Deep Learning: A Survey
Automated Image Captioning – Model Based on CNN – GRU Architecture
Decomposing image generation into layout priction and conditional synthesis
Mnist report ppt
A comparatively study on visual cryptography
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
Ad

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation_ Review paper, used for researhc scholars
PPT
Teaching material agriculture food technology
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation theory and applications.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Empathic Computing: Creating Shared Understanding
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Big Data Technologies - Introduction.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
Teaching material agriculture food technology
NewMind AI Monthly Chronicles - July 2025
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation theory and applications.pdf
Understanding_Digital_Forensics_Presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Network Security Unit 5.pdf for BCA BBA.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Empathic Computing: Creating Shared Understanding
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Image segmentation with deep learning