SlideShare a Scribd company logo
Prototype Mixture Models
for Few-shot Semantic Segmentation
University of Chinese Academy of Sciences, Beijing, China
Yonsei University Severance Hospital CCIDS
Choi Dongmin
Abstract
• Few-shot segmentation

- challenging

- single prototype from the support image causes semantic ambiguity
• Prototype mixture models (PMMs)

- correlate diverse image regions with multiple prototypes

- leverage the semantics to activate objects in the query image

- S.O.T.A on Pascal VOC and MS-COCO

Introduction
Nguyen et al. Feature Weighting and Boosting for Few-Shot Segmentation. ICCV 2019
Few-shot Segmentation
Segmenting the Query image based on a feature representation learned on training images
given Support images and the related segmentation Support masks
Introduction
Single Prototype Model vs Prototype Mixture Model
A single prototype causes "semantic ambiguity" and deteriorates the distribution of features.

PMMs focus on solving the semantic ambiguity problem.
Introduction
Prototype Mixture Model
Expectation-Maximization (EM) algorithm

treats each prototype vector within the mask region as a positive sample
Mixed prototypesDiverse foreground regions
Related Works
Semantic Segmentation
Chen et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. TPAMI 2017
S.O.T.A methods : UNet, PSPNet, DeepLab
Related Works
Few-shot learning
• Metric Learning

- train networks to predict whether two images/regions belong to the
same category
• Meta-learning

- specify optimization or loss functions which force faster adaptation
of the parameters to new categories with few examples

• Data Augmentation

- generate additional examples for unseen categories
Related Works
Few-shot learning
• Metric Learning

Chen et al. A CLOSER LOOK AT FEW-SHOT CLASSIFICATION. ICLR 2019
simple prototypes for each class, which captures representative and discriminative features
Related Works
Few-shot Segmentation
• Largely following the Metric Learning framework

- Feed learned knowledge to a metric module to segment query images
Shaban et al. One-Shot Learning for Semantic Segmentation. BMVC 2017
OSLSM (two-branch network)
Support branch
Query branch
Related Works
Few-shot Segmentation
• Largely following the Metric Learning framework

- Feed learned knowledge to a metric module to segment query images
Zhang et al. SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. CoRR abs/1810.09091 (2018)
SG-One, which uses a prototype vector
Prototype vector
Related Works
Few-shot Segmentation
• Largely following the Metric Learning framework

- Feed learned knowledge to a metric module to segment query images
Zhang et al. SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. CoRR abs/1810.09091 (2018)
PANet w/ a prototype alignment regularization between support and query branches
Related Works
Few-shot Segmentation
• Metric Learning in few-shot segmentation

- A core is the prototype vector, which commonly calculated by GAP

- However, it typically disregards the spatial extent of objects and

tends to mix semantics from various parts

- Using single prototypes to represent object regions and

the semantic ambiguity problem remains unsolved
The Proposed Approach
Overview
The Proposed Approach
Overview
Support branch
Query branch
Negative sample set S−
Positive sample set S+
Activate query features in a duplex way (P-Match and P-Conv)
The Proposed Approach
Prototype Mixture Models
Features is spatially partitioned into

foreground samples and background samples ,

( : feature vectors within the mask of the support image )
S ∈ RW×H×C
S+
S−
S+
The Proposed Approach
Prototype Mixture Models
PMMs : a probability mixture model
p(si |θ) = ΣK
k=1wk pk(si |θ)
- : the mixing weights 

- : the model parameters 

- : the feature sample

- : the base model, which is a probability model

based on a Kernel distance function (vector distance)

wk (0 ≤ wk ≤ 1, ΣK
k=1wk = 1)
θ
si ∈ S ith
pk(si |θ) kth
pk(si |θ) = β(θ)eKernel(si, μk)
= βc(κ)eκ μT
k si
Normalization constant
one of the parameter μk ∈ θ
κc/2−1
(2π)c/2Ic/2−1(κ)
* θ = {μ, κ}
The Proposed Approach
Prototype Mixture Models
Model Learning using EM algorithm
Eik =
pk(si |θ)
ΣK
k=1pk(si |θ)
=
eκ μT
k si
ΣK
k=1eκ μT
k si
E-step :
Given model parameters and sample features extracted,

calculating the expectation of the sample si
μk =
ΣN
i=1Eiksi
ΣN
k=1Eik
M-step :
The expectation is used to update the mean vectors of PMMs

( is the number of samples )N = W × H
The Proposed Approach
Prototype Mixture Models
Model Learning using EM algorithm
The mean vectors and

are used as

prototype vectors to extract convolution features
for the query image.



Such a prototype vector can represent

a region around an object part
μ+
= {μ+
k , k = 1, …, K}
μ−
= {μ−
k , k = 1, …, K}
The Proposed Approach
Prototype Mixture Models
PMMs as Representation (P-Match)
squeezes representation information about an object part

and can be used to match and activate the query features 



μ+
Q
Q′ = P-Match(μ+
k , Q), k = 1, …, K
The Proposed Approach
Prototype Mixture Models
PMMs as Classifiers (P-Conv)
Each prototype vector incorporating discriminative information

across feature channels can be seen as classifier,

which produces probability maps 



Mk = {M+
k , M−
k }
Mk = P-Conv(μ+
k , μ−
k , Q), k = 1, …, K
The Proposed Approach
Prototype Mixture Models
P-Match and P-Conv
The semantic info across channels and discriminative info related to object
parts are collected from the support features to activate the query featureS Q
The Proposed Approach
Prototype Mixture Models
The Proposed Approach
Residual Prototype Mixture Models
Ensemble by stacking multiple PMMs

to further enhance the model representative capacity
Experiments
• Baseline : CANet w/o iterative optimization

• Data Augmentation

: normalization, horizontal flipping, random cropping and random resizing

• Pytorch 1.0 & Nvidia 2080Ti GPUs

• The EM algorithm iterates 10 rounds

• Optimization

: Cross-entropy Loss with SGD (init lr = 0.0035, momentum 0.9,

200,000 iterations, 8 pairs of support-query images per batch),

LR decay following DeepLab’s policy

• For each training step, the categories in the train split are randomly selected
and then the support-query pairs are randomly sampled in the selected
categories.
Zhang et al. CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning. CVPR 2019

Chen et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI 2018
Experiments
• Dataset

- Pascal- : 20 object categories are partitioned into 4 splits

with 3 for training and 1 for testing

- COCO- : 80 classes are divided into 4 splits and each contains

20 classes and the val dataset is used for evaluation

• Evaluation Metric : mIoU
5i
20i
Experiments
Experiments
Experiments
Ablation Study
Experiments
Ablation Study
Experiments
Performance
Experiments
Performance
Experiments
Performance
Conclusion
• PMMs

- correlate diverse image regions with multiple prototype to solve the
semantic ambiguity problem

- During training, PMMs incorporate rich channel-wised and spatial
semantics from limited support images

- During inference, PMMs are matched with query features in a duplex
manner to perform accurate semantic segmentation

- S.O.T.A of few-shot segmentation

- Capture the diverse semantics of object parts given few support
examples
Thank you

More Related Content

PPTX
3D Gaussian Splatting
PDF
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
PDF
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
PDF
【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
PDF
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...
PPTX
【DL輪読会】An Image is Worth One Word: Personalizing Text-to-Image Generation usi...
PDF
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
PDF
SSD: Single Shot MultiBox Detector (ECCV2016)
3D Gaussian Splatting
Review : PolarMask: Single Shot Instance Segmentation with Polar Representati...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
論文紹介:Grad-CAM: Visual explanations from deep networks via gradient-based loca...
【DL輪読会】An Image is Worth One Word: Personalizing Text-to-Image Generation usi...
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
SSD: Single Shot MultiBox Detector (ECCV2016)

What's hot (20)

PPTX
MS COCO Dataset Introduction
PDF
コンピューテーショナルフォトグラフィ
PDF
【DL輪読会】Toward Fast and Stabilized GAN Training for Highfidelity Few-shot Imag...
PDF
【メタサーベイ】Video Transformer
PDF
東京工業大学・鈴木良郎との「AI共同研究」や「当ラボへの配属」をご検討頂いている方へ
PDF
(第3版)「知能の構成的解明の研究動向と今後の展望」についての個人的見解:Chain of thought promptingやpostdictionを中...
PPTX
[DL輪読会]Flow-based Deep Generative Models
PDF
Contrastive learning 20200607
PDF
Pr057 mask rcnn
PDF
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
PPTX
[DL輪読会]Neural Ordinary Differential Equations
PDF
Activity-Net Challenge 2021の紹介
PDF
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PPTX
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
PDF
3D CNNによる人物行動認識の動向
PPTX
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
PDF
SSII2018TS: 大規模深層学習
PDF
ELBO型VAEのダメなところ
PDF
逐次モンテカルロ法の基礎
PDF
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
MS COCO Dataset Introduction
コンピューテーショナルフォトグラフィ
【DL輪読会】Toward Fast and Stabilized GAN Training for Highfidelity Few-shot Imag...
【メタサーベイ】Video Transformer
東京工業大学・鈴木良郎との「AI共同研究」や「当ラボへの配属」をご検討頂いている方へ
(第3版)「知能の構成的解明の研究動向と今後の展望」についての個人的見解:Chain of thought promptingやpostdictionを中...
[DL輪読会]Flow-based Deep Generative Models
Contrastive learning 20200607
Pr057 mask rcnn
[DL輪読会]Recent Advances in Autoencoder-Based Representation Learning
[DL輪読会]Neural Ordinary Differential Equations
Activity-Net Challenge 2021の紹介
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
3D CNNによる人物行動認識の動向
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
SSII2018TS: 大規模深層学習
ELBO型VAEのダメなところ
逐次モンテカルロ法の基礎
[DL輪読会] Spectral Norm Regularization for Improving the Generalizability of De...
Ad

Similar to Review : Prototype Mixture Models for Few-shot Semantic Segmentation (20)

PPTX
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
PDF
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
PDF
IRJET- Semantic Segmentation using Deep Learning
PPTX
Image segmentation hj_cho
PPTX
nnU-Net: a self-configuring method for deep learning-based biomedical image s...
PDF
The Future of Health Monitoring: Advances in Wearable Sensor Data Processing
PPTX
Deep Learning in Computer Vision
PDF
PDF
Online video object segmentation via convolutional trident network
PDF
Unsupervised semisupervised semantic or instance segmentation
PPTX
DefenseTalk_Trimmed
PDF
NIPS2017 Few-shot Learning and Graph Convolution
PPTX
[NS][Lab_Seminar_250106]SAM-Aware Graph Prompt Reasoning Network for Cross-Do...
PPTX
CM20315_10_Convolutional neural networkArchitecture
PPTX
Accelerating Deep Learning Inference 
on Mobile Systems
PPTX
Recognize, Describe, and Generate: Introduction of Recent Work at MIL
PDF
Swift for TensorFlow - CoreML Personalization
PDF
2019 cvpr paper_overview
PDF
2019 cvpr paper overview by Ho Seong Lee
PPTX
Few shot learning/ one shot learning/ machine learning
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
IRJET- Semantic Segmentation using Deep Learning
Image segmentation hj_cho
nnU-Net: a self-configuring method for deep learning-based biomedical image s...
The Future of Health Monitoring: Advances in Wearable Sensor Data Processing
Deep Learning in Computer Vision
Online video object segmentation via convolutional trident network
Unsupervised semisupervised semantic or instance segmentation
DefenseTalk_Trimmed
NIPS2017 Few-shot Learning and Graph Convolution
[NS][Lab_Seminar_250106]SAM-Aware Graph Prompt Reasoning Network for Cross-Do...
CM20315_10_Convolutional neural networkArchitecture
Accelerating Deep Learning Inference 
on Mobile Systems
Recognize, Describe, and Generate: Introduction of Recent Work at MIL
Swift for TensorFlow - CoreML Personalization
2019 cvpr paper_overview
2019 cvpr paper overview by Ho Seong Lee
Few shot learning/ one shot learning/ machine learning
Ad

More from Dongmin Choi (20)

PDF
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
PDF
Review: Incremental Few-shot Instance Segmentation [CDM]
PDF
Review: You Only Look One-level Feature
PDF
Transformer in Computer Vision
PDF
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
PDF
YolactEdge Review [cdm]
PDF
Review : Inter-slice Context Residual Learning for 3D Medical Image Segmentation
PDF
Deformable DETR Review [CDM]
PDF
ViT (Vision Transformer) Review [CDM]
PDF
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
PDF
Review : Rethinking Pre-training and Self-training
PDF
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
PDF
Pyradiomics Customization [CDM]
PDF
Seeing What a GAN Cannot Generate [cdm]
PDF
Neural network pruning with residual connections and limited-data review [cdm]
PDF
Network Deconvolution review [cdm]
PDF
How much position information do convolutional neural networks encode? review...
PDF
Objects as points (CenterNet) review [CDM]
PDF
Augmix review [cdm]
PDF
Bag of tricks for image classification with convolutional neural networks r...
[Review] BoxInst: High-Performance Instance Segmentation with Box Annotations...
Review: Incremental Few-shot Instance Segmentation [CDM]
Review: You Only Look One-level Feature
Transformer in Computer Vision
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
YolactEdge Review [cdm]
Review : Inter-slice Context Residual Learning for 3D Medical Image Segmentation
Deformable DETR Review [CDM]
ViT (Vision Transformer) Review [CDM]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Rethinking Pre-training and Self-training
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
Pyradiomics Customization [CDM]
Seeing What a GAN Cannot Generate [cdm]
Neural network pruning with residual connections and limited-data review [cdm]
Network Deconvolution review [cdm]
How much position information do convolutional neural networks encode? review...
Objects as points (CenterNet) review [CDM]
Augmix review [cdm]
Bag of tricks for image classification with convolutional neural networks r...

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
KodekX | Application Modernization Development
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Machine learning based COVID-19 study performance prediction
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Encapsulation theory and applications.pdf
Understanding_Digital_Forensics_Presentation.pptx
MYSQL Presentation for SQL database connectivity
Building Integrated photovoltaic BIPV_UPV.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The AUB Centre for AI in Media Proposal.docx
Unlocking AI with Model Context Protocol (MCP)
KodekX | Application Modernization Development
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
MIND Revenue Release Quarter 2 2025 Press Release
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
sap open course for s4hana steps from ECC to s4
Machine learning based COVID-19 study performance prediction
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation_ Review paper, used for researhc scholars
Per capita expenditure prediction using model stacking based on satellite ima...

Review : Prototype Mixture Models for Few-shot Semantic Segmentation

  • 1. Prototype Mixture Models for Few-shot Semantic Segmentation University of Chinese Academy of Sciences, Beijing, China Yonsei University Severance Hospital CCIDS Choi Dongmin
  • 2. Abstract • Few-shot segmentation
 - challenging
 - single prototype from the support image causes semantic ambiguity • Prototype mixture models (PMMs)
 - correlate diverse image regions with multiple prototypes
 - leverage the semantics to activate objects in the query image
 - S.O.T.A on Pascal VOC and MS-COCO

  • 3. Introduction Nguyen et al. Feature Weighting and Boosting for Few-Shot Segmentation. ICCV 2019 Few-shot Segmentation Segmenting the Query image based on a feature representation learned on training images given Support images and the related segmentation Support masks
  • 4. Introduction Single Prototype Model vs Prototype Mixture Model A single prototype causes "semantic ambiguity" and deteriorates the distribution of features. PMMs focus on solving the semantic ambiguity problem.
  • 5. Introduction Prototype Mixture Model Expectation-Maximization (EM) algorithm
 treats each prototype vector within the mask region as a positive sample Mixed prototypesDiverse foreground regions
  • 6. Related Works Semantic Segmentation Chen et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. TPAMI 2017 S.O.T.A methods : UNet, PSPNet, DeepLab
  • 7. Related Works Few-shot learning • Metric Learning
 - train networks to predict whether two images/regions belong to the same category • Meta-learning
 - specify optimization or loss functions which force faster adaptation of the parameters to new categories with few examples • Data Augmentation
 - generate additional examples for unseen categories
  • 8. Related Works Few-shot learning • Metric Learning
 Chen et al. A CLOSER LOOK AT FEW-SHOT CLASSIFICATION. ICLR 2019 simple prototypes for each class, which captures representative and discriminative features
  • 9. Related Works Few-shot Segmentation • Largely following the Metric Learning framework
 - Feed learned knowledge to a metric module to segment query images Shaban et al. One-Shot Learning for Semantic Segmentation. BMVC 2017 OSLSM (two-branch network) Support branch Query branch
  • 10. Related Works Few-shot Segmentation • Largely following the Metric Learning framework
 - Feed learned knowledge to a metric module to segment query images Zhang et al. SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. CoRR abs/1810.09091 (2018) SG-One, which uses a prototype vector Prototype vector
  • 11. Related Works Few-shot Segmentation • Largely following the Metric Learning framework
 - Feed learned knowledge to a metric module to segment query images Zhang et al. SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation. CoRR abs/1810.09091 (2018) PANet w/ a prototype alignment regularization between support and query branches
  • 12. Related Works Few-shot Segmentation • Metric Learning in few-shot segmentation
 - A core is the prototype vector, which commonly calculated by GAP
 - However, it typically disregards the spatial extent of objects and
 tends to mix semantics from various parts
 - Using single prototypes to represent object regions and
 the semantic ambiguity problem remains unsolved
  • 14. The Proposed Approach Overview Support branch Query branch Negative sample set S− Positive sample set S+ Activate query features in a duplex way (P-Match and P-Conv)
  • 15. The Proposed Approach Prototype Mixture Models Features is spatially partitioned into
 foreground samples and background samples ,
 ( : feature vectors within the mask of the support image ) S ∈ RW×H×C S+ S− S+
  • 16. The Proposed Approach Prototype Mixture Models PMMs : a probability mixture model p(si |θ) = ΣK k=1wk pk(si |θ) - : the mixing weights 
 - : the model parameters 
 - : the feature sample
 - : the base model, which is a probability model
 based on a Kernel distance function (vector distance)
 wk (0 ≤ wk ≤ 1, ΣK k=1wk = 1) θ si ∈ S ith pk(si |θ) kth pk(si |θ) = β(θ)eKernel(si, μk) = βc(κ)eκ μT k si Normalization constant one of the parameter μk ∈ θ κc/2−1 (2π)c/2Ic/2−1(κ) * θ = {μ, κ}
  • 17. The Proposed Approach Prototype Mixture Models Model Learning using EM algorithm Eik = pk(si |θ) ΣK k=1pk(si |θ) = eκ μT k si ΣK k=1eκ μT k si E-step : Given model parameters and sample features extracted,
 calculating the expectation of the sample si μk = ΣN i=1Eiksi ΣN k=1Eik M-step : The expectation is used to update the mean vectors of PMMs
 ( is the number of samples )N = W × H
  • 18. The Proposed Approach Prototype Mixture Models Model Learning using EM algorithm The mean vectors and
 are used as
 prototype vectors to extract convolution features for the query image.
 
 Such a prototype vector can represent
 a region around an object part μ+ = {μ+ k , k = 1, …, K} μ− = {μ− k , k = 1, …, K}
  • 19. The Proposed Approach Prototype Mixture Models PMMs as Representation (P-Match) squeezes representation information about an object part
 and can be used to match and activate the query features 
 
 μ+ Q Q′ = P-Match(μ+ k , Q), k = 1, …, K
  • 20. The Proposed Approach Prototype Mixture Models PMMs as Classifiers (P-Conv) Each prototype vector incorporating discriminative information
 across feature channels can be seen as classifier,
 which produces probability maps 
 
 Mk = {M+ k , M− k } Mk = P-Conv(μ+ k , μ− k , Q), k = 1, …, K
  • 21. The Proposed Approach Prototype Mixture Models P-Match and P-Conv The semantic info across channels and discriminative info related to object parts are collected from the support features to activate the query featureS Q
  • 23. The Proposed Approach Residual Prototype Mixture Models Ensemble by stacking multiple PMMs
 to further enhance the model representative capacity
  • 24. Experiments • Baseline : CANet w/o iterative optimization • Data Augmentation
 : normalization, horizontal flipping, random cropping and random resizing • Pytorch 1.0 & Nvidia 2080Ti GPUs • The EM algorithm iterates 10 rounds • Optimization
 : Cross-entropy Loss with SGD (init lr = 0.0035, momentum 0.9,
 200,000 iterations, 8 pairs of support-query images per batch),
 LR decay following DeepLab’s policy • For each training step, the categories in the train split are randomly selected and then the support-query pairs are randomly sampled in the selected categories. Zhang et al. CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning. CVPR 2019 Chen et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI 2018
  • 25. Experiments • Dataset
 - Pascal- : 20 object categories are partitioned into 4 splits
 with 3 for training and 1 for testing
 - COCO- : 80 classes are divided into 4 splits and each contains
 20 classes and the val dataset is used for evaluation • Evaluation Metric : mIoU 5i 20i
  • 33. Conclusion • PMMs
 - correlate diverse image regions with multiple prototype to solve the semantic ambiguity problem
 - During training, PMMs incorporate rich channel-wised and spatial semantics from limited support images
 - During inference, PMMs are matched with query features in a duplex manner to perform accurate semantic segmentation
 - S.O.T.A of few-shot segmentation
 - Capture the diverse semantics of object parts given few support examples