SlideShare a Scribd company logo
CutMix - Regularization Strategy to Train Strong
Classifiers with Localizable Features
Changjin Lee
Intro
● Many data augmentation and regularization methods have been proposed for vision tasks
● Random feature removal regularization - they work OK!
○ dropout
○ regional dropout - remove random spatial regions
● However, regional dropouts lead to information loss - severe conceptual limitation
❖ How to maximally utilize the deleted regions while preserving generalization and localization effects using
regional dropout?
❖ The paper addresses this by CutMix - replaces the deleted region with a patch from another image.
CutMix Intro
● Crop a region and replace with a patch from another image
● The ground truth labels are also mixed proportionally to the
pixel ratio of the two images
Advantages
● No information loss
● Enhance localization ability - should learn to identify the
object from a partial view
Comparison: Regional Dropout
● CutMix is similar to regional dropout in that they both crop a portion of image
● Regional Dropout - randomly remove a portion of image
● CutMix - randomly crop a portion and replace with a patch from another image
Comparison: Synthesizing training data
● A synthesizing technique such as Stylizing ImageNet focuses more on shape than texture
● CutMix requires only trivial additional cost for training while generating new samples
Comparison: Mixup
● Mixup samples introduce locally ambiguous and unnatural images
Complementary to other models
● CutMix is a great complementary to weight decay, batch normalization, and adding noises
● CutMix operates only on data level
CutMix Algorithm
training images: (W*H*C)
Binary Mask: (W*H)
new training
sample
labels
combination ratio: sampled from Beta(ɑ,ɑ)
-> ɑ=1
1-λ
λ
B
CutMix on Class Activation Map (CAM)
● Vanilla ResNet-50
● Cutout focuses on less discriminative parts like belly
● Mixup fully uses the pixels but it’s unnatural and
confusing model which object to choose
● This confusion results in suboptimal performance
● CutMix successfully localize the two objects
confused
Performances
Performance: ImageNet
Performance: CIFAR-10
Layer 0: input level (best)
Layer 1: after conv-bn
Layer 2: after layer 1
…
Variations of CutMix lead to
performance degradation
-> Original is the best!
PyTorch Implementation
PyTorch Implementation
Dive Deeper…
❖ Random Croppings sometimes replace with useless images and this definitely reduces the performance
➢ Possible improvements: 1) Object Detection (inefficient?), 2) limiting cropped size i.e, lambda=U(0.5,0.8)?
❖ Weakly Supervised Object Localization
❖ Image Captioning
Bad
Good
References
● https://arxiv.org/pdf/1905.04899.pdf
● https://github.com/clovaai/CutMix-PyTorch
Blog Post
● https://jasonlee-cp.github.io/paper/CutMix/

More Related Content

PPTX
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable ...
PPTX
ConvNeXt: A ConvNet for the 2020s explained
PDF
PR-366: A ConvNet for 2020s
PDF
Winning Data Science Competitions
PPTX
Speech Processing with deep learning
PPT
Dataa miining
PDF
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
PDF
Unity遊戲程式設計 - 製作2D骨架動畫
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable ...
ConvNeXt: A ConvNet for the 2020s explained
PR-366: A ConvNet for 2020s
Winning Data Science Competitions
Speech Processing with deep learning
Dataa miining
PR-120: ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture De...
Unity遊戲程式設計 - 製作2D骨架動畫

What's hot (9)

PDF
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
PDF
Deep learning for medical imaging
PPTX
AlexNet, VGG, GoogleNet, Resnet
PPT
SELECTIVE LASER SINTERING
PPT
Shape Features
PDF
Convolutional Neural Networks (CNN)
PDF
Object Detection Beyond Mask R-CNN and RetinaNet I
PDF
Panoptic Segmentation
PDF
Image segmentation with deep learning
“An Introduction to Data Augmentation Techniques in ML Frameworks,” a Present...
Deep learning for medical imaging
AlexNet, VGG, GoogleNet, Resnet
SELECTIVE LASER SINTERING
Shape Features
Convolutional Neural Networks (CNN)
Object Detection Beyond Mask R-CNN and RetinaNet I
Panoptic Segmentation
Image segmentation with deep learning
Ad

Similar to Cut mix: Regularization strategy to train strong classifiers with localizable features (20)

PDF
20150703.journal club
PPTX
Introduction to Convolutional Neural Networks (CNNs).pptx
PDF
Traffic sign classification
PPTX
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
PPTX
Presentation1.pptx
PPTX
Pillar k means
PDF
A Review on Color Recognition using Deep Learning and Different Image Segment...
PPTX
leaf diseses.pptx
PDF
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
PDF
Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
PDF
C017121219
PDF
Decomposing image generation into layout priction and conditional synthesis
PPTX
Knowledge modelling by using clustering method Fuzzy C means
PDF
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...
PDF
International Journal of Computational Engineering Research(IJCER)
PPT
Image inpainting
PDF
Point cloud mesh-investigation_report-lihang
PPTX
Images Analysis  in matlab
PPTX
[NS][Lab_Seminar_241125]Affinity Attention Graph Neural Network for Weakly Su...
PPTX
Introduction to Convolutional Neural Networks (CNNs).pptx
20150703.journal club
Introduction to Convolutional Neural Networks (CNNs).pptx
Traffic sign classification
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
Presentation1.pptx
Pillar k means
A Review on Color Recognition using Deep Learning and Different Image Segment...
leaf diseses.pptx
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
C017121219
Decomposing image generation into layout priction and conditional synthesis
Knowledge modelling by using clustering method Fuzzy C means
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...
International Journal of Computational Engineering Research(IJCER)
Image inpainting
Point cloud mesh-investigation_report-lihang
Images Analysis  in matlab
[NS][Lab_Seminar_241125]Affinity Attention Graph Neural Network for Weakly Su...
Introduction to Convolutional Neural Networks (CNNs).pptx
Ad

More from Changjin Lee (6)

PPTX
R-FCN.pptx
PPTX
U-Net (1).pptx
PPTX
ViT.pptx
PPTX
WBF.pptx
PPTX
Cascade R-CNN_ Delving into High Quality Object Detection.pptx
PPTX
EfficientNet
R-FCN.pptx
U-Net (1).pptx
ViT.pptx
WBF.pptx
Cascade R-CNN_ Delving into High Quality Object Detection.pptx
EfficientNet

Recently uploaded (20)

PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
additive manufacturing of ss316l using mig welding
PDF
Well-logging-methods_new................
DOCX
573137875-Attendance-Management-System-original
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPT
Project quality management in manufacturing
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Geodesy 1.pptx...............................................
bas. eng. economics group 4 presentation 1.pptx
Foundation to blockchain - A guide to Blockchain Tech
Embodied AI: Ushering in the Next Era of Intelligent Systems
Model Code of Practice - Construction Work - 21102022 .pdf
additive manufacturing of ss316l using mig welding
Well-logging-methods_new................
573137875-Attendance-Management-System-original
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Lecture Notes Electrical Wiring System Components
Arduino robotics embedded978-1-4302-3184-4.pdf
Project quality management in manufacturing
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
OOP with Java - Java Introduction (Basics)
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Geodesy 1.pptx...............................................

Cut mix: Regularization strategy to train strong classifiers with localizable features

  • 1. CutMix - Regularization Strategy to Train Strong Classifiers with Localizable Features Changjin Lee
  • 2. Intro ● Many data augmentation and regularization methods have been proposed for vision tasks ● Random feature removal regularization - they work OK! ○ dropout ○ regional dropout - remove random spatial regions ● However, regional dropouts lead to information loss - severe conceptual limitation ❖ How to maximally utilize the deleted regions while preserving generalization and localization effects using regional dropout? ❖ The paper addresses this by CutMix - replaces the deleted region with a patch from another image.
  • 3. CutMix Intro ● Crop a region and replace with a patch from another image ● The ground truth labels are also mixed proportionally to the pixel ratio of the two images Advantages ● No information loss ● Enhance localization ability - should learn to identify the object from a partial view
  • 4. Comparison: Regional Dropout ● CutMix is similar to regional dropout in that they both crop a portion of image ● Regional Dropout - randomly remove a portion of image ● CutMix - randomly crop a portion and replace with a patch from another image Comparison: Synthesizing training data ● A synthesizing technique such as Stylizing ImageNet focuses more on shape than texture ● CutMix requires only trivial additional cost for training while generating new samples Comparison: Mixup ● Mixup samples introduce locally ambiguous and unnatural images
  • 5. Complementary to other models ● CutMix is a great complementary to weight decay, batch normalization, and adding noises ● CutMix operates only on data level
  • 6. CutMix Algorithm training images: (W*H*C) Binary Mask: (W*H) new training sample labels combination ratio: sampled from Beta(ɑ,ɑ) -> ɑ=1 1-λ λ B
  • 7. CutMix on Class Activation Map (CAM) ● Vanilla ResNet-50 ● Cutout focuses on less discriminative parts like belly ● Mixup fully uses the pixels but it’s unnatural and confusing model which object to choose ● This confusion results in suboptimal performance ● CutMix successfully localize the two objects confused
  • 10. Performance: CIFAR-10 Layer 0: input level (best) Layer 1: after conv-bn Layer 2: after layer 1 … Variations of CutMix lead to performance degradation -> Original is the best!
  • 13. Dive Deeper… ❖ Random Croppings sometimes replace with useless images and this definitely reduces the performance ➢ Possible improvements: 1) Object Detection (inefficient?), 2) limiting cropped size i.e, lambda=U(0.5,0.8)? ❖ Weakly Supervised Object Localization ❖ Image Captioning Bad Good