SlideShare a Scribd company logo
PR-045, 5th Nov, 2017
MVPLAB @ Yonsei Univ.
1. Fully Convolutional Networks for Semantic Segmentation
Arxiv 2014, CVPR 2015, TPAMI 2017
2. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Arxiv 2014, ICLR 2015
3. Multi Scale Context Aggregation by Dilated Convolution
Arxiv 2015, ICLR 2016
4. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully
Connected CRFs
Arxiv 2016, Accepted to TPAMI
5. Pyramid Scene Parsing Network
Arxiv 2016, CVPR 2017
6. Rethinking Atrous Convolution for Semantic Image Segmentation
Arxiv 2017
1. Fully Convolutional Networks for Semantic Segmentation
Arxiv 2014, CVPR 2015, TPAMI 2017
2. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Arxiv 2014, ICLR 2015
3. Multi Scale Context Aggregation by Dilated Convolution
Arxiv 2015, ICLR 2016
4. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully
Connected CRFs
Arxiv 2016, Accepted to TPAMI
5. Pyramid Scene Parsing Network
Arxiv 2016, CVPR 2017
6. Rethinking Atrous Convolution for Semantic Image Segmentation
Arxiv 2017
FCN
DeepLab
DilatedConv
DeepLab v2
PSPNet
DeepLab v3
Pr045 deep lab_semantic_segmentation
Microsoft COCO: Common Objects in Context, Arxiv 2015
Pixel-level
Dense Prediction
Instance-level
Object Detection
Today
Slides from ICCV 17 COCO Challenge Workshop by FAIR
Today
• IoU (Intersection Over Union) = TP / (TP+FP+FN)
GT
Prediction
True
Positive
False
Negative
False
Positive
IoU =
Pascal VOC 2012
Cityscapes
Cityscapes
• Pixel Level Segmentation
• Instance Level Segmentation
“DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and
Fully Connected CRFs”, 2016
• Networks
• DilatedConv
• DeepLab
• PSPNet
• DeepLab v3
Baseline
“Fully Convolutional NetworksforSemantic Segmentation”, 2014
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
VGG16 (Classification)
224x224x3
Conv1
Pool1
112x112x64
Conv2
Pool2
56x56x128
Conv3
Pool3
28x28x256
Conv4
Pool4
14x14x512
Conv5
Pool5
7x7x512
Fully Connected
Softmax
VGG16 (Classification)
224x224x3
Conv1
Pool1
112x112x64
Conv2
Pool2
56x56x128
Conv3
Pool3
28x28x256
Conv4
Pool4
14x14x512
Conv5
Pool5
7x7x512
Fully Connected
Softmax
1x1 Result
위치 정보 분실
VGG16 (Segmentation)
224x224x3
Conv1
Pool1
112x112x64
Conv2
Pool2
56x56x128
Conv3
Pool3
28x28x256
Conv4
Pool4
14x14x512
Conv5
Pool5
7x7x512
1x1x512 Conv
7x7 Heatmap
VGG16 (Segmentation)
224x224x3
Conv1
Pool1
112x112x64
Conv2
Pool2
56x56x128
Conv3
Pool3
28x28x256
Conv4
Pool4
14x14x512
Conv5
Pool5
7x7x512
1x1x512 Conv
7x7 Heatmap
x32 Upsample
Softmax
위치 정보 활용가능: Conv의 특징
VGG16 (Segmentation)
32x32x3
Conv1
Pool1
16x16x64
Conv2
Pool2
8x8x128
Conv3
Pool3
4x4x256
Conv4
Pool4
2x2x512
Conv5
Pool5
1x1x512
1x1x512 Conv
1x1 Heatmap
x32 Upsample
Softmax
임의의 32x32
Image 또는 Patch
다양한 Size 사용가능
Conv Filter를 학습하기 때문
VGG16 (Segmentation)
32x32x3
Conv1
Pool1
16x16x64
Conv2
Pool2
8x8x128
Conv3
Pool3
4x4x256
Conv4
Pool4
2x2x512
Conv5
Pool5
1x1x512
1x1x512 Conv
1x1 Heatmap
x32 Upsample
Softmax
임의의 32x32
Image 또는 Patch
Pr045 deep lab_semantic_segmentation
32x32x3
Conv1
Pool1
16x16x64
Conv2
Pool2
8x8x128
Conv3
Pool3
4x4x256
Conv4
Pool4
2x2x512
Conv5
Pool5
1x1x512
1x1 Heatmap
x2 Upsample x16 Upsample
Softmax
Every 32x32 Patch
+
1x1 Conv
32x32x3
Conv1
Pool1
16x16x64
Conv2
Pool2
8x8x128
Conv3
Pool3
4x4x256
Conv4
Pool4
2x2x512
Conv5
Pool5
1x1x512
1x1 Heatmap
x2 Upsample x8 Upsample
Softmax
Every 32x32 Patch
+
x2 Upsample
+x2 Upsample
1x1 Conv
PASCAL VOC 2012 Cityscapes (IoU / iIoU)
FCN-8s-CVPR15 62.2% 65.3% / 41.7%
FCN-8s-PAMI17 67.2%
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
Classification의 관점
1. LearningDeconvolution NetworkforSemanticSegmentation, 2015,Nohetal.
2. Attention toScale:Scale-AwareSemanticImageSegmentation, 2015,Chenetal.
• 굳이 Downsample할 필요가 있는가?
• 굳이 여러 Scale의 Input을 별개로 처리할 필요가 있는가?
• 굳이 Downsample할 필요가 있는가?
• 굳이 여러 Scale의 Input을 별개로 처리할 필요가 있는가?
Figure from DeepLab
Figure from DeepLab-v2
Figures from DilatedConv
3x3 Conv r=1
3x3 Range
Figures from DilatedConv
3x3 Conv r=1 3x3 Conv r=1
3x3 Conv r=2
3x3 Range 7x7 Range
Figures from DilatedConv
3x3 Conv r=1 3x3 Conv r=1
3x3 Conv r=2
3x3 Range 7x7 Range
3x3 Conv r=1
15x15 Range
3x3 Conv r=4
3x3 Conv r=2
Dilated Convolution
Receptive Field
Pr045 deep lab_semantic_segmentation
FromMcCallum's introduction toCRFs
,( )p y x
|( )p y x
• Graph Model
• 각 Node = Label of Pixel
• 각 Node의 Latent Variable = Pixel
• 각 Node 사이를 확률 Modeling
• Posterior를 최대화 하도록 확률 Model을 학습
Maximize Posterior
𝑃 𝑋 𝐼 =
1
𝑍(𝐼)
exp −∑𝜙𝑐 𝑋 𝐶 𝐼
Minimize Energy
Efficient inference infully connectedcrfswithgaussianedgepotentials, Krähenbühl etal,NIPS2011
𝐸 𝑋 𝐼 = ∑𝜙𝑐 𝑋 𝐶 𝐼
𝐸(𝑋) = ∑𝜓 𝐶 𝑋 𝐶
Normalization
Image
Label
𝐸 𝑥 = ෍
𝑖
𝜓𝑖(𝑥𝑖) + ෍
𝑖,𝑗
𝜓𝑖,𝑗(𝑥𝑖, 𝑥𝑗)
FullyConnected
Unary
𝐸 𝑥 = ෍
𝑖
𝜓𝑖(𝑥𝑖) + ෍
𝑖,𝑗
𝜓𝑖,𝑗(𝑥𝑖, 𝑥𝑗)
𝜓𝑖 𝑥𝑖 = −log P(𝑥𝑖)
𝜓𝑖,𝑗 𝑥𝑖, 𝑥𝑗 = 𝜇 𝑥𝑖, 𝑥𝑗 [𝑤1 exp −
𝑝𝑖 − 𝑝𝑗
2
2𝜎 𝛼
2 −
𝐼𝑖 − 𝐼𝑗
2
2𝜎𝛽
2 + 𝑤2 exp −
𝑝𝑖 − 𝑝𝑗
2
2𝜎 𝛾
2 ]
𝜇 𝑥𝑖, 𝑥𝑗 = ቊ
1 𝑥𝑖 ≠ 𝑥𝑗
0 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑝𝑖, 𝑝𝑗: Pixel 위치
𝐼𝑖, 𝐼𝑗: Pixel RGB값
𝑤1, 𝑤2: Kernel Weights
𝜎 𝛼, 𝜎 𝛼, 𝜎 𝛼: Hyper-parameter
Efficient inference infully connectedcrfswithgaussianedgepotentials, Krähenbühl etal,NIPS2011
Unary Term (from Classifier)
Pairwise Term
의미?
Pixel이 서로 비슷한데
(위치적으로, RGB상으로)
Label이 서로 다르면
Energy 증가하여 Penalty
계산량이 많으니까 Mean Field Approximation
Efficient inference infully connectedcrfswithgaussianedgepotentials, Krähenbühl etal,NIPS2011
𝑃 𝑋 𝐼 =
1
𝑍(𝐼)
exp −𝐸(𝑋)
𝜓𝑖,𝑗 𝑥𝑖, 𝑥𝑗 = 𝜇 𝑥𝑖, 𝑥𝑗 [𝑤1 exp −
𝑝𝑖 − 𝑝𝑗
2
2𝜎 𝛼
2 −
𝐼𝑖 − 𝐼𝑗
2
2𝜎𝛽
2 + 𝑤2 exp −
𝑝𝑖 − 𝑝𝑗
2
2𝜎𝛾
2 ]
𝑄𝑖 𝑥𝑖 = 𝑙 =
1
𝑍𝑖
exp −𝜓 𝑢 𝑥𝑖 − ෍
𝑙′
𝜇 𝑙, 𝑙′ ෍
𝑚
𝑤 𝑚 ෍
𝑗
𝑘 𝑚 𝑓𝑖, 𝑓𝑗 𝑄𝑗(𝑙′)
대신에 𝑄 𝑋 = ς𝑖 𝑄𝑖(𝑋𝑖) 를 정의하고 𝐷 𝐾𝐿(𝑄||𝑃)를 최소화하도록 만들면
아래와 같은 Update 식을 얻을 수 있음
• Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials: Supplementary Material
• Chapter 11.5 of Koller and Friedman “Probabilistic Graphical Models: Principles and Techniques”, 2009
Update Rule
Efficient inference infully connectedcrfswithgaussianedgepotentials, Krähenbühl etal,NIPS2011
𝑄𝑖 𝑥𝑖 = 𝑙 =
1
𝑍𝑖
exp −𝜓 𝑢 𝑥𝑖
෨𝑄𝑖
𝑚
𝑙 = ෍
𝑗
𝑘 𝑚
𝑓𝑖, 𝑓𝑗 𝑄𝑗 𝑙
ෘ𝑄𝑖 𝑙 = ෍
𝑚
𝑤 𝑚 ෨𝑄𝑖
𝑚
𝑙
෠𝑄𝑖 𝑙 = ෍
𝑙′
𝜇 𝑙, 𝑙′ ෘ𝑄𝑖 𝑙
ሖ𝑄𝑖 𝑙 = −𝜓 𝑢 𝑥𝑖 − ෠𝑄𝑖 𝑙
𝑄𝑖 =
1
𝑍𝑖
exp( ሖ𝑄𝑖 𝑙 )
𝑄𝑖 𝑥𝑖 = 𝑙 =
1
𝑍𝑖
exp −𝜓 𝑢 𝑥𝑖 − ෍
𝑙′
𝜇 𝑙, 𝑙′
෍
𝑚
𝑤 𝑚
෍
𝑗
𝑘 𝑚
𝑓𝑖, 𝑓𝑗 𝑄𝑗(𝑙′
)
초기화 수렴할때 까지
Message Passing
Weighting
Compatibility Transform
Adding Unary (Local Update)
Normalization (Softmax)
CRF Learning with Validation (DeepLab v2)
𝜓𝑖,𝑗 𝑥𝑖, 𝑥𝑗 = 𝜇 𝑥𝑖, 𝑥𝑗 [𝑤1 exp −
𝑝𝑖 − 𝑝𝑗
2
2𝜎 𝛼
2 −
𝐼𝑖 − 𝐼𝑗
2
2𝜎𝛽
2 + 𝑤2 exp −
𝑝𝑖 − 𝑝𝑗
2
2𝜎 𝛾
2 ]
CRF를 Classification Network 뒤의 Post-processing으로써 사용하여
Detail을 높임
𝐼 Network
𝜓 𝑢 𝑥𝑖
𝑄𝑖
𝑄𝑖
෨𝑄𝑖
𝑚
𝑙 = ෍
𝑗
𝑘 𝑚
𝑓𝑖, 𝑓𝑗 𝑄𝑗 𝑙
ෘ𝑄𝑖(𝑙) = ෍
𝑚
𝑤 𝑚 ෨𝑄𝑖
𝑚
𝑙
෠𝑄𝑖 𝑙 = ෍
𝑙′
𝜇 𝑙, 𝑙′ ෘ𝑄𝑖 𝑙
ሖ𝑄𝑖 𝑙 = −𝜓 𝑢 𝑥𝑖 − ෠𝑄𝑖 𝑙
𝑄𝑖 =
1
𝑍𝑖
exp( ሖ𝑄𝑖 𝑙 )
𝑄𝑖
𝐼
𝜓𝑖,𝑗 𝑥𝑖, 𝑥𝑗 = 𝜇 𝑥𝑖, 𝑥𝑗 [𝑤1 exp −
𝑝𝑖 − 𝑝𝑗
2
2𝜎 𝛼
2 −
𝐼𝑖 − 𝐼𝑗
2
2𝜎𝛽
2 + 𝑤2 exp −
𝑝𝑖 − 𝑝𝑗
2
2𝜎𝛾
2 ]
Message Passing
Weighting
Compatibility Trans.
෠𝑄𝑖 𝑙
𝜓 𝑢 𝑥𝑖
Addition
Normalization
𝑄𝑖
Conditional RandomFieldsasRecurrent NeuralNetworks, Zhengetal,ICCV2015
Iteration  RNN  End-to-End
𝐼 Network
𝜓 𝑢 𝑥𝑖
𝑄𝑖
𝑄𝑖
𝑄𝑖
𝐼
Message Passing
Weighting
Compatibility Trans.
෠𝑄𝑖 𝑙
𝜓 𝑢 𝑥𝑖
Addition
Normalization
𝑄𝑖
Conditional RandomFieldsasRecurrent NeuralNetworks, Zhengetal,ICCV2015
Pr045 deep lab_semantic_segmentation
“Multi ScaleContextAggregation byDilatedConvolution”, 2015
8 Layer Context Module of DilatedConv
Multi Scale Context Aggregation by Dilated Convolution
3x3 Conv r=1 3x3 C
3x3 Conv r=1 5x5 C
3x3 Conv r=2 9x9 C
3x3 Conv r=4 17x17 C
3x3 Conv r=8 33x33 C
3x3 Conv r=16 65x65 C
3x3 Conv r=1 67x67 C
1x1 Conv r=1 67x67 C
2C
2C
4C
8C
16C
32C
32C
C
Basic LargeInput: 64x64xC
OutputChannels
새로운Network
CascadeDilatedConv.
8 Layer Context Module of DilatedConv
3x3 Conv r=1 3x3 C
3x3 Conv r=1 5x5 C
3x3 Conv r=2 9x9 C
3x3 Conv r=4 17x17 C
3x3 Conv r=8 33x33 C
3x3 Conv r=16 65x65 C
3x3 Conv r=1 67x67 C
1x1 Conv r=1 67x67 C
2C
2C
4C
8C
16C
32C
32C
C
Basic LargeInput: 64x64xC
OutputChannels
ReceptiveFields
새로운Network
CascadeDilatedConv.
VGG16
32x32x3
Conv1
Pool1
16x16x64
Conv2
Pool2
8x8x128
Conv3
Pool3
4x4x256
Conv4
2x2x512
Conv5
1x1x512
FC6
1x1 Heatmap
x32 Upsample
Softmax
Pool4
Pool5
VGG16 Front-End Module of DilatedConv
32x32x3
Conv1
Pool1
16x16x64
Conv2
Pool2
8x8x128
Conv3
Pool3
4x4x256
Conv4
4x4x512
Conv r=2
4x4x512
Conv r=4
4x4 Heatmap
x8 Upsample
Softmax
Pooling 제거
Pooling 제거
Front-End Module for Context Module
Input
Conv1
Pool1
Conv2
Pool2
Conv3
Pool3
Conv4
Conv r=2
Conv r=4
64x64xC Heatmap
Context Module
Heatmap
Size에맞게InputPadding
C=21(Class개수)
• Context Module 제안: Dilated Conv Layer로만 구성된 새로운 Network
• Front-End Module 제안: VGG16을 Pooling을 제거하고 Dilated Conv로 구성
Pr045 deep lab_semantic_segmentation
PASCAL VOC 2012 Cityscapes (IoU / iIoU)
FCN-8s-CVPR15 62.2% 65.3% / 41.7%
FCN-8s-PAMI17 67.2%
DeepLab v1 71.6% 63.1% / 34.5%
CRF-RNN 72.0% 62.5% / 34.4%
Dilated Conv FrontEnd 71.3%
10-Layer Context
67.1% / 42.0%
Dilated Conv+ Context 73.5%
Dilated Conv+ CRFRNN 75.3%
“Semantic ImageSegmentation withDeepConvolutional NetsandFullyConnected CRFs”2014
asDeepLabv1
“DeepLab:Semantic ImageSegmentation withDeepConvolutional Nets,AtrousConvolution andFully
Connected CRFs”,2015
asDeepLabv2
CRFs를 강조
• CRF 사용
• Hole Algorithm 제안
• Atrous Conv. (Dilated Conv.)
• CRFs
• ASPP 제안
• 추가적인 학습 방법 등
DeepLab v1 DeepLab v2
32x32x3
Conv1
Pool1
16x16x64
Conv2
Pool2
8x8x128
Conv3
Pool3
4x4x256
Conv4
4x4x512
Conv r=2
4x4x512
fc6 r=4
x8 Upsample
Softmax
fc7 1x1
fc8 1x1
DilatedConv와 비슷 (Front-End)
Fc6 Layer의 rate가 다름
ASPP 제안: Atrous Conv. (Dilated Conv.)
• Dilated Conv만으로는 Multi-scale을 본다고 하기 어렵다
• Dilated Conv는 Pooling 대신 Resolution 유지를 위해 사용하는 것
• 물론 Multi-scale로 Input을 넣어주면 성능은 무조건 향상
• 대신 ASPP를 제안하여 간단하게 그 효과를 본다!
Fc6 Layer에서 Parallel하게 Atrous Convolution한 뒤 Fusion
K. He의 Spatial Pyramid Pooling에서 영감
• Poly Learning Rate
• ASPP
• CRF
• VGG16  ResNet
• Multi-scale Inputs
• Pretrained on MS-COCO
• Data Augmentation
ASPP의 효과
ResNet+
CRF의 효과
PASCAL VOC 2012 Cityscapes (IoU / iIoU)
FCN-8s-CVPR15 62.2% 65.3% / 41.7%
FCN-8s-PAMI17 67.2%
DeepLab v1 71.6% 63.1% / 34.5%
CRF-RNN 72.0% 62.5% / 34.4%
Dilated Conv FrontEnd 71.3%
10-Layer Context
67.1% / 42.0%
Dilated Conv Context 73.5%
Dilated Conv+ CRFRNN 75.3%
DeepLab v2 79.7% 70.4% / 42.6%
“Pyramid SceneParsingNetwork”, 2016
Deep Network with a Suitable Global-scene-level Prior can much Improve
the Performance of Scene Parsing
주변이 강이라면 Car보단 Boat
Building? Skyscraper
비슷한 Texture
• Pyramid Pooling Module + Concat
• Auxiliary Loss (ResNet)
ResNet
DilatedConv
1/8
1x1
중간에서 Loss Check
Average Pooling이 좋음
1x1 Convolution 효과
깊을수록 좋다 Auxiliary Loss의 효과
Pr045 deep lab_semantic_segmentation
“Rethinking AtrousConvolution forSemanticImageSegmentation”, 2017
• Cascade Atrous Convolutions
• MultiGrid
ex) Output Stride = 16
Feature Map이 원본의 1/16
Resolution
한 Block에 3개의 Conv Layer
각 Layer의 Conv Rate 조절
• Modified ASSP + Batch Normalization
• Inference Strategy on Val Set
• Pretrained on COCO
• Bootstrapping
• Pretrained on JFT-300M
Block 4에만 적용!
Iteration마다 어려운 Label의 Data양을 늘려서 학습
Output Stride=16에 학습
Output Stride=8로 Test
ASPP가 추가되면
Multigrid = (1,2,4)가 좋음
+Image Pooling 효과
ASPP가 추가된 경우
Output Stride=8로 Test
하면 좋음
깊을수록 좋다
PASCAL VOC 2012 Cityscapes (IoU / iIoU)
FCN-8s-CVPR15 62.2% 65.3% / 41.7%
FCN-8s-PAMI17 67.2%
DeepLab v1 71.6% 63.1% / 34.5%
CRF-RNN 72.0% 62.5% / 34.4%
Dilated Conv FrontEnd 71.3%
10-Layer Context
67.1% / 42.0%
Dilated Conv Context 73.5%
Dilated Conv+ CRFRNN 75.3%
DeepLab v2 79.7% 70.4% / 42.6%
PSPNet 85.4% 81.2% / 59.6%
DeepLab v3 85.7%
81.3% / 62.1%
DeepLab v3-JFT 86.9%
PASCAL VOC 2012 Cityscapes (IoU / iIoU) Contribution
FCN-8s-CVPR15 62.2% 65.3% / 41.7%
FCN
FCN-8s-PAMI17 67.2%
DeepLab v1 71.6% 63.1% / 34.5% Dilated + CRF
CRF-RNN 72.0% 62.5% / 34.4% CRF (End-to-End)
Dilated Conv FrontEnd 71.3%
10-Layer Context
67.1% / 42.0%
Cascade DilatedDilated Conv Context 73.5%
Dilated Conv+ CRFRNN 75.3%
DeepLab v2 79.7% 70.4% / 42.6% Dilated+ASPP+CRFs+ResNet
PSPNet 85.4% 81.2% / 59.6% Pyramid Pooling + Aux. Loss
DeepLab v3 85.7%
81.3% / 62.1%
Modified Layer & ASPP +
BatchNorm + Traning Strategies
DeepLab v3-JFT 86.9%
Q&A?

More Related Content

PPTX
CNN Tutorial
PPTX
AlexNet
PDF
Deep Learning - Convolutional Neural Networks
PPTX
Object detection
PPTX
Object Detection using Deep Neural Networks
PPTX
Few shot learning/ one shot learning/ machine learning
PDF
Latent diffusions vs DALL-E v2
PPTX
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
CNN Tutorial
AlexNet
Deep Learning - Convolutional Neural Networks
Object detection
Object Detection using Deep Neural Networks
Few shot learning/ one shot learning/ machine learning
Latent diffusions vs DALL-E v2
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...

What's hot (20)

PPTX
Introduction to Visual transformers
PDF
Pr057 mask rcnn
PDF
Convolutional neural network
PPTX
Image classification with Deep Neural Networks
PDF
Object Detection Using R-CNN Deep Learning Framework
PDF
Object Detection with Transformers
PPTX
PPTX
Object detection
PDF
PR-284: End-to-End Object Detection with Transformers(DETR)
PPTX
Object Detection Methods using Deep Learning
PDF
Introduction to Diffusion Models
PDF
PR-409: Denoising Diffusion Probabilistic Models
PDF
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
PPTX
Object detection with deep learning
PDF
순환신경망(Recurrent neural networks) 개요
PPTX
Image Classification using deep learning
PPTX
Machine Learning - Convolutional Neural Network
PPTX
Image Segmentation Using Deep Learning : A survey
PPTX
Resnet.pptx
PPTX
Semantic segmentation with Convolutional Neural Network Approaches
Introduction to Visual transformers
Pr057 mask rcnn
Convolutional neural network
Image classification with Deep Neural Networks
Object Detection Using R-CNN Deep Learning Framework
Object Detection with Transformers
Object detection
PR-284: End-to-End Object Detection with Transformers(DETR)
Object Detection Methods using Deep Learning
Introduction to Diffusion Models
PR-409: Denoising Diffusion Probabilistic Models
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Object detection with deep learning
순환신경망(Recurrent neural networks) 개요
Image Classification using deep learning
Machine Learning - Convolutional Neural Network
Image Segmentation Using Deep Learning : A survey
Resnet.pptx
Semantic segmentation with Convolutional Neural Network Approaches
Ad

Similar to Pr045 deep lab_semantic_segmentation (20)

PDF
Deeplabv1, v2, v3, v3+
PDF
Convolutional Neural Network Models - Deep Learning
PPTX
Deep Learning in Computer Vision
PDF
Lecture 5: Convolutional Neural Network Models
PDF
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
PDF
Deep Learning for New User Interactions (Gestures, Speech and Emotions)
PDF
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
PPTX
Generating super resolution images using transformers
PDF
Pixel RNN to Pixel CNN++
PDF
Ilsvrc2015 deep residual_learning_kaiminghe
PDF
Multi-stage Progressive Image Restoration
PPTX
150807 Fast R-CNN
PPTX
Batch normalization presentation
PPTX
Anomaly detection using deep one class classifier
PDF
Recent Object Detection Research & Person Detection
PDF
Using neon for pattern recognition in audio data
PPTX
Week5-Faster R-CNN.pptx
PDF
The impact of visual saliency prediction in image classification
PDF
Auro tripathy - Localizing with CNNs
PPTX
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
Deeplabv1, v2, v3, v3+
Convolutional Neural Network Models - Deep Learning
Deep Learning in Computer Vision
Lecture 5: Convolutional Neural Network Models
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
Deep Learning for New User Interactions (Gestures, Speech and Emotions)
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Generating super resolution images using transformers
Pixel RNN to Pixel CNN++
Ilsvrc2015 deep residual_learning_kaiminghe
Multi-stage Progressive Image Restoration
150807 Fast R-CNN
Batch normalization presentation
Anomaly detection using deep one class classifier
Recent Object Detection Research & Person Detection
Using neon for pattern recognition in audio data
Week5-Faster R-CNN.pptx
The impact of visual saliency prediction in image classification
Auro tripathy - Localizing with CNNs
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
Ad

More from Taeoh Kim (6)

PDF
CNN Attention Networks
PDF
PR 127: FaceNet
PDF
PR 113: The Perception Distortion Tradeoff
PDF
PR 103: t-SNE
PDF
Pr083 Non-local Neural Networks
PDF
Pr072 deep compression
CNN Attention Networks
PR 127: FaceNet
PR 113: The Perception Distortion Tradeoff
PR 103: t-SNE
Pr083 Non-local Neural Networks
Pr072 deep compression

Recently uploaded (20)

PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Sustainable Sites - Green Building Construction
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
DOCX
573137875-Attendance-Management-System-original
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CYBER-CRIMES AND SECURITY A guide to understanding
Arduino robotics embedded978-1-4302-3184-4.pdf
Operating System & Kernel Study Guide-1 - converted.pdf
bas. eng. economics group 4 presentation 1.pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
UNIT 4 Total Quality Management .pptx
Sustainable Sites - Green Building Construction
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Foundation to blockchain - A guide to Blockchain Tech
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Model Code of Practice - Construction Work - 21102022 .pdf
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
573137875-Attendance-Management-System-original
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Internet of Things (IOT) - A guide to understanding
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx

Pr045 deep lab_semantic_segmentation