SlideShare a Scribd company logo
Aerial Object Detection
HyeongJun Kwon
2019-2
Contents
2
1. ClusDet
2. RoI Transformer
3. SCRDet
4. GcGAN
5. CBAM
ClusDet
3
Network Overview
ClusDet
4
Object : solve image with object sparse and non-uniform and tend to be highly clustered
in certain regions
Existing Problem of Method:
- parse, non-uniform and highly clustered in certain region
Proposed Method:
- Cluster Proposal Sub-network (CPNet)
- Scale Network
ClusDet
5
Cluster Proposal Sub-network (CPNet)
: like RPN, but using first layer of feature extractor because of large receptive field
ClusDet
6
ICM : aggregate Cluster region algorithm
ClusDet
7
ScaleNet & Padding and partition(PP)
: to avoid extreme scale of objects degrading detection performance
ClusDet
8
Experiments
RoI Transformer
9
Network Overview
RoI Transformer
10
Object : oriented and densely packed detection task
Existing Problem of Method:
- Expensive Computation
- Not Learning rotation-invariant feature
Proposed Method:
- RRoI learner
- Rotated Position Sensitive RoI pooling (RPS RoI pooling)
RoI Transformer
11
RRoI learner. For computational efficiency, matching RRoI and RGT before
determine 𝑡θ
∗
RPS RoI Align. Rotate + PS RoI pooling + RoI Align
RoI Transformer
12
RoI Transformer
13
• Experiments :
SCRDet
14
Network Overview
SCRDet
15
Object : oriented and densely packed detection task
Challenging task of object detection:
- Small object
- Cluttered arrangement
- Arbitrary orientation
Proposed Method:
- Sampling fusion network(SFnet) for issue of small object
- Multi-dimensional attention network for denoising background noise
SCRDet
16
SFNet : module for combining Feature fusion and Finer sampling
Feature fusion: for combining low-level and high-
level information like FPN, TDM etc..
Finer sampling: small size of anchor stride achieve
higher EMO score than large size
SCRDet
17
MDANet : suppress noise by using pixel attention + channel attention
Channel Attention: Using SE-module
Pixel Attention: Using Inception-module & get
attention loss by using binary map of RGT
SCRDet
18
SCRDet
19
IoU smooth L1 loss : for solving boundary discontinuity problem
SCRDet
20
SCRDet
21
GcGAN
22
Network Overview
GcGAN
23
Object : inference marginal distribution about source domain and target domain
Existing Problem of Method:
- Existing constraints have overlooked special characteristics of image
: geometric transformation do not change semantic structure
Proposed Method:
- Geometric consistency which can make model one-side mapping
GcGAN
24
Geometric consistency constraints
ℒℊℯℴ 𝐺 𝑋𝑌, 𝐺 𝑋𝑌, 𝑋, 𝑌
GcGAN
25
Experiments
GcGAN
26
Experiments
CBAM(Covolutional Block Attention Module)
27
Network Overview
CBAM(Covolutional Block Attention Module)
28
Experiments on MS COCO
Result
29
Baseline(RoI Transforemr with Faster Rcnn) on DOTA 1
Plane BD Bridge GTF SV LV Ship TC BC ST SBF RA Harbor SP HC mAP
88.52 80.13 52.45 71.01 63.16 79.63 85.17 90.68 85.50 82.37 51.82 37.22 72.09 63.28 57.89 70.73
Plane BD Bridge GTF SV LV Ship TC BC ST SBF RA Harbor SP HC mAP
88.01 78.34 52.56 71.64 61.33 79.89 83.97 90.61 85.14 83.30 50.22 37.72 67.59 62.14 62.10 70.5
Results(Baseline)
Results(Baseline+CBAM)
Result
30
CBAMGT

More Related Content

PPTX
Aerial detection part2
PPTX
Aerial detection part3
PPTX
Feature pyramid networks for object detection
PDF
Faster R-CNN - PR012
PPTX
Semantic segmentation with Convolutional Neural Network Approaches
PDF
Mask-RCNN for Instance Segmentation
PPTX
Faster R-CNN
PDF
Object Detection Using R-CNN Deep Learning Framework
Aerial detection part2
Aerial detection part3
Feature pyramid networks for object detection
Faster R-CNN - PR012
Semantic segmentation with Convolutional Neural Network Approaches
Mask-RCNN for Instance Segmentation
Faster R-CNN
Object Detection Using R-CNN Deep Learning Framework

What's hot (20)

PDF
Recent Object Detection Research & Person Detection
PPTX
Tutorial on Object Detection (Faster R-CNN)
PPTX
Object Detection Methods using Deep Learning
PDF
Pr057 mask rcnn
PPTX
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
PDF
[PaperReview] LightGCN: Simplifying and Powering Graph Convolution Network fo...
PPTX
Convolutional Patch Representations for Image Retrieval An unsupervised approach
PDF
ShuffleNet - PR054
PPTX
Deep image retrieval - learning global representations for image search - ub ...
PDF
How much position information do convolutional neural networks encode? review...
PPTX
Review-image-segmentation-by-deep-learning
PPTX
PDF
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
PPTX
150807 Fast R-CNN
PDF
[Paper] Multiscale Vision Transformers(MVit)
PDF
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
PDF
ViT (Vision Transformer) Review [CDM]
PDF
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
PPT
Motion estimation overview
Recent Object Detection Research & Person Detection
Tutorial on Object Detection (Faster R-CNN)
Object Detection Methods using Deep Learning
Pr057 mask rcnn
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
[PaperReview] LightGCN: Simplifying and Powering Graph Convolution Network fo...
Convolutional Patch Representations for Image Retrieval An unsupervised approach
ShuffleNet - PR054
Deep image retrieval - learning global representations for image search - ub ...
How much position information do convolutional neural networks encode? review...
Review-image-segmentation-by-deep-learning
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
150807 Fast R-CNN
[Paper] Multiscale Vision Transformers(MVit)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
ViT (Vision Transformer) Review [CDM]
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Motion estimation overview
Ad

Similar to Aerial detection1 (20)

PPTX
[NS][Lab_Seminar_241118]Relation Matters: Foreground-aware Graph-based Relati...
PPTX
Semantic Segmentation on Satellite Imagery
PDF
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
PDF
Optimized Multi-agent Box-pushing - 2017-10-24
PPTX
2022-01-17-Rethinking_Bisenet.pptx
PDF
CRDOM: CELL RE-ORDERING BASED DOMINO ON-THE-FLY MAPPING
PPTX
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
PDF
Efficient_DNN_pruning_System_for_machine_learning.pdf
PDF
Crdom cell re ordering based domino on-the-fly mapping
PDF
SPIE Remote Sensing 2019
PDF
INNOVA - SPIE Remote Sensing 2019
PDF
Review: You Only Look One-level Feature
PDF
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
PDF
Deep image retrieval learning global representations for image search
PDF
information-11-00583-v3.pdf
PDF
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
PDF
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
PDF
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
PPT
Image Compression Digital Image processing
PPT
Qo s based mac protocol for medical wireless body area sensor networks
[NS][Lab_Seminar_241118]Relation Matters: Foreground-aware Graph-based Relati...
Semantic Segmentation on Satellite Imagery
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
Optimized Multi-agent Box-pushing - 2017-10-24
2022-01-17-Rethinking_Bisenet.pptx
CRDOM: CELL RE-ORDERING BASED DOMINO ON-THE-FLY MAPPING
DDGK: Learning Graph Representations for Deep Divergence Graph Kernels
Efficient_DNN_pruning_System_for_machine_learning.pdf
Crdom cell re ordering based domino on-the-fly mapping
SPIE Remote Sensing 2019
INNOVA - SPIE Remote Sensing 2019
Review: You Only Look One-level Feature
PR-187 : MorphNet: Fast & Simple Resource-Constrained Structure Learning of D...
Deep image retrieval learning global representations for image search
information-11-00583-v3.pdf
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
Image Compression Digital Image processing
Qo s based mac protocol for medical wireless body area sensor networks
Ad

More from ssuser456ad6 (6)

PPTX
Scale invariant feature transform
PPTX
Learning joint 2 d 3d representations for depth completion
PPTX
Guided image filter
PPTX
Fast cost volume filtering for visual correspondence and beyond
PPTX
D2 net a trainable cnn for joint description and detection of local features
PPTX
Gan dissection
Scale invariant feature transform
Learning joint 2 d 3d representations for depth completion
Guided image filter
Fast cost volume filtering for visual correspondence and beyond
D2 net a trainable cnn for joint description and detection of local features
Gan dissection

Recently uploaded (20)

PPT
Project quality management in manufacturing
PDF
PPT on Performance Review to get promotions
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
Well-logging-methods_new................
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
composite construction of structures.pdf
PPTX
web development for engineering and engineering
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Safety Seminar civil to be ensured for safe working.
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Construction Project Organization Group 2.pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Project quality management in manufacturing
PPT on Performance Review to get promotions
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Well-logging-methods_new................
Fundamentals of safety and accident prevention -final (1).pptx
composite construction of structures.pdf
web development for engineering and engineering
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Safety Seminar civil to be ensured for safe working.
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Model Code of Practice - Construction Work - 21102022 .pdf
CYBER-CRIMES AND SECURITY A guide to understanding
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Internet of Things (IOT) - A guide to understanding
Construction Project Organization Group 2.pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT

Aerial detection1

Editor's Notes

  • #3: Cluster detection networ의 전체 스트럭쳐 입니다. 크게 Cluster proposal network랑 scale network그리고 detection network가 존재합니다.
  • #4: Cluster detection networ의 전체 스트럭쳐 입니다. 크게 Cluster proposal network랑 scale network그리고 detection network가 존재합니다.
  • #5: ClusDet의 목표는 image에 object들이 non-uniform하게 분포하며 군집화되어 있는 경향으로 인한 문제점을 해결하는 것 입니다. 기존의 연구들은 이러한 문제점에 대해서 논의한 적이 없다는 것이 선행연구들의 문제점이라 하였습니다. 이를 해결하기위해 Cluster proposal network와 Scale network를 저자는 제안하였습니다.
  • #6: Cpnet은 RPN과 거의 유사한 형태를 가지고 있습니다. 다만 large receptive field를 필요로 하여 feature extractor의 첫 번째 layer를 사용합니다.
  • #7: 또한 너무 많은 Cluster가 제시되는 경우가 있기 때문에 ICM을 제안하였습니다. 간단한 구조는 가장 큰 Cluster에 작은 cluster를 합치는 알고리즘입니다.
  • #8: Scale network는 cluster에 있는 object의 상대적인 크기의 문제를 해결하기 위해서 제시한 network입니다.
  • #9: 결과는 다음과 같습니다.
  • #10: RoI Transformer의 구조는 다음과 같습니다. RRoI learne와 RPS RoI Align method가 contributio입니다.
  • #11: RoI Transformer는 oriented and densely packed detection task를 해결하는 것이 목표입니다. 선행연구(Rotated object detection)들은 computational cost가 매우 크다는 단점이 있으면 rotation-invariant feature를 learning하지 못한다는 단점이 있습니다. 이에 저자는 RRoI learne와 RPS RoI Align method를 제안하였습니다.
  • #12: 간단하게 설명을 하면 feature map을 생성하기 전에 sub module로 feature map에서 예측 위치와 각도 값을 뽑아낸 다음 이 값을 이용해 rotate 시키는 것입니다. RPS RoI Align은 rotate ps roi pooling과 roi align이 합쳐진 모듈입니다. 식으로는 다음 표현과 같습니다.
  • #13: Learning invariant feature는 warpin이후 각도 변환을 시켜서 학습을 진행하면 rotate invariant feature로 학습을 할 수 있습니다.
  • #14: 결과는 다음과 같습니다.
  • #15: 다음은 SCRDet의 모델구조입니다. Sampling fusion(SF) network와 Multi-dimensional attention network구조가 추가되었습니다.
  • #16: 해당 network또한 oriented and densely packed detection task를 해결하는 것이 목표입니다. 저자는 object detection의 어려운 점으로 다음 세가지를 뽑았습니다. 이를 해결하기 위해 각각 sfnet과 mdanet을 제안하였습니다.
  • #17: Sfnet은 feature extractor의 첫 번째 layer와 second layer를 이용하여 feature를 fusion합니다. 두 번째 layer에는 inception module을 붙였습니다. Finer sampling은 anchor stride가 작을수록 EMO score가 올라간다는 점에서 anchor stride를 작게 가져가는 것을 택했습니다.
  • #18: MDAnet은 attention module을 추가하여 noise를 억제하는 network입니다. Attention loss가 blur를 줄여주는 역할을 합니다.
  • #20: 가장 큰 contribution point라 생각되는 IoU smooth l1 loss입니다. Bbox regression시에 boundary discontiuity problem을 완화시켜줘서 학습을 안정시키는 결과를 가져왔습니다.
  • #21: 결과입니다. SCRDet이 성능이 앞서는 것을 확인할 수 있습니다.
  • #22: Abblation study 결과입니다.
  • #23: Network를 간단한게 보면 왼쪽부터 cycle gan, distance gan , gcgan입니다.
  • #24: Source domain과 target domain의 marginal distribution을 구하는 것이 목표입니다. 저자가 제안한 문제점은 기존의 연굳들은 image의 특별한 성질을 간과한 것이 문제라 하였습니다. 이에 one side mapping이 가능한 Geometric consistency constraints를 제안하였습니다.
  • #25: 크게 복잡한 것은 없으므로 식을 보면서 간단하게 설명을 하면 tranformation된 image가 translation되는 것과 translation된 image가 tranformation된 것의 차이를 줄이는 것이 object function의 일부입니다. 전체 object function은 gan loss랑 tranformation이후 image에 대한 gan loss그리고 geometric consistency loss의 합입니다.
  • #26: 결과입니다.
  • #27: 결과입니다.
  • #28: 추가적으로 기존의 object detection network의 성능향상을 위해 적절한 모듈을 찾아봤는데 network engineering 논문 중 CBAM이라는 sub-module을 찾아서 잠깐 소개하겠습니다. Attention module이고 적용 부분은 backbone network에 적용 시켰습니다.
  • #29: Object detection에서도 성능향상이 나왔다고 제시되어있습니다.
  • #30: 논문에 제시된 Roi tranformer의 결과입니다. 또한, 아래는 training을 돌려서 결과를 확인해본 roi transforemer와 +cbam network의 mAP결과 입니다. category별 ap의 분포는 비슷한데 큰 성능 변화가 없었습니다.