SlideShare a Scribd company logo
Going Deeper with Convolutions
1. R-CNN & Network-in-Network
2. Introduction & Related Work
3. Motivation and High Level Considerations
4. Architectural Details
5. GoogLeNet
6. Training Methodology
7. ILSVRC 2014 Classification & Detection Challenge Setup and Results
목차
R-CNN
1. R-CNN & Network-in-Network
Network-in-Network
1. R-CNN & Network-in-Network
http://nmhkahn.github.io/Casestudy-CNN
INTRODUCTION
- GoogLeNet은 ILSVRC 2014에서 AlexNet보다 12배 적은 parameter 사용
→ 하지만 더 정확하다!
RELATED WORK
- Inception layer → Repeat many times
- Network-in-Network → Dimension Reduction
- R-CNN → Multi-box& Ensemble
2. Introduction & Related Work
- 성능을 높일 수 있는 방법은 모델의 사이즈를 키우는 것이다.
→ Parameter 수 증가(overfitting ↑) & Computational resources ↑
- Fully connected → Sparsely connected architectures in the convolutions
- Dense submatrices → Clustering sparse matrices
3. Motivation and High Level Considerations
- Inception architecture
Dense components과 비슷한 성능을 내는 Local sparse structure
4. Architectural Details
5. GoogLeNet
https://github.com/rlatjcj/Keras-Model/blob/master/Inception_v1/model.py
6. Training Methodology
- Stochastic Gradient Descent with 0.9 momentum
- Fixed Learning Rate Schedule (Decreasing the learning rate by 4% every 8 epochs)
- Use Polyak averaging to create the final model used at inference time.
(http://ttic.uchicago.edu/~shubhendu/Pages/Files/Lecture6_flat.pdf)
- 하지만, transfer learning 할 때는 dropout이나 learning rate 같은 option들을 바꿀 수 있음
- Crop image whose size is distributed evenly between 8% and 100% and aspect ratio is chosen
randomly between 3/4 and 4/3
- Random interpolation (bilinear, area, nearest neighbor, cubic with equal probability)
7. ILSVRC 2014 Classification & Detection
Classification
- 7개의 GoogLeNet (with one wider version)을 동일한 초기조건, learning rate policies로 학습
Sampling methodologies와 입력 이미지의 순서만 다름
예측할 때 Ensemble 수행
7. ILSVRC 2014 Classification & Detection
Detection
- R-CNN과 접근이 유사하지만 Region classifier에서 Inception model로 증강
- Region proposal step에 Selective Search를 추가하여 개선
→ Superpixel size를 2배 증가시켜 Selective Search Algorithm 시행을 절반으로 줄임
- Use Ensemble of 6 ConvNets when classifying each region
→ R-CNN과 달리, 시간 지연의 이유로 bbox regression을 사용하지 않았다.
Reference
- Inception v1
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going Deeper with Convolutions”, In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, 2015. (https://arxiv.org/pdf/1409.4842.pdf)
- R-CNN
R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation”, in IEEE Conference on Computer Vision and Pattern
Recognition, 2014. (https://arxiv.org/pdf/1311.2524.pdf)
- Network-in-Network
M. Lin, Q. Chen and S. Yan, “Network in network”, in CoRR, 2013. (https://arxiv.org/pdf/1312.4400.pdf)
- AlexNet
A. Krizhevsky, I. Sutskever and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks”, in Neural Information Processing Systems, 2012.
(http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)
13
감사합니다

More Related Content

DOC
Sejarah Media Massa
PDF
Brain tumor classification using artificial neural network on mri images
PDF
Modul Pelatihan Jurnalistik Tingkat Dasar.pdf
DOCX
Optical character recognition IEEE Paper Study
PPTX
Realitas media dan konstruksi sosial media massa
DOCX
) Atraksi dlm komunikasi interpersonal dan hubungan interpersonal word
PPTX
Media baru – teori baru
PPTX
Lessons Learned From Building an In-House Executive Recruiting Team
Sejarah Media Massa
Brain tumor classification using artificial neural network on mri images
Modul Pelatihan Jurnalistik Tingkat Dasar.pdf
Optical character recognition IEEE Paper Study
Realitas media dan konstruksi sosial media massa
) Atraksi dlm komunikasi interpersonal dan hubungan interpersonal word
Media baru – teori baru
Lessons Learned From Building an In-House Executive Recruiting Team

Similar to Going Deeper with Convolutions (20)

PDF
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
PDF
paper8.pdfiy87t6r5e5wsretdryfugihojp[][poipuoiyutyrtersweaserdtfyguhuijk
PDF
How much position information do convolutional neural networks encode? review...
PDF
Getting the most out of multi-GPU on Inference stage using Hadoop-spark cluster
PPTX
Object Detection is a very powerful field.pptx
PDF
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
PDF
ResNeSt: Split-Attention Networks
PPTX
DIY Deep Learning with Caffe Workshop
PDF
Inception v4 vs Inception Resnet v2.pdf
PDF
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
PDF
[Paper] DetectoRS for Object Detection
PDF
Autimatic Machine Learning and Artificial Intelligence
PPTX
QuadTree_Visualizer_Major-Project-II_CS-73.pptx
PDF
Introduction to Chainer
PDF
Introduction to Chainer
PPTX
How well do self-supervised models transfer.pptx
PDF
Remote Sensing IEEE 2015 Projects
PDF
PDF
Remote Sensing IEEE 2015 Projects
PPTX
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
paper8.pdfiy87t6r5e5wsretdryfugihojp[][poipuoiyutyrtersweaserdtfyguhuijk
How much position information do convolutional neural networks encode? review...
Getting the most out of multi-GPU on Inference stage using Hadoop-spark cluster
Object Detection is a very powerful field.pptx
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
ResNeSt: Split-Attention Networks
DIY Deep Learning with Caffe Workshop
Inception v4 vs Inception Resnet v2.pdf
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
[Paper] DetectoRS for Object Detection
Autimatic Machine Learning and Artificial Intelligence
QuadTree_Visualizer_Major-Project-II_CS-73.pptx
Introduction to Chainer
Introduction to Chainer
How well do self-supervised models transfer.pptx
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 Projects
240325_JW_labseminar[node2vec: Scalable Feature Learning for Networks].pptx
Ad

More from Sungchul Kim (20)

PDF
SAM2: Segment Anything in Images and Videos
PDF
FeatUp: A Model-Agnostic Framework for Features at Any Resolution
PDF
Personalize Segment Anything Model with One Shot
PDF
TOOD: Task-aligned One-stage Object Detection
PDF
FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
PDF
Network Representation Analysis using Centered Kernel Alignment (CKA)
PDF
Review. Dense Prediction Tasks for SSL
PPTX
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
PDF
Revisiting the Calibration of Modern Neural Networks
PDF
Emerging Properties in Self-Supervised Vision Transformers
PDF
PR-305: Exploring Simple Siamese Representation Learning
PDF
Score based Generative Modeling through Stochastic Differential Equations
PDF
Exploring Simple Siamese Representation Learning
PDF
Revisiting the Sibling Head in Object Detector
PDF
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
PDF
Deeplabv1, v2, v3, v3+
PDF
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
PDF
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...
PDF
Panoptic Segmentation
PDF
On the Variance of the Adaptive Learning Rate and Beyond
SAM2: Segment Anything in Images and Videos
FeatUp: A Model-Agnostic Framework for Features at Any Resolution
Personalize Segment Anything Model with One Shot
TOOD: Task-aligned One-stage Object Detection
FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Network Representation Analysis using Centered Kernel Alignment (CKA)
Review. Dense Prediction Tasks for SSL
PR-343: Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
Revisiting the Calibration of Modern Neural Networks
Emerging Properties in Self-Supervised Vision Transformers
PR-305: Exploring Simple Siamese Representation Learning
Score based Generative Modeling through Stochastic Differential Equations
Exploring Simple Siamese Representation Learning
Revisiting the Sibling Head in Object Detector
Do Wide and Deep Networks Learn the Same Things: Uncovering How Neural Networ...
Deeplabv1, v2, v3, v3+
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Reg...
Panoptic Segmentation
On the Variance of the Adaptive Learning Rate and Beyond
Ad

Recently uploaded (20)

PPTX
Nature of X-rays, X- Ray Equipment, Fluoroscopy
PPT
introduction to datamining and warehousing
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
Soil Improvement Techniques Note - Rabbi
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PDF
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
PPTX
introduction to high performance computing
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Analyzing Impact of Pakistan Economic Corridor on Import and Export in Pakist...
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPT
Occupational Health and Safety Management System
PDF
737-MAX_SRG.pdf student reference guides
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PPTX
Fundamentals of Mechanical Engineering.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
UNIT - 3 Total quality Management .pptx
PPTX
Information Storage and Retrieval Techniques Unit III
Nature of X-rays, X- Ray Equipment, Fluoroscopy
introduction to datamining and warehousing
Fundamentals of safety and accident prevention -final (1).pptx
Soil Improvement Techniques Note - Rabbi
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
UNIT no 1 INTRODUCTION TO DBMS NOTES.pdf
introduction to high performance computing
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Analyzing Impact of Pakistan Economic Corridor on Import and Export in Pakist...
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Occupational Health and Safety Management System
737-MAX_SRG.pdf student reference guides
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
Fundamentals of Mechanical Engineering.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
UNIT - 3 Total quality Management .pptx
Information Storage and Retrieval Techniques Unit III

Going Deeper with Convolutions

  • 2. 1. R-CNN & Network-in-Network 2. Introduction & Related Work 3. Motivation and High Level Considerations 4. Architectural Details 5. GoogLeNet 6. Training Methodology 7. ILSVRC 2014 Classification & Detection Challenge Setup and Results 목차
  • 3. R-CNN 1. R-CNN & Network-in-Network
  • 4. Network-in-Network 1. R-CNN & Network-in-Network http://nmhkahn.github.io/Casestudy-CNN
  • 5. INTRODUCTION - GoogLeNet은 ILSVRC 2014에서 AlexNet보다 12배 적은 parameter 사용 → 하지만 더 정확하다! RELATED WORK - Inception layer → Repeat many times - Network-in-Network → Dimension Reduction - R-CNN → Multi-box& Ensemble 2. Introduction & Related Work
  • 6. - 성능을 높일 수 있는 방법은 모델의 사이즈를 키우는 것이다. → Parameter 수 증가(overfitting ↑) & Computational resources ↑ - Fully connected → Sparsely connected architectures in the convolutions - Dense submatrices → Clustering sparse matrices 3. Motivation and High Level Considerations
  • 7. - Inception architecture Dense components과 비슷한 성능을 내는 Local sparse structure 4. Architectural Details
  • 9. 6. Training Methodology - Stochastic Gradient Descent with 0.9 momentum - Fixed Learning Rate Schedule (Decreasing the learning rate by 4% every 8 epochs) - Use Polyak averaging to create the final model used at inference time. (http://ttic.uchicago.edu/~shubhendu/Pages/Files/Lecture6_flat.pdf) - 하지만, transfer learning 할 때는 dropout이나 learning rate 같은 option들을 바꿀 수 있음 - Crop image whose size is distributed evenly between 8% and 100% and aspect ratio is chosen randomly between 3/4 and 4/3 - Random interpolation (bilinear, area, nearest neighbor, cubic with equal probability)
  • 10. 7. ILSVRC 2014 Classification & Detection Classification - 7개의 GoogLeNet (with one wider version)을 동일한 초기조건, learning rate policies로 학습 Sampling methodologies와 입력 이미지의 순서만 다름 예측할 때 Ensemble 수행
  • 11. 7. ILSVRC 2014 Classification & Detection Detection - R-CNN과 접근이 유사하지만 Region classifier에서 Inception model로 증강 - Region proposal step에 Selective Search를 추가하여 개선 → Superpixel size를 2배 증가시켜 Selective Search Algorithm 시행을 절반으로 줄임 - Use Ensemble of 6 ConvNets when classifying each region → R-CNN과 달리, 시간 지연의 이유로 bbox regression을 사용하지 않았다.
  • 12. Reference - Inception v1 C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going Deeper with Convolutions”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. (https://arxiv.org/pdf/1409.4842.pdf) - R-CNN R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation”, in IEEE Conference on Computer Vision and Pattern Recognition, 2014. (https://arxiv.org/pdf/1311.2524.pdf) - Network-in-Network M. Lin, Q. Chen and S. Yan, “Network in network”, in CoRR, 2013. (https://arxiv.org/pdf/1312.4400.pdf) - AlexNet A. Krizhevsky, I. Sutskever and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks”, in Neural Information Processing Systems, 2012. (http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)