Going Deeper with Convolutions

1. R-CNN & Network-in-Network
2. Introduction & Related Work
3. Motivation and High Level Considerations
4. Architectural Details
5. GoogLeNet
6. Training Methodology
7. ILSVRC 2014 Classification & Detection Challenge Setup and Results
목차

R-CNN

Network-in-Network
http://nmhkahn.github.io/Casestudy-CNN

INTRODUCTION
- GoogLeNet은 ILSVRC 2014에서 AlexNet보다 12배 적은 parameter 사용
→ 하지만 더 정확하다!
RELATED WORK
- Inception layer → Repeat many times
- Network-in-Network → Dimension Reduction
- R-CNN → Multi-box& Ensemble
2. Introduction & Related Work

- 성능을 높일 수 있는 방법은 모델의 사이즈를 키우는 것이다.
→ Parameter 수 증가(overfitting ↑) & Computational resources ↑
- Fully connected → Sparsely connected architectures in the convolutions
- Dense submatrices → Clustering sparse matrices
3. Motivation and High Level Considerations

- Inception architecture
Dense components과 비슷한 성능을 내는 Local sparse structure
4. Architectural Details

5. GoogLeNet
https://github.com/rlatjcj/Keras-Model/blob/master/Inception_v1/model.py

6. Training Methodology
- Stochastic Gradient Descent with 0.9 momentum
- Fixed Learning Rate Schedule (Decreasing the learning rate by 4% every 8 epochs)
- Use Polyak averaging to create the final model used at inference time.
(http://ttic.uchicago.edu/~shubhendu/Pages/Files/Lecture6_flat.pdf)
- 하지만, transfer learning 할 때는 dropout이나 learning rate 같은 option들을 바꿀 수 있음
- Crop image whose size is distributed evenly between 8% and 100% and aspect ratio is chosen
randomly between 3/4 and 4/3
- Random interpolation (bilinear, area, nearest neighbor, cubic with equal probability)

7. ILSVRC 2014 Classification & Detection
Classification
- 7개의 GoogLeNet (with one wider version)을 동일한 초기조건, learning rate policies로 학습
Sampling methodologies와 입력 이미지의 순서만 다름
예측할 때 Ensemble 수행

7. ILSVRC 2014 Classification & Detection
Detection
- R-CNN과 접근이 유사하지만 Region classifier에서 Inception model로 증강
- Region proposal step에 Selective Search를 추가하여 개선
→ Superpixel size를 2배 증가시켜 Selective Search Algorithm 시행을 절반으로 줄임
- Use Ensemble of 6 ConvNets when classifying each region
→ R-CNN과 달리, 시간 지연의 이유로 bbox regression을 사용하지 않았다.

Reference
- Inception v1
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, “Going Deeper with Convolutions”, In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, 2015. (https://arxiv.org/pdf/1409.4842.pdf)
- R-CNN
R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation”, in IEEE Conference on Computer Vision and Pattern
Recognition, 2014. (https://arxiv.org/pdf/1311.2524.pdf)
- Network-in-Network
M. Lin, Q. Chen and S. Yan, “Network in network”, in CoRR, 2013. (https://arxiv.org/pdf/1312.4400.pdf)
- AlexNet
A. Krizhevsky, I. Sutskever and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks”, in Neural Information Processing Systems, 2012.
(http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)

Going Deeper with Convolutions

More Related Content

Similar to Going Deeper with Convolutions (20)

More from Sungchul Kim (20)

Recently uploaded (20)

Going Deeper with Convolutions