SlideShare a Scribd company logo
Faster R-CNN:
Towards Real-Time Object Detection with
Region Proposal Networks
Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
[paper@NIPS15][arXiv][python][matlab][slides by R. Girshick]
Slides by Amaia Salvador [GDoc]
Computer Vision Reading Group (01/03/2016)
1. Introduction
Object Detection
3
Object Detection: Previously...
DPM. P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan. Object Detection with Discriminatively Trained Part Based Models. In IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, Sep. 2010
DPM
4
Hand-crafted features + Sliding Window
Object Detection: Previously...
R-CNN. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014, June). Rich feature hierarchies for accurate object detection and semantic
segmentation. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on(pp. 580-587). IEEE.
SPPnet. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. Pattern Analysis
and Machine Intelligence, IEEE Transactions on, 37(9), 1904-1916.
Fast R-CNN. Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1440-1448).
R-CNN SPPnet Fast R-CNN
5
CNN features + Object Proposals
Object Detection: Limitations
Selective Search CPMC
MCG
Object Proposal computation is the bottleneck in
current state of the art object detection systems
Selective Search. Van de Sande, K. E., Uijlings, J. R., Gevers, T., & Smeulders, A. W. (2011, November). Segmentation as selective search for object
recognition. InComputer Vision (ICCV), 2011 IEEE International Conference on (pp. 1879-1886). IEEE.
CPMC. Carreira, J., & Sminchisescu, C. (2010, June). Constrained parametric min-cuts for automatic object segmentation. In Computer Vision
and Pattern Recognition (CVPR), 2010 IEEE Conference on (pp. 3241-3248). IEEE.
MCG. Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., & Malik, J. (2014). Multiscale combinatorial grouping. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (pp. 328-335). 6
Faster R-CNN: Motivation
Selective Search CPMC
MCG
Replace the usage of external Object Proposals
with a Region Proposal Network (RPN).
7
2. Methodology
Faster R-CNN: Overview
Conv
Layer 5
Conv
layers
RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
FC layers
Class scores
9
Faster R-CNN: Overview
Conv
Layer 5
Conv
layers
RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
FC layers
Class scores
10
Region Proposal Network (RPN)
11
Objectness scores
Bounding Box Regression
In practice, k = 9 (3 different scales and 3 aspect ratios)
RPN: Loss Function
12
Predicted probability of being an object for anchor i
i = anchor index in minibatch
Coordinates of the predicted bounding box for anchor i
Ground truth objectness label
True box coordinates
Ncls
= Number of anchors in minibatch (~ 256)
Nreg
= Number of anchor locations ( ~ 2400)
Log loss
Smooth
L1 loss
In practice = 10, so that both terms
are roughly equally balanced
RPN: Positive/Negative Samples
13
An anchor is labeled as positive if:
(a) the anchor is the one with highest IoU overlap with a ground-truth box
(b) the anchor has an IoU overlap with a ground-truth box higher than 0.7
Negative labels are assigned to anchors with IoU lower than 0.3 for all ground-truth
boxes.
50%/50% ratio of positive/negative anchors in a minibatch.
Faster R-CNN: Overview
Conv
Layer 5
Conv
layers
RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
FC layers
Class scores
14
Object Detection Network
15
Fast R-CNN
Object Detection Network: Loss
16
*From Fast R-CNN
Predicted class scores
True class scores
True box coordinates
Predicted box coordinates
Log loss
Smooth
L1 loss
Fast R-CNN: Positive/Negative Samples
17
Positive samples are defined as those whose IoU overlap with a
ground-truth bounding box is > 0.5.
Negative examples are sampled from those that have a maximum
IoU overlap with ground truth in the interval [0.1, 0.5).
25%/75% ratio for positive/negative samples in a minibatch.
*From Fast R-CNN
Faster R-CNN: Training
18
Conv
Layer 5
Conv
layers RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
FC layers
Class scores
4-step training to share features for RPN and Fast R-CNN
Faster R-CNN: 4-step training
Conv
Layer 5
Conv
layers
RPN RPN Proposals
19
Step 1: Train RPN initialized with an ImageNet pre-trained model.
ImageNet weights
(fine tuned)
Faster R-CNN: 4-step training
Conv
Layer 5
Conv
layers
RPN Proposals
(learned in 1)
Class probabilities
20
Step 2: Train Fast R-CNN with learned RPN proposals.
ImageNet weights
(fine tuned)
Faster R-CNN: 4-step training
Conv
Layer 5
Conv
layers
RPN RPN Proposals
21
Step 3: The model trained in 2 is used to initialize RPN and train again.
Weights from Step 2
(fixed)
Faster R-CNN: 4-step training
Conv
Layer 5
Conv
layers
RPN Proposals
(learned in 3)
Class probabilities
22
Step 4: Fine tune FC layers of Fast R-CNN using same shared convolutional layers as in 3.
Weights from Step 2&3
(fixed)
3. Experiments
Experiments: CNN Architectures
24
VGG-16: Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks
for large-scale image recognition. arXiv preprint arXiv:1409.1556.
ZF: Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional
networks. In Computer vision–ECCV 2014 (pp. 818-833). Springer International
Publishing.
Experiments: Datasets
25
Experiments I: VOC 2007 & ZF
26
Comparison between Fast R-CNN trained with external object proposals
(SS: Selective Search, EB: EdgeBoxes) with Faster R-CNN
Experiments I: VOC 2007 & ZF
27
Experiments I: VOC 2007 & ZF
28
Experiments I: VOC 2007 & ZF
29
Experiments II
30
Detection Accuracy
Timing (ms)
Experiments III
31
Experiments IV
32
One-Stage Detection:
1) Directly Refine and Classify Sliding Window locations
Two-Stage Proposal + Detection:
1) Learn Object Proposals
2) Refine and classify Object Proposals
Experiments V: MS COCO (arXiv)
33
Qualitative Results
34
4. Summary
Summary
36
● Region Proposal Network sharing convolutional features
with Object Detection Network makes region generation
step nearly cost-free.
● Quality of proposals is improved with RPN wrt SS and EB.
● Object Detection system at 5-17 fps.
Summary
37
● Faster R-CNN is the basis of the winners of COCO and
ILSVRC 2015 object detection competitions [1].
● RPN is also used in the winning entries of ILSVRC 2015
localization [1] and COCO 2015 segmentation competitions
[2].
[1] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
arXiv:1512.03385, 2015.
[2] J. Dai, K. He, and J. Sun, “Instance-aware semantic segmentation via multi-task
network cascades,” arXiv:1512.04412, 2015.
Thank you !
Questions?

More Related Content

PDF
Faster R-CNN - PR012
PPTX
Tutorial on Object Detection (Faster R-CNN)
PDF
PDF
Object Detection Using R-CNN Deep Learning Framework
PPTX
Object detection - RCNNs vs Retinanet
PDF
Mask-RCNN for Instance Segmentation
PDF
Pr057 mask rcnn
PDF
A Brief History of Object Detection / Tommi Kerola
Faster R-CNN - PR012
Tutorial on Object Detection (Faster R-CNN)
Object Detection Using R-CNN Deep Learning Framework
Object detection - RCNNs vs Retinanet
Mask-RCNN for Instance Segmentation
Pr057 mask rcnn
A Brief History of Object Detection / Tommi Kerola

What's hot (20)

PPTX
Object Detection using Deep Neural Networks
PDF
Deep learning based object detection basics
PPTX
You only look once (YOLO) : unified real time object detection
PPTX
Object detection with deep learning
PDF
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
PDF
Introduction to object detection
PPTX
Object detection
PPTX
You only look once: Unified, real-time object detection (UPC Reading Group)
PPTX
Faster rcnn
PPTX
Faster R-CNN
PDF
SSD: Single Shot MultiBox Detector (UPC Reading Group)
PPTX
Object detection
PDF
Mask R-CNN
PPTX
Deep learning for object detection
PDF
Introduction of Faster R-CNN
PPTX
PPTX
You Only Look Once: Unified, Real-Time Object Detection
PDF
PR-132: SSD: Single Shot MultiBox Detector
PPTX
PDF
Deep Learning - Convolutional Neural Networks
Object Detection using Deep Neural Networks
Deep learning based object detection basics
You only look once (YOLO) : unified real time object detection
Object detection with deep learning
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
Introduction to object detection
Object detection
You only look once: Unified, real-time object detection (UPC Reading Group)
Faster rcnn
Faster R-CNN
SSD: Single Shot MultiBox Detector (UPC Reading Group)
Object detection
Mask R-CNN
Deep learning for object detection
Introduction of Faster R-CNN
You Only Look Once: Unified, Real-Time Object Detection
PR-132: SSD: Single Shot MultiBox Detector
Deep Learning - Convolutional Neural Networks
Ad

Similar to Faster R-CNN: Towards real-time object detection with region proposal networks (UPC Reading Group) (20)

PDF
Auro tripathy - Localizing with CNNs
PDF
object detection paper review
PPTX
Object Detection is a very powerful field.pptx
PDF
D3L4-objects.pdf
PDF
Object Detection - Míriam Bellver - UPC Barcelona 2018
PDF
Deep Learning for Computer Vision: Object Detection (UPC 2016)
PDF
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
PDF
Fast methods for deep learning based object detection
PDF
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
PPTX
Week5-Faster R-CNN.pptx
PPTX
Improving region based CNN object detector using bayesian optimization
PDF
Comparative Study of Object Detection Algorithms
PDF
IRJET- Real-Time Object Detection using Deep Learning: A Survey
PDF
Backbone search for object detection for applications in intrusion warning sy...
PDF
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
PDF
Cvpr 2017 Summary Meetup
PDF
Brodmann17 CVPR 2017 review - meetup slides
PDF
R-FCN : object detection via region-based fully convolutional networks
PDF
Real Time Object Detection And Recognization.pdf
PPTX
150807 Fast R-CNN
Auro tripathy - Localizing with CNNs
object detection paper review
Object Detection is a very powerful field.pptx
D3L4-objects.pdf
Object Detection - Míriam Bellver - UPC Barcelona 2018
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Fast methods for deep learning based object detection
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
Week5-Faster R-CNN.pptx
Improving region based CNN object detector using bayesian optimization
Comparative Study of Object Detection Algorithms
IRJET- Real-Time Object Detection using Deep Learning: A Survey
Backbone search for object detection for applications in intrusion warning sy...
Recognition and Detection of Real-Time Objects Using Unified Network of Faste...
Cvpr 2017 Summary Meetup
Brodmann17 CVPR 2017 review - meetup slides
R-FCN : object detection via region-based fully convolutional networks
Real Time Object Detection And Recognization.pdf
150807 Fast R-CNN
Ad

More from Universitat Politècnica de Catalunya (20)

PDF
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
PDF
Deep Generative Learning for All
PDF
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
PDF
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
PDF
The Transformer - Xavier Giró - UPC Barcelona 2021
PDF
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
PDF
Open challenges in sign language translation and production
PPTX
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
PPTX
Discovery and Learning of Navigation Goals from Pixels in Minecraft
PDF
Learn2Sign : Sign language recognition and translation using human keypoint e...
PDF
Intepretability / Explainable AI for Deep Neural Networks
PDF
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
PDF
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
PDF
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
PDF
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
PDF
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
PDF
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
PDF
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
PDF
Curriculum Learning for Recurrent Video Object Segmentation
PDF
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
The Transformer - Xavier Giró - UPC Barcelona 2021
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Open challenges in sign language translation and production
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Learn2Sign : Sign language recognition and translation using human keypoint e...
Intepretability / Explainable AI for Deep Neural Networks
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Curriculum Learning for Recurrent Video Object Segmentation
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PPT
Teaching material agriculture food technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Unlocking AI with Model Context Protocol (MCP)
Teaching material agriculture food technology
Digital-Transformation-Roadmap-for-Companies.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
20250228 LYD VKU AI Blended-Learning.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Diabetes mellitus diagnosis method based random forest with bat algorithm
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectral efficient network and resource selection model in 5G networks
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MYSQL Presentation for SQL database connectivity
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
Encapsulation theory and applications.pdf
Machine learning based COVID-19 study performance prediction
Per capita expenditure prediction using model stacking based on satellite ima...

Faster R-CNN: Towards real-time object detection with region proposal networks (UPC Reading Group)

  • 1. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun [paper@NIPS15][arXiv][python][matlab][slides by R. Girshick] Slides by Amaia Salvador [GDoc] Computer Vision Reading Group (01/03/2016)
  • 4. Object Detection: Previously... DPM. P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan. Object Detection with Discriminatively Trained Part Based Models. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, Sep. 2010 DPM 4 Hand-crafted features + Sliding Window
  • 5. Object Detection: Previously... R-CNN. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014, June). Rich feature hierarchies for accurate object detection and semantic segmentation. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on(pp. 580-587). IEEE. SPPnet. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 37(9), 1904-1916. Fast R-CNN. Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1440-1448). R-CNN SPPnet Fast R-CNN 5 CNN features + Object Proposals
  • 6. Object Detection: Limitations Selective Search CPMC MCG Object Proposal computation is the bottleneck in current state of the art object detection systems Selective Search. Van de Sande, K. E., Uijlings, J. R., Gevers, T., & Smeulders, A. W. (2011, November). Segmentation as selective search for object recognition. InComputer Vision (ICCV), 2011 IEEE International Conference on (pp. 1879-1886). IEEE. CPMC. Carreira, J., & Sminchisescu, C. (2010, June). Constrained parametric min-cuts for automatic object segmentation. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (pp. 3241-3248). IEEE. MCG. Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., & Malik, J. (2014). Multiscale combinatorial grouping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 328-335). 6
  • 7. Faster R-CNN: Motivation Selective Search CPMC MCG Replace the usage of external Object Proposals with a Region Proposal Network (RPN). 7
  • 9. Faster R-CNN: Overview Conv Layer 5 Conv layers RPN RPN Proposals RPN Proposals Class probabilities RoI pooling layer FC layers Class scores 9
  • 10. Faster R-CNN: Overview Conv Layer 5 Conv layers RPN RPN Proposals RPN Proposals Class probabilities RoI pooling layer FC layers Class scores 10
  • 11. Region Proposal Network (RPN) 11 Objectness scores Bounding Box Regression In practice, k = 9 (3 different scales and 3 aspect ratios)
  • 12. RPN: Loss Function 12 Predicted probability of being an object for anchor i i = anchor index in minibatch Coordinates of the predicted bounding box for anchor i Ground truth objectness label True box coordinates Ncls = Number of anchors in minibatch (~ 256) Nreg = Number of anchor locations ( ~ 2400) Log loss Smooth L1 loss In practice = 10, so that both terms are roughly equally balanced
  • 13. RPN: Positive/Negative Samples 13 An anchor is labeled as positive if: (a) the anchor is the one with highest IoU overlap with a ground-truth box (b) the anchor has an IoU overlap with a ground-truth box higher than 0.7 Negative labels are assigned to anchors with IoU lower than 0.3 for all ground-truth boxes. 50%/50% ratio of positive/negative anchors in a minibatch.
  • 14. Faster R-CNN: Overview Conv Layer 5 Conv layers RPN RPN Proposals RPN Proposals Class probabilities RoI pooling layer FC layers Class scores 14
  • 16. Object Detection Network: Loss 16 *From Fast R-CNN Predicted class scores True class scores True box coordinates Predicted box coordinates Log loss Smooth L1 loss
  • 17. Fast R-CNN: Positive/Negative Samples 17 Positive samples are defined as those whose IoU overlap with a ground-truth bounding box is > 0.5. Negative examples are sampled from those that have a maximum IoU overlap with ground truth in the interval [0.1, 0.5). 25%/75% ratio for positive/negative samples in a minibatch. *From Fast R-CNN
  • 18. Faster R-CNN: Training 18 Conv Layer 5 Conv layers RPN RPN Proposals RPN Proposals Class probabilities RoI pooling layer FC layers Class scores 4-step training to share features for RPN and Fast R-CNN
  • 19. Faster R-CNN: 4-step training Conv Layer 5 Conv layers RPN RPN Proposals 19 Step 1: Train RPN initialized with an ImageNet pre-trained model. ImageNet weights (fine tuned)
  • 20. Faster R-CNN: 4-step training Conv Layer 5 Conv layers RPN Proposals (learned in 1) Class probabilities 20 Step 2: Train Fast R-CNN with learned RPN proposals. ImageNet weights (fine tuned)
  • 21. Faster R-CNN: 4-step training Conv Layer 5 Conv layers RPN RPN Proposals 21 Step 3: The model trained in 2 is used to initialize RPN and train again. Weights from Step 2 (fixed)
  • 22. Faster R-CNN: 4-step training Conv Layer 5 Conv layers RPN Proposals (learned in 3) Class probabilities 22 Step 4: Fine tune FC layers of Fast R-CNN using same shared convolutional layers as in 3. Weights from Step 2&3 (fixed)
  • 24. Experiments: CNN Architectures 24 VGG-16: Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. ZF: Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer vision–ECCV 2014 (pp. 818-833). Springer International Publishing.
  • 26. Experiments I: VOC 2007 & ZF 26 Comparison between Fast R-CNN trained with external object proposals (SS: Selective Search, EB: EdgeBoxes) with Faster R-CNN
  • 27. Experiments I: VOC 2007 & ZF 27
  • 28. Experiments I: VOC 2007 & ZF 28
  • 29. Experiments I: VOC 2007 & ZF 29
  • 32. Experiments IV 32 One-Stage Detection: 1) Directly Refine and Classify Sliding Window locations Two-Stage Proposal + Detection: 1) Learn Object Proposals 2) Refine and classify Object Proposals
  • 33. Experiments V: MS COCO (arXiv) 33
  • 36. Summary 36 ● Region Proposal Network sharing convolutional features with Object Detection Network makes region generation step nearly cost-free. ● Quality of proposals is improved with RPN wrt SS and EB. ● Object Detection system at 5-17 fps.
  • 37. Summary 37 ● Faster R-CNN is the basis of the winners of COCO and ILSVRC 2015 object detection competitions [1]. ● RPN is also used in the winning entries of ILSVRC 2015 localization [1] and COCO 2015 segmentation competitions [2]. [1] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” arXiv:1512.03385, 2015. [2] J. Dai, K. He, and J. Sun, “Instance-aware semantic segmentation via multi-task network cascades,” arXiv:1512.04412, 2015.