SlideShare a Scribd company logo
1
Minyoung Kim
May 1st, 2017
A Fast Object Detector for ADAS
Using Deep Learning
2
Panasonic Silicon Valley Laboratory
Silicon Valley Laboratory
(PSVL)
Cupertino, California
3
•  Pros
•  High performance
•  Beat state-of-the-art records in many tasks including image
classification and detection
•  Cons
•  Large set of database
•  High computational power
•  Deep Neural Networks with millions of parameters
•  Slower running time than most of conventional algorithms
Object Detection with Deep Learning
4
Tradeoffs
Speed vs. Accuracy
5
Object Detection System
Building Object Detection System
•  Training Deep Neural Network for Classification
•  Pedestrian detection: Binary classification
•  Object Proposal Generation at different scales
•  Generate box proposals (1000 ~ 2000 boxes)
•  Selective Search*, Edge Boxes**
•  Merge largely overlapping boxes
•  Non Maximum Suppression
* J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders, IJCV 2013
** C. Lawrence Zitnick and Piotr Doll´ar, Microsoft Research
Run Recognizer
Proposal Generation
Recognition Network
Classification
Pedestrian
Background
Merge boxes
6
Time Consuming!
Proposal Generation & Scaling
•  Region proposal
•  Selective Search: 2 seconds per image (CPU)
•  Order of magnitude slower
•  Edge Boxes: 0.2 seconds per image
•  Scaling
•  Multiple forward propagations
•  Bottleneck
•  A forward propagation of an image
•  Less than 0.1 seconds (GPU)
Object Detection System
Proposal Generation
Scaling
7
PSVL Pedestrian Detection System
Our Pedestrian Detection System
INPUT
A Single Forward Propagation
OUTPUT
PSVL
Neural Detector
8
Recognition Network
Our Pedestrian Detection System
Add Regression Layer and Finetune
Fully Convolutional Network as Detector
Detection by a single forward propagation
9
Train DNN for recognition
•  GPU & Framework
•  NVIDIA Titan X, NVIDIA Tesla K80
•  Caffe*
•  Network Architectures
•  Modified GoogLeNet**
•  25~30 Convolutional layers
•  Input: Pedestrian and Backgrounds (80x32)
•  Output: Sigmoid or Softmax
•  Dataset
•  Caltech Pedestrian Detection Benchmark***
•  10 hours of 640(w) x 480(h) 30Hz video
•  About 250,000 frames with a total of
350,000 bounding boxes
Recognition Network
* http://caffe.berkeleyvision.org/
** C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed,
D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich (2014)
*** http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
10
Convert recognition network to a fully convolutional network
Fully Convolutional Network
Base
Network
limited input size
Kernel sliding
Input size not limited
Fully connected Convolutional
11
Regression Layer
•  Regress bounding boxes on useful features
•  Nx4 box coordinates data
•  N: Feature Map resolution (NX x NY)
•  Original GT Box: B = [x1, y1, x2, y2]
•  New GT Box: B’ = rel(B) / m (m: multiplier of Window Size)
Fully Convolutional Network
240 120
m = 2
Output
Feature
Map
4
NX
NY
12
Training detector network
•  Network Architectures
•  Custom loss functions
•  Feature Map: Cross Entropy Loss with Boosting
•  Boosting
•  Ped: Correct Results (TPs) + Ground Truths (FNs)
•  True Positive if IOU > 0.5
•  False Negative if Ground Truths not detected
•  NonPed: FPs
•  False Positive if IOU < 0.5
•  Regression: Euclidean Loss with Feature Map Data incorporated
PSVL Neural Detector
+
640x480
Original
Images
Regression Layer
Fully
Convolutional
Network
Feature
Map
Box
Coord-
inates
13
More Data
PSVL Neural Detector
14
Even fewer box prediction with Center-Height features
PSVL Neural Detector
15
Performance – Very Fast with Competitive
Accuracy
•  From DeepCascade paper1)
•  DeepCascade: NVIDIA K20
•  15 fps
•  Ours: NVIDIA GTX770
•  34 fps
•  Speed Adjustment
•  34*0.96992) = 33 fps
•  Ours: NVIDIA Titan X
•  51.422 fps w/o cuDNN
•  85.565 fps with cuDNN4
(*): Left hand side for methods with unknown
fps or less than 0.2 fps
(**): DeepCascade without extra data
(***): SpatialPooling+/Katamari methods use
additional motion information
1) A. Angelova, A. Krizhevsky V. Vanhoucke, A. Ogale, D. Ferguson (2015)
2) http://caffe.berkeleyvision.org/performance_hardware.html
Performance of Pedestrian Detection Methods (Accuracy vs. Speed)
PSVL ND
(**)
(*)
(***)
Faster
Moreaccurate
16
Deploy PSVL ND on Google Nexus 9
•  Processor
•  NVIDIA Tegra K1
•  GPU: NVIDIA Kepler with 192 CUDA cores
•  Speed (without any optimization)
•  Base resolution (600x390): 5 fps
•  Lower resolution (280x240): 16 fps
ND on Portable Device
17
Threshold Information
Probability and NMS
Threshold Bar
Detection box with Probability
Toggle for Threshold Bar
ND Application
18
ND Application Demo (Cluster with Titan X)
19
ND Application Demo at ITSWC 2015
GTX980m Tegra K1
20
More Approaches
•  Faster-RCNN (2015)*
•  Region Proposal Network
•  +10 ms
•  Anchor boxes
•  Predicts offsets & confidences
Object Detection from others
* S. Ren, K. He, R. Girshick, J. Sun (NIPS 2015)
21
Object Detection from others
* J. Redmon, S. Divvala, R. Girshick, and A. Farhadi (CVPR 2016)
** J. Redmon and A. Farhadi (CVPR 2017)
•  YOLO9000 (2017)**
•  Improved localization/recall
•  (-) fully connected layer
Similar Approaches
•  YOLO (2016)*
•  Fully Convolutional Network
•  + fc layer + regression
22
OURS
* F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, K. Keutzer (arXiv:1602.07360)
PSVL Multiple-Object Detection System
•  Fire modules*
•  Only 13 MB size
•  16.5 fps on max scale (600x2200)
Performance
(Speed, Accuracy)
23
PSVL Multiple-Object Detection System
•  Only real-time demo at ITSWC 2016
•  30+ fps (2 views per GPU)
Demo at ITSWC 2016
24
OURS (DEMO)
25
PSVL Neural Tracker
•  Critical Risk Management by tracking nearby
objects (pedestrians, cars, cyclists)
•  arXiv:1609.09156
•  State-of-the-art on KITTI MOT
What’s Next?
( : weight sharing )
pairdata
datap
featp
datap
feat
ContrastiveLoss
NB
DIoU DArat
deconvI
reluA
reluI
deconvA
concat
concatp
26
PSVL Neural Tracker
27
Conclusion
Speed & Accuracy
No Separate Region Proposal
Network Size Optimization
28
Thank You!

More Related Content

PDF
"The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,...
PDF
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
PDF
"Designing a Stereo IP Camera From Scratch," a Presentation from ELVEES
PDF
"End to End Fire Detection Deep Neural Network Platform," a Presentation from...
PDF
"Another Set of Eyes: Machine Vision Automation Solutions for In Vitro Diagno...
PDF
"Edge Intelligence: Visual Reinforcement Learning for Mobile Devices," a Pres...
PDF
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
PDF
"The Evolution of Depth Sensing: From Exotic to Ubiquitous," a Presentation f...
"The Reverse Factory: Embedded Vision in High-Volume Laboratory Applications,...
"Update on Khronos Standards for Vision and Machine Learning," a Presentation...
"Designing a Stereo IP Camera From Scratch," a Presentation from ELVEES
"End to End Fire Detection Deep Neural Network Platform," a Presentation from...
"Another Set of Eyes: Machine Vision Automation Solutions for In Vitro Diagno...
"Edge Intelligence: Visual Reinforcement Learning for Mobile Devices," a Pres...
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Evolution of Depth Sensing: From Exotic to Ubiquitous," a Presentation f...

What's hot (20)

PPT
Ieee gold 2010 resta
PPTX
Disaster Monitoring using Unmanned Aerial Vehicles and Deep Learning
PPTX
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
PDF
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
PDF
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
PPT
Calit2 Technology Overview for New Channels for Bio Com
PPT
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...
PDF
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
PDF
“Efficient Deep Learning for 3D Point Cloud Understanding,” a Presentation fr...
PPTX
Multi spectral imaging sensors
PPTX
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
PDF
The World Wide Distributed Computing Architecture of the LHC Datagrid
PDF
Machine Learning for Weather Forecasts
PDF
VIDEO STREAM ANALYSIS IN CLOUDS: AN OBJECT DETECTION AND CLASSIFICATION FRAME...
PDF
SFScon21 - Roberto Confalonieri - Boyuan Sun - Hyper-spectral image classific...
PDF
Fine tuning a convolutional network for cultural event recognition
PDF
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
PDF
Resume_optics_Gupta Roy
PPTX
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
PPT
Science and Cyberinfrastructure in the Data-Dominated Era
Ieee gold 2010 resta
Disaster Monitoring using Unmanned Aerial Vehicles and Deep Learning
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for ...
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
Calit2 Technology Overview for New Channels for Bio Com
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
“Efficient Deep Learning for 3D Point Cloud Understanding,” a Presentation fr...
Multi spectral imaging sensors
Crowdsourcing the Acquisition and Analysis of Mobile Videos for Disaster Resp...
The World Wide Distributed Computing Architecture of the LHC Datagrid
Machine Learning for Weather Forecasts
VIDEO STREAM ANALYSIS IN CLOUDS: AN OBJECT DETECTION AND CLASSIFICATION FRAME...
SFScon21 - Roberto Confalonieri - Boyuan Sun - Hyper-spectral image classific...
Fine tuning a convolutional network for cultural event recognition
NVIDIA 深度學習教育機構 (DLI): Medical image segmentation using digits
Resume_optics_Gupta Roy
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Science and Cyberinfrastructure in the Data-Dominated Era
Ad

Similar to "A Fast Object Detector for ADAS using Deep Learning," a Presentation from Panasonic (20)

PDF
Object Detection An Overview
PDF
Object Detection for Autonomous Cars using AI/ML
PPTX
GP_Slides_V3 .pptx
PDF
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
PDF
IRJET- A Review Paper on Object Detection using Zynq-7000 FPGA for an Embedde...
PDF
Image Object Detection Pipeline
PDF
An Analysis of Various Deep Learning Algorithms for Image Processing
PPTX
ObjectDetection.pptx
PDF
Modern convolutional object detectors
PDF
ObjectDetectionUsingMachineLearningandNeuralNetworks.pdf
PDF
Performance Evaluation of CNN Based Pedestrian and Cyclist Detectors On Degra...
PDF
Vehicle detection and classification using three variations of you only look ...
PPTX
PDF
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
PDF
Real time pedestrian detection with deformable part models [h. cho, p. rybski...
PPTX
Object detection with deep learning
PPTX
Object detection with Tensorflow Api
PDF
IRJET- Real-Time Object Detection System using Caffe Model
PDF
物件偵測與辨識技術
PDF
Object Detetcion using SSD-MobileNet
Object Detection An Overview
Object Detection for Autonomous Cars using AI/ML
GP_Slides_V3 .pptx
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
IRJET- A Review Paper on Object Detection using Zynq-7000 FPGA for an Embedde...
Image Object Detection Pipeline
An Analysis of Various Deep Learning Algorithms for Image Processing
ObjectDetection.pptx
Modern convolutional object detectors
ObjectDetectionUsingMachineLearningandNeuralNetworks.pdf
Performance Evaluation of CNN Based Pedestrian and Cyclist Detectors On Degra...
Vehicle detection and classification using three variations of you only look ...
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
Real time pedestrian detection with deformable part models [h. cho, p. rybski...
Object detection with deep learning
Object detection with Tensorflow Api
IRJET- Real-Time Object Detection System using Caffe Model
物件偵測與辨識技術
Object Detetcion using SSD-MobileNet
Ad

More from Edge AI and Vision Alliance (20)

PDF
“Visual Search: Fine-grained Recognition with Embedding Models for the Edge,”...
PDF
“Optimizing Real-time SLAM Performance for Autonomous Robots with GPU Acceler...
PDF
“LLMs and VLMs for Regulatory Compliance, Quality Control and Safety Applicat...
PDF
“Simplifying Portable Computer Vision with OpenVX 2.0,” a Presentation from AMD
PDF
“Quantization Techniques for Efficient Deployment of Large Language Models: A...
PDF
“Introduction to Data Types for AI: Trade-Offs and Trends,” a Presentation fr...
PDF
“Introduction to Radar and Its Use for Machine Perception,” a Presentation fr...
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
PDF
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
PDF
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
PDF
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
PDF
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
PDF
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
PDF
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
PDF
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
PDF
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...
“Visual Search: Fine-grained Recognition with Embedding Models for the Edge,”...
“Optimizing Real-time SLAM Performance for Autonomous Robots with GPU Acceler...
“LLMs and VLMs for Regulatory Compliance, Quality Control and Safety Applicat...
“Simplifying Portable Computer Vision with OpenVX 2.0,” a Presentation from AMD
“Quantization Techniques for Efficient Deployment of Large Language Models: A...
“Introduction to Data Types for AI: Trade-Offs and Trends,” a Presentation fr...
“Introduction to Radar and Its Use for Machine Perception,” a Presentation fr...
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
“ONNX and Python to C++: State-of-the-art Graph Compilation,” a Presentation ...
“Beyond the Demo: Turning Computer Vision Prototypes into Scalable, Cost-effe...
“Running Accelerated CNNs on Low-power Microcontrollers Using Arm Ethos-U55, ...
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
“A Re-imagination of Embedded Vision System Design,” a Presentation from Imag...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“Evolving Inference Processor Software Stacks to Support LLMs,” a Presentatio...
“Efficiently Registering Depth and RGB Images,” a Presentation from eInfochips
“How to Right-size and Future-proof a Container-first Edge AI Infrastructure,...

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
cuic standard and advanced reporting.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Cloud computing and distributed systems.
PDF
Encapsulation theory and applications.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity
The AUB Centre for AI in Media Proposal.docx
Reach Out and Touch Someone: Haptics and Empathic Computing
cuic standard and advanced reporting.pdf
Network Security Unit 5.pdf for BCA BBA.
Per capita expenditure prediction using model stacking based on satellite ima...
MIND Revenue Release Quarter 2 2025 Press Release
Advanced methodologies resolving dimensionality complications for autism neur...
Building Integrated photovoltaic BIPV_UPV.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Cloud computing and distributed systems.
Encapsulation theory and applications.pdf
Spectroscopy.pptx food analysis technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
“AI and Expert System Decision Support & Business Intelligence Systems”
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Understanding_Digital_Forensics_Presentation.pptx

"A Fast Object Detector for ADAS using Deep Learning," a Presentation from Panasonic

  • 1. 1 Minyoung Kim May 1st, 2017 A Fast Object Detector for ADAS Using Deep Learning
  • 2. 2 Panasonic Silicon Valley Laboratory Silicon Valley Laboratory (PSVL) Cupertino, California
  • 3. 3 •  Pros •  High performance •  Beat state-of-the-art records in many tasks including image classification and detection •  Cons •  Large set of database •  High computational power •  Deep Neural Networks with millions of parameters •  Slower running time than most of conventional algorithms Object Detection with Deep Learning
  • 5. 5 Object Detection System Building Object Detection System •  Training Deep Neural Network for Classification •  Pedestrian detection: Binary classification •  Object Proposal Generation at different scales •  Generate box proposals (1000 ~ 2000 boxes) •  Selective Search*, Edge Boxes** •  Merge largely overlapping boxes •  Non Maximum Suppression * J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders, IJCV 2013 ** C. Lawrence Zitnick and Piotr Doll´ar, Microsoft Research Run Recognizer Proposal Generation Recognition Network Classification Pedestrian Background Merge boxes
  • 6. 6 Time Consuming! Proposal Generation & Scaling •  Region proposal •  Selective Search: 2 seconds per image (CPU) •  Order of magnitude slower •  Edge Boxes: 0.2 seconds per image •  Scaling •  Multiple forward propagations •  Bottleneck •  A forward propagation of an image •  Less than 0.1 seconds (GPU) Object Detection System Proposal Generation Scaling
  • 7. 7 PSVL Pedestrian Detection System Our Pedestrian Detection System INPUT A Single Forward Propagation OUTPUT PSVL Neural Detector
  • 8. 8 Recognition Network Our Pedestrian Detection System Add Regression Layer and Finetune Fully Convolutional Network as Detector Detection by a single forward propagation
  • 9. 9 Train DNN for recognition •  GPU & Framework •  NVIDIA Titan X, NVIDIA Tesla K80 •  Caffe* •  Network Architectures •  Modified GoogLeNet** •  25~30 Convolutional layers •  Input: Pedestrian and Backgrounds (80x32) •  Output: Sigmoid or Softmax •  Dataset •  Caltech Pedestrian Detection Benchmark*** •  10 hours of 640(w) x 480(h) 30Hz video •  About 250,000 frames with a total of 350,000 bounding boxes Recognition Network * http://caffe.berkeleyvision.org/ ** C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich (2014) *** http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
  • 10. 10 Convert recognition network to a fully convolutional network Fully Convolutional Network Base Network limited input size Kernel sliding Input size not limited Fully connected Convolutional
  • 11. 11 Regression Layer •  Regress bounding boxes on useful features •  Nx4 box coordinates data •  N: Feature Map resolution (NX x NY) •  Original GT Box: B = [x1, y1, x2, y2] •  New GT Box: B’ = rel(B) / m (m: multiplier of Window Size) Fully Convolutional Network 240 120 m = 2 Output Feature Map 4 NX NY
  • 12. 12 Training detector network •  Network Architectures •  Custom loss functions •  Feature Map: Cross Entropy Loss with Boosting •  Boosting •  Ped: Correct Results (TPs) + Ground Truths (FNs) •  True Positive if IOU > 0.5 •  False Negative if Ground Truths not detected •  NonPed: FPs •  False Positive if IOU < 0.5 •  Regression: Euclidean Loss with Feature Map Data incorporated PSVL Neural Detector + 640x480 Original Images Regression Layer Fully Convolutional Network Feature Map Box Coord- inates
  • 14. 14 Even fewer box prediction with Center-Height features PSVL Neural Detector
  • 15. 15 Performance – Very Fast with Competitive Accuracy •  From DeepCascade paper1) •  DeepCascade: NVIDIA K20 •  15 fps •  Ours: NVIDIA GTX770 •  34 fps •  Speed Adjustment •  34*0.96992) = 33 fps •  Ours: NVIDIA Titan X •  51.422 fps w/o cuDNN •  85.565 fps with cuDNN4 (*): Left hand side for methods with unknown fps or less than 0.2 fps (**): DeepCascade without extra data (***): SpatialPooling+/Katamari methods use additional motion information 1) A. Angelova, A. Krizhevsky V. Vanhoucke, A. Ogale, D. Ferguson (2015) 2) http://caffe.berkeleyvision.org/performance_hardware.html Performance of Pedestrian Detection Methods (Accuracy vs. Speed) PSVL ND (**) (*) (***) Faster Moreaccurate
  • 16. 16 Deploy PSVL ND on Google Nexus 9 •  Processor •  NVIDIA Tegra K1 •  GPU: NVIDIA Kepler with 192 CUDA cores •  Speed (without any optimization) •  Base resolution (600x390): 5 fps •  Lower resolution (280x240): 16 fps ND on Portable Device
  • 17. 17 Threshold Information Probability and NMS Threshold Bar Detection box with Probability Toggle for Threshold Bar ND Application
  • 18. 18 ND Application Demo (Cluster with Titan X)
  • 19. 19 ND Application Demo at ITSWC 2015 GTX980m Tegra K1
  • 20. 20 More Approaches •  Faster-RCNN (2015)* •  Region Proposal Network •  +10 ms •  Anchor boxes •  Predicts offsets & confidences Object Detection from others * S. Ren, K. He, R. Girshick, J. Sun (NIPS 2015)
  • 21. 21 Object Detection from others * J. Redmon, S. Divvala, R. Girshick, and A. Farhadi (CVPR 2016) ** J. Redmon and A. Farhadi (CVPR 2017) •  YOLO9000 (2017)** •  Improved localization/recall •  (-) fully connected layer Similar Approaches •  YOLO (2016)* •  Fully Convolutional Network •  + fc layer + regression
  • 22. 22 OURS * F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, K. Keutzer (arXiv:1602.07360) PSVL Multiple-Object Detection System •  Fire modules* •  Only 13 MB size •  16.5 fps on max scale (600x2200) Performance (Speed, Accuracy)
  • 23. 23 PSVL Multiple-Object Detection System •  Only real-time demo at ITSWC 2016 •  30+ fps (2 views per GPU) Demo at ITSWC 2016
  • 25. 25 PSVL Neural Tracker •  Critical Risk Management by tracking nearby objects (pedestrians, cars, cyclists) •  arXiv:1609.09156 •  State-of-the-art on KITTI MOT What’s Next? ( : weight sharing ) pairdata datap featp datap feat ContrastiveLoss NB DIoU DArat deconvI reluA reluI deconvA concat concatp
  • 27. 27 Conclusion Speed & Accuracy No Separate Region Proposal Network Size Optimization