DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources

DhakaNet: Unstructured Vehicle Detection
using Limited Computational Resources
1,2,3,4,6
Bangladesh University of Engineering and Technology, Bangladesh
5
University of Oklahoma, USA
ICDM 2021
Auckland, New Zealand
Fellowship from:
Tarik Reza Toha1
, Masfiqur Rahaman2
, Saiful Islam Salim3
, Mainul
Hossain4
, Arif Mohaimin Sadri5
, and A. B. M. Alim Al Islam6

• Background and motivation
• Our proposed approach
• Experimental results and findings
• Conclusion and future work
2
Overview of This Presentation

3
Traffic Congestion in Dhaka
• Average public transport speed is 7 kph
– Expected to be only 4 kph (slower than
walking speed) by 2035
• Wastes ~3.2 million working hours daily
• An annual loss of billions of dollars
[Source: World Bank Report (2018)]
1. https://openknowledge.worldbank.org/handle/10986/29925
2. https://www.thedailystar.net/frontpage/colossal-loss-1553002

4
Existing Solution for Limiting Traffic Jam
Adaptive traffic control system (Lee et al., 2020)
Capture on-
road traffic
images
Send images
to server Estimate
traffic density
and optimize
signal timing
This centralized solution demands high-speed
network connectivity, which is not always
available across all road intersections in the
developing countries such as Bangladesh,
India, Kenya, etc. (Chauhan et al., 2019)

5
An Alternative Existing Solution for
Limiting Traffic Jam
Decentralized adaptive traffic control system
(Yeshwanth et al., 2017)
Capture traffic
images and
estimate traffic
density
Send only the
vehicle count
to server
Optimize
signal timing
Embedded systems are
needed to be deployed at
signalized intersections to
estimate traffic density in
real-time
It imposes severe
computational
constraints on DL
architectures to estimate
traffic density on-road

• EfficientDet: Scalable and Efficient Object Detection
– Tan et al., CVPR, IEEE, 2020
• Scaled-YOLOv4: Scaling Cross Stage Partial Network
– Wang et al., CVPR, IEEE, 2021
• YOLOv5: Leading Edge Artificial Intelligence Solutions
– Jocher et al., Ultralytics Company, 2020
6
Existing Deep Learning Architectures
These architectures neither attain faster
inference speed nor higher accuracy because
of their inherent limitations

7
We propose a novel low-resource DL
architecture (DhakaNet) for faster and more
accurate vehicle detection in street-view
traffic images captured by on-road cameras
Our Contribution

8
mCSP: Our Proposed Backbone Network
Modified Cross-Stage Partial Networks
Increases the
accuracy
Increases the
inference
speed
At the beginning layers At the later (deeper) layers

9
mPANet: Our Proposed Neck Network
Modified Path Aggregation Network
Increases the
accuracy using
a small
overhead
Exactly ONE
extra connection
is possible

10
MSAM: Our Proposed Plugin Module
Multi-Scale Attention Module
Fuses local
features within
the same layer
Identifies
meaningful
features
Localizes
meaningful
features

11
DhakaNet: Our Proposed Architecture
Scaling Factor = 0.29
Modified
CSP module
Multi-Scale
Attention
module

12
• Ground truth
– Box label
• Data augmentation during training
– HSV, translation, mosaic, and
horizontal flip
• Training configuration
– Input size: 768 × 768 × 3
– Training : Validation = 0.75 : 0.25
– Other configurations follow Jocher et
al., 2020
Experimental Setup
GeForce GTX 1070
8 GB Memory
Training purpose only
Raspberry Pi 4 Model B
ARMv7 Processor, 4 GB RAM
Testing purpose only

13
Datasets Used for Performance Evaluation
Attribute DhakaAI IITM-HeTra-A IITM-HeTra-B
Traffic Unstructured Unstructured Unstructured
Location Dhaka, BD Chennai, India Chennai, India
Training : Testing 3000 : 500 1201 : 216 1201 : 216
# of object classes 21 3 4
DhakaAI dataset
(Shihavuddin et al., 2020)
IITM-HeTra datasets (A and B)
(Mittal et al., 2018)

14
Evaluation Results on DhakaAI Dataset
Not
applicable
DhakaNet achieves 50% faster inference
speed, or 13% higher accuracy compared to
the existing architecture

15
Evaluation Results on IITM-HeTra Datasets
IITM-HeTra-A dataset IITM-HeTra-B dataset
Not
applicable
Not
applicable
DhakaNet achieves 51% faster inference speed
and similar accuracy on both IITM-HeTra datasets
compared to the existing architecture

• Existing low-resource DL architectures neither attain faster
inference speed nor higher accuracy due to not overcoming
their inherent limitations
• We propose a new architecture for embedded systems named
DhakaNet
– Delivers up to 51% faster or 13% more accurate detection over
street-view traffic images
• We plan to develop a traffic signal optimization module for a
coordinated and adaptive traffic signal system for Dhaka and
similar cities in future
17
Conclusion and Future Work

Thank You
Questions are welcome!
18
Email: toha@tmdm.butex.edu.bd

DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources

More Related Content

Similar to DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources (20)

More from Tarik Reza Toha (20)

Recently uploaded (20)

DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources

Editor's Notes