SlideShare a Scribd company logo
DhakaNet: Unstructured Vehicle Detection
using Limited Computational Resources
1,2,3,4,6
Bangladesh University of Engineering and Technology, Bangladesh
5
University of Oklahoma, USA
ICDM 2021
Auckland, New Zealand
Fellowship from:
Tarik Reza Toha1
, Masfiqur Rahaman2
, Saiful Islam Salim3
, Mainul
Hossain4
, Arif Mohaimin Sadri5
, and A. B. M. Alim Al Islam6
• Background and motivation
• Our proposed approach
• Experimental results and findings
• Conclusion and future work
2
Overview of This Presentation
3
Traffic Congestion in Dhaka
• Average public transport speed is 7 kph
– Expected to be only 4 kph (slower than
walking speed) by 2035
• Wastes ~3.2 million working hours daily
• An annual loss of billions of dollars
[Source: World Bank Report (2018)]
1. https://openknowledge.worldbank.org/handle/10986/29925
2. https://www.thedailystar.net/frontpage/colossal-loss-1553002
4
Existing Solution for Limiting Traffic Jam
Adaptive traffic control system (Lee et al., 2020)
Capture on-
road traffic
images
Send images
to server Estimate
traffic density
and optimize
signal timing
This centralized solution demands high-speed
network connectivity, which is not always
available across all road intersections in the
developing countries such as Bangladesh,
India, Kenya, etc. (Chauhan et al., 2019)
5
An Alternative Existing Solution for
Limiting Traffic Jam
Decentralized adaptive traffic control system
(Yeshwanth et al., 2017)
Capture traffic
images and
estimate traffic
density
Send only the
vehicle count
to server
Optimize
signal timing
Embedded systems are
needed to be deployed at
signalized intersections to
estimate traffic density in
real-time
It imposes severe
computational
constraints on DL
architectures to estimate
traffic density on-road
• EfficientDet: Scalable and Efficient Object Detection
– Tan et al., CVPR, IEEE, 2020
• Scaled-YOLOv4: Scaling Cross Stage Partial Network
– Wang et al., CVPR, IEEE, 2021
• YOLOv5: Leading Edge Artificial Intelligence Solutions
– Jocher et al., Ultralytics Company, 2020
6
Existing Deep Learning Architectures
These architectures neither attain faster
inference speed nor higher accuracy because
of their inherent limitations
7
We propose a novel low-resource DL
architecture (DhakaNet) for faster and more
accurate vehicle detection in street-view
traffic images captured by on-road cameras
Our Contribution
8
mCSP: Our Proposed Backbone Network
Modified Cross-Stage Partial Networks
Increases the
accuracy
Increases the
inference
speed
At the beginning layers At the later (deeper) layers
9
mPANet: Our Proposed Neck Network
Modified Path Aggregation Network
Increases the
accuracy using
a small
overhead
Exactly ONE
extra connection
is possible
10
MSAM: Our Proposed Plugin Module
Multi-Scale Attention Module
Fuses local
features within
the same layer
Identifies
meaningful
features
Localizes
meaningful
features
11
DhakaNet: Our Proposed Architecture
Scaling Factor = 0.29
Modified
CSP module
Multi-Scale
Attention
module
12
• Ground truth
– Box label
• Data augmentation during training
– HSV, translation, mosaic, and
horizontal flip
• Training configuration
– Input size: 768 × 768 × 3
– Training : Validation = 0.75 : 0.25
– Other configurations follow Jocher et
al., 2020
Experimental Setup
GeForce GTX 1070
8 GB Memory
Training purpose only
Raspberry Pi 4 Model B
ARMv7 Processor, 4 GB RAM
Testing purpose only
13
Datasets Used for Performance Evaluation
Attribute DhakaAI IITM-HeTra-A IITM-HeTra-B
Traffic Unstructured Unstructured Unstructured
Location Dhaka, BD Chennai, India Chennai, India
Training : Testing 3000 : 500 1201 : 216 1201 : 216
# of object classes 21 3 4
DhakaAI dataset
(Shihavuddin et al., 2020)
IITM-HeTra datasets (A and B)
(Mittal et al., 2018)
14
Evaluation Results on DhakaAI Dataset
Not
applicable
DhakaNet achieves 50% faster inference
speed, or 13% higher accuracy compared to
the existing architecture
15
Evaluation Results on IITM-HeTra Datasets
IITM-HeTra-A dataset IITM-HeTra-B dataset
Not
applicable
Not
applicable
DhakaNet achieves 51% faster inference speed
and similar accuracy on both IITM-HeTra datasets
compared to the existing architecture
16
Final Output of DhakaNet
• Existing low-resource DL architectures neither attain faster
inference speed nor higher accuracy due to not overcoming
their inherent limitations
• We propose a new architecture for embedded systems named
DhakaNet
– Delivers up to 51% faster or 13% more accurate detection over
street-view traffic images
• We plan to develop a traffic signal optimization module for a
coordinated and adaptive traffic signal system for Dhaka and
similar cities in future
17
Conclusion and Future Work
Thank You
Questions are welcome!
18
Email: toha@tmdm.butex.edu.bd

More Related Content

PDF
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
PPTX
AARAA_Tours_Travels_PPT_for_studnet.pptx
PPTX
Bangla Hand Written Digit Recognition presentation slide .pptx
PPTX
1 st review pothole srm bi1 st review pothole srm bi1 st review pothole srm bi
PPTX
MINI PROJECT FINAL 2nd review.ppt DETAILS
PDF
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
PDF
Deep Learning Initiative @ NECSTLab
PPTX
ACIC: Automatic Cloud I/O Configurator for HPC Applications
Machine learning in Dynamic Adaptive Streaming over HTTP (DASH)
AARAA_Tours_Travels_PPT_for_studnet.pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
1 st review pothole srm bi1 st review pothole srm bi1 st review pothole srm bi
MINI PROJECT FINAL 2nd review.ppt DETAILS
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Deep Learning Initiative @ NECSTLab
ACIC: Automatic Cloud I/O Configurator for HPC Applications

Similar to DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources (20)

PDF
Data Mobility Exhibition
PDF
The little engine(s) that could: scaling online social networks
PDF
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
PDF
Ameya_Kasbekar_Resume
PPTX
Inteligent multicriteria model load blancing in cloude computing
PDF
A Transfer Learning Approach to Traffic Sign Recognition
PDF
BDW16 London - Ingrid Funie, Imperial College London - Machine Learning and F...
PPTX
Shikha fdp 62_14july2017
PPTX
How to Leverage Big Data to Deliver Smart Logistics
PPTX
Mortgage Data for Machine Learning Algorithms
PDF
rerngvit_phd_seminar
PPTX
Sustainable Transportation System
PDF
Comparative analysis of various data stream mining procedures and various dim...
PDF
陸永祥/全球網路攝影機帶來的機會與挑戰
PPT
TeraGrid Communication and Computation
PDF
Arpan_booth_talk_2 DNN and Tsnor Floww.pdf
PPTX
Computer Vision for Beginners
PPTX
Implementation of Automated Attendance System using Deep Learning
PDF
Activity Monitoring Using Wearable Sensors and Smart Phone
PDF
A fuzzy clustering algorithm for high dimensional streaming data
Data Mobility Exhibition
The little engine(s) that could: scaling online social networks
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
Ameya_Kasbekar_Resume
Inteligent multicriteria model load blancing in cloude computing
A Transfer Learning Approach to Traffic Sign Recognition
BDW16 London - Ingrid Funie, Imperial College London - Machine Learning and F...
Shikha fdp 62_14july2017
How to Leverage Big Data to Deliver Smart Logistics
Mortgage Data for Machine Learning Algorithms
rerngvit_phd_seminar
Sustainable Transportation System
Comparative analysis of various data stream mining procedures and various dim...
陸永祥/全球網路攝影機帶來的機會與挑戰
TeraGrid Communication and Computation
Arpan_booth_talk_2 DNN and Tsnor Floww.pdf
Computer Vision for Beginners
Implementation of Automated Attendance System using Deep Learning
Activity Monitoring Using Wearable Sensors and Smart Phone
A fuzzy clustering algorithm for high dimensional streaming data
Ad

More from Tarik Reza Toha (20)

PDF
An approach towards greening the digital display system
PDF
Many-Objective Performance Enhancement in Computing Clusters
PPTX
Exploiting a Synergy between Greedy Approach and NSGA for Scheduling in Compu...
PPTX
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
PPTX
Automatic Fabric Defect Detection with a Wide-And-Compact Network
PPTX
Binarization of degraded document images based on hierarchical deep supervise...
PPTX
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
PPTX
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
PPTX
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
PPTX
GMC: Greening MapReduce Clusters Considering both Computation Energy and Cool...
PPTX
PNUTS: Yahoo!’s Hosted Data Serving Platform
PPTX
Path shala
PPTX
Towards Greening the Digital Display System
PPTX
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
PDF
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
PPTX
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
PPTX
Smart Mat: A Low Cost People Counting Solution
PPTX
uReporter, an open public reporting system(SD)
PPTX
uReporter, a social problem reporting system (ISD+DB)
PDF
Euler trails and circuit
An approach towards greening the digital display system
Many-Objective Performance Enhancement in Computing Clusters
Exploiting a Synergy between Greedy Approach and NSGA for Scheduling in Compu...
Predicting Human Count through Environmental Sensing in Closed Indoor Settings
Automatic Fabric Defect Detection with a Wide-And-Compact Network
Binarization of degraded document images based on hierarchical deep supervise...
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks—Countin...
BGPC: Energy-Efficient Parallel Computing Considering Both Computational and ...
Towards Simulating Non-lane Based Heterogeneous Road Traffic of Less Develope...
GMC: Greening MapReduce Clusters Considering both Computation Energy and Cool...
PNUTS: Yahoo!’s Hosted Data Serving Platform
Path shala
Towards Greening the Digital Display System
Workload-Based Prediction of CPU Temperature and Usage for Small-Scale Distri...
Towards Making an Anonymous and One-Stop Online Reporting System for Third-Wo...
Sparse Mat: A Tale of Devising A Low-Cost Directional System for Pedestrian C...
Smart Mat: A Low Cost People Counting Solution
uReporter, an open public reporting system(SD)
uReporter, a social problem reporting system (ISD+DB)
Euler trails and circuit
Ad

Recently uploaded (20)

PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
A systematic review of self-coping strategies used by university students to ...
PPTX
Presentation on HIE in infants and its manifestations
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Cell Structure & Organelles in detailed.
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
Microbial diseases, their pathogenesis and prophylaxis
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Chinmaya Tiranga quiz Grand Finale.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
O5-L3 Freight Transport Ops (International) V1.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Module 4: Burden of Disease Tutorial Slides S2 2025
A systematic review of self-coping strategies used by university students to ...
Presentation on HIE in infants and its manifestations
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
VCE English Exam - Section C Student Revision Booklet
Cell Structure & Organelles in detailed.
Microbial disease of the cardiovascular and lymphatic systems
2.FourierTransform-ShortQuestionswithAnswers.pdf

DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources

  • 1. DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources 1,2,3,4,6 Bangladesh University of Engineering and Technology, Bangladesh 5 University of Oklahoma, USA ICDM 2021 Auckland, New Zealand Fellowship from: Tarik Reza Toha1 , Masfiqur Rahaman2 , Saiful Islam Salim3 , Mainul Hossain4 , Arif Mohaimin Sadri5 , and A. B. M. Alim Al Islam6
  • 2. • Background and motivation • Our proposed approach • Experimental results and findings • Conclusion and future work 2 Overview of This Presentation
  • 3. 3 Traffic Congestion in Dhaka • Average public transport speed is 7 kph – Expected to be only 4 kph (slower than walking speed) by 2035 • Wastes ~3.2 million working hours daily • An annual loss of billions of dollars [Source: World Bank Report (2018)] 1. https://openknowledge.worldbank.org/handle/10986/29925 2. https://www.thedailystar.net/frontpage/colossal-loss-1553002
  • 4. 4 Existing Solution for Limiting Traffic Jam Adaptive traffic control system (Lee et al., 2020) Capture on- road traffic images Send images to server Estimate traffic density and optimize signal timing This centralized solution demands high-speed network connectivity, which is not always available across all road intersections in the developing countries such as Bangladesh, India, Kenya, etc. (Chauhan et al., 2019)
  • 5. 5 An Alternative Existing Solution for Limiting Traffic Jam Decentralized adaptive traffic control system (Yeshwanth et al., 2017) Capture traffic images and estimate traffic density Send only the vehicle count to server Optimize signal timing Embedded systems are needed to be deployed at signalized intersections to estimate traffic density in real-time It imposes severe computational constraints on DL architectures to estimate traffic density on-road
  • 6. • EfficientDet: Scalable and Efficient Object Detection – Tan et al., CVPR, IEEE, 2020 • Scaled-YOLOv4: Scaling Cross Stage Partial Network – Wang et al., CVPR, IEEE, 2021 • YOLOv5: Leading Edge Artificial Intelligence Solutions – Jocher et al., Ultralytics Company, 2020 6 Existing Deep Learning Architectures These architectures neither attain faster inference speed nor higher accuracy because of their inherent limitations
  • 7. 7 We propose a novel low-resource DL architecture (DhakaNet) for faster and more accurate vehicle detection in street-view traffic images captured by on-road cameras Our Contribution
  • 8. 8 mCSP: Our Proposed Backbone Network Modified Cross-Stage Partial Networks Increases the accuracy Increases the inference speed At the beginning layers At the later (deeper) layers
  • 9. 9 mPANet: Our Proposed Neck Network Modified Path Aggregation Network Increases the accuracy using a small overhead Exactly ONE extra connection is possible
  • 10. 10 MSAM: Our Proposed Plugin Module Multi-Scale Attention Module Fuses local features within the same layer Identifies meaningful features Localizes meaningful features
  • 11. 11 DhakaNet: Our Proposed Architecture Scaling Factor = 0.29 Modified CSP module Multi-Scale Attention module
  • 12. 12 • Ground truth – Box label • Data augmentation during training – HSV, translation, mosaic, and horizontal flip • Training configuration – Input size: 768 × 768 × 3 – Training : Validation = 0.75 : 0.25 – Other configurations follow Jocher et al., 2020 Experimental Setup GeForce GTX 1070 8 GB Memory Training purpose only Raspberry Pi 4 Model B ARMv7 Processor, 4 GB RAM Testing purpose only
  • 13. 13 Datasets Used for Performance Evaluation Attribute DhakaAI IITM-HeTra-A IITM-HeTra-B Traffic Unstructured Unstructured Unstructured Location Dhaka, BD Chennai, India Chennai, India Training : Testing 3000 : 500 1201 : 216 1201 : 216 # of object classes 21 3 4 DhakaAI dataset (Shihavuddin et al., 2020) IITM-HeTra datasets (A and B) (Mittal et al., 2018)
  • 14. 14 Evaluation Results on DhakaAI Dataset Not applicable DhakaNet achieves 50% faster inference speed, or 13% higher accuracy compared to the existing architecture
  • 15. 15 Evaluation Results on IITM-HeTra Datasets IITM-HeTra-A dataset IITM-HeTra-B dataset Not applicable Not applicable DhakaNet achieves 51% faster inference speed and similar accuracy on both IITM-HeTra datasets compared to the existing architecture
  • 16. 16 Final Output of DhakaNet
  • 17. • Existing low-resource DL architectures neither attain faster inference speed nor higher accuracy due to not overcoming their inherent limitations • We propose a new architecture for embedded systems named DhakaNet – Delivers up to 51% faster or 13% more accurate detection over street-view traffic images • We plan to develop a traffic signal optimization module for a coordinated and adaptive traffic signal system for Dhaka and similar cities in future 17 Conclusion and Future Work
  • 18. Thank You Questions are welcome! 18 Email: toha@tmdm.butex.edu.bd

Editor's Notes

  • #1: Assalamualaikum, hello everyone. I am Tarik Reza Toha from Bangladesh University of Engineering and Technology. Today I am going to present my paper titled “DhakaNet: Unstructured Vehicle Detection using Limited Computational Resources”. Rest of the authors are Masfiqur Rahaman, Saiful Islam Salim, Mainul Hossain, Arif Mohaimin Sadri, and A. B. M. Alim Al Islam. The fifth author is from University of Oklahoma and other authors are from Bangladesh University of Engineering and Technology. This research has been conducted under a fellowship from the ICT Division of Bangladesh. Before we start, let’s see the outline of my presentation.
  • #2: At first, we will talk about the background and motivation. Next, we will discuss the proposed approach. After that, we will show the experimental results and findings. Finally, we will conclude our presentation with some future work. Let’s start with the background of our study.
  • #3: Traffic congestion is one of the most serious challenges for the cities of developing countries such as Bangladesh, India, Kenya, etc. According to the World Bank Report, particularly in Dhaka (capital of Bangladesh), the average driving speed is 7 kilometers per hour (kph), which is expected to be only 4 kph (slower than walking speed) by 2035. Moreover, traffic congestion in Dhaka wastes about 3.2 million working hours daily and billions of dollars of the national economy annually. Therefore, it is extremely important to develop some solutions to fix this problem. Accordingly, there exist some solutions in the literature. Now, we will see some of the solutions.
  • #4: For example, Lee et al., proposed an adaptive traffic control system. In this approach, we need to capture on-road traffic images and upload them to the cloud for necessary processing tasks such as vehicle detection. However, these cloud-based solutions demand high-speed network connectivity, which is not always available across all road intersections in developing countries. Hence, we need to decentralize our approach.
  • #5: In a decentralized approach, vehicle detection is performed in the embedded platforms and only the vehicle count is uploaded to the cloud. The fundamental difference between this approach and the previous work is that, here, we need to estimate the traffic density on road, which was performed in the central server in the earlier study. This approach alleviates the demand for a high-speed network, however, imposes severe computational constraints (inherited from the embedded platforms) on the learning models. Therefore, we need to work on the deep learning architecture itself to make sorts of solution viable to practice. Accordingly, there exists some studies in the literature that deal with the DL architecture itself to make that viable for this sort of solution. Let me show some of the solutions.
  • #6: Tan et al., proposed a scalabale and efficient object detection model named EfficientDet. Besides, Wang et al., proposed scaled-yolov4 through scaling the cross-stage partial network. Next, Jocher et al., from Ultralytics company, proposed YOLOv5 for real-time object detection. The common drawback of these solutions is that they neither attain faster inference speed nor higher accuracy because of their inherent limitations. To overcome the limitations, we have proposed DhakaNet in our paper.
  • #7: We propose a novel low-resource deep learning architecture named DhakaNet for faster and more accurate vehicle detection in street-view traffic images, which have been captured from on-road cameras. Next, we will see our proposed methodology.
  • #8: In our backbone network, we propose a modified CSP module using higher number of filters and limiting the number of bottlenecks. In the left module, we use one bottleneck and in the right one, we use no bottleneck. The increased filters increase the accuracy and the limited bottleneck increase the inference speed. Next, we will see some existing neck networks along with their limitations.
  • #9: In our neck network, we propose a modified PANet that adds an extra edge from input to output nodes if they are at the same level. It fuses more features without adding much computational cost. Next, we will propose a novel plugin module to increase the accuracy of our low-resource architecture.
  • #10: In a limited-resource network, it is very difficult to learn semantically rich features from the images. Hence, we develop a novel multi-scale attention module to extract multi-scale and meaningful features from the images. This module has three blocks namely spatial pyramid pooling, channel attention, and spatial attention. The spp block fuses local features within the same convolutional layer. The cam block identifies meaningful features and sam block localizes these features from the input images. Using these three blocks, the accuracy gets increased significantly. Till now, we have presented all the improvements made in our architecture in isolation. The impact of each improvement has been studied in our analysis and presented in the ablation studies later. Now, we will show the combined architecture.
  • #11: In the backbone, we have used modified CSP modules. In the neck, we have used modified path aggregation network. Besides, we have integrated three multi-scale attention modules prior to the detection layers. The rest of the layers are similar to yolov5-small.
  • #12: We use a straightforward way to train the DhakaNet architecture. We prepare the ground truth using conventional box labels. In the data augmentation, we change hue-saturation-brightness values of the images. Besides, we use translation, horizontal flip, and mosaic operations. Moreover, we use 768x768 image size, 25% validation split, and SGD optimizer. During object detection, we need to optimize three loss function such as bounding box regression, objectness, and classification. We use complete intersection over union for regression and binary cross-entropy for objectness and classification. Next, we see how these loss functions are used during the training stage.
  • #13: We use three unstructured traffic datasets in our performance evaluation. First one is DhakaAI that consists of unstructured traffic images of Dhaka city. It has 3k training and 500 test images and 21 classes of vehicles. Second one is IITM-HeTra-A that contains unstructured traffic images of Chennai city. It has 1201 training and 216 test images and 2 classes. Last one is IITM-HeTra-B that contains same images as IITM-HeTra-A. However, it has 4 classes of vehicles. Next, we see the performance evaluation metrics.
  • #14: Here, the data points on the right side implies faster detection, i.e., they require less computational resources, and the points on the top side implies more accurate detection. The red points are state-of-the-art models such as yolov4-tiny and yolov5-small. The green boxes are DhakaNet models, i.e., left one is original DhakaNet and the right one is down-scaled version of DhakaNet. Here, we can see that YOLOv4-tiny delivers very poor inference speed, hence, it is not a realistic solution for a decentralized adaptive traffic control system. Compared to yolov5-small, DhakaNet delivers 50% faster inference speed, or 13% better accuracy. Next, we will see the evaluation results on IITM-HeTra datasets.
  • #15: In IITM-HeTra datasets, we can see that yolov5-tiny can be excluded due to having poor inference speed as before. Compared to yolov5-small, DhakaNet delivers 51% faster inference speed and comparable accuracy on both datasets. Next, we will see the ablation studies.
  • #16: Here are the output of yolov5-small and DhakaNet over four challenging test images of the DhakaAI dataset. We can see that, for severe occlusion, DhakaNet performs much better than YOLOv5-small. Besides, in the night scene, yolov5-small misses all vehicles whereas DhakaNet detects several ones. Next, we will see an extension of our proposed approach.
  • #17: To conclude, existing limited-resource deep learning architectures exhibit either low inference speed, or low detection accuracy due to inherent limitations. As a remedy, we propose two novel architectures namely DhakaNet and DhakaNet-drone for better real-time vehicle detection in embedded system. Rigorous performance evaluation of DhakaNet shows up to 51% faster, or 13% more accurate detection in embedded systems. Besides, DhakaNet-drone achieves up to 50% faster, or 17% more accurate detection performance in embedded system. In future, we plan to devise an optimization module and implement a coordinated and adaptive traffic signal system in Dhaka city. Thank you.
  • #20: To analyze the effectiveness of our proposed modules of DhakaNet, we conduct an ablation study of DhakaNet over DhakaAI dataset. For this purpose, we remove all the innovative changes from DhakaNet and consider this version as the baseline model. In every stage, we evaluate the object detection accuracy and corresponding inference speed, which is presented in the table. Here, every row represents the results for each variant of DhakaNet. From this table, DhakaNet having all the three modifications achieves the highest accuracy among all the other variants. Note that, the speed is getting dropped because of adding new modules to DhakaNet architecture. Next, we will see the efficacy of our MSAM module.