Vehicle Detection and Identification for Autonomous Driving Applications

INDIAN INSTITUTE OF TECHNOLOGY ROORKEE
CEN-499 Training Seminar
Under Supervision of :
Prof. Sanhita Das
Submitted By:
Arpana Sharma(21113026)
Group 1 (M2)
Btech 4th
year Civil Engineering
Vehicle Detection and Identification for
Autonomous Driving Applications

2
Table of Contents
1 Introduction
2 Data Collection
3 Methodology Overview
4 Data Annotation and Pre-processing
5 Model Description
6 Model Training
7 Vehicle Detection and Classification
8 Post-Processing
9 Result and Analysis
10 Conclusion
11 Future Enhancements

3
Introduction
Advanced Driver Assistance Systems (ADAS) crucial in enhancing the safety and efficiency
of autonomous vehicles by providing drivers with essential information. Main objective of
project is to achieve real-time vehicle detection and identification using advanced
computer vision models like YOLO v8 Nano and the Segment Anything Model (SAM).

4
Data Collection
Data collection conducted over a 115 km stretch between Greater Noida and New Delhi.
Video VBOX system equipped with 4 cameras and GPS was used. Setup ensured accurate
spatial and temporal alignment of the video frames, providing high-quality data.
Dataset comprises 5,200 images annotated with approximately 18,000 objects.
Dataset comprises of 5200 images with 720x576 pixels resolution.

5
Methodology Overview
Data Annotation and Preprocessing: Annotated images using Roboflow, including bounding
boxes and class labels. Augmented data to improve model robustness.
Model Training: Trained the YOLO v8 Nano model with the annotated dataset. Configured
for high accuracy and real-time performance.
Vehicle Detection: Detected vehicles in video frames using YOLO v8 Nano. Applied non-max
suppression to handle overlapping detections.
Segmentation: Utilized the Segment Anything Model (SAM) to segment vehicles from the
background and refine vehicle outlines.
Edge Detection: Applied edge detection algorithms (e.g., Canny and Sobel) to the
segmented images to extract and visualize vehicle contours.

6
Data Annotation and Pre-Processing
Data Annotation: Image annotation process involved using Roboflow. Annotation of images
focused on identifying different types of vehicles, such as cars, trucks, buses, and
motorcycles.
Data augmentation: Images went through transformations like rotation, flipping, scaling,
and brightness adjustments.
Data preprocessing: Images were normalized for consistent input. The dataset was split into
training (85%) and validation (15%) sets to evaluate model performance.

7
Model Description- YOLO v8 Nano
Key Features:
Real-Time Detection: Enables rapid processing of images for immediate object detection,
essential for dynamic environments like autonomous driving.
Improved Accuracy: Utilizes advanced techniques for better precision in detecting and
classifying vehicles.
Anchor-Free Design: Simplifies the model by eliminating predefined anchor boxes,
improving detection of varying object sizes and shapes.
Technical Aspects:
CIoU Loss Function: Enhances bounding box prediction by accounting for overlap, distance
between box centers, and aspect ratio:
where IoU is Intersection over Union, ρ2 is the Euclidean distance between box centers, and
ϑ adjusts for aspect ratio.

8
Model Description- Segment Anything Model
(SAM)
Versatile Segmentation: Designed to segment any object within an image, offering flexibility
in various applications, including vehicle identification.
Transformer-Based Architecture: Utilizes advanced Transformer vision models for high-
performance segmentation, enabling precise object delineation in complex scenes.
Prompt Engineering: Adapts the concept of prompts from NLP to image segmentation,
allowing SAM to effectively process and segment objects based on provided bounding boxes.
Technical Aspects:
Image Encoder: Converts input images into feature representations, enabling the model to
analyze and interpret visual data accurately.
Prompt Encoder: Integrates additional context from bounding boxes or other prompts to
refine segmentation results.
Mask Decoder: Generates binary masks to isolate objects from the background, providing
detailed segmentation of vehicles.

9
Model Training
Hyperparameter Configuration: Key parameters, such as learning rate, batch size (32), and
number of epochs (150), were set to optimize model performance.
Feature Extraction: The model utilizes convolutional layers to extract features from images.
The feature map F is represented as:
Where is the weight matrix, is the input, is the bias, and is the activation function.
Training Process:
GPU Acceleration: Training was performed in a GPU-enabled environment to expedite the
process and handle large-scale data efficiently.
Monitoring: Training progress was monitored using metrics such as loss and accuracy to
ensure effective learning and avoid overfitting.
Roboflow Integration: Facilitated dataset management and augmentation, allowing
experimentation with different configurations to improve model robustness.

10
Vehicle Detection and Classification
Input Frames: Video frames are fed into the YOLO v8 Nano model for analysis. Each frame is
processed individually to identify and classify vehicles.
Vehicle Detection: YOLO v8 Nano detects vehicles by drawing bounding boxes around them.
The model processes visual features to determine the presence and location of each vehicle.
Classification: Detected vehicles are classified into predefined categories (e.g., cars, trucks,
buses) based on visual characteristics and model training.
Non-Max Suppression: Redundant bounding boxes are removed to ensure that each vehicle
is detected only once, minimizing false positives.

11
Post-Processing
Masking and Segmentation: Binary masks are created for each detected vehicle using the
Segment Anything Model (SAM). This step isolates vehicles from the background, allowing
for precise segmentation.
Edge Detection: After segmentation, edge detection algorithms, such as Canny and Sobel,
are applied to highlight vehicle contours and boundaries. This enhances the visual
representation of vehicle edges.
Canny Edge Detection: Identifies edges based on intensity gradients, providing sharp and
well-defined boundaries.
Sobel Operator: Uses gradient calculation to detect edges in both x and y directions,
enhancing edge detail.
Polygonal Representation: The segmented vehicles are outlined with polygon borders, and
edge points are identified. This step refines the object contours for further analysis and
integration.

12
Result and Analysis
1. Detection Accuracy
• Overview:
• The YOLOv8 Nano model was evaluated based on its ability to accurately detect and
classify vehicles in real-time.
• Performance metrics such as precision, recall, and F1-score were used to quantify
detection accuracy.
• Key Metrics:
• Precision: High precision indicates that the model accurately identifies vehicles
without many false positives.
• Recall: High recall suggests the model successfully detects most vehicles,
minimizing false negatives.
• F1-Score: Balances precision and recall, providing a single measure of accuracy.
• Performance Summary:
• Precision: 89.5%
• Recall: 92.3%
• F1-Score: 90.9%

13
Result and Analysis
2. Confusion Matrix Analysis
• Purpose:
• The confusion matrix illustrates the model’s classification performance by showing
the number of true positives, false positives, true negatives, and false negatives.
• Observations:
• True Positives (TP): High number, indicating successful vehicle detections.
• False Positives (FP): Relatively low, reflecting the model's ability to avoid
misclassifying non-vehicles.
• True Negatives (TN): Not applicable in this context as the focus is on vehicle
detection.
• False Negatives (FN): Low, but still present, indicating some vehicles were missed.
• Impact on Results:
• A small number of false negatives suggests that while the model is generally
reliable, there may be scenarios where it misses certain vehicles, particularly in
challenging lighting or weather conditions.

15
Result and Analysis
3. Precision-Recall Curve Analysis
• Precision-Recall Curve:
• The curve shows the trade-off between precision and recall for different thresholds.
• A sharp drop in precision at lower recall values may indicate that the model
struggles with detection under certain conditions.
• Analysis:
• The area under the Precision-Recall curve (PR AUC) is high, confirming strong
overall performance.
• Recall-Confidence Curve: Shows how recall changes with varying confidence
thresholds.
• Precision-Confidence Curve: Highlights the model’s precision across different
confidence levels.
• Conclusions:
• High precision and recall indicate the model performs well across various traffic
scenarios.
• There is a trade-off at extreme thresholds where either false positives or false
negatives may increase, requiring careful threshold tuning.

19
Result and Analysis
4. Post-Processing Results
• Edge Detection Analysis:
• Post-processing with edge detection using Canny and Sobel algorithms helped
refine vehicle contours and boundaries.
• Edge detection was particularly effective in distinguishing closely packed vehicles
and overlapping objects.
• Masking and Segmentation:
• The SAM model provided clear segmentation, accurately isolating vehicles from the
background.
• Masking was effective in improving detection confidence, especially in cluttered
environments.
• Visual Examples:
• Detection and Segmentation: Display images of detected and segmented vehicles.
• Edge Detection: Show how edge detection enhances the visual identification of
vehicle contours.

20
Conclusion
• Project Achievements:
• Successfully implemented real-time vehicle detection and identification using YOLO v8 Nano and the
Segment Anything Model (SAM).
• Demonstrated high accuracy in detecting and classifying various vehicle types in video frames.
• Key Findings:
• YOLO v8 Nano: Provided rapid and precise vehicle detection with improved accuracy and an anchor-
free design, optimizing performance for real-time applications.
• SAM: Delivered effective segmentation of vehicles, enhancing the clarity of vehicle boundaries
through advanced Transformer-based architecture and prompt engineering.
• Performance Highlights:
• Achieved effective real-time processing with minimal latency.
• High segmentation quality with accurate isolation of vehicles from the background.
• Challenges:
• Identified areas for improvement, including handling occlusions and varying lighting conditions.

21
Future Enhancements
• Model Optimization:
• Further refinement is needed to address challenges such as class imbalances and improve detection
of smaller vehicles. Continued model training with a more diverse dataset can enhance
performance.
• Integration of Additional Sensors:
• Incorporating data from other sensors, such as LIDAR, could provide a more comprehensive
understanding of the environment, improving detection accuracy and robustness.
• Exploration of Enhanced Techniques:
• Investigate advanced data augmentation methods and explore more sophisticated model
architectures to further boost detection capabilities and system efficiency.

22
References
• Cao, X.; Wu, C.; Yan, P.; Li, X. Linear SVM classification using boosting HOG features for vehicle
detection in low-altitude airborne videos. In Proceedings of the 2011 IEEE International Conference
Image Processing (ICIP), Brussels, Belgium, 11–14 September 2011; pp. 2421–2424.Niaz Mahmud
Zafri1 &Atikul Islam Rony1 & Neelopal Adri1.
• Laopracha, N.; Sunat, K. Comparative Study of Computational Time that HOG-Based Features Used
for Vehicle Detection. In Proceedings of the International Conference on Computing and Information
Technology, Helsinki, Finland, 21–23 August 2017; pp. 275–284.Yingying Ma,1 Siyuan Lu,1 and
Yuanyuan Zhang 2.
• Siam, M.; Elhelw, M. Robust autonomous visual detection and tracking of moving targets in UAV
imagery. In Proceedings of the IEEE 9th International Conference on Signal Processing, Beijing,
China, 21–25 October 2012; Volume 2, pp. 1060–1066

Vehicle Detection and Identification for Autonomous Driving Applications

More Related Content

Similar to Vehicle Detection and Identification for Autonomous Driving Applications (20)

Recently uploaded (20)

Vehicle Detection and Identification for Autonomous Driving Applications

Editor's Notes