SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2822
Object Detection and Translation for Blind People Using Deep Learning
Mayuresh Banne1, Rahul Vhatkar2, Ruchita Tatkare3
1,2,3Department of Information Technology, Vidyalankar Institute of Technology, Wadala
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - There are million of people in India alone that are visually impaired. so, it’sessentialtounderstandforvisually impaired
people to recognize a product of their daily use so we made a system to identify products in their everyday routine by this system.
There are many papers on this topic that will help a blind person. This paper helps a blind person in their daily use. This system
consists of a camera, a speaker and an image processing system. This project tries to detect the object and transform that object
into the audio form and inform blind person about those objects. Our system consists of a box which has a portable camera and a
system which will process that image. image are captured with a portable camera device with real-time image recognition on
existing object detection models. after detecting an object that information is translate into audio.
Key Words: System, Camera, Audio, Image processing, Object.
1. INTRODUCTION
Millions of people live in this world who can’t see environmentduetovisual impairment.Althoughtheycandevelopalternative
approaches to deal with daily routines, but sometimes there are from certain objects they just can’t tell withoutfeel oftouchA
variety of object is processing and machine learning techniques have been applied to the problem, including matrix
factorization, dictionary learning and most recently mask region Convolution neural networks (MASK RCNN). In particular,
mask region Convolution neural network (MASK RCNN) are, in principle, very well suited to the problem of object detection
and recognition First, it generates proposals about the regions wheretheremightbeanobjectbasedontheinputimage.Second,
it predicts the class-id of the object and define a bounding box and generates a mask atpixel level of the object basedonthefirst
stage proposal. Mask-RCNN is additional feature of Faster RCNN. Fast RCNN has an output for each object that are class label
and bounded-box offset. But in Mask-RCNN it has one feature that is object masking. Mask RCNN has additional mask output
which are distinct from class and box that why it requires finer spatial layout of an object [1]. It also includes pixel-to-pixel
alignment which was not present in Faster RCNN.
possible explanation for the limited exploration of CNNs and the difficulty to improveonsimplermodelsistherelativescarcity
of labelled data for object detection.
2. LITERATURE SURVEY
2.1 Prof. Seema Udgirkar, Shivaji Sarokar, Sujit Gore, Dinesh Kakuste, Suraj Chaskar, “Object Detection System for
Blind People”.
In these paper author tries to convey that they proposed a smart vision whose objective is to move anywhere in the
environment through a user-friendly interface system.hisprojectmainlyfocussesoncomputer visionmodule.Intheseauthors
made a system which will able to find the obstacle which are near to his head specially while entering from door.in short it is
made to protected his head from getting injury. This product is design to navigate blind person in any environment and it
guides the user about that object and provide information about that obstacle using buzzer and vibraterasa two-outputmode
of the user. User control mode include switch that allows the user to choose project mode of operation. There are of twomode
of operation first is buzzer mode and second is vibration mode these mode are provided as an output for a blind person,mode
are used because user might not be comfortable with one of these mode .sometimes vibration motor are uncomfortable
because it can irritate him with their vibration .similarly if there is a lot of noise in thesurroundingbuzzercannotbe usedashe
can’t here the buzzer noise. sensor control tells whether to take the measureand receiveoutputfromthesensorand normalize
it to control value for the sensor which is mounted on stepper motor .this stepper motor is continuously move in 90degree in
which it divide the image into 3 portion which are left ,right and central .for obstacle detection. So basically, when blind
person is walking when optical is appeared it is sensed by sensorandtellsitthroughonethedifferentoutputs[2]By usingthese
papers, we can use their image processing technique .in our project since these project uses camera to detected an obstacle,
we can use same technique in our object detection.
2.2. Amira S. Mahmoud, Sayed A. Mohamed1 Reda A. El-Khoribi2 Hisham M. AbdelSalam, “Object Detection Using
Adaptive Mask RCNN in Optical Remote Sensing Images”.
In this paper author has talked us about mask region-based convolutional network (Mask-RCNN) is used for utilization for
multi-class object detection for sensing the images. Transfer learning, fine tuninganddata augmentationareusedto overcome
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2823
object scale variability, the density of object.Also,adaptiveMask-RCNN wascomparedtodeepobjectdetectionmethods.Mask-
RCNN is an extended version to faster RCNN that allow an accurate pixel-based segmentation it consist of two stages feature
pyramid network (FPN) and region proposal network (RPN). In feature pyramid network based on input images the different
number of proposals were generated about that regions. Before that we utilized a standard convolutional neural network asa
feature extractor by using art architecture AlexNet and VGGNet layers respectively. The network suffers from vanishing
gradient problem which resultinperformancesaturation anddegradingsoResNet50architecture wereintroduced. Whichskip
connection or shortcuts that allow to take activation from one layer and feed to another layer. ResNet50 uses seminal
architecture to different computer application in these they have used pre trained architecture on ImageNet (1000 class).
Which are small due to global average pooling rather than fully connected layers. the FPN extracts region of interest from
different levels and gives as an input to RPN. RPN scanners individual and predicted whether an object is present or not .it
actually scans the feature map and makes it much faster. Then each region of interest proposed by RPN as input and output a
classification and bounded box and Mask-RCNN is added a new branch to output which indicate whether the given pixel is a
Part or not a part of an object.[3]. From these papers we can Mask-RCNN for an objectdetectionsinceMask RCNN isfasterthan
any deep learning object detection and provide more accurate info because of it has more added feature then faster RCNN.
2.3. N. Saranya, M. Nandinipriya, U. Priya, “Real Time Object Detection for Blind People”.
In these paper author explained us about object detection from the image and represented it by their name and speech. And it
also helps the blind people in location and encoded the audio into 2 channel audios withthehelpof3D binaural sound. Inthese
a video is capture with portable camera device from client side and it is streamed to a server for real time image recognition
with object detection. which mean it identify and follow the same object in sequence of video frames sometimes video may
have some noise. to remove that noise from frames noise reduction technique is used that improve the image quality and
extraction of object frame is used to detected the object based on color of the moving frame .using different feature extraction
of object from frame is used they are called object detection every object has specific feature based on shapes by using this a
rectangular bounding box and centroid is plotted and position of that centroid is stored with bounded box. Now by using data
streaming a pipeline is developed that enable quick communication. A raw image is taken from camera encodes it into string
and send it to a client to the server. It sends information directly to the unity sound generator and play the binaural sound and
output is received to unity sound generator and it plays through a wireless cord. Thispapercanbeusedinconversionofsound
which will be helpful for blind person in object detection.
From the past event and the existing approach, the below Drawback are been noted:
1. Cost Effective System.
2. Lack of user acceptance.
3. High chances of system breaking due to hardware failure.
Considering all the drawbacks into account we have formulated a proposed system which covers all the above-mentioned
drawbacks.
1. Low cost system is used.
2. No special external hardware is needed.
3. Software based system leading to low chances of complete system failure.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2824
3. FLOWCHART
As shown in given figure in this image is taken from camera of an object:
Step 1. place the object inside the box.
Step 2. object will get scan by the camera which is placed inside the box.
Step 3. if object is not detected then Change the position of an object until object is detected by the camera.
Step 4. object is detected by the camera Extract region proposals using an algorithm such as Selective Search.
Step 5. Detection is identify the class id attribute of a detected object from which class that are belonging.
Step 6. After detecting frame of an object, a score is generated and bounded box is created.
Step 7. pyttsx3 lib is used to convert txt into a speech operation in which class name is identified
mask RCNN and gets converted to voice speech.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2825
Step 8. Pyttsx3 is used for text to speech conversion in python. it is a libraries which is used for conversion. unlike other
libraries it works on an office and it is compatible with python 2 and python 3.
Step 9. pyttsx3 initialize the engine after that we can set the speaking rate volume level and we can get the details of current
voice.
Step 10. engine. Say() is used to output the informationof currentlydetectedobjected regionproposalsusinganalgorithmsuch
as Selective Search.
4. IMPLEMENTATION
In this section the implementation of Blind-Aid application has been described. In section A of this topic, the actual
methodology used is been described with respect to all the modules. Section B, are the snapshots of the application with their
description showing the implemented application in detail.
[A] Methodology Used:
In these we are using a system from which blind person can place an object into thatsystemafterprocessingvoice isgenerated
through which blind person can identify that object. These systems are using deep learning from which a clicked image of an
object is converted and transforms into speech which become handy for blind person to understand that object. Image
processing is used in the following way it is done by taking an image as an input goesthroughMask-RCNN processionwhich: it
generates proposals regions of that object from a given input image which is taken from camera. It predicts the class of that
object, define the bounding box and generates a mask at a pixel level of the object based on that stage proposal. with the
reference images we can describe an object detection and translation.
[B] Screenshots:
System:
The above figure is the system in which it consists of a box in which camera is mounted on a top when object is placed in
the box camera detected an object.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2826
Process:
Above figure shows how object detection and translation works in these OpenCV uses a videocapture() in these it captures an
image of that object and by using frame which has rois, masks, class_ids, scores object is detected and masked. And By using
pyttx3 lib we can translate class_ids object into speech by using engine.say().
Detected Object:
This is image of an object detected that is bottle in which it shows score. Score can range from0to1.Biggerthescoresmorethe
accurate the result.
5. FUTURE SCOPE
In future scope of this application, the System will detect more object that are used daily. It will increase its accuracy so to get
better result and it will also consider complex shape which was difficult by the blind person to detected. Since this project is
using a system which is stationary by using smartphone, wecanovercomethisproblem.smartphoneisportableso itcomeeasy
to carry.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2827
6. CONCLUSION
The Project entitled “Object Detection and Translation for Bind People Using Deep Learning” has been developed and this
satisfies all proposed requirements. The system is highly usable and user friendly. All the system objectiveshave beenmet.All
these phases of development are done according to methodologies. The application will execute successfully by fulfilling the
objectives of the project. Further extensions to this application can be made as required with minor modifications.
ACKNOWLEDGEMENT
We are pleased to present “Object Detection and Translation for Bind People Using Deep Learning”asourprojectandtakethis
opportunity to express our profound gratitude to all those people who helped us in completion of this project.
We thank our college for providing us with excellent facilities that helped us to complete and present this project. We would
also like to thank the staff members and lab assistants for permitting us to use computers in the lab as and when required.
We express our deepest gratitude towards our project guide Prof. Ichhanshu Jaiswal for his valuableandtimelyadviceduring
the various phases in our project. We would also like to thank him for providing us with all proper facilities and supportasthe
project coordinator. We would like to thank him for support, patience and faithinourcapabilitiesandforgivingusflexibilityin
terms of working and reporting schedules.
Finally, we would like to thank everyone who has helped us directly or indirectly in our project.
REFERENCES
[1] Facebook AI Research (FAIR), Kaiming He Georgia Gkioxari Piotr Dollar Ross.
[2] Prof. Seema Udgirkar, Shivaji Sarokar, Sujit Gore, Dinesh Kakuste, Suraj Chaskar, Object Detection System for Blind
People.
[3] Object Detection Using Adaptive Mask RCNN in Optical Remote Sensing Images, Amira S. Mahmoud, Sayed A. Mohamed1
Reda A. El-Khoribi2 Hisham M. AbdelSalam.
[4] Real Time Object Detection for Blind People, N.Saranya , M.Nandinipriya , U.Priya
[5] Convolutional Neural Network for Object Detection System forBlind People, Y.C.Wong,J.A.Lai,S.S.S.Ranjit,A.R.Syafeeza,N.
A. Hamid
[6] Object Detection and Recognition for Visually Impaired People, Shuihua Wang
[7]medium.com_@alittlepain833/simple-understanding-of-mask-rcnn , Xiang Zhang
[8] TEXT TO SPEECH CONVERSION MODULE, Hussain Rangoonwala , Vishal Kaushik , P Mohith and DhanalakshmiSamiappan

More Related Content

PDF
IRJET - Steganography based on Discrete Wavelet Transform
PDF
[IJET V2I2P23] Authors: K. Deepika, Sudha M. S., Sandhya Rani M.H
DOCX
Implementation of digital image watermarking techniques using dwt and dwt svd...
PDF
Comparison of SVD & Pseudo Random Sequence based methods of Image Watermarking
PDF
The International Journal of Engineering and Science (The IJES)
PDF
1674 1677
PDF
Comparison of Wavelet Watermarking Method With & without Estimator Approach
PDF
IRJET - An Robust and Dynamic Fire Detection Method using Convolutional N...
IRJET - Steganography based on Discrete Wavelet Transform
[IJET V2I2P23] Authors: K. Deepika, Sudha M. S., Sandhya Rani M.H
Implementation of digital image watermarking techniques using dwt and dwt svd...
Comparison of SVD & Pseudo Random Sequence based methods of Image Watermarking
The International Journal of Engineering and Science (The IJES)
1674 1677
Comparison of Wavelet Watermarking Method With & without Estimator Approach
IRJET - An Robust and Dynamic Fire Detection Method using Convolutional N...

What's hot (20)

PDF
Analysis of Digital Image Watermarking and its Application
PDF
PPTX
Pipeline anomaly detection
PDF
M.tech dsp list 2014 15
PDF
SIGNIFICANCE OF RATIONAL 6TH ORDER DISTORTION MODEL IN THE FIELD OF MOBILE’S ...
PDF
IRJET- A Review Analysis to Detect an Object in Video Surveillance System
PDF
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
PPTX
False colouring
PDF
An Approach Towards Lossless Compression Through Artificial Neural Network Te...
PDF
Digital video watermarking scheme using discrete wavelet transform and standa...
PDF
Mislaid character analysis using 2-dimensional discrete wavelet transform for...
PDF
IRJET- Forensic Detection of Inverse Tone Mapping in HDR Images
PDF
Performance Comparison of Digital Image Watermarking Techniques: A Survey
PDF
International Journal for Research in Applied Science & Engineering
PPTX
project_final
PDF
A Novel Approach for Tracking with Implicit Video Shot Detection
PDF
An Improved Noise Resistant Image Steganography Technique using Zero Cross Ed...
PDF
Wavelet Based Image Watermarking
PDF
Encrypted sensing of fingerprint image
PDF
Commutative approach for securing digital media
Analysis of Digital Image Watermarking and its Application
Pipeline anomaly detection
M.tech dsp list 2014 15
SIGNIFICANCE OF RATIONAL 6TH ORDER DISTORTION MODEL IN THE FIELD OF MOBILE’S ...
IRJET- A Review Analysis to Detect an Object in Video Surveillance System
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
False colouring
An Approach Towards Lossless Compression Through Artificial Neural Network Te...
Digital video watermarking scheme using discrete wavelet transform and standa...
Mislaid character analysis using 2-dimensional discrete wavelet transform for...
IRJET- Forensic Detection of Inverse Tone Mapping in HDR Images
Performance Comparison of Digital Image Watermarking Techniques: A Survey
International Journal for Research in Applied Science & Engineering
project_final
A Novel Approach for Tracking with Implicit Video Shot Detection
An Improved Noise Resistant Image Steganography Technique using Zero Cross Ed...
Wavelet Based Image Watermarking
Encrypted sensing of fingerprint image
Commutative approach for securing digital media
Ad

Similar to IRJET - Object Detection and Translation for Blind People using Deep Learning (20)

PDF
IRJET- Object Detection and Recognition for Blind Assistance
PDF
Sanjaya: A Blind Assistance System
PDF
IRJET - Direct Me-Nevigation for Blind People
PDF
Voice Enable Blind Assistance System -Real time Object Detection
PDF
Object Detection and Localization for Visually Impaired People using CNN
PDF
IRJET- Comparative Analysis of Video Processing Object Detection
PDF
Object and Currency Detection for the Visually Impaired
PDF
Object Detection and Tracking AI Robot
PDF
IRJET - Blind Guidance using Smart Cap
PDF
Reasearch Paper for AI using Image Processing
PDF
Deep Learning for X ray Image to Text Generation
PPTX
Development of wearable object detection system & blind stick for visuall...
PDF
Smart Navigation Assistance System for Blind People
PDF
Object Detetcion using SSD-MobileNet
PPTX
ML for blind people.pptx
PDF
INDOOR AND OUTDOOR NAVIGATION ASSISTANCE SYSTEM FOR VISUALLY IMPAIRED PEOPLE ...
PDF
DSNet Joint Semantic Learning for Object Detection in Inclement Weather Condi...
PDF
Convolutional Neural Network Based Real Time Object Detection Using YOLO V4
PDF
IRJET- Real-Time Object Detection using Deep Learning: A Survey
PDF
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET- Object Detection and Recognition for Blind Assistance
Sanjaya: A Blind Assistance System
IRJET - Direct Me-Nevigation for Blind People
Voice Enable Blind Assistance System -Real time Object Detection
Object Detection and Localization for Visually Impaired People using CNN
IRJET- Comparative Analysis of Video Processing Object Detection
Object and Currency Detection for the Visually Impaired
Object Detection and Tracking AI Robot
IRJET - Blind Guidance using Smart Cap
Reasearch Paper for AI using Image Processing
Deep Learning for X ray Image to Text Generation
Development of wearable object detection system & blind stick for visuall...
Smart Navigation Assistance System for Blind People
Object Detetcion using SSD-MobileNet
ML for blind people.pptx
INDOOR AND OUTDOOR NAVIGATION ASSISTANCE SYSTEM FOR VISUALLY IMPAIRED PEOPLE ...
DSNet Joint Semantic Learning for Object Detection in Inclement Weather Condi...
Convolutional Neural Network Based Real Time Object Detection Using YOLO V4
IRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPT
introduction to datamining and warehousing
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Current and future trends in Computer Vision.pptx
PDF
PPT on Performance Review to get promotions
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
Construction Project Organization Group 2.pptx
PDF
composite construction of structures.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
DOCX
573137875-Attendance-Management-System-original
PPTX
web development for engineering and engineering
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Foundation to blockchain - A guide to Blockchain Tech
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
introduction to datamining and warehousing
Mechanical Engineering MATERIALS Selection
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Current and future trends in Computer Vision.pptx
PPT on Performance Review to get promotions
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
R24 SURVEYING LAB MANUAL for civil enggi
Construction Project Organization Group 2.pptx
composite construction of structures.pdf
Internet of Things (IOT) - A guide to understanding
CYBER-CRIMES AND SECURITY A guide to understanding
573137875-Attendance-Management-System-original
web development for engineering and engineering
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Foundation to blockchain - A guide to Blockchain Tech

IRJET - Object Detection and Translation for Blind People using Deep Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2822 Object Detection and Translation for Blind People Using Deep Learning Mayuresh Banne1, Rahul Vhatkar2, Ruchita Tatkare3 1,2,3Department of Information Technology, Vidyalankar Institute of Technology, Wadala ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - There are million of people in India alone that are visually impaired. so, it’sessentialtounderstandforvisually impaired people to recognize a product of their daily use so we made a system to identify products in their everyday routine by this system. There are many papers on this topic that will help a blind person. This paper helps a blind person in their daily use. This system consists of a camera, a speaker and an image processing system. This project tries to detect the object and transform that object into the audio form and inform blind person about those objects. Our system consists of a box which has a portable camera and a system which will process that image. image are captured with a portable camera device with real-time image recognition on existing object detection models. after detecting an object that information is translate into audio. Key Words: System, Camera, Audio, Image processing, Object. 1. INTRODUCTION Millions of people live in this world who can’t see environmentduetovisual impairment.Althoughtheycandevelopalternative approaches to deal with daily routines, but sometimes there are from certain objects they just can’t tell withoutfeel oftouchA variety of object is processing and machine learning techniques have been applied to the problem, including matrix factorization, dictionary learning and most recently mask region Convolution neural networks (MASK RCNN). In particular, mask region Convolution neural network (MASK RCNN) are, in principle, very well suited to the problem of object detection and recognition First, it generates proposals about the regions wheretheremightbeanobjectbasedontheinputimage.Second, it predicts the class-id of the object and define a bounding box and generates a mask atpixel level of the object basedonthefirst stage proposal. Mask-RCNN is additional feature of Faster RCNN. Fast RCNN has an output for each object that are class label and bounded-box offset. But in Mask-RCNN it has one feature that is object masking. Mask RCNN has additional mask output which are distinct from class and box that why it requires finer spatial layout of an object [1]. It also includes pixel-to-pixel alignment which was not present in Faster RCNN. possible explanation for the limited exploration of CNNs and the difficulty to improveonsimplermodelsistherelativescarcity of labelled data for object detection. 2. LITERATURE SURVEY 2.1 Prof. Seema Udgirkar, Shivaji Sarokar, Sujit Gore, Dinesh Kakuste, Suraj Chaskar, “Object Detection System for Blind People”. In these paper author tries to convey that they proposed a smart vision whose objective is to move anywhere in the environment through a user-friendly interface system.hisprojectmainlyfocussesoncomputer visionmodule.Intheseauthors made a system which will able to find the obstacle which are near to his head specially while entering from door.in short it is made to protected his head from getting injury. This product is design to navigate blind person in any environment and it guides the user about that object and provide information about that obstacle using buzzer and vibraterasa two-outputmode of the user. User control mode include switch that allows the user to choose project mode of operation. There are of twomode of operation first is buzzer mode and second is vibration mode these mode are provided as an output for a blind person,mode are used because user might not be comfortable with one of these mode .sometimes vibration motor are uncomfortable because it can irritate him with their vibration .similarly if there is a lot of noise in thesurroundingbuzzercannotbe usedashe can’t here the buzzer noise. sensor control tells whether to take the measureand receiveoutputfromthesensorand normalize it to control value for the sensor which is mounted on stepper motor .this stepper motor is continuously move in 90degree in which it divide the image into 3 portion which are left ,right and central .for obstacle detection. So basically, when blind person is walking when optical is appeared it is sensed by sensorandtellsitthroughonethedifferentoutputs[2]By usingthese papers, we can use their image processing technique .in our project since these project uses camera to detected an obstacle, we can use same technique in our object detection. 2.2. Amira S. Mahmoud, Sayed A. Mohamed1 Reda A. El-Khoribi2 Hisham M. AbdelSalam, “Object Detection Using Adaptive Mask RCNN in Optical Remote Sensing Images”. In this paper author has talked us about mask region-based convolutional network (Mask-RCNN) is used for utilization for multi-class object detection for sensing the images. Transfer learning, fine tuninganddata augmentationareusedto overcome
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2823 object scale variability, the density of object.Also,adaptiveMask-RCNN wascomparedtodeepobjectdetectionmethods.Mask- RCNN is an extended version to faster RCNN that allow an accurate pixel-based segmentation it consist of two stages feature pyramid network (FPN) and region proposal network (RPN). In feature pyramid network based on input images the different number of proposals were generated about that regions. Before that we utilized a standard convolutional neural network asa feature extractor by using art architecture AlexNet and VGGNet layers respectively. The network suffers from vanishing gradient problem which resultinperformancesaturation anddegradingsoResNet50architecture wereintroduced. Whichskip connection or shortcuts that allow to take activation from one layer and feed to another layer. ResNet50 uses seminal architecture to different computer application in these they have used pre trained architecture on ImageNet (1000 class). Which are small due to global average pooling rather than fully connected layers. the FPN extracts region of interest from different levels and gives as an input to RPN. RPN scanners individual and predicted whether an object is present or not .it actually scans the feature map and makes it much faster. Then each region of interest proposed by RPN as input and output a classification and bounded box and Mask-RCNN is added a new branch to output which indicate whether the given pixel is a Part or not a part of an object.[3]. From these papers we can Mask-RCNN for an objectdetectionsinceMask RCNN isfasterthan any deep learning object detection and provide more accurate info because of it has more added feature then faster RCNN. 2.3. N. Saranya, M. Nandinipriya, U. Priya, “Real Time Object Detection for Blind People”. In these paper author explained us about object detection from the image and represented it by their name and speech. And it also helps the blind people in location and encoded the audio into 2 channel audios withthehelpof3D binaural sound. Inthese a video is capture with portable camera device from client side and it is streamed to a server for real time image recognition with object detection. which mean it identify and follow the same object in sequence of video frames sometimes video may have some noise. to remove that noise from frames noise reduction technique is used that improve the image quality and extraction of object frame is used to detected the object based on color of the moving frame .using different feature extraction of object from frame is used they are called object detection every object has specific feature based on shapes by using this a rectangular bounding box and centroid is plotted and position of that centroid is stored with bounded box. Now by using data streaming a pipeline is developed that enable quick communication. A raw image is taken from camera encodes it into string and send it to a client to the server. It sends information directly to the unity sound generator and play the binaural sound and output is received to unity sound generator and it plays through a wireless cord. Thispapercanbeusedinconversionofsound which will be helpful for blind person in object detection. From the past event and the existing approach, the below Drawback are been noted: 1. Cost Effective System. 2. Lack of user acceptance. 3. High chances of system breaking due to hardware failure. Considering all the drawbacks into account we have formulated a proposed system which covers all the above-mentioned drawbacks. 1. Low cost system is used. 2. No special external hardware is needed. 3. Software based system leading to low chances of complete system failure.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2824 3. FLOWCHART As shown in given figure in this image is taken from camera of an object: Step 1. place the object inside the box. Step 2. object will get scan by the camera which is placed inside the box. Step 3. if object is not detected then Change the position of an object until object is detected by the camera. Step 4. object is detected by the camera Extract region proposals using an algorithm such as Selective Search. Step 5. Detection is identify the class id attribute of a detected object from which class that are belonging. Step 6. After detecting frame of an object, a score is generated and bounded box is created. Step 7. pyttsx3 lib is used to convert txt into a speech operation in which class name is identified mask RCNN and gets converted to voice speech.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2825 Step 8. Pyttsx3 is used for text to speech conversion in python. it is a libraries which is used for conversion. unlike other libraries it works on an office and it is compatible with python 2 and python 3. Step 9. pyttsx3 initialize the engine after that we can set the speaking rate volume level and we can get the details of current voice. Step 10. engine. Say() is used to output the informationof currentlydetectedobjected regionproposalsusinganalgorithmsuch as Selective Search. 4. IMPLEMENTATION In this section the implementation of Blind-Aid application has been described. In section A of this topic, the actual methodology used is been described with respect to all the modules. Section B, are the snapshots of the application with their description showing the implemented application in detail. [A] Methodology Used: In these we are using a system from which blind person can place an object into thatsystemafterprocessingvoice isgenerated through which blind person can identify that object. These systems are using deep learning from which a clicked image of an object is converted and transforms into speech which become handy for blind person to understand that object. Image processing is used in the following way it is done by taking an image as an input goesthroughMask-RCNN processionwhich: it generates proposals regions of that object from a given input image which is taken from camera. It predicts the class of that object, define the bounding box and generates a mask at a pixel level of the object based on that stage proposal. with the reference images we can describe an object detection and translation. [B] Screenshots: System: The above figure is the system in which it consists of a box in which camera is mounted on a top when object is placed in the box camera detected an object.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2826 Process: Above figure shows how object detection and translation works in these OpenCV uses a videocapture() in these it captures an image of that object and by using frame which has rois, masks, class_ids, scores object is detected and masked. And By using pyttx3 lib we can translate class_ids object into speech by using engine.say(). Detected Object: This is image of an object detected that is bottle in which it shows score. Score can range from0to1.Biggerthescoresmorethe accurate the result. 5. FUTURE SCOPE In future scope of this application, the System will detect more object that are used daily. It will increase its accuracy so to get better result and it will also consider complex shape which was difficult by the blind person to detected. Since this project is using a system which is stationary by using smartphone, wecanovercomethisproblem.smartphoneisportableso itcomeeasy to carry.
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 03 | Mar 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 2827 6. CONCLUSION The Project entitled “Object Detection and Translation for Bind People Using Deep Learning” has been developed and this satisfies all proposed requirements. The system is highly usable and user friendly. All the system objectiveshave beenmet.All these phases of development are done according to methodologies. The application will execute successfully by fulfilling the objectives of the project. Further extensions to this application can be made as required with minor modifications. ACKNOWLEDGEMENT We are pleased to present “Object Detection and Translation for Bind People Using Deep Learning”asourprojectandtakethis opportunity to express our profound gratitude to all those people who helped us in completion of this project. We thank our college for providing us with excellent facilities that helped us to complete and present this project. We would also like to thank the staff members and lab assistants for permitting us to use computers in the lab as and when required. We express our deepest gratitude towards our project guide Prof. Ichhanshu Jaiswal for his valuableandtimelyadviceduring the various phases in our project. We would also like to thank him for providing us with all proper facilities and supportasthe project coordinator. We would like to thank him for support, patience and faithinourcapabilitiesandforgivingusflexibilityin terms of working and reporting schedules. Finally, we would like to thank everyone who has helped us directly or indirectly in our project. REFERENCES [1] Facebook AI Research (FAIR), Kaiming He Georgia Gkioxari Piotr Dollar Ross. [2] Prof. Seema Udgirkar, Shivaji Sarokar, Sujit Gore, Dinesh Kakuste, Suraj Chaskar, Object Detection System for Blind People. [3] Object Detection Using Adaptive Mask RCNN in Optical Remote Sensing Images, Amira S. Mahmoud, Sayed A. Mohamed1 Reda A. El-Khoribi2 Hisham M. AbdelSalam. [4] Real Time Object Detection for Blind People, N.Saranya , M.Nandinipriya , U.Priya [5] Convolutional Neural Network for Object Detection System forBlind People, Y.C.Wong,J.A.Lai,S.S.S.Ranjit,A.R.Syafeeza,N. A. Hamid [6] Object Detection and Recognition for Visually Impaired People, Shuihua Wang [7]medium.com_@alittlepain833/simple-understanding-of-mask-rcnn , Xiang Zhang [8] TEXT TO SPEECH CONVERSION MODULE, Hussain Rangoonwala , Vishal Kaushik , P Mohith and DhanalakshmiSamiappan