SlideShare a Scribd company logo
2
Most read
3
Most read
5
Most read
OCRBasedSpeechSynthesis
Bharat Thakur
Electrical & Electronics Engineering
Panjab University, Chandigarh
Bharat.puchd@gmail.com
Introduction
• Speech is more efficient and effective mode of communication as compared to text. In this
Project work OCR Based Speech Synthesis System has been discussed using LabVIEW 2013.
• The OCR application is developed with IMAQ Vision for LabVIEW software- developing tool
and it uses a commercial digital camera from any android phone as image acquisition device.
• Whole project can be divided into 2 parts:
Text to
speech
conversion
Optical
character
recognition
Components of OCR System
• The identity of each symbol is found by comparing the extracted features with descriptions of
the symbol classes obtained through a previous learning phase.
• Finally, contextual information is used to reconstruct the words and numbers of the original
text.
Fig 20. components of OCR system
System Analysis
A OCR based Speech Synthesis System is a computer-based system that should be able to read text
and give voice output, when the text is scanned and submitted to an Optical Character Recognition
(OCR) system.
Hardware Required:
1. Camera
2. Laptop
3. Speaker
Software Platform:
1. NI Labview
2. NI Vision Assistant
Image Acquisition
• The image has been captured using a digital camera from Redmi Note 3 Android Phone.
• The images are transmitted wirelessly to processor using an Android App named “IP Webcam”
Using Internet Protocol using the IP address of the streaming inside the app.
Fig 21. Block Diagram & Front Panel for Image Acquisition
Binarization
Binarization is the process of converting a grayscale image (0 to 255 pixel values) into binary image (0 to1
pixel values) by a threshold value of 175. the pixels lighter than the threshold are turned to white and the
remainder to black pixels.
Fig 22. Binarization
Template Matching
In template matching the written words in the image are segmented and then compared against a set
of character set file with the extension .abc.
This character set file is formed by using NI vision assistant itself.
After we got the
character by character
segmentation we store
the character image in
a structure. This
character as to be
identified for the pre-
defined character set.
Fig 23. Template Matching
Recognition
Fig 24. Recognition
Text to Speech Synthesis
• Speech synthesis is the artificial production of human speech.
• A computer system used for this purpose is called a speech computer or speech synthesizer.
• In text to speech module text recognized by OCR system will be the inputs of speech synthesis
system which is to be converted into speech which can be heard using an earphone connected to
the laptop or using the built in speakers.
• ActiveX is the general name for a set of Microsoft Technologies that allows you to reuse code
Constructor
Property Node
Invoke Node
Assemblies are
implemented in
3 steps
Text to Speech Code
Text to Speak
. The input given to the invoke node “Speak” in the last step is the text that gets converted to speech
and is available as output from the speakers of the laptop.
Speakers
Text to
Speech
OCRImage
Fig 25. Text to Speech Code
Final Code
Fig 26. Final Code
Fig 27. Steps inside the Vision Assistant
Results and Discussions
• Experiments Suggest that the system has been able to detect the text with high degree of
accuracy (75-80%). However, the efficiency of the systems depends a lot on the size of the font
which is under investigation.
Fig 28. Front Panel for final code
Future Prospects
OCR base Speech recognition system using LabVIEW is an efficient program giving good results for specific fonts
and font sizes, but there is room for improvement.
Future
Prospects
Multi-
Lingual
Educational
Purposes
Translator
Volume
Options
Omni-font
Font sizes
References
[1]www.scientificamerican.com/article/pavement-pounders-at-paris-marathon-generate-power/
[2] COMPARISON OF DIFFERENT BEAM SHAPES FOR PIEZOELECTRIC VIBRATION ENERGY HARVESTING [Maxime Defosseux1*,
Marjolaine Allain, Skandar Basrour, TIMA, UJF-CNRS-Grenoble INP, Grenoble, France]
[3]www.dailymail.co.uk/sciencetech/article-1027362/Britains-eco-nightclub-powered-pounding-feet-opens-doors.html
[4] Pataky TC, Bosch K, Mu T, Keijsers NLW, Segers V, Rosenbaum D, Goulermas JY (2011). An anatomically unbiased foot template for inter-
subject plantar pressure evaluation. Gait and Posture 33(3): 418-422.
[5] Kiran Boby, Aleena Paul K, Anumol.C.V, Josnie Ann Thomas, Nimisha K.K
“Footstep Power Generation Using Piezo Electric Transducers” International Journal of Engineering and Innovative Technology (IJEIT) Volume 3,
Issue 10, April 2014
[6] Landt, Jerry. "Shrouds of Time: The history of RFID," AIM, Inc., 31 May 2006
[7] National Instruments Vision Assistant Manual
[8] D. Klatt, “Review of Text-to-Speech Conversion for English,” Journal of the Acoustical Society of America, JASA vol. 82 (3), pp.737-793, 1987.
[9] ] E. Nunes; E. Abreu; J.C. Metrolho; N. Cardoso; M. Costa; E. Lopes, "Flour quality control using image processing," Industrial Electronics,
2003. ISIE '03. 2003 IEEE International Symposium on , vol.1, no., pp. 594-597 vol. 1, 9-11 June 2003
[10] Van Santen, J. (April 1994). "Assignment of segmental duration in text-to-speech synthesis". Computer Speech & Language 8 (2): 95–128.
doi:10.1006/csla.1994.1005.
THANK YOU

More Related Content

PPTX
Data compression
PPTX
Chain code in dip
DOCX
image compression using matlab project report
PPTX
Region based segmentation
PPTX
Psuedo color
PPTX
OCR 's Functions
PPT
morphological image processing
PPTX
Chapter 9 morphological image processing
Data compression
Chain code in dip
image compression using matlab project report
Region based segmentation
Psuedo color
OCR 's Functions
morphological image processing
Chapter 9 morphological image processing

What's hot (20)

PPSX
Color Image Processing: Basics
PPTX
Simultaneous Smoothing and Sharpening of Color Images
PPTX
Text summarization
PPTX
Number plate recognition using matlab
PPT
morphological image processing
PPTX
Image Interpolation Techniques with Optical and Digital Zoom Concepts
PPTX
Color Image Processing
PPTX
Detection and recognition of face using neural network
PDF
Automotive Software Basics
PPTX
Multimedia graphics and image data representation
DOCX
Optical character recognition IEEE Paper Study
PPTX
Term paper alarm clock
PDF
Introduction to object detection
PPTX
IMAGE SEGMENTATION.
PPTX
Image Compression
PPTX
Character Recognition using Machine Learning
PDF
Content Based Image Retrieval
PPTX
OCR Presentation (Optical Character Recognition)
PPTX
Text extraction from images
PPTX
Basics of-optical-character-recognition
Color Image Processing: Basics
Simultaneous Smoothing and Sharpening of Color Images
Text summarization
Number plate recognition using matlab
morphological image processing
Image Interpolation Techniques with Optical and Digital Zoom Concepts
Color Image Processing
Detection and recognition of face using neural network
Automotive Software Basics
Multimedia graphics and image data representation
Optical character recognition IEEE Paper Study
Term paper alarm clock
Introduction to object detection
IMAGE SEGMENTATION.
Image Compression
Character Recognition using Machine Learning
Content Based Image Retrieval
OCR Presentation (Optical Character Recognition)
Text extraction from images
Basics of-optical-character-recognition
Ad

Viewers also liked (7)

PDF
00 lab view arduino
PPTX
Labview
PPTX
SIMULATION OF TEMPERATURE SENSOR USING LABVIEW
PPTX
Optical Character Recognition (OCR)
PPTX
Water level controller
PPT
optical character recognition system
PPTX
Optical Character Recognition( OCR )
00 lab view arduino
Labview
SIMULATION OF TEMPERATURE SENSOR USING LABVIEW
Optical Character Recognition (OCR)
Water level controller
optical character recognition system
Optical Character Recognition( OCR )
Ad

Similar to OCR speech using Labview (20)

DOC
Girish one year
PDF
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
PDF
IRJET- Speech Based Answer Sheet Evaluation System
PDF
IRJET- Survey Paper: Image Reader for Blind Person
PDF
PKSengupta_TechAssoc
PDF
IRJET- Voice to Code Editor using Speech Recognition
PPTX
3 (1).pptxgsbbshjsjkskskksnshshjsjsjsjjsjsjsjjs
PDF
Andrew Resume
PDF
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
PDF
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
PDF
N010637794
PDF
Synthesized Speech using a small Microcontroller
PDF
IRJET- Wearable AI Device for Blind
PPTX
Scalable constrained spectral clustering
PDF
IRJET- Recruitment Chatbot
PDF
IRJET-Raspberry Pi Based Reader for Blind People
PPTX
Team-98 research paper presentation.pptx
PDF
An Intelligent Chatbot for College Enquiry with Amazon Lex
PPT
Aspect Oriented Software Development
PDF
IRJET- Voice Command Execution with Speech Recognition and Synthesizer
Girish one year
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IRJET- Speech Based Answer Sheet Evaluation System
IRJET- Survey Paper: Image Reader for Blind Person
PKSengupta_TechAssoc
IRJET- Voice to Code Editor using Speech Recognition
3 (1).pptxgsbbshjsjkskskksnshshjsjsjsjjsjsjsjjs
Andrew Resume
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS
N010637794
Synthesized Speech using a small Microcontroller
IRJET- Wearable AI Device for Blind
Scalable constrained spectral clustering
IRJET- Recruitment Chatbot
IRJET-Raspberry Pi Based Reader for Blind People
Team-98 research paper presentation.pptx
An Intelligent Chatbot for College Enquiry with Amazon Lex
Aspect Oriented Software Development
IRJET- Voice Command Execution with Speech Recognition and Synthesizer

Recently uploaded (20)

PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
UNIT 4 Total Quality Management .pptx
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Digital Logic Computer Design lecture notes
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
web development for engineering and engineering
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
PPT on Performance Review to get promotions
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPT
Mechanical Engineering MATERIALS Selection
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
R24 SURVEYING LAB MANUAL for civil enggi
Internet of Things (IOT) - A guide to understanding
UNIT 4 Total Quality Management .pptx
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
bas. eng. economics group 4 presentation 1.pptx
Digital Logic Computer Design lecture notes
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
CH1 Production IntroductoryConcepts.pptx
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
web development for engineering and engineering
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPT on Performance Review to get promotions
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Mechanical Engineering MATERIALS Selection
CYBER-CRIMES AND SECURITY A guide to understanding
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
R24 SURVEYING LAB MANUAL for civil enggi

OCR speech using Labview

  • 1. OCRBasedSpeechSynthesis Bharat Thakur Electrical & Electronics Engineering Panjab University, Chandigarh Bharat.puchd@gmail.com
  • 2. Introduction • Speech is more efficient and effective mode of communication as compared to text. In this Project work OCR Based Speech Synthesis System has been discussed using LabVIEW 2013. • The OCR application is developed with IMAQ Vision for LabVIEW software- developing tool and it uses a commercial digital camera from any android phone as image acquisition device. • Whole project can be divided into 2 parts: Text to speech conversion Optical character recognition
  • 3. Components of OCR System • The identity of each symbol is found by comparing the extracted features with descriptions of the symbol classes obtained through a previous learning phase. • Finally, contextual information is used to reconstruct the words and numbers of the original text. Fig 20. components of OCR system
  • 4. System Analysis A OCR based Speech Synthesis System is a computer-based system that should be able to read text and give voice output, when the text is scanned and submitted to an Optical Character Recognition (OCR) system. Hardware Required: 1. Camera 2. Laptop 3. Speaker Software Platform: 1. NI Labview 2. NI Vision Assistant
  • 5. Image Acquisition • The image has been captured using a digital camera from Redmi Note 3 Android Phone. • The images are transmitted wirelessly to processor using an Android App named “IP Webcam” Using Internet Protocol using the IP address of the streaming inside the app. Fig 21. Block Diagram & Front Panel for Image Acquisition
  • 6. Binarization Binarization is the process of converting a grayscale image (0 to 255 pixel values) into binary image (0 to1 pixel values) by a threshold value of 175. the pixels lighter than the threshold are turned to white and the remainder to black pixels. Fig 22. Binarization
  • 7. Template Matching In template matching the written words in the image are segmented and then compared against a set of character set file with the extension .abc. This character set file is formed by using NI vision assistant itself. After we got the character by character segmentation we store the character image in a structure. This character as to be identified for the pre- defined character set. Fig 23. Template Matching
  • 9. Text to Speech Synthesis • Speech synthesis is the artificial production of human speech. • A computer system used for this purpose is called a speech computer or speech synthesizer. • In text to speech module text recognized by OCR system will be the inputs of speech synthesis system which is to be converted into speech which can be heard using an earphone connected to the laptop or using the built in speakers. • ActiveX is the general name for a set of Microsoft Technologies that allows you to reuse code Constructor Property Node Invoke Node Assemblies are implemented in 3 steps
  • 10. Text to Speech Code Text to Speak . The input given to the invoke node “Speak” in the last step is the text that gets converted to speech and is available as output from the speakers of the laptop. Speakers Text to Speech OCRImage Fig 25. Text to Speech Code
  • 11. Final Code Fig 26. Final Code Fig 27. Steps inside the Vision Assistant
  • 12. Results and Discussions • Experiments Suggest that the system has been able to detect the text with high degree of accuracy (75-80%). However, the efficiency of the systems depends a lot on the size of the font which is under investigation. Fig 28. Front Panel for final code
  • 13. Future Prospects OCR base Speech recognition system using LabVIEW is an efficient program giving good results for specific fonts and font sizes, but there is room for improvement. Future Prospects Multi- Lingual Educational Purposes Translator Volume Options Omni-font Font sizes
  • 14. References [1]www.scientificamerican.com/article/pavement-pounders-at-paris-marathon-generate-power/ [2] COMPARISON OF DIFFERENT BEAM SHAPES FOR PIEZOELECTRIC VIBRATION ENERGY HARVESTING [Maxime Defosseux1*, Marjolaine Allain, Skandar Basrour, TIMA, UJF-CNRS-Grenoble INP, Grenoble, France] [3]www.dailymail.co.uk/sciencetech/article-1027362/Britains-eco-nightclub-powered-pounding-feet-opens-doors.html [4] Pataky TC, Bosch K, Mu T, Keijsers NLW, Segers V, Rosenbaum D, Goulermas JY (2011). An anatomically unbiased foot template for inter- subject plantar pressure evaluation. Gait and Posture 33(3): 418-422. [5] Kiran Boby, Aleena Paul K, Anumol.C.V, Josnie Ann Thomas, Nimisha K.K “Footstep Power Generation Using Piezo Electric Transducers” International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 10, April 2014 [6] Landt, Jerry. "Shrouds of Time: The history of RFID," AIM, Inc., 31 May 2006 [7] National Instruments Vision Assistant Manual [8] D. Klatt, “Review of Text-to-Speech Conversion for English,” Journal of the Acoustical Society of America, JASA vol. 82 (3), pp.737-793, 1987. [9] ] E. Nunes; E. Abreu; J.C. Metrolho; N. Cardoso; M. Costa; E. Lopes, "Flour quality control using image processing," Industrial Electronics, 2003. ISIE '03. 2003 IEEE International Symposium on , vol.1, no., pp. 594-597 vol. 1, 9-11 June 2003 [10] Van Santen, J. (April 1994). "Assignment of segmental duration in text-to-speech synthesis". Computer Speech & Language 8 (2): 95–128. doi:10.1006/csla.1994.1005.