SlideShare a Scribd company logo
Speech Recognizers & Generators 
Let’s Get Started… 
Presented by: P. Kahoro 
Presented to: Prof P. Okanda
Speech Recognizers: What are they? 
A Speech is the vocalized form of human communication. 
Incomputer scienceandelectrical engineering,speech recognition(SR) is the translation of spoken words into text. It is also known as "automatic speech recognition" (ASR). Speech Recognition (SR) is the ability to translate a dictation or spoken word to text. 
-Speech recognition has evolved quite a bit over the past few years. Initially, it used to work in discrete dictation mode, where you had to pause between each spoken word. Today, however, it uses continuous dictation. It’s also become smarter, with its own set of grammar rules to make out the meaning of what’s being said.
Terms and Concepts 
•Utterances 
•Pronounciation 
•Grammer 
•Speaker Dependent System 
•Speaker Independent System 
•Training 
•Accuracy
Terms &Concepts 
Utterances: 
An utterance is any stream of speech between two periods of silence. Silence delineates the start and end of an utterance. An utterance can be a single word, or it can contain multiple words (a phrase or a sentence) 
Pronunciations: 
One piece of information that the speech recognition engine uses to process a word is its pronunciation, which represents what the speech engine thinks a word should sound like. 
Words can have multiple pronunciations associated with them. For example, the word “the” has at least two pronunciations in the U.S. English language: “thee” and “thuh”.
Cont… 
Grammar: Grammars define the domain, or context, within which the recognition engine works. The engine compares the current utterance against the words and phrases in the active grammars. If the user says something that is not in the grammar, the speech engine will not be able to understand it correctly. So usually speech engines have a very vast grammar. 
Accuracy: The ability of a recognizer can be examined by measuring its accuracy − or how well it recognizes utterances. 
Training: 
Somespeechrecognizershavetheabilitytoadapttoaspeaker.Whenthesystemhasthisability,itmayallowtrainingtotakeplace.
Cont… 
Speaker Dependent Systems: Speech recognition systems that require a user to train the system to his/her voice are known as speaker-dependent systems. If you are familiar with desktop dictation systems, most are speaker dependent like IBM Via Voice. 
Speaker Independent Systems: 
Speech recognition systems that do not require a user to train the system are known as speaker-independent systems.
How do humans do it? 
Articulation produces sound waves which the ear conveys to the brain for processing
How might computers do it? 
Digitization 
Acoustic analysis of the speech signal 
Linguistic interpretation 
Acoustic waveform 
Acoustic signal 
Speech recognition
How Speech Recognition Work? 
•Audio input 
•Apply a "grammar" so the speech recognizer knows what phonemes to expect. 
•Acoustic Model 
•Recognized text
How do computers do it? 
•First, the user gives a voice command over the microphone, which is passed to the sound card in your system. This analog signal is sampled converted into digital form using a technique called Pulse Code Modulation or PCM. This digital waveform is a stream of amplitudes that look like a wavy line. 
•The audio signal is further sampled and each sample is converted into a frequency domain. So, the incoming stream is now a set of discrete frequency bands, in a form that can be used by the speech recognizer. 
•The next stage involves recognizing these bands of frequencies. For this, the speech recognition software has a database containing thousands of frequencies or "phonemes", as they’re called.
Hardware: 
Sound Cards 
Soundcard with the cleanest A/D (Analog to Digital) conversions are recommended. 
Microphone 
The best choice for microphone is the headset style. 
Computers / Processors 
The more the speed the better Speech Recognition would work. For good Speech Recognition you should be having 1 GHz processor and 1 GB of RAM.
Where can it be used? 
•GPS: System control/navigation e.g. GPS-connected digital maps: “How far is it to the motorway junction?” 
•Commercial/Industrial applicationsin-car steering systems 
•Mobile telephony: Voice dialing hands-free use of mobile in car e.g. “Dial office” 
•Home automation -heating, ventilation and air conditioning
Where can it be used? 
•Military: System control/navigation e.g. Military -High-performance fighter aircraft, Helicopters, Training Air Traffic Controllers 
•Computer and Video Games: Speech input has been used in a limited number of computer and video games. The Microsoft Xbox, Nintendo GameCube, and Sony PlayStation 2 consoles all offer games with speech input/output. 
•Usage in education -Students who are blind 
•Voice Security System: security locks of gates and doors 
•Wearable Computers: The most futuristic application is in the use and functionality of wearable computers.
Speech Recognition Software 
•Dragon Naturally Speeking 
•IBM Via Voice 
•Microsoft Speech Recognition System 
•MacSpeechDictate 
•Philips Speech Magic
Pros of Speech Recognition 
•Faster than “hand-writing”. 
•Allows for better spelling, whether it be in text or documents. 
•Helpful for people with a mental or physical disability . 
•Hands-free capability .
Cons of Speech Recognition 
•No program is 100% perfect 
•Factors that affect the accuracy of speech recognition are: slang, homonyms, signal-to-noise ratio, and overlapping speech 
•Can be expensive depending on the program 
•Easily misinterprets vocal commands e.gSIRI
Conclusion 
•Revolutionize the way people conduct business over the Web and ,differentiate world-class e-businesses. 
•VoiceXMLties speech recognition and telephony together 
•voice-enabled Web solutions TODAY!
Generators: 
•Software generators are programs that build other programs. In computer science, a generator is a special routine that can be used to control the iteration behavior of a loop. In fact, all generators are iterators. 
•A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.
Types of software generators: 
•key generator(key-gen) 
•RandomPassword Generators 
•Code generator 
•Natural language generator 
•Random test generator 
•Pseudorandom number generator
key generator(key-gen) 
•Akeygenerator(key-gen)isacomputerprogramthatgeneratesaproductlicensingkey,suchasaserialnumber,necessarytoactivateforuseasoftwareapplication. 
•Key-gensmaybelegitimatelydistributedbysoftwaremanufacturersforlicensingsoftwareincommercialenvironmentswheresoftwarehasbeenlicensedinbulkforanentiresiteorenterprise,ortheymaybedistributedillegitimatelyincircumstancesofcopyrightinfringementorsoftwarepiracy. 
•Asoftwarelicenseisalegalinstrumentthatgovernstheusageanddistributionofcomputersoftware. 
•Illegitimatekeygeneratorsaretypicallydistributedbysoftwarecrackerse.gkey-gensusedtocrackfakeWindowsOSe.gWindows8arealreadyavailable
Random password generator 
•Arandompasswordgeneratorissoftwareprogramorhardwaredevicethattakesinputfromarandomorpseudo- randomnumbergeneratorandautomaticallygeneratesapassword.Randompasswordscanbegeneratedmanually,usingsimplesourcesofrandomnesssuchasdiceorcoins,ortheycanbegeneratedusingacomputer. 
•Whiletherearemanyexamplesof"random"passwordgeneratorprogramsavailableontheInternet,generatingrandomnesscanbetrickyandmanyprogramsdonotgeneraterandomcharactersinawaythatensuresstrongsecurity.Acommonrecommendationistouseopensourcesecuritytoolswherepossible,sincetheyallowindependentchecksonthequalityofthemethodsused.Notethatsimplygeneratingapasswordatrandomdoesnotensurethepasswordisastrongpassword,becauseitispossible,althoughhighlyunlikely,togenerateaneasilyguessedorcrackedpassword.Infactthereisnoneedatallforapasswordtohavebeenproducedbyaperfectlyrandomprocess:itjustneedstobesufficientlydifficulttoguess.
Pseudorandom number generators 
•Apseudorandom number generator(PRNG), also known as adeterministic random bit generator(DRBG),is analgorithmfor generating a sequence of numbers whose properties approximate the properties of sequences ofrandom numbers. 
•Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.
Code generator 
•In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form (e.g., machine code) that can be readily executed by a machine. 
•Sophisticated compilers typically perform multiple passes over various intermediate forms. This multi-stage process is used because many algorithms for code optimization are easier to apply one at a time, or because the input to one optimization relies on the completed processing performed by another optimization.
Men have become the tools of their tools. -P. Kahoro 
The End

More Related Content

PDF
Artificial Intelligence for Speech Recognition
PPTX
Group 2 -innovation in smartphones-
PPTX
Speech Recognition Technology
PPT
Abstract of speech recognition
PPT
Noise Adaptive Training for Robust Automatic Speech Recognition
PPTX
Speech Recognition by Iqbal
PPTX
Text to speech converter in C#.NET
PPTX
Speech Recognition
Artificial Intelligence for Speech Recognition
Group 2 -innovation in smartphones-
Speech Recognition Technology
Abstract of speech recognition
Noise Adaptive Training for Robust Automatic Speech Recognition
Speech Recognition by Iqbal
Text to speech converter in C#.NET
Speech Recognition

What's hot (20)

PDF
Speech Recognition: Transcription and transformation of human speech
PPT
Speech recognition
PPTX
Speech to text conversion
PDF
Speech recognition-using-wavelet-transform
PDF
speech processing and recognition basic in data mining
PPT
Gujarati Text-to-Speech Presentation
PPT
Speechrecognition 100423091251-phpapp01
DOCX
Automatic Speech Recognition
PPTX
Text to Speech for Mobile Voice
PPTX
Automatic speech recognition system
PPTX
Automatic speech recognition system
PPT
Artificial intelligence Speech recognition system
PPTX
Introduction to myanmar Text-To-Speech
PDF
AUTOMATIC SPEECH RECOGNITION- A SURVEY
PPT
Voice recognition
PPTX
project indesh
PPTX
TEXT-SPEECH PPT.pptx
PDF
Voice/Speech recognition in mobile devices
PPTX
Speech recognition final presentation
Speech Recognition: Transcription and transformation of human speech
Speech recognition
Speech to text conversion
Speech recognition-using-wavelet-transform
speech processing and recognition basic in data mining
Gujarati Text-to-Speech Presentation
Speechrecognition 100423091251-phpapp01
Automatic Speech Recognition
Text to Speech for Mobile Voice
Automatic speech recognition system
Automatic speech recognition system
Artificial intelligence Speech recognition system
Introduction to myanmar Text-To-Speech
AUTOMATIC SPEECH RECOGNITION- A SURVEY
Voice recognition
project indesh
TEXT-SPEECH PPT.pptx
Voice/Speech recognition in mobile devices
Speech recognition final presentation
Ad

Viewers also liked (20)

DOCX
משימה 1 מדעים כיתה ח
PDF
Manual para-realizar-estudios-de-prefactibilidad-y-factibilidad
PPTX
сейсмология
PDF
Кама сутра Media Relations или 50 оттенков Медиа
PPTX
Media Evaluation Question 3
PPTX
Il congiuntivo imperfetto e trapassato
PDF
Conexindia
DOC
Ta1 7º ano p1
PPTX
Alignment of the New Orleans Citywide Master Plan and the BioDistrict
PPT
Freedomof informationact(jones)(2012)
PPTX
Si spersonalizzante
DOC
اكمل المتواليات الاتية
PPT
Intel SUSE Texperts Webinar
PPT
Hand Swage Cable Railing Basics
PPT
Taurus y bovina
DOCX
2 3 task
PPTX
Hari konvokesyen222
PDF
Cсправочник по продукции LR
PPTX
Diapositivas
משימה 1 מדעים כיתה ח
Manual para-realizar-estudios-de-prefactibilidad-y-factibilidad
сейсмология
Кама сутра Media Relations или 50 оттенков Медиа
Media Evaluation Question 3
Il congiuntivo imperfetto e trapassato
Conexindia
Ta1 7º ano p1
Alignment of the New Orleans Citywide Master Plan and the BioDistrict
Freedomof informationact(jones)(2012)
Si spersonalizzante
اكمل المتواليات الاتية
Intel SUSE Texperts Webinar
Hand Swage Cable Railing Basics
Taurus y bovina
2 3 task
Hari konvokesyen222
Cсправочник по продукции LR
Diapositivas
Ad

Similar to Speech recognizers & generators (20)

PPTX
Artificial Intelligence- An Introduction
PPTX
Artificial Intelligence - An Introduction
PPT
Speech Recognition in Artificail Inteligence
PPTX
Presentation.ai
PPTX
PPTX
Speech Recognition Technology
PPTX
Speech to text conversion
PPTX
PDF
Paper on Speech Recognition
PDF
General Speereo Technology
PDF
Artificial intelligence - research areas
PPTX
Speech Recognition
PPT
Speech Recognition
PDF
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
PDF
VIRTUAL PERSONAL ASSISTANT.pdf
PDF
Speech recognition - how does it work?
PPTX
Speech Recognition
PPTX
550529842-SPEECH-RECOGNITION-PPT-BF.pptx
PDF
Desktop assistant
Artificial Intelligence- An Introduction
Artificial Intelligence - An Introduction
Speech Recognition in Artificail Inteligence
Presentation.ai
Speech Recognition Technology
Speech to text conversion
Paper on Speech Recognition
General Speereo Technology
Artificial intelligence - research areas
Speech Recognition
Speech Recognition
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
VIRTUAL PERSONAL ASSISTANT.pdf
Speech recognition - how does it work?
Speech Recognition
550529842-SPEECH-RECOGNITION-PPT-BF.pptx
Desktop assistant

Recently uploaded (20)

PDF
System and Network Administraation Chapter 3
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
AI in Product Development-omnex systems
PPTX
Introduction to Artificial Intelligence
PPT
Introduction Database Management System for Course Database
PDF
Nekopoi APK 2025 free lastest update
PPTX
Transform Your Business with a Software ERP System
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
top salesforce developer skills in 2025.pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
System and Network Administraation Chapter 3
How to Choose the Right IT Partner for Your Business in Malaysia
Design an Analysis of Algorithms II-SECS-1021-03
AI in Product Development-omnex systems
Introduction to Artificial Intelligence
Introduction Database Management System for Course Database
Nekopoi APK 2025 free lastest update
Transform Your Business with a Software ERP System
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
top salesforce developer skills in 2025.pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Upgrade and Innovation Strategies for SAP ERP Customers
Navsoft: AI-Powered Business Solutions & Custom Software Development
Operating system designcfffgfgggggggvggggggggg
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Which alternative to Crystal Reports is best for small or large businesses.pdf
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf

Speech recognizers & generators

  • 1. Speech Recognizers & Generators Let’s Get Started… Presented by: P. Kahoro Presented to: Prof P. Okanda
  • 2. Speech Recognizers: What are they? A Speech is the vocalized form of human communication. Incomputer scienceandelectrical engineering,speech recognition(SR) is the translation of spoken words into text. It is also known as "automatic speech recognition" (ASR). Speech Recognition (SR) is the ability to translate a dictation or spoken word to text. -Speech recognition has evolved quite a bit over the past few years. Initially, it used to work in discrete dictation mode, where you had to pause between each spoken word. Today, however, it uses continuous dictation. It’s also become smarter, with its own set of grammar rules to make out the meaning of what’s being said.
  • 3. Terms and Concepts •Utterances •Pronounciation •Grammer •Speaker Dependent System •Speaker Independent System •Training •Accuracy
  • 4. Terms &Concepts Utterances: An utterance is any stream of speech between two periods of silence. Silence delineates the start and end of an utterance. An utterance can be a single word, or it can contain multiple words (a phrase or a sentence) Pronunciations: One piece of information that the speech recognition engine uses to process a word is its pronunciation, which represents what the speech engine thinks a word should sound like. Words can have multiple pronunciations associated with them. For example, the word “the” has at least two pronunciations in the U.S. English language: “thee” and “thuh”.
  • 5. Cont… Grammar: Grammars define the domain, or context, within which the recognition engine works. The engine compares the current utterance against the words and phrases in the active grammars. If the user says something that is not in the grammar, the speech engine will not be able to understand it correctly. So usually speech engines have a very vast grammar. Accuracy: The ability of a recognizer can be examined by measuring its accuracy − or how well it recognizes utterances. Training: Somespeechrecognizershavetheabilitytoadapttoaspeaker.Whenthesystemhasthisability,itmayallowtrainingtotakeplace.
  • 6. Cont… Speaker Dependent Systems: Speech recognition systems that require a user to train the system to his/her voice are known as speaker-dependent systems. If you are familiar with desktop dictation systems, most are speaker dependent like IBM Via Voice. Speaker Independent Systems: Speech recognition systems that do not require a user to train the system are known as speaker-independent systems.
  • 7. How do humans do it? Articulation produces sound waves which the ear conveys to the brain for processing
  • 8. How might computers do it? Digitization Acoustic analysis of the speech signal Linguistic interpretation Acoustic waveform Acoustic signal Speech recognition
  • 9. How Speech Recognition Work? •Audio input •Apply a "grammar" so the speech recognizer knows what phonemes to expect. •Acoustic Model •Recognized text
  • 10. How do computers do it? •First, the user gives a voice command over the microphone, which is passed to the sound card in your system. This analog signal is sampled converted into digital form using a technique called Pulse Code Modulation or PCM. This digital waveform is a stream of amplitudes that look like a wavy line. •The audio signal is further sampled and each sample is converted into a frequency domain. So, the incoming stream is now a set of discrete frequency bands, in a form that can be used by the speech recognizer. •The next stage involves recognizing these bands of frequencies. For this, the speech recognition software has a database containing thousands of frequencies or "phonemes", as they’re called.
  • 11. Hardware: Sound Cards Soundcard with the cleanest A/D (Analog to Digital) conversions are recommended. Microphone The best choice for microphone is the headset style. Computers / Processors The more the speed the better Speech Recognition would work. For good Speech Recognition you should be having 1 GHz processor and 1 GB of RAM.
  • 12. Where can it be used? •GPS: System control/navigation e.g. GPS-connected digital maps: “How far is it to the motorway junction?” •Commercial/Industrial applicationsin-car steering systems •Mobile telephony: Voice dialing hands-free use of mobile in car e.g. “Dial office” •Home automation -heating, ventilation and air conditioning
  • 13. Where can it be used? •Military: System control/navigation e.g. Military -High-performance fighter aircraft, Helicopters, Training Air Traffic Controllers •Computer and Video Games: Speech input has been used in a limited number of computer and video games. The Microsoft Xbox, Nintendo GameCube, and Sony PlayStation 2 consoles all offer games with speech input/output. •Usage in education -Students who are blind •Voice Security System: security locks of gates and doors •Wearable Computers: The most futuristic application is in the use and functionality of wearable computers.
  • 14. Speech Recognition Software •Dragon Naturally Speeking •IBM Via Voice •Microsoft Speech Recognition System •MacSpeechDictate •Philips Speech Magic
  • 15. Pros of Speech Recognition •Faster than “hand-writing”. •Allows for better spelling, whether it be in text or documents. •Helpful for people with a mental or physical disability . •Hands-free capability .
  • 16. Cons of Speech Recognition •No program is 100% perfect •Factors that affect the accuracy of speech recognition are: slang, homonyms, signal-to-noise ratio, and overlapping speech •Can be expensive depending on the program •Easily misinterprets vocal commands e.gSIRI
  • 17. Conclusion •Revolutionize the way people conduct business over the Web and ,differentiate world-class e-businesses. •VoiceXMLties speech recognition and telephony together •voice-enabled Web solutions TODAY!
  • 18. Generators: •Software generators are programs that build other programs. In computer science, a generator is a special routine that can be used to control the iteration behavior of a loop. In fact, all generators are iterators. •A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.
  • 19. Types of software generators: •key generator(key-gen) •RandomPassword Generators •Code generator •Natural language generator •Random test generator •Pseudorandom number generator
  • 20. key generator(key-gen) •Akeygenerator(key-gen)isacomputerprogramthatgeneratesaproductlicensingkey,suchasaserialnumber,necessarytoactivateforuseasoftwareapplication. •Key-gensmaybelegitimatelydistributedbysoftwaremanufacturersforlicensingsoftwareincommercialenvironmentswheresoftwarehasbeenlicensedinbulkforanentiresiteorenterprise,ortheymaybedistributedillegitimatelyincircumstancesofcopyrightinfringementorsoftwarepiracy. •Asoftwarelicenseisalegalinstrumentthatgovernstheusageanddistributionofcomputersoftware. •Illegitimatekeygeneratorsaretypicallydistributedbysoftwarecrackerse.gkey-gensusedtocrackfakeWindowsOSe.gWindows8arealreadyavailable
  • 21. Random password generator •Arandompasswordgeneratorissoftwareprogramorhardwaredevicethattakesinputfromarandomorpseudo- randomnumbergeneratorandautomaticallygeneratesapassword.Randompasswordscanbegeneratedmanually,usingsimplesourcesofrandomnesssuchasdiceorcoins,ortheycanbegeneratedusingacomputer. •Whiletherearemanyexamplesof"random"passwordgeneratorprogramsavailableontheInternet,generatingrandomnesscanbetrickyandmanyprogramsdonotgeneraterandomcharactersinawaythatensuresstrongsecurity.Acommonrecommendationistouseopensourcesecuritytoolswherepossible,sincetheyallowindependentchecksonthequalityofthemethodsused.Notethatsimplygeneratingapasswordatrandomdoesnotensurethepasswordisastrongpassword,becauseitispossible,althoughhighlyunlikely,togenerateaneasilyguessedorcrackedpassword.Infactthereisnoneedatallforapasswordtohavebeenproducedbyaperfectlyrandomprocess:itjustneedstobesufficientlydifficulttoguess.
  • 22. Pseudorandom number generators •Apseudorandom number generator(PRNG), also known as adeterministic random bit generator(DRBG),is analgorithmfor generating a sequence of numbers whose properties approximate the properties of sequences ofrandom numbers. •Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.
  • 23. Code generator •In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form (e.g., machine code) that can be readily executed by a machine. •Sophisticated compilers typically perform multiple passes over various intermediate forms. This multi-stage process is used because many algorithms for code optimization are easier to apply one at a time, or because the input to one optimization relies on the completed processing performed by another optimization.
  • 24. Men have become the tools of their tools. -P. Kahoro The End