Amadou

Field: Computational linguistics
Topic :Speech Recognition
how SR came to be and how it
works? How beneficial is it for
students?
Presenter:
AMADOU ADAMOU AMINOU

Have you ever talked to your
computer or your phone? Where
it actually recognized what you
said and than did something as a
result??

Contents
• History
• Introduction
• Types of Speech Recognition
• Components of SR
• Applications
• Examples of SR
• Weakness and flaws
• Conclusion
• References

INTRODUCTION
WHAT IS SPEECH RECOGNISTION?
• Speech recognition is a process by which a
computer takes a speech signal (recorded
using a microphone) and converts it into
words in real-time.
• SR simply is the process of converting spoken
input to text.
PELLOM B. Sonic: The University of Colorado Continuous, Speech
Recognition System.2001

History
• The first speech recognition system was invented by Bell
laboratories in 1952 in US. It could understand only digits
spoken by a single voice. The system is called the “AUDREY”
system.
• Ten years late labs in the US, Japan, England developed
hardware dedicated to recognize spoken sounds.
• They now have large vocabularies and can recognize
continuous speech.
Eugene Weinstein - Speech Recognition p15, Prentice hall 1995

Why was SRS invented
• Individuals With Disabilities – Assists those who have visual
impairment, hand immobility, dyslexia, etc.
• Medical Transcription – Reduces delays to write out
medical transcriptions
• Dictation - Converts words to text in emails or other word
documents (also helpful for English Language Learners).
• Access Menu Commands – Opens files using voice commands.
Eugene Weinstein - Speech Recognition p40-50, Prentice hall 1995

Speech recognition system consists of:
• A microphone.
• A speech recognition software.
• A computer to take and interpret the speech.
• A good quality soundcard for input and output.
• A proper and good pronunciation.
BGROUV, B T. (1989) computational linguistics, London, Longman

Two types of SR..
• Speaker-dependent systems
– Require “training” to “teach” the individual System
– More robust
– But less convenient
– And obviously less portable
• Speaker-independent systems
– Language coverage is reduced to compensate need to be
flexible in phoneme identification
– Clever compromise is to learn on the fly
Eugene Weinstein - Speech Recognition p40-45, Prentice hall 1995

Components
• Audio input
• Grammar
• Speech Recognition Engine
• Acoustic Model
• Recognized text

What’s hard about that?
• Digitization
– Converting analogue signal into digital representation.
• Signal processing
– Separating speech from background noise.
• Phonetics
– Variability in human speech.
• Phonology
– Recognizing individual sound distinctions (similar phonemes.)
• Lexicology and syntax
– Disambiguating homophones.
– Features of continuous speech.
• Syntax and pragmatics
– Interpreting features.
– Filtering of performance errors (disfluencies).
Eugene Weinstein - Speech Recognition p15-67, Prentice hall 1995.

Potential uses in education
• Teaching students of foreign languages to
pronounce vocabulary correctly
• Enabling students who are physically
handicapped who cant use keyboard.
• Enabling student with textual interpretive
problems e.g. Dyslexia to enter text verbally.
• Restrictive access on high security computer,
where a keyboard may be used by hackers.
BGROUV, B T. (1989) computational linguistics, London, Longman

Applications of Speech Recognition
• Speech recognition applications include
 controlling any devices in your home (e.g. video),
 Call routing (e.g., "I would like to make a collect call"),
 Simple data entry (e.g., entering a credit card number),
 Preparation of structured documents (e.g., A radiology
report),
 Speech-to-text processing (e.g., word processors or emails)
 In aircraft cockpits (usually termed Direct Voice Input).
BGROUV, B T. (1989) computational linguistics, London: LONMAN

Example: Microsoft Speech
Recognition – Windows 7

SIRI & GOOGLE
Intelligent Personal Assistant
developed by Apple.
Google Now is an intelligent personal
assistant developed by Google.
Both use a combination of speaker- dependent and speaker-independent
SR systems

Weakness and Flaws
• Low signal-to-noise ratio: the program needs to
hear the words spoken distinctly.
• Intensive use of device power.
• Homonyms e.g. “there” and “their”, “be” and
“bee”
• Overlapping speech.
• No program is 100% perfect
• Problem of understanding dialects and accents
Example: see video
A practical Intro to the computer Analysis of Language, Geoff Barnbrook,
1996, Edinburgh University press.

Conclusion
• Revolutionize the way people conduct
business over the Web and ,differentiate
world-class e-businesses.
• VoiceXML ties speech recognition and
telephony together
• At some point in the future, speech
recognition may become speech
understanding.
• voice-enabled Web solutions TODAY!

References
• PELLOM, B., Sonic: The University of Colorado
Continuous Speech Recognition System, 2001
• BOGUREV, B T. (1989) computational linguistics,
London: LONGMAN
• Eugene Weinstein - Speech Recognition, Prentice
hall 1995
• http://www.tldp.org/HOWTO/Speech-
Recognition-
• A practical Intro to the computer Analysis of
Language, Geoff BRANKROOK, 1996, Edinburgh
University press.

Amadou

More Related Content

What's hot (20)

Similar to Amadou (20)

Recently uploaded (20)

Amadou