SlideShare a Scribd company logo
2
Most read
3
Most read
9
Most read
A Case Study on
DSP Speech
Processing
What is Speech Processing?
1. Speech Coding 2. Speech Recognition 3. Speech Verification
4. Speech Enhancement5. Speech Synthesis
Speech processing is the application of digital signal processing (DSP) techniques to
the processing and analysis of speech signals.
The application of speech processing includes:
Process of Speech Production
The speech production process begins
when the talker formulates a message
in his/her mind to transmit to listener
via speech. The next step is the
conversion of the message into a
language code. This corresponds to
converting the message into a set of
phoneme sequences corresponding to
sounds that make up the words, along
with prosody makers denoting duration
of sounds, loudness of sounds, and
pitch associated with the sounds.
Figure: Shows a schematic diagram of the speech
production/perception process in human beings
Information Rate of the Speech Signal
First Stage
The discrete symbol
information rate in the raw
message text is rather low
(about 50 bits per second
corresponding to about 8
sounds per second, where
each sound is one of the
about 50 distinct
symbols). After the
language code conversion,
with the inclusion of
prosody information, the
information rate rises to
about 200 bps.
Second Stage
In the next stage the
representation of the
information in the signal
becomes continuous with
an equivalent rate of about
2000 bps at the
neuromuscular control
level and about 30000-
50000 bps at the acoustic
signal level.
Third Stage
The continuous
information rate at the
basilar membrane is in the
range of 30000-500000
bps, while at the neural
transduction stage is about
2000 bps.
Fourth Stage
The higher-level
processing within the
brain converts the neural
signals to a discrete
representation, which
ultimately is decoded into
a low bit rate message.
Classification of Speech Sound
Type 1: VOICED speech is produced when the vocal cords play an active role in the
production of sound:
•50: 200 Hz for male speakers
•150: 300 Hz for female speakers.
•200: 400 Hz for child speakers.
Example: Voiced sounds (A), (E), (I).
Type 2: UNVOICED Speech is produced when vocal cords are inactive.
The vocal cords are held open and air flows continuously through them.
Example: Unvoiced sounds (S), (F).
Formant Frequencies
Speech normally exhibits one formant frequency in
every 1KHz. For VOICED speech, the magnitude of
the lower formant frequencies is successively larger
than the magnitude of the higher formant frequencies.
For UNVOICED speech, the magnitude of the higher
formant frequencies is successively larger than the
magnitude of the lower formant frequencies.
Basic Assumption of Speech Processing
Parameters & Speech Sound
1. Phonemes: Smallest segments of speech sounds /d/ and /b/ are distinct
phonemes e.g. dark and bark.
2. It is important to realize, that phonemes are abstract linguistic units and may
not be directly observed in the speech signal.
3. Different speakers producing the same string of phonemes convey the same
information yet sound different as a result of differences in dialect and vocal
tract length and shape.
4. There are about 40 phonemes in English.
5. We can see the table for IPA (international Phonetic Alphabet) symbol for each
phoneme together with sample words in which they occur.
Model for Speech Production
To develop an accurate model for how speech is produced, it is necessary to develop a digital
filter-based model of human speech production mechanism. The model must contain 4 steps:
Steps of Speech Production
Operation of the Vocal Tract
Lip/Nasal Radiation Process
Both Voice and Unvoiced Speech
Time Frame: 10-20ms
Overall Speech Production Model
Thank You……..

More Related Content

PPTX
1.arithmetic & logical operations
PPT
Instruction cycle
PPTX
Addressing modes of 8086 - Binu Joy
PPTX
8237 dma controller
PPTX
discrete time signals and systems
PPTX
Linear block coding
PDF
Digital Image Fundamentals
PPT
Bus and Memory transfer
1.arithmetic & logical operations
Instruction cycle
Addressing modes of 8086 - Binu Joy
8237 dma controller
discrete time signals and systems
Linear block coding
Digital Image Fundamentals
Bus and Memory transfer

What's hot (20)

PPT
Ch7 official=computer organization and archietectur- CO-COA
PPTX
COMPUTER INSTRUCTIONS & TIMING & CONTROL.
PPTX
Digital modeling of speech signal
PPTX
Linear Predictive Coding
PPTX
Precessor organization
PPTX
Linear Block Codes
PPTX
2.8 normal forms gnf & problems
PDF
Addressing modes/Addressing Mode with illustration/ Addressing mode in 8086
PDF
VTU E&C,TCE CBCS[NEW] 5th Sem Information Theory and Coding Module-1 notes(15...
PDF
8086 String Instructions.pdf
PPTX
Instruction Pipelining
PDF
Computer organiztion5
PPTX
Auto correlation and cross-correlation
PPT
Data Redundacy
PDF
Design of FIR filters
PDF
Memory mapping
PPTX
Index registers
PPTX
Flag Registers (Assembly Language)
PDF
Information theory
PPTX
Unit 4-booth algorithm
Ch7 official=computer organization and archietectur- CO-COA
COMPUTER INSTRUCTIONS & TIMING & CONTROL.
Digital modeling of speech signal
Linear Predictive Coding
Precessor organization
Linear Block Codes
2.8 normal forms gnf & problems
Addressing modes/Addressing Mode with illustration/ Addressing mode in 8086
VTU E&C,TCE CBCS[NEW] 5th Sem Information Theory and Coding Module-1 notes(15...
8086 String Instructions.pdf
Instruction Pipelining
Computer organiztion5
Auto correlation and cross-correlation
Data Redundacy
Design of FIR filters
Memory mapping
Index registers
Flag Registers (Assembly Language)
Information theory
Unit 4-booth algorithm
Ad

Similar to A Case Study on DSP (Speech Processing) (20)

PPTX
speech processing basics
PDF
DDSP_2018_FOEHU - Lec 10 - Digital Signal Processing Applications
PDF
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
PDF
An Introduction to Various Features of Speech SignalSpeech features
PDF
Speech processinglecworkshop
PDF
Speech signal processing lizy
PPTX
Automatic Speech Recognion
DOCX
Automatic Speech Recognition
PPTX
Speech recognition and digital image processing.pptx
PDF
Iciiecs1461
DOCX
Voice morphing document
PPT
Speech encoding techniques
PDF
Speech Analysis and synthesis using Vocoder
PPT
Principal characteristics of speech
PPTX
Part1 speech basics
PDF
Vocal Translation For Muteness People Using Speech Synthesizer
PDF
Vocal Translation For Muteness People Using Speech Synthesizer
PDF
DSP_Module5_Rev2.pdfICE3251_DSP_DIGITAL SYSTEM PROCESSING_MIT
PPTX
Speech acoustics
speech processing basics
DDSP_2018_FOEHU - Lec 10 - Digital Signal Processing Applications
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
An Introduction to Various Features of Speech SignalSpeech features
Speech processinglecworkshop
Speech signal processing lizy
Automatic Speech Recognion
Automatic Speech Recognition
Speech recognition and digital image processing.pptx
Iciiecs1461
Voice morphing document
Speech encoding techniques
Speech Analysis and synthesis using Vocoder
Principal characteristics of speech
Part1 speech basics
Vocal Translation For Muteness People Using Speech Synthesizer
Vocal Translation For Muteness People Using Speech Synthesizer
DSP_Module5_Rev2.pdfICE3251_DSP_DIGITAL SYSTEM PROCESSING_MIT
Speech acoustics
Ad

More from Md. Towhidul Islam Chowdhury (8)

PPTX
Power distance, Masculinity & feminity
PPT
PPTX
Business Operation Procedure for ERP
PPT
5G Enabled Vehicular Networks
PPT
Thesis requirements - Chinese Culture
PPTX
Foods in Bangladesh
PPTX
Art, Music and Entertainment of Bangladesh
PPTX
Heroes of Bangladesh
Power distance, Masculinity & feminity
Business Operation Procedure for ERP
5G Enabled Vehicular Networks
Thesis requirements - Chinese Culture
Foods in Bangladesh
Art, Music and Entertainment of Bangladesh
Heroes of Bangladesh

Recently uploaded (20)

PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
composite construction of structures.pdf
PPTX
Safety Seminar civil to be ensured for safe working.
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
Current and future trends in Computer Vision.pptx
CH1 Production IntroductoryConcepts.pptx
composite construction of structures.pdf
Safety Seminar civil to be ensured for safe working.
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
OOP with Java - Java Introduction (Basics)
Internet of Things (IOT) - A guide to understanding
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Foundation to blockchain - A guide to Blockchain Tech
Model Code of Practice - Construction Work - 21102022 .pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
bas. eng. economics group 4 presentation 1.pptx
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
Current and future trends in Computer Vision.pptx

A Case Study on DSP (Speech Processing)

  • 1. A Case Study on DSP Speech Processing
  • 2. What is Speech Processing? 1. Speech Coding 2. Speech Recognition 3. Speech Verification 4. Speech Enhancement5. Speech Synthesis Speech processing is the application of digital signal processing (DSP) techniques to the processing and analysis of speech signals. The application of speech processing includes:
  • 3. Process of Speech Production The speech production process begins when the talker formulates a message in his/her mind to transmit to listener via speech. The next step is the conversion of the message into a language code. This corresponds to converting the message into a set of phoneme sequences corresponding to sounds that make up the words, along with prosody makers denoting duration of sounds, loudness of sounds, and pitch associated with the sounds. Figure: Shows a schematic diagram of the speech production/perception process in human beings
  • 4. Information Rate of the Speech Signal First Stage The discrete symbol information rate in the raw message text is rather low (about 50 bits per second corresponding to about 8 sounds per second, where each sound is one of the about 50 distinct symbols). After the language code conversion, with the inclusion of prosody information, the information rate rises to about 200 bps. Second Stage In the next stage the representation of the information in the signal becomes continuous with an equivalent rate of about 2000 bps at the neuromuscular control level and about 30000- 50000 bps at the acoustic signal level. Third Stage The continuous information rate at the basilar membrane is in the range of 30000-500000 bps, while at the neural transduction stage is about 2000 bps. Fourth Stage The higher-level processing within the brain converts the neural signals to a discrete representation, which ultimately is decoded into a low bit rate message.
  • 5. Classification of Speech Sound Type 1: VOICED speech is produced when the vocal cords play an active role in the production of sound: •50: 200 Hz for male speakers •150: 300 Hz for female speakers. •200: 400 Hz for child speakers. Example: Voiced sounds (A), (E), (I). Type 2: UNVOICED Speech is produced when vocal cords are inactive. The vocal cords are held open and air flows continuously through them. Example: Unvoiced sounds (S), (F).
  • 6. Formant Frequencies Speech normally exhibits one formant frequency in every 1KHz. For VOICED speech, the magnitude of the lower formant frequencies is successively larger than the magnitude of the higher formant frequencies. For UNVOICED speech, the magnitude of the higher formant frequencies is successively larger than the magnitude of the lower formant frequencies.
  • 7. Basic Assumption of Speech Processing Parameters & Speech Sound 1. Phonemes: Smallest segments of speech sounds /d/ and /b/ are distinct phonemes e.g. dark and bark. 2. It is important to realize, that phonemes are abstract linguistic units and may not be directly observed in the speech signal. 3. Different speakers producing the same string of phonemes convey the same information yet sound different as a result of differences in dialect and vocal tract length and shape. 4. There are about 40 phonemes in English. 5. We can see the table for IPA (international Phonetic Alphabet) symbol for each phoneme together with sample words in which they occur.
  • 8. Model for Speech Production To develop an accurate model for how speech is produced, it is necessary to develop a digital filter-based model of human speech production mechanism. The model must contain 4 steps: Steps of Speech Production Operation of the Vocal Tract Lip/Nasal Radiation Process Both Voice and Unvoiced Speech Time Frame: 10-20ms