SlideShare a Scribd company logo
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 04 Issue: 05 | May-2015, Available @ http://www.ijret.org 361
COMPARATIVE PERFORMANCE ANALYSIS OF CHANNEL
NORMALIZATION TECHNIQUES
Rupa Patel1
, Aparna Gurjar2
1
Asst. Professor, Department of Computer Application, Ramdeobaba College of Engineering and Management,
Nagpur, Maharashtra, India
2
Asst. Professor, Department of Computer Application, Ramdeobaba College of Engineering and Management,
Nagpur, Maharashtra, India
Abstract
A major part of the interaction between humans takes place via speech communication. The speech signal carries both useful and
unwanted information. Processing of such signals involve enhancing the useful information. The intelligibility of speech signals is
significantly reduced due to the presence of unwanted information such as noise. Channel normalization algorithms suppress such
additive noise introduced in the speech signals by transmission channel or by recording environment conditions. Enhancing the
quality and intelligibility of speech signals improve the performance of speech systems such as Automatic speech recognition
(ASR) , voice communication and hearing aids to name the few. Based on the experimental results the comparative analysis of
channel normalization techniques have been presented in this paper to find out the most suitable algorithm for enhancing the
speech signals.
Keywords: Cepstral Mean Normalization, Spectral Subtraction, Weiner filter, Signal to Noise Ratio
--------------------------------------------------------------------***----------------------------------------------------------------------
1. INTRODUCTION
Automatic Speech Recognition (ASR) is a technology that
transforms human speech to a symbolic representation.
Recognizer performs the transformation with the goal that it
can handle spontaneous speech from any speaker in any
environment.
The speech waveform produced by a speaker is transmitted
over some channel before it reaches the recording device,
and the channel disturbs the original speech signal. Most of
the channel normalization techniques either deal with
channel transfer characteristics or additive noise. Study
reveals that different channel normalization techniques have
been developed to minimize the effects of channel noises in
general on speech recognition systems.
This paper focuses on three channel normalization
techniques Cepstral Mean Normalization, Spectral
Subtraction and Weiner filtering for reduction of noise in
speech signals. The techniques are summarized in table 1.
Table -1: Channel Normalization Techniques
Cepstral Mean
Normalization
(CMN)
 Noise compensation technique.
 Also refereed as Channel Mean
Normalization.
 Reduce distortion caused by
transmission channel.
 Applied to cepstral data.
 Considered as a high pass filtering
process [1].
 Based on the observation that a linear
channel distortion becomes a
constant offset in the cepstral domain
[2].
 Cepstral mean value is calculated
across the whole speech utterance
(combination of cepstral vectors),
this calculated cepstral mean is
subtracted from each frame single
cepstral vector. [3].
Spectral
Subtraction (SS)
 Voice activity detector first detects
whether the frame is noisy or speech
frame. Noise spectrum is obtained for
noisy frame. Clean speech spectrum
is obtained by subtracting noise
spectrum from corrupted speech
spectrum so that signal-to-noise ratio
(SNR) is improved [4, 5, 6, 7, 8].
Weiner Filter  Adaptive filter [3].
 Useful for additive noise reduction
and signal restoration. [3].
 Based on tracking a priori SNR using
Decision-Directed method, proposed
by Scalart et al 96 [9 ].
In this method it is assumed that
SNRpost=SNRprior +1[10].
 Noise present in the signal is reduced
by comparison with an estimate of
the desired clean signal with that of
noisy speech signals. [11,12]. This
estimate is obtained by minimizing
the Mean Square Error (MSE)
between the desired signal and the
estimated signal[11].
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 04 Issue: 05 | May-2015, Available @ http://www.ijret.org 362
2. EXPERIMENTAL RESULTS
The setup to analyze the various algorithms to determine the
suitability of the techniques in speech systems has been
implemented in MATLAB. Results of seven samples have
been discussed here. The seven speech signals are
represented as case 1 to case 7 respectively. Testing data is
with additive noise.
Preprocessing Step
The speech signals are analyzed in short time segments,
referred as analysis frames [13]. Processing of only voiced
speech signals is important. The input signals must be first
classified as voiced or unvoiced. Voice activity detection
(VAD) algorithm detects the presence of human speech in
the signal In VAD algorithms, the parameters used for
speech detection are based for voiced /unvoiced
classification zero-crossing rate and short time energy [14].
Zero-crossing analysis is a simple kind of voice time-
domain analysis. Considering audio data as discrete signals
Zero crossing is said to occur if successive samples have
different algebraic signs. The rate at which zero crossings
occur is the measure of the frequency content of a signal.
Zero-crossing rate is the number of times that the sample
changes the symbols. Zero-crossing rate is a measure of
number of times in a given time interval/frame that the
amplitude of the speech signals passes through a value of
zero. The zero-crossing rate is one of the useful parameter
for estimating whether speech is voiced or unvoiced [15].
Energy provides a representation that reflects the amplitude
variations. The amplitude of the speech signal varies with
time. The amplitude of unvoiced speech segments is much
lower than the amplitude of voiced segments. The energy of
the speech signal provides a representation that reflects
these amplitude variations [16].
To classify input speech signals as voiced or unvoiced
signals Voice activity detection algorithm is applied to each
of the case samples. The results of pre-processing step are
recorded in table 2.
Table -2: Results of Voice Activity Detector
Signals Energy ZCR
Case 1 0.60 0.07
Case 2 0.36 0.10
Case 3 0.68 0.07
Case 4 1.02 0.07
Case 5 0.36 0.10
Case 6 0.47 0.08
Case 7 0.43 0.07
All the seven speech signals are classified as voiced speech
waveform.
The energy plot and zero crossing plot for case1 is shown in
figure below.
For Case 1:
Fig 1 Energy Plot of Waveform for Case 1
Fig 2. ZCR Plot of Waveform for Case 1
Almost similar energy plot and zero crossing plot for the
remaining cases have been obtained.
2.1 Cepstral Mean Normalization
All Cepstral features were mean normalized and
normalization scheme were performed on the full utterance.
Signal to noise ratio (SNR), which is the measure of signal
strength relative to background noise of original voiced
speech segment is computed. CMN is applied to it and again
SNR is computed. Results are tabulated in table 3 and also
in comparative chart 1.
Table -3: Results of CMN
Signal SNR(db)
Original CMN Improvement
Case 1 2.4565 33.6335 31.177
Case 2 0.7504 25.1226 24.3722
Case 3 7.1191 35.2533 28.1342
Case 4 6.7261 45.1343 38.4069
Case 5 9.4415 37.9693 28.5278
Case 6 1.2681 27.7092 26.4411
Case 7 3.4327 23.2372 19.8045
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 04 Issue: 05 | May-2015, Available @ http://www.ijret.org 363
Chart -1: Comparison of SNR before & after CMN
2.2 Spectral Subtraction
SNR of original noisy voiced speech segment is computed.
Spectral subtraction is applied to it and SNR is again
computed. Results are tabulated in table 4 and depicted in
chart 2. Spectral subtraction estimates the clean speech
spectrum by subtracting the estimated additive noise
spectrum from the noisy speech spectrum.
Table -4: Results of Spectral Subtraction
Signal
SNR(db)
Original Spectral
Subtraction
Improvement
Case 1 2.4565 14.5997 12.1432
Case 2 0.7504 6.7826 6.0322
Case 3 7.1191 21.6013 14.4822
Case 4 6.7261 24.5745 17.8484
Case 5 9.4415 31.2225 21.781
Case 6 1.268157 5.8731 4.5921
Case 7 3.4327 11.7850 8.3523
Chart -2: Comparison of SNR before & after SS
Improvement in SNR values indicate that the additive noise
have been reduced from the speech signal
2.3 Weiner Filer
Weiner filtering is the most basic approach used for
reducing the noise from the signal. Signal to noise ratio, is
estimated before and after filtering signal. Table 5 and chart
3 shows the results of Weiner filtering.
Table -5: Results of Weiner
Signal
SNR (dB)
Original Weiner Improvement
Case 1 2.4565 24.8676 22.4111
Case 2 0.7504 6.0803 5.3299
Case 3 7.1191 23.6743 16.5552
Case 4 6.7261 23.6149 16.8888
Case 5 9.4415 23.2383 14.3616
Case 6 1.268157 12.1107 10.8425
Case 7 3.4327 16.6735 13.4208
Chart -3: Comparison of SNR before & after Weiner
From the computed SNR values and comparative chart one
can observe that there is an improvement in SNR values
which indicates that the noise from the signal have been
reduced after the implementation of channel normalization
techniques.
3. ANALYSIS
Table 3 and chart 1 summarize the results obtained by
using cepstral mean normalization technique. The
improvement in SNR values is in the range of 20 db and 30
db which shows that CMN algorithms significantly enhance
the signal. The evaluation is repeated with Spectral
subtraction and weiner filter. The SNR improvement in case
of spectral subtraction is approximately in between 4db to
21db as can be observed from table 4 and chart 2. While in
case of weiner filter it lies in the range of 10db to 22 db as
can be found from table 5 and chart 3.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 04 Issue: 05 | May-2015, Available @ http://www.ijret.org 364
Comparative analysis of these techniques is depicted in
chart 4.
Chart -4: Improvements in SNR for CMN, Spectral
Subtraction and Weiner
From chart 4 one can observe that the results obtained for
Spectral subtraction and Weiner are pretty much similar
and findings shows that CMN gives better result.
Cepstral mean normalization enhances the original signal
approximately by 31%, spectral subtraction by 13% and
weiner by 15%. These clearly indicate that Cepstral Mean
Normalization technique is useful in reducing the impact of
noise as compared to two other mentioned algorithms.
4. CONCLUSION
The effectiveness of different normalization techniques has
been evaluated and results obtained have been summarized.
From the results it is clear that the channel normalization
technique cepstral mean normalization reduce distortion and
proved to be effective
REFERENCES
[1]. Stern, Richard M., Bhiksha Raj, and Pedro J. Moreno.
"Compensation for environmental degradation in automatic
speech recognition." Robust Speech Recognition for
Unknown Communication Channels. 1997.
[2]. Garner, Philip N. "Cepstral normalization and the signal
to noise ratio spectrum in automatic speech recognition."
Speech Communication 53.8 (2011): 991-1001.
[3]. http://recognize-speech.com/preprocessing/10-
preprocessing.
[4]. N.S. Mahlanyane and D.J. Mashao, ”Channel
Normalization for GSM Speech Recognition”, SATNAC
conference paper, UCT, 2003.
[5]. J.S. Lim , A.V. Oppenheim “ Enhancement and
bandwidth compression of noisy speech “,Proc IEEE, vol
67, pp 1586-1604, December 1979.
[6]. S. Boll, “Suppression of Acoustic Noise in Speech using
Spectral Subtraction,” IEEE Trans. Acoust., Speech, Signal
Process., vol.27, pp. 113-120, Apr. 1979.
[7]. http://sound.eti.pg.gda.pl/denoise/noise.html
[8]. Kim, Chanwoo. Signal processing for robust speech
recognition motivated by auditory processing. Diss. Johns
Hopkins University, 2010.
[9]. Scalart, Pascal. "Speech enhancement based on a priori
signal to noise estimation." Acoustics, Speech, and Signal
Processing, 1996. ICASSP-96. Conference Proceedings.,
1996 IEEE International Conference on. Vol. 2. IEEE, 1996.
[10]. www.mathworks.com
[11]. Md. Jahangir Alam, Md. Faqrul Alam Chowdhury,
Md. Fasiul Alam, ”Comparative Study of a Priori Signal -
To- Noise Ratio(SNR) Estimation Approaches for Speech
Enhancement”, Journal of electrical and electronics
engineering, 2009,vol 1
[12]. Rupa Patel,Urmila Shrawankar, Dr. V.M.Thakare
,”Hiding speaker Characteristics for Security”, ICCCNT’12
, 26th -28th July 2012, Coimbatore
[13]. L. R. Rabiner, R.W. Schafer ,” Digital processing of
speech signals” ,Prentice-Hall, Inc., Reprint 2009
[14]. Wei Li et.al, “Voice-Based Recognition System for
Non-Semantics Information by Language and Gender”
,Third International Symposium on Electronic Commerce
and Security,2010
[15].http://en.wikipedia.org/wiki/Audio_normalization#Lou
dness_normalization
[16]. Bachu, R. G., et al. "Separation of Voiced and
Unvoiced using Zero crossing rate and Energy of the Speech
Signal." American Society for Engineering Education
(ASEE) Zone Conference Proceedings. 2008.

More Related Content

PDF
Paper id 28201448
PDF
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
Speech enhancement using spectral subtraction technique with minimized cross ...
PPTX
Final ppt
PDF
Paper id 252014135
PDF
Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...
PDF
Multirate signal processing and decimation interpolation
Paper id 28201448
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
International Journal of Engineering Research and Development (IJERD)
Speech enhancement using spectral subtraction technique with minimized cross ...
Final ppt
Paper id 252014135
Speech Enhancement Based on Spectral Subtraction Involving Magnitude and Phas...
Multirate signal processing and decimation interpolation

What's hot (19)

PDF
Improvement of minimum tracking in Minimum Statistics noise estimation method
PDF
Broad phoneme classification using signal based features
PDF
Noise reduction in speech processing using improved active noise control (anc...
PDF
Noise reduction in speech processing using improved active noise control (anc...
PDF
Design and implementation of different audio restoration techniques for audio...
PDF
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
PDF
Performance enhancement of dct based speaker recognition using wavelet de noi...
PDF
Speech signal analysis for linear filter banks of different orders
PDF
Lr3620092012
PDF
Adaptive noise estimation algorithm for speech enhancement
PDF
A New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual ...
PDF
G010424248
PDF
Performance Analysis of Fractional Sample Rate Converter Using Audio Applicat...
PDF
Speech Compression Using Wavelets
PDF
Performance Analysis of Acoustic Echo Cancellation Techniques
PDF
IRJET- Pitch Detection Algorithms in Time Domain
PDF
Audio/Speech Signal Analysis for Depression
PDF
Compressive Sensing in Speech from LPC using Gradient Projection for Sparse R...
PDF
D011132635
Improvement of minimum tracking in Minimum Statistics noise estimation method
Broad phoneme classification using signal based features
Noise reduction in speech processing using improved active noise control (anc...
Noise reduction in speech processing using improved active noise control (anc...
Design and implementation of different audio restoration techniques for audio...
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
Performance enhancement of dct based speaker recognition using wavelet de noi...
Speech signal analysis for linear filter banks of different orders
Lr3620092012
Adaptive noise estimation algorithm for speech enhancement
A New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual ...
G010424248
Performance Analysis of Fractional Sample Rate Converter Using Audio Applicat...
Speech Compression Using Wavelets
Performance Analysis of Acoustic Echo Cancellation Techniques
IRJET- Pitch Detection Algorithms in Time Domain
Audio/Speech Signal Analysis for Depression
Compressive Sensing in Speech from LPC using Gradient Projection for Sparse R...
D011132635
Ad

Similar to Comparative performance analysis of channel normalization techniques (20)

PDF
01 8445 speech enhancement
PDF
F010334548
PDF
A novel speech enhancement technique
PDF
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
PDF
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
PDF
An effective evaluation study of objective measures using spectral subtractiv...
PDF
ROBUST FEATURE EXTRACTION USING AUTOCORRELATION DOMAIN FOR NOISY SPEECH RECOG...
PDF
Adaptive wavelet thresholding with robust hybrid features for text-independe...
PDF
PDF
PDF
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
PDF
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
PDF
An Effective Approach for Chinese Speech Recognition on Small Size of Vocabulary
PDF
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
PDF
Robust Speech Recognition Technique using Mat lab
PDF
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
PPTX
Icmmse slides
PDF
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
PDF
V041203124126
PDF
Investigations on the role of analysis window shape parameter in speech enhan...
01 8445 speech enhancement
F010334548
A novel speech enhancement technique
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
An effective evaluation study of objective measures using spectral subtractiv...
ROBUST FEATURE EXTRACTION USING AUTOCORRELATION DOMAIN FOR NOISY SPEECH RECOG...
Adaptive wavelet thresholding with robust hybrid features for text-independe...
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
An Effective Approach for Chinese Speech Recognition on Small Size of Vocabulary
Investigation of-combined-use-of-mfcc-and-lpc-features-in-speech-recognition-...
Robust Speech Recognition Technique using Mat lab
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
Icmmse slides
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
V041203124126
Investigations on the role of analysis window shape parameter in speech enhan...
Ad

More from eSAT Journals (20)

PDF
Mechanical properties of hybrid fiber reinforced concrete for pavements
PDF
Material management in construction – a case study
PDF
Managing drought short term strategies in semi arid regions a case study
PDF
Life cycle cost analysis of overlay for an urban road in bangalore
PDF
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materials
PDF
Laboratory investigation of expansive soil stabilized with natural inorganic ...
PDF
Influence of reinforcement on the behavior of hollow concrete block masonry p...
PDF
Influence of compaction energy on soil stabilized with chemical stabilizer
PDF
Geographical information system (gis) for water resources management
PDF
Forest type mapping of bidar forest division, karnataka using geoinformatics ...
PDF
Factors influencing compressive strength of geopolymer concrete
PDF
Experimental investigation on circular hollow steel columns in filled with li...
PDF
Experimental behavior of circular hsscfrc filled steel tubular columns under ...
PDF
Evaluation of punching shear in flat slabs
PDF
Evaluation of performance of intake tower dam for recent earthquake in india
PDF
Evaluation of operational efficiency of urban road network using travel time ...
PDF
Estimation of surface runoff in nallur amanikere watershed using scs cn method
PDF
Estimation of morphometric parameters and runoff using rs & gis techniques
PDF
Effect of variation of plastic hinge length on the results of non linear anal...
PDF
Effect of use of recycled materials on indirect tensile strength of asphalt c...
Mechanical properties of hybrid fiber reinforced concrete for pavements
Material management in construction – a case study
Managing drought short term strategies in semi arid regions a case study
Life cycle cost analysis of overlay for an urban road in bangalore
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materials
Laboratory investigation of expansive soil stabilized with natural inorganic ...
Influence of reinforcement on the behavior of hollow concrete block masonry p...
Influence of compaction energy on soil stabilized with chemical stabilizer
Geographical information system (gis) for water resources management
Forest type mapping of bidar forest division, karnataka using geoinformatics ...
Factors influencing compressive strength of geopolymer concrete
Experimental investigation on circular hollow steel columns in filled with li...
Experimental behavior of circular hsscfrc filled steel tubular columns under ...
Evaluation of punching shear in flat slabs
Evaluation of performance of intake tower dam for recent earthquake in india
Evaluation of operational efficiency of urban road network using travel time ...
Estimation of surface runoff in nallur amanikere watershed using scs cn method
Estimation of morphometric parameters and runoff using rs & gis techniques
Effect of variation of plastic hinge length on the results of non linear anal...
Effect of use of recycled materials on indirect tensile strength of asphalt c...

Recently uploaded (20)

PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Welding lecture in detail for understanding
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
PPT on Performance Review to get promotions
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Construction Project Organization Group 2.pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
Digital Logic Computer Design lecture notes
PPTX
UNIT 4 Total Quality Management .pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPT
Mechanical Engineering MATERIALS Selection
PDF
composite construction of structures.pdf
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
bas. eng. economics group 4 presentation 1.pptx
Welding lecture in detail for understanding
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPT on Performance Review to get promotions
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Construction Project Organization Group 2.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Digital Logic Computer Design lecture notes
UNIT 4 Total Quality Management .pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Mechanical Engineering MATERIALS Selection
composite construction of structures.pdf
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...

Comparative performance analysis of channel normalization techniques

  • 1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 04 Issue: 05 | May-2015, Available @ http://www.ijret.org 361 COMPARATIVE PERFORMANCE ANALYSIS OF CHANNEL NORMALIZATION TECHNIQUES Rupa Patel1 , Aparna Gurjar2 1 Asst. Professor, Department of Computer Application, Ramdeobaba College of Engineering and Management, Nagpur, Maharashtra, India 2 Asst. Professor, Department of Computer Application, Ramdeobaba College of Engineering and Management, Nagpur, Maharashtra, India Abstract A major part of the interaction between humans takes place via speech communication. The speech signal carries both useful and unwanted information. Processing of such signals involve enhancing the useful information. The intelligibility of speech signals is significantly reduced due to the presence of unwanted information such as noise. Channel normalization algorithms suppress such additive noise introduced in the speech signals by transmission channel or by recording environment conditions. Enhancing the quality and intelligibility of speech signals improve the performance of speech systems such as Automatic speech recognition (ASR) , voice communication and hearing aids to name the few. Based on the experimental results the comparative analysis of channel normalization techniques have been presented in this paper to find out the most suitable algorithm for enhancing the speech signals. Keywords: Cepstral Mean Normalization, Spectral Subtraction, Weiner filter, Signal to Noise Ratio --------------------------------------------------------------------***---------------------------------------------------------------------- 1. INTRODUCTION Automatic Speech Recognition (ASR) is a technology that transforms human speech to a symbolic representation. Recognizer performs the transformation with the goal that it can handle spontaneous speech from any speaker in any environment. The speech waveform produced by a speaker is transmitted over some channel before it reaches the recording device, and the channel disturbs the original speech signal. Most of the channel normalization techniques either deal with channel transfer characteristics or additive noise. Study reveals that different channel normalization techniques have been developed to minimize the effects of channel noises in general on speech recognition systems. This paper focuses on three channel normalization techniques Cepstral Mean Normalization, Spectral Subtraction and Weiner filtering for reduction of noise in speech signals. The techniques are summarized in table 1. Table -1: Channel Normalization Techniques Cepstral Mean Normalization (CMN)  Noise compensation technique.  Also refereed as Channel Mean Normalization.  Reduce distortion caused by transmission channel.  Applied to cepstral data.  Considered as a high pass filtering process [1].  Based on the observation that a linear channel distortion becomes a constant offset in the cepstral domain [2].  Cepstral mean value is calculated across the whole speech utterance (combination of cepstral vectors), this calculated cepstral mean is subtracted from each frame single cepstral vector. [3]. Spectral Subtraction (SS)  Voice activity detector first detects whether the frame is noisy or speech frame. Noise spectrum is obtained for noisy frame. Clean speech spectrum is obtained by subtracting noise spectrum from corrupted speech spectrum so that signal-to-noise ratio (SNR) is improved [4, 5, 6, 7, 8]. Weiner Filter  Adaptive filter [3].  Useful for additive noise reduction and signal restoration. [3].  Based on tracking a priori SNR using Decision-Directed method, proposed by Scalart et al 96 [9 ]. In this method it is assumed that SNRpost=SNRprior +1[10].  Noise present in the signal is reduced by comparison with an estimate of the desired clean signal with that of noisy speech signals. [11,12]. This estimate is obtained by minimizing the Mean Square Error (MSE) between the desired signal and the estimated signal[11].
  • 2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 04 Issue: 05 | May-2015, Available @ http://www.ijret.org 362 2. EXPERIMENTAL RESULTS The setup to analyze the various algorithms to determine the suitability of the techniques in speech systems has been implemented in MATLAB. Results of seven samples have been discussed here. The seven speech signals are represented as case 1 to case 7 respectively. Testing data is with additive noise. Preprocessing Step The speech signals are analyzed in short time segments, referred as analysis frames [13]. Processing of only voiced speech signals is important. The input signals must be first classified as voiced or unvoiced. Voice activity detection (VAD) algorithm detects the presence of human speech in the signal In VAD algorithms, the parameters used for speech detection are based for voiced /unvoiced classification zero-crossing rate and short time energy [14]. Zero-crossing analysis is a simple kind of voice time- domain analysis. Considering audio data as discrete signals Zero crossing is said to occur if successive samples have different algebraic signs. The rate at which zero crossings occur is the measure of the frequency content of a signal. Zero-crossing rate is the number of times that the sample changes the symbols. Zero-crossing rate is a measure of number of times in a given time interval/frame that the amplitude of the speech signals passes through a value of zero. The zero-crossing rate is one of the useful parameter for estimating whether speech is voiced or unvoiced [15]. Energy provides a representation that reflects the amplitude variations. The amplitude of the speech signal varies with time. The amplitude of unvoiced speech segments is much lower than the amplitude of voiced segments. The energy of the speech signal provides a representation that reflects these amplitude variations [16]. To classify input speech signals as voiced or unvoiced signals Voice activity detection algorithm is applied to each of the case samples. The results of pre-processing step are recorded in table 2. Table -2: Results of Voice Activity Detector Signals Energy ZCR Case 1 0.60 0.07 Case 2 0.36 0.10 Case 3 0.68 0.07 Case 4 1.02 0.07 Case 5 0.36 0.10 Case 6 0.47 0.08 Case 7 0.43 0.07 All the seven speech signals are classified as voiced speech waveform. The energy plot and zero crossing plot for case1 is shown in figure below. For Case 1: Fig 1 Energy Plot of Waveform for Case 1 Fig 2. ZCR Plot of Waveform for Case 1 Almost similar energy plot and zero crossing plot for the remaining cases have been obtained. 2.1 Cepstral Mean Normalization All Cepstral features were mean normalized and normalization scheme were performed on the full utterance. Signal to noise ratio (SNR), which is the measure of signal strength relative to background noise of original voiced speech segment is computed. CMN is applied to it and again SNR is computed. Results are tabulated in table 3 and also in comparative chart 1. Table -3: Results of CMN Signal SNR(db) Original CMN Improvement Case 1 2.4565 33.6335 31.177 Case 2 0.7504 25.1226 24.3722 Case 3 7.1191 35.2533 28.1342 Case 4 6.7261 45.1343 38.4069 Case 5 9.4415 37.9693 28.5278 Case 6 1.2681 27.7092 26.4411 Case 7 3.4327 23.2372 19.8045
  • 3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 04 Issue: 05 | May-2015, Available @ http://www.ijret.org 363 Chart -1: Comparison of SNR before & after CMN 2.2 Spectral Subtraction SNR of original noisy voiced speech segment is computed. Spectral subtraction is applied to it and SNR is again computed. Results are tabulated in table 4 and depicted in chart 2. Spectral subtraction estimates the clean speech spectrum by subtracting the estimated additive noise spectrum from the noisy speech spectrum. Table -4: Results of Spectral Subtraction Signal SNR(db) Original Spectral Subtraction Improvement Case 1 2.4565 14.5997 12.1432 Case 2 0.7504 6.7826 6.0322 Case 3 7.1191 21.6013 14.4822 Case 4 6.7261 24.5745 17.8484 Case 5 9.4415 31.2225 21.781 Case 6 1.268157 5.8731 4.5921 Case 7 3.4327 11.7850 8.3523 Chart -2: Comparison of SNR before & after SS Improvement in SNR values indicate that the additive noise have been reduced from the speech signal 2.3 Weiner Filer Weiner filtering is the most basic approach used for reducing the noise from the signal. Signal to noise ratio, is estimated before and after filtering signal. Table 5 and chart 3 shows the results of Weiner filtering. Table -5: Results of Weiner Signal SNR (dB) Original Weiner Improvement Case 1 2.4565 24.8676 22.4111 Case 2 0.7504 6.0803 5.3299 Case 3 7.1191 23.6743 16.5552 Case 4 6.7261 23.6149 16.8888 Case 5 9.4415 23.2383 14.3616 Case 6 1.268157 12.1107 10.8425 Case 7 3.4327 16.6735 13.4208 Chart -3: Comparison of SNR before & after Weiner From the computed SNR values and comparative chart one can observe that there is an improvement in SNR values which indicates that the noise from the signal have been reduced after the implementation of channel normalization techniques. 3. ANALYSIS Table 3 and chart 1 summarize the results obtained by using cepstral mean normalization technique. The improvement in SNR values is in the range of 20 db and 30 db which shows that CMN algorithms significantly enhance the signal. The evaluation is repeated with Spectral subtraction and weiner filter. The SNR improvement in case of spectral subtraction is approximately in between 4db to 21db as can be observed from table 4 and chart 2. While in case of weiner filter it lies in the range of 10db to 22 db as can be found from table 5 and chart 3.
  • 4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 04 Issue: 05 | May-2015, Available @ http://www.ijret.org 364 Comparative analysis of these techniques is depicted in chart 4. Chart -4: Improvements in SNR for CMN, Spectral Subtraction and Weiner From chart 4 one can observe that the results obtained for Spectral subtraction and Weiner are pretty much similar and findings shows that CMN gives better result. Cepstral mean normalization enhances the original signal approximately by 31%, spectral subtraction by 13% and weiner by 15%. These clearly indicate that Cepstral Mean Normalization technique is useful in reducing the impact of noise as compared to two other mentioned algorithms. 4. CONCLUSION The effectiveness of different normalization techniques has been evaluated and results obtained have been summarized. From the results it is clear that the channel normalization technique cepstral mean normalization reduce distortion and proved to be effective REFERENCES [1]. Stern, Richard M., Bhiksha Raj, and Pedro J. Moreno. "Compensation for environmental degradation in automatic speech recognition." Robust Speech Recognition for Unknown Communication Channels. 1997. [2]. Garner, Philip N. "Cepstral normalization and the signal to noise ratio spectrum in automatic speech recognition." Speech Communication 53.8 (2011): 991-1001. [3]. http://recognize-speech.com/preprocessing/10- preprocessing. [4]. N.S. Mahlanyane and D.J. Mashao, ”Channel Normalization for GSM Speech Recognition”, SATNAC conference paper, UCT, 2003. [5]. J.S. Lim , A.V. Oppenheim “ Enhancement and bandwidth compression of noisy speech “,Proc IEEE, vol 67, pp 1586-1604, December 1979. [6]. S. Boll, “Suppression of Acoustic Noise in Speech using Spectral Subtraction,” IEEE Trans. Acoust., Speech, Signal Process., vol.27, pp. 113-120, Apr. 1979. [7]. http://sound.eti.pg.gda.pl/denoise/noise.html [8]. Kim, Chanwoo. Signal processing for robust speech recognition motivated by auditory processing. Diss. Johns Hopkins University, 2010. [9]. Scalart, Pascal. "Speech enhancement based on a priori signal to noise estimation." Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on. Vol. 2. IEEE, 1996. [10]. www.mathworks.com [11]. Md. Jahangir Alam, Md. Faqrul Alam Chowdhury, Md. Fasiul Alam, ”Comparative Study of a Priori Signal - To- Noise Ratio(SNR) Estimation Approaches for Speech Enhancement”, Journal of electrical and electronics engineering, 2009,vol 1 [12]. Rupa Patel,Urmila Shrawankar, Dr. V.M.Thakare ,”Hiding speaker Characteristics for Security”, ICCCNT’12 , 26th -28th July 2012, Coimbatore [13]. L. R. Rabiner, R.W. Schafer ,” Digital processing of speech signals” ,Prentice-Hall, Inc., Reprint 2009 [14]. Wei Li et.al, “Voice-Based Recognition System for Non-Semantics Information by Language and Gender” ,Third International Symposium on Electronic Commerce and Security,2010 [15].http://en.wikipedia.org/wiki/Audio_normalization#Lou dness_normalization [16]. Bachu, R. G., et al. "Separation of Voiced and Unvoiced using Zero crossing rate and Energy of the Speech Signal." American Society for Engineering Education (ASEE) Zone Conference Proceedings. 2008.