The document investigates the effects of dynamic time warping (DTW) and hidden markov model (HMM) parameters on speech signal alignment at the phrase, word, and phoneme levels, utilizing recorded Hindi speech samples from six speakers. Results demonstrate that using HMM improves alignment accuracy by reducing the Mahalanobis distance compared to using only mel-frequency cepstral coefficients (MFCC). The study concludes that HMM-based alignment is more effective than MFCC alone and highlights that effective speech alignment can be achieved at the phrase level.
Related topics: