This document provides an overview of speech recognition algorithms and models. It describes preprocessing techniques like pre-emphasis and MFCC extraction. Modeling methods covered include GMM, HMM, vector quantization using LBG and k-means, and distance measures for pattern matching. Physiological aspects of speech production and representations in time and frequency domains are also summarized. The document aims to give a comprehensive overview of the multidisciplinary field of speech recognition from DSP techniques to applications.
Related topics: