Müller, L. and Psutka Josef V. : Building robust PLP-based acoustic module for ASR applications . SPECOM 2005 proceedings, p. 761-764, Moscow State Linguistic University, Moscow , 2005.


This paper wants to summarize our experience of the PLP parameterization used in front-end modules of continuous speech recognition systems working with a telephone or high-quality speech. This experience is supported by results of many experiments which became basis for an adjustment of our systems in real applications. There will be discussed such issues as an impact of the number of PLP filters and calculated PLP-cepstral coefficients on the word error rate (WER), a comparison between a monophone-based HMM structure with a growing number of Gaussians and a triphone-based structure. The paper pays also attention to an influence of number of speakers (whose voices were used for training HMMs) on the WER. All front-end settings both for telephone and/or high-quality speech were tested on tasks with a same perplexity in order to compare better individual results.

speech recognition, feature extraction, PLP parameteriazation


