Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition.
Cepstral normalisation in automatic speech recognition is investigated in the context of robustness to additive noise. It is argued that such normalisation leads naturally to a speech feature based on signal to noise ratio rather than absolute energy (or power). Explicit calculation of this {\em SNR-cepstrum} by means of a noise estimate is shown to have theoretical and practical advantages over the usual (energy based) cepstrum. The SNR-cepstrum is shown to be almost identical to the articulation index known in psycho-acoustics. Combination of the SNR-cepstrum with the well known perceptual linear prediction method is shown to be beneficial in noisy environments.
Garner_Idiap-RR-15-2011.pdf
openaccess
675.92 KB
Adobe PDF
06d4e5ae744f8c113e42eda318485a64