Modelling auxiliary information (pitch frequency) in hybrid HMM/ANN based ASR systems

Magimai.-Doss, Mathew; Stephenson, Todd Andrew; Bourlard, Hervé

report

Magimai.-Doss, Mathew

•

Stephenson, Todd Andrew

•

Bourlard, Hervé

2002

Automatic Speech Recognition systems typically use smoothed spectral features as acoustic observations. In recent studies, it has been shown that complementing these standard features with auxiliary information could improve the performance of the system. The previously proposed systems have been studied in the framework of GMMs. In this paper, we study and compare different ways to include auxiliary information in state-of-the-art hybrid HMM/ANN system. In the present paper, we have focused on pitch frequency as the auxiliary information. We have evaluated the proposed system on two different ASR tasks, namely, isolated word recognition and connected word recognition. Our results complement the previous efforts to incorporate auxiliary information in ASR system and also show that pitch frequency can indeed be used in ASR systems to improve the recognition performance.

Name

rr02-62.pdf

Access type

openaccess

Size

229.76 KB

Format

Adobe PDF

Checksum (MD5)

55d51b60d85d168b0e1b08b21e6236a3