Combining Wavelet-domain Hidden Markov Trees with Hidden Markov Models

In this paper, the concept of Wavelet-domain Hidden Markov Trees (WHMT) is introduced to Automatic Speech Recognition. WHMT are a convenient means to model the structure of wavelet feature vectors, as wavelet coefficients can be interpreted as nodes in a binary tree. By the introduction of hidden states in each node, non-Gaussian statistics inherent in wavelet features can be modeled. At the same time, correlations between neighboring coefficients in the time-frequency plane are accommodated. Phoneme probabilities obtained using the WHMT and wavelet features are then combined at the state level with those obtained by Gaussian distributions in conjunction with MFCCs, and fed into conventional Hidden Markov Models. Preliminary experiments show the potential advantages of this novel approach.

Related material