Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures
This paper shows that exemplar-based speech processing using class-conditional posterior probabilities admits a highly effective search strategy relying on posteriors' intrinsic sparsity structures. The posterior probabilities are estimated for phonetic and phonological classes using deep neural network (DNN) computational framework. Exploiting the class-specific sparsity leads to a simple quantized posterior hashing procedure to reduce the search space of posterior exemplars. To that end, small number of quantized posteriors are regarded as representatives of the posterior space and used as hash keys to index subsets of neighboring exemplars. The $k$ nearest neighbor ($k$NN) method is applied for posterior based classification problems. The phonetic posterior probabilities are used as exemplars for phonetic classification whereas the phonological posteriors are used as exemplars for automatic prosodic event detection. Experimental results demonstrate that posterior hashing improves the efficiency of $k$NN classification drastically. This work encourages the use of posteriors as discriminative exemplars appropriate for large scale speech classification tasks.
Record created on 2016-04-19, modified on 2016-08-09