Low-Rank Representation of Nearest Neighbor Phone Posterior Probabilities to Enhance DNN Acoustic Modeling

Luyet, Gil; Dighe, Pranay; Asaei, Afsaneh; Bourlard, Hervé

report

Luyet, Gil

•

Dighe, Pranay

•

Asaei, Afsaneh

2016

We hypothesize that optimal deep neural networks (DNN) class-conditional posterior probabilities live in a union of low-dimensional subspaces. In real test conditions, DNN posteriors encode uncertainties which can be regarded as a superposition of unstructured sparse noise to the optimal posteriors. We aim to investigate different ways to structure the DNN outputs exploiting low-rank representation (LRR) techniques. Using a large number of training posterior vectors, the underlying low-dimensional subspace is identified through nearest neighbor analysis, and low-rank decomposition enables separation of the ``optimal'' posteriors from the spurious uncertainties at the DNN output. Experiments demonstrate that by processing subsets of posteriors which possess strong subspace similarity, low-rank representation enables enhancement of posterior probabilities, and lead to higher speech recognition accuracy based on the hybrid DNN-hidden Markov model (HMM) system.

Name

LRR.pdf

Access type

openaccess

Size

218.66 KB

Format

Adobe PDF

Checksum (MD5)

b5fc77e79a8ad306295da93e218c0202