Abstract

Matching of a test signal to a reference word hypothesis forms the core of many speech processing problems, including objective speech intelligibility assessment. This paper first shows that the comparison of two speech signals can be formulated as matching of two sequences of "uncertain" or probabilistic latent symbols, in the same manner as string matching. Based upon that, we propose a pathological speech intelligibility assessment approach that compares pathological speaker's speech to control speaker's speech in phone space and articulatory feature space, and yields a score that is interpretable w.r.t. human listening test. Experimental validation of the proposed approach on the UA-speech corpus yielded a Spearman's correlation coefficient of 0.976 and a Pearson's correlation coefficient of 0.946.

Details