Hierarchical approach for spotting keywords

The paper presents a new approach to spotting a particular sound (keyword) in an acoustic stream. The approach is based on hierarchical processing where equally-sampled posterior probabilities of phoneme classes are estimated first, followed by matched filtering that yields non-equally spaced values, one for each phoneme, indicating confidences of underlying phoneme being present. The target keyword is indicated by a particular sequence of high-confidence phonemes.

