Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Phase AutoCorrelation (PAC) features for noise robust speech recognition
 
research article

Phase AutoCorrelation (PAC) features for noise robust speech recognition

Ikbal, Shajith
•
Misra, Hemant
•
Hermansky, Hynek  
Show more
2012
Speech Communication

In this paper, we introduce a new class of noise robust features derived from an alternative measure of autocorrelation representing the phase variation of speech signal frame over time. These features, referred to as Phase AutoCorrelation (PAC) features include PAC-spectrum and PAC-MFCC, among others. In traditional autocorrelation, correlation between two time delayed signal vectors is computed as their dot product. Whereas in PAC, angle between the vectors in the signal vector space is used to compute the correlation. PAC features are more noise robust because the angle is typically less affected by noise than the dot product. However, the use of angle as correlation estimate makes the PAC features inferior in clean speech. In this paper, we circumvent this problem by introducing another set of features where complementary information among the PAC features and the traditional features are combined adaptively to retain the best of both. An entropy based feature combination method in a multi-layer perceptron (MLP) based multi-stream framework is used to derive an adaptively combined representation of the component feature streams. An evaluation of the combined features using OGI Numbers95 database and Aurora-2 database under various noise conditions and noise levels show significant improvements in recognition accuracies in clean as well as noisy conditions. © 2012 Elsevier B.V. All rights reserved.

  • Details
  • Metrics
Type
research article
DOI
10.1016/j.specom.2012.02.005
Author(s)
Ikbal, Shajith
Misra, Hemant
Hermansky, Hynek  
Magimai.-Doss, Mathew  
Date Issued

2012

Published in
Speech Communication
Volume

54

Issue

7

Start page

867

End page

880

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Available on Infoscience
December 19, 2013
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/98229
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés