Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Reports, Documentation, and Standards
  4. Neural Network based Regression for Robust Overlapping Speech Recognition using Microphone Arrays
 
report

Neural Network based Regression for Robust Overlapping Speech Recognition using Microphone Arrays

Li, Weifeng  
•
Dines, John  
•
Magimai.-Doss, Mathew  
Show more
2008

This paper investigates a neural network based acoustic feature mapping to extract robust features for automatic speech recognition (ASR) of overlapping speech. In our preliminary studies, we trained neural networks to learn the mapping from log mel filter bank energies (MFBEs) extracted from the distant microphone recordings, including multiple overlapping speakers, to log MFBEs extracted from the clean speech signal. In this paper, we explore the mapping of higher order mel-filterbank cepstral coefficients (MFCC) to lower order coefficients. We also investigate the mapping of features from both target and interfering distant sound sources to the clean target features. This is achieved by using the microphone array to extract features from both the direction of the target and interfering sound sources. We demonstrate the effectiveness of the proposed approach through extensive evaluations on the MONC corpus, which includes both non-overlapping single speaker and overlapping multi-speaker conditions.

  • Files
  • Details
  • Metrics
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés