Indexing Audio Documents by using Latent Semantic Analysis and SOM

Kurimo, Mikko

Kurimo, Mikko

1999

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

This paper describes an important application for state-of-art automatic speech recognition, natural language processing and information retrieval systems. Methods for enhancing the indexing of spoken documents by using latent semantic analysis and self-organizing maps are presented, motivated and tested. The idea is to extract extra information from the structure of the document collection and use it for more accurate indexing by generating new index terms and stochastic index weights. Indexing methods are evaluated for two broadcast news databases (one French and one English) using the average document perplexity defined in this paper and test queries analyzed by human experts

Details

Title Indexing Audio Documents by using Latent Semantic Analysis and SOM

Author(s) Kurimo, Mikko

Date 1999

Publisher IDIAP

Keywords

speech

Note Published in ``Kohonen Maps'', Elsevier, 1999

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Work produced at EPFL
Technical Reports
Published

Record creation date 2006-03-10

Files

Abstract

Details

PDF