Noisy Text Categorization

Vinciarelli, Alessandro

Vinciarelli, Alessandro

2004

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

This work presents a system for the categorization of noisy texts. By noisy it is meant any text obtained through an extraction process (affected by errors) from media different than digital texts. We show that, even with an average Word Error Rate of around 50%, the categorization performance loss with respect to the clean version of the same documents is negligible.

Details

Title Noisy Text Categorization

Author(s) Vinciarelli, Alessandro

Published in Proceedings of International Conference on Pattern Recognition (ICPR)

Conference Proceedings of International Conference on Pattern Recognition (ICPR)

Date 2004

Keywords

vision

Note IDIAP-RR 03-61

Additional link URL; Related documents

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2006-03-10

Actions

Preview

Select file: