Rôle de la matrice d'information et pondération des composantes dans les noyaux de Fisher pour PLSI

ABSTRACT. An information-geometric approach for document similarities in the framework of “Probabilistic Latent Semantic Indexing” was first proposed by T. Hofmann (2000) and later extended (“revisited”) by Nyffenegger et al. (2006). This paper presents an in-depth study and revision of these models by (1) providing a simpler unified description framework, (2) investigating the role of the Fisher Information Matrix G(θ), and (3) analyzing the impact of latent “topic” parameters in such models. It furthermore provides new experimental results on larger collections coming from the TREC–AP evaluation corpus.

Published in:
Actes du colloque Coria 09, 279-294
Presented at:
6e Conférence en Recherche d'Information et Applications, Presqu'île de Giens, 5-7 May 2009

 Record created 2009-07-06, last modified 2018-10-07

Download fulltextPDF
External link:
Download fulltextURL
Rate this document:

Rate this document:
(Not yet reviewed)