Infoscience

Report

Word Embeddings through Hellinger PCA

Word embeddings resulting from neural lan- guage models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with the Collobert and Weston (2008) embeddings on several NLP tasks and show that we can reach similar or even better performance.

    Reference

    • EPFL-REPORT-192491

    Record created on 2013-12-19, modified on 2016-08-09

Related material