Word Embeddings through Hellinger PCA

Word embeddings resulting from neural lan- guage models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with the Collobert and Weston (2008) embeddings on several NLP tasks and show that we can reach similar or even better performance.


    • EPFL-REPORT-192491

    Record created on 2013-12-19, modified on 2017-05-10

Related material