Files
Abstract
Word embeddings resulting from neural lan- guage models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with the Collobert and Weston (2008) embeddings on several NLP tasks and show that we can reach similar or even better performance.