Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Better Word Embeddings by Disentangling Contextual n-Gram Information
 
conference paper

Better Word Embeddings by Disentangling Contextual n-Gram Information

Gupta, Prakhar
•
Pagliardini, Matteo
•
Jaggi, Martin
2019
NAACL 2019 - Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
NAACL 2019 - Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Pre-trained word vectors are ubiquitous in Natural Language Processing applications. In this paper, we show how training word embeddings jointly with bigram and even trigram embeddings, results in improved unigram embeddings. We claim that training word embeddings along with higher n-gram embeddings helps in the removal of the contextual information from the unigrams, resulting in better stand-alone word embeddings. We empirically show the validity of our hypothesis by outperforming other competing word representation models by a significant margin on a wide variety of tasks. We make our models publicly available.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

Better_word_embeddings.pdf

Access type

openaccess

License Condition

CC BY

Size

249.01 KB

Format

Adobe PDF

Checksum (MD5)

e3fe4148f5fcb3a9b219ec280e35d820

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés