Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Cross-lingual Linking of Multi-word Entities and their corresponding Acronyms
 
conference paper

Cross-lingual Linking of Multi-word Entities and their corresponding Acronyms

Jacquet, Guillaume
•
Ehrmann, Maud  
•
Steinberger, Ralf
Show more
Calzolari, Nicoletta
•
Choukri, Khalid
Show more
2016
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
10th International Conference on Language Resources and Evaluation

This paper reports on an approach and experiments to automatically build a cross-lingual multi-word entity resource. Starting from a collection of millions of acronym/expansion pairs for 22 languages where expansion variants were grouped into monolingual clusters, we experiment with several aggregation strategies to link these clusters across languages. Aggregation strategies make use of string similarity distances and translation probabilities and they are based on vector space and graph representations. The accuracy of the approach is evaluated against Wikipedia's redirection and cross-lingual linking tables. The resulting multi-word entity resource contains 64,000 multi-word entities with unique identifiers and their 600,000 multilingual lexical variants. We intend to make this new resource publicly available.

  • Files
  • Details
  • Metrics
Type
conference paper
Author(s)
Jacquet, Guillaume
Ehrmann, Maud  
Steinberger, Ralf
Väyrynen, Jaakko
Editors
Calzolari, Nicoletta
•
Choukri, Khalid
•
Declerck, Thierry
•
Grobelnik, Marko
•
Maegaard, Bente
•
Mariani, Joseph
•
Moreno, Asuncion
•
Odijk, Jan
•
Piperidis, Stelios
Date Issued

2016

Publisher

European Language Resources Association (ELRA)

Published in
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
ISBN of the book

978-2-9517408-9-1

Subjects

multiword named entity

•

named entity cross-lingual linking

•

acronyms

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DHLAB  
Event nameEvent placeEvent date
10th International Conference on Language Resources and Evaluation

Portorož, Slovenia

May 2016

Available on Infoscience
May 20, 2016
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/126247
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés