Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Preprints and Working Papers
  4. 280 Birds with One Stone: Inducing Multilingual Taxonomies from Wikipedia using Character-level Classification
 
working paper

280 Birds with One Stone: Inducing Multilingual Taxonomies from Wikipedia using Character-level Classification

Gupta, Amit  
•
Lebret, Rémi Philippe  
•
Harkous, Hamza  
Show more
2017

We propose a novel fully-automated approach towards inducing multilingual taxonomies from Wikipedia. Given an English taxonomy, our approach first leverages the interlanguage links of Wikipedia to automatically construct training datasets for the is-a relation in the target language. Character-level classifiers are trained on the constructed datasets, and used in an optimal path discovery framework to induce high-precision, high-coverage taxonomies in other languages. Through experiments, we demonstrate that our approach significantly outperforms the state-of-the-art, heuristics-heavy approaches for six languages. As a consequence of our work, we release presumably the largest and the most accurate multilingual taxonomic resource spanning over 280 languages.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

arxiv-aaai.pdf

Access type

openaccess

Size

426.04 KB

Format

Adobe PDF

Checksum (MD5)

e66754c421cb1aad782cbeda9b126e7d

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés