Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Small Languages, Big Data: Multilingual Computational Tools and Techniques for the Lexicography of Endangered Languages
 
conference paper

Small Languages, Big Data: Multilingual Computational Tools and Techniques for the Lexicography of Endangered Languages

Benjamin, Martin  
•
Radetzky, Paula
Good, Jeff
•
Hirschberg, Julia
Show more
2014
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages
52nd Annual Meeting of the Association for Computational Linguistics

The Kamusi Project, a multilingual online dictionary website, has as one of its goals to document the lexicons of en-dangered and less-resourced languages (LRLs). Kamusi.org provides a unified platform and repository for this kind of data that is both simple to use and free to researchers and the public. Since Kamusi has a separate entry for each homophone or polyseme, it can be used to produce sophisticated multilingual dictionaries. We have recently been confronting issues inherent in contact language-based lexi-cography, especially the elicitation of culturally-specific semantic terms, which cannot be obtained through fieldwork purely reliant on a contact language. To address this, we have designed a system of “balloons.” Based on a variety of fac-tors, balloons raise the likelihood of re-vealing terms and fields that have partic-ular relevance within a culture, rather than perpetuating linguistic bias toward the concerns and artifacts of more power-ful groups. Kamusi has also developed a smartphone application which can be used for crowdsourcing contributions and validation. It will also be invaluable in gathering oral data from speakers of en-dangered languages for the production of monolingual talking dictionaries. The first of these projects is planned for the Arrernte language in central Australia.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

computel.camera_ready.FINAL.submitted.pdf

Access type

openaccess

Size

2.44 MB

Format

Adobe PDF

Checksum (MD5)

0504cdd859a5075093c59d915614167a

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés