000200377 001__ 200377
000200377 005__ 20190316235945.0
000200377 037__ $$aCONF
000200377 245__ $$aSmall Languages, Big Data: Multilingual Computational Tools and Techniques for the Lexicography of Endangered Languages
000200377 269__ $$a2014
000200377 260__ $$bAssociation for Computational Linguistics$$c2014$$aStroudsburg, PA, USA
000200377 336__ $$aConference Papers
000200377 520__ $$aThe Kamusi Project, a multilingual online dictionary website, has as one of its goals to document the lexicons of en-dangered and less-resourced languages (LRLs). Kamusi.org provides a unified platform and repository for this kind of data that is both simple to use and free to researchers and the public. Since Kamusi has a separate entry for each homophone or polyseme, it can be used to produce sophisticated multilingual dictionaries. We have recently been confronting issues inherent in contact language-based lexi-cography, especially the elicitation of culturally-specific semantic terms, which cannot be obtained through fieldwork purely reliant on a contact language. To address this, we have designed a system of “balloons.” Based on a variety of fac-tors, balloons raise the likelihood of re-vealing terms and fields that have partic-ular relevance within a culture, rather than perpetuating linguistic bias toward the concerns and artifacts of more power-ful groups. Kamusi has also developed a smartphone application which can be used for crowdsourcing contributions and validation. It will also be invaluable in gathering oral data from speakers of en-dangered languages for the production of monolingual talking dictionaries. The first of these projects is planned for the Arrernte language in central Australia.
000200377 6531_ $$aendangered languages
000200377 6531_ $$amultilingual lexicography
000200377 6531_ $$acrowdsourcing
000200377 6531_ $$atalking dictionaries
000200377 700__ $$0247668$$g241224$$aBenjamin, Martin
000200377 700__ $$aRadetzky, Paula
000200377 7112_ $$dJune 22-27, 2014$$cBaltimore, Maryland, USA$$a52nd Annual Meeting of the Association for Computational Linguistics
000200377 720_1 $$aGood, Jeff$$eed.
000200377 720_1 $$aHirschberg, Julia$$eed.
000200377 720_1 $$aRambow, Owen$$eed.
000200377 773__ $$tProceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages$$q15-23
000200377 8564_ $$uhttps://infoscience.epfl.ch/record/200377/files/computel.camera_ready.FINAL.submitted.pdf$$zn/a$$s2554406$$yn/a
000200377 909C0 $$xU10405$$0252004$$pLSIR
000200377 909CO $$qGLOBAL_SET$$pconf$$ooai:infoscience.tind.io:200377$$pIC
000200377 917Z8 $$x241224
000200377 937__ $$aEPFL-CONF-200377
000200377 973__ $$rREVIEWED$$sPUBLISHED$$aEPFL
000200377 980__ $$aCONF