Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Character-level Chinese-English Translation through ASCII Encoding
 
Loading...
Thumbnail Image
conference paper

Character-level Chinese-English Translation through ASCII Encoding

Nikolov, Nikola
•
Hu, Yuhuang
•
Tan, Mi Xue
Show more
October 1, 2018
Proceedings of the Third Conference on Machine Translation: Research Papers
3rd Conference on Machine Translation: Research Papers

Character-level Neural Machine Translation(NMT) models have recently achieved impressive results on many language pairs. They mainly do well for Indo-European language pairs, where the languages share the same writing system. However, for translating between Chinese and English, the gap between the two different writing systems poses a ma-jor challenge because of a lack of systematic correspondence between the individual linguistic units.In this paper, we enable character-level NMT for Chinese, by breaking down Chinese characters into linguistic units similar to that of Indo-European languages. We use the Wubi encoding scheme, which preserves the original shape and semantic in-formation of the characters, while also being reversible. We show promising results from training Wubi-based models on the character-and subword-level with recurrent as well as convolutional models.

  • Details
  • Metrics
Type
conference paper
DOI
10.18653/v1/W18-6302
Author(s)
Nikolov, Nikola
•
Hu, Yuhuang
•
Tan, Mi Xue
•
Hahnloser, Richard H.R.
Date Issued

2018-10-01

Publisher

Association for Computational Linguistics

Published in
Proceedings of the Third Conference on Machine Translation: Research Papers
Total of pages

6

Start page

10

End page

16

URL

Paper

https://www.aclweb.org/anthology/W18-6302.pdf
Peer reviewed

REVIEWED

Written at

OTHER

EPFL units
NCCR-ROBOTICS  
Event nameEvent placeEvent date
3rd Conference on Machine Translation: Research Papers

Brussels, Belgium

October 2018

Available on Infoscience
October 31, 2019
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/162570
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés