Multilingual Annotation and Disambiguation of Discourse Connectives for Machine Translation

Many discourse connectives can signal several types of relations between sentences. Their automatic disambiguation, i.e. the labeling of the correct sense of each occurrence is important for discourse parsing, but could also be helpful to machine translation. We describe new approaches for improving the accuracy of manual annotation of three discourse connectives (two English, one French) by using parallel corpora. An appropriate set of labels for each connective can be found using information from their translations. Our results for automatic disambiguation are state-of-the-art, at up to 85% accuracy using surface features. Using feature analysis, contextual features are shown to be useful across languages and connectives.

Presented at:
Association for Computational Linguistics - Proceedings of 12th SIGdial Meeting on Discourse and Dialogue, Portland, OR

 Record created 2011-05-19, last modified 2018-01-28

External link:
Download fulltext
Rate this document:

Rate this document:
(Not yet reviewed)