Evaluating Attention Networks for Anaphora Resolution

In this paper, we evaluate the results of using inter and intra attention mechanisms from two architectures, a Deep Attention Long Short-Term Memory-Network (LSTM-N) (Cheng et al., 2016) and a Decomposable Attention model (Parikh et al., 2016), for anaphora resolution, i.e. detecting coreference relations between a pronoun and a noun (its antecedent). The models are adapted from an entailment task, to address the pronominal coreference resolution task by comparing two pairs of sentences: one with the original sentences containing the antecedent and the pronoun, and another one with the pronoun replaced with a correct or an incorrect antecedent. The goal is thus to detect the correct replacements, assuming the original sentence pair entails the one with the correct replacement, but not one with an incorrect replacement. We use the CoNLL-2012 English dataset (Pradhan et al., 2012) to train the models and evaluate the ability to recognize correct and incorrect pronoun replacements in sentence pairs. We find that the Decomposable Attention Model performs better, while using a much simpler architecture. Furthermore, we focus on two previous studies that use intra- and inter-attention mechanisms, discuss how they relate to each other, and examine how these advances work to identify correct antecedent replacements.

Work done during an internship of the first author at the Idiap Research Institute from March to August 2017.

 Record created 2017-10-19, last modified 2018-09-13

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)