Neural Language Models with Explicit Coreference Decision

University essay from Uppsala universitet/Institutionen för lingvistik och filologi

Abstract: Coreference is an important and frequent concept in any form of discourse, and Coreference Resolution (CR) a widely used task in Natural Language Understanding (NLU). In this thesis, we implement and explore two recent models that include the concept of coreference in Recurrent Neural Network (RNN)-based Language Models (LM). Entity and reference decisions are modeled explicitly in these models using attention mechanisms. Both models learn to save the previously observed entities in a set and to decide if the next token created by the LM is a mention of one of the entities in the set, an entity that has not been observed yet, or not an entity. After a theoretical analysis where we compare the two LMs to each other and to a state of the art Coreference Resolution system, we perform an extensive quantitative and qualitative analysis. For this purpose, we train the two models and a classical RNN-LM as the baseline model on the OntoNotes 5.0 corpus with coreference annotation. While we do not reach the baseline in the perplexity metric, we show that the models’ relative performance on entity tokens has the potential to improve when including the explicit entity modeling. We show that the most challenging point in the systems is the decision if the next token is an entity token, while the decision which entity the next token refers to performs comparatively well. Our analysis in the context of a text generation task shows that a wide-spread error source for the mention creation process is the confusion of tokens that refer to related but different entities in the real world, presumably a result of the context-based word representations in the models. Our re-implementation of the DeepMind model by Yang et al. 2016 performs notably better than the re-implementation of the EntityNLM model by Ji et al. 2017 with a perplexity of 107 compared to a perplexity of 131.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)