Relation Extraction on Swedish Text by the Use of Semantic Fields and Deep Multi-Channel Convolutional Neural Networks

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Eric Hallström; [2019]

Keywords: ;

Abstract: This thesis makes two contributions to the research domain of relation extraction (RE), i.e., the automated discovery of semantic links in unstructured text. The first contribution is a method for creating a dataset for RE, and using it to create the first Swedish RE dataset involving nine relationships between persons, locations and vehicles. The second contribution is a variety of experiments on this new dataset providing baselines. The relation extraction systems created in this thesis include deep multi-channel convolutional neural networks, and Word2Vec embeddings. A manual labeling of a subset of our data shows an accuracy of 73%. We find that using a discrete representation of part-of-speech and dependency tags in the multi-channel convolutional network yields the best performance with a micro-average F1-score of 77%. The thesis discusses a variety of problems and future avenues of research, including the underlying motivation of this work: the automatic summarization of police reports in Sweden.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)