Domain Adaptation for Hypernym Discovery via Automatic Collection of Domain-Speciﬁc Training Data

University essay from Linköpings universitet/Institutionen för datavetenskap

Author: Johannes Palm Myllylä; [2019]

Keywords: NLP; natural language processing; domain adaptation; hypernym; hyponym;

Abstract: Identifying semantic relations in natural language text is an important component of many knowledge extraction systems. This thesis studies the task of hypernym discovery, i.e discovering terms that are related by the hypernymy (is-a) relation. Speciﬁcally, this thesis explores how state-of-the-art methods for hypernym discovery perform when applied in speciﬁc language domains. In recent times, state-of-the-art methods for hypernym discovery are mostly made up by supervised machine learning models that leverage distributional word representations such as word embeddings. These models require labeled training data in the form of term pairs that are known to be related by hypernymy. Such labeled training data is often not available when working with a speciﬁc language domain. This thesis presents experiments with an automatic training data collection algorithm. The algorithm leverages a pre-deﬁned domain-speciﬁc vocabulary, and the lexical resource WordNet, to extract training pairs automatically. This thesis contributes by presenting experimental results when attempting to leverage such automatically collected domain-speciﬁc training data for the purpose of domain adaptation. Experiments are conducted in two different domains: One domain where there is a large amount of text data, and another domain where there is a much smaller amount of text data. Results show that the automatically collected training data has a positive impact on performance in both domains. The performance boost is most signiﬁcant in the domain with a large amount of text data, with mean average precision increasing by up to 8 points.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Domain Adaptation for Hypernym Discovery via Automatic Collection of Domain-Speciﬁc Training Data

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-23)