A System for Building Corpus Annotated With Semantic Roles

University essay from JTH. Forskningsmiljö Informationsteknik

Abstract: Semantic role labelling (SRL) is a natural language processing (NLP) technique that maps sentences to semantic representations. This can be used in different NLP tasks. The goal of this master thesis is to investigate how to support the novel method proposed by He Tan for building corpus annotated with semantic roles. The mentioned goal provides the context for developing a general framework of the work and as a result implementing a supporting system based on the framework. Implementation is followed using Java. Defined features of the system reflect the usage of frame semantics in understanding and explaining the meaning of lexical items. This prototype system has been processed by the biomedical corpus as a dataset for the evaluation. Our supporting environment has the ability to create frames with all related associations through XML, updating frames and related information including definition, elements and example sentences and at last annotating the example sentences of the frame. The output of annotation is a semi structure schema where tokens of a sentence are labelled. We evaluated our system by means of two surveys. The evaluation results showed that our framework and system have fulfilled the expectations of users and has satisfied them in a good scale. Also feedbacks from users have defined new areas of improvement regarding this supporting environment.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)