Pattern Acquisition Methods for Information Extraction Systems

University essay from Blekinge Tekniska Högskola/Avdelningen för programvarusystem

Abstract: This master thesis treats about Event Recognition in the reports of Polish stockholders. Event Recognition is one of the Information Extraction tasks. This thesis provides a comparison of two approaches to Event Recognition: manual and automatic. In the manual approach regular expressions are used. Regular expressions are used as a baseline for the automatic approach. In the automatic approach three Machine Learning methods were applied. In the initial experiment the Decision Trees, naive Bayes and Memory Based Learning methods are compared. A modification of the standard Memory Based Learning method is presented which goal is to create a classifier that uses only positives examples in the classification task. The performance of the modified Memory Based Learning method is presented and compared to the baseline and also to other Machine Learning methods. In the initial experiment one type of annotation is used and it is the meeting date annotation. The final experiment is conducted using three types of annotations: the meeting time, the meeting date and the meeting place annotation. The experiments show that the classification can be performed using only one class of instances with the same level of performance.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)