A Study on Manual and Automatic Evaluation Procedures and Production of Automatic Post-editing Rules for Persian Machine Translation

University essay from Uppsala universitet/Institutionen för lingvistik och filologi

Abstract: Evaluation of machine translation is an important step towards improving MT. One way to evaluate the output of MT is to focus on different types of errors occurring in the translation hypotheses, and to think of possible solutions to fix those errors. An error categorization is a rather beneficent tool that makes it easy to analyze the translation errors and can also be utilized to manually generate post-editing rules to be applied automatically to the product of machine translation. In this work, we define a categorization for the errors occurring in Swedish--Persian machine translation by analyzing the errors that occur in three data-sets from two websites: 1177.se, and Linköping municipality. We define three types of monolingual reference free evaluation (MRF), and use two automatic metrics BLEU and TER, to conduct a bilingual evaluation for Swedish-Persian translation. Later on, based on the experience of working with the errors that occur in the corpora, we manually generate automatic post-editing (APE) rules and apply them to the product of machine translation. Three different sets of results are obtained: (1) The results of analyzing MT errors show that the three most common types of errors that occur in the translation hypotheses are mistranslated words, wrong word order, and extra prepositions. These types of errors are placed in semantic and syntactic categories respectively. (2) The results of comparing the correlation between the automatic and manual evaluation show a low correlation between the two evaluations. (3) Lastly, applying the APE rules to the product of machine translation gives an increase in BLEU score on the largest data-set while remaining almost unchanged on the other two data-sets. The results for TER show a better score on one data-set, while the scores on the two other data-sets remain unchanged.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)