Spelling Correction To Improve Classification Of Technical Error Reports

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Rithika Harish Kumar; [2019]

Keywords: ;

Abstract: This master’s thesis project undertook the investigation of whether spelling correction would improve the performance of the classification of reports. The idea is to use different approaches of spelling correction to check which approach suits this particular dataset. Three different approaches were tested for spelling correction. The first two approaches considered only the erroneous word for correction. The third approach also considered context or the surrounding words to the erroneous word. The results after spelling correction were tested on a model classifier. No significant improvement in the performance of the classifier was observed when compared to the baseline. The reason for this might be because most of the reports do not contain more than a few spelling errors and the majority of words detected as spelling errors are not in English. However, the second approach performed better than the baseline for the dataset due to it being language independent as most of the non-words were non-english words which are dynamically updated based on input.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)