Evolving Text Classifier Using Genetic Programming

University essay from Lunds universitet/Avdelningen för Biomedicinsk teknik

Abstract: Text classification is one of the main tasks within the field of natural language processing, which has been growing significantly during the last decade with applications in different industries. Despite different approaches to text classification showing good results, such as Machine Learning and Deep Learning, their shortcomings give substance to the need for further research on other approaches. In this thesis, we propose a genetic programming algorithm - a technique inspired by biological evolution, which is capable of producing text classification models by means of string matchers and character n-grams. This approach does not require domain-specific knowledge and manual feature engineering and can provide interpretability in the model. The performance of the classification models produced by the proposed algorithm gives promising results, especially on topic detection.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)