Application of transfer learning in text classification for small and medium sized web-based enterprises

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Victor Oldensand; Simon Haglund; [2022]

Keywords: ;

Abstract: In recent years, the open sourcing of pretrained machine learning models through platforms like Hugging Face has reduced the barriers to entry in big data analysis. This thesis studies the use case of such models for web-based organisations, with a focus on text classification. The use of pretrained models, known as transfer learning, is evaluated against the traditional supervised machine learning approach. As a case study, this report investigates an RSS feed content manager called Feeder which aims to classify their user-read content into 10 predefined categories. Therefore, a naive Bayes model is developed to represent the traditional approach and a pretrained transformer model is used to represent the transfer learning approach. These classifiers are subsequently evaluated separately on efficiency and accuracy. The results indicate that a transfer learning approach yields more accurate predictions, whereas the traditional models may be less computationally intensive. Furthermore, this report analyses the business case for the use of transfer learning through the lens of consumer profiling theory and Porter’s five forces. An interview with Feeder’s chief technical officer suggests that there are unlimited uses of the technology, and with the development of improved processing power and cloud computing its feasibility in practice is substantially improved.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)