Primary stage Lung Cancer Prediction with Natural Language Processing-based Machine Learning

University essay from KTH/Skolan för kemi, bioteknologi och hälsa (CBH)

Abstract: Early detection reduces mortality in lung cancer, but it is also considered as a challenge for oncologists and for healthcare systems. In addition, screening modalities like CT-scans come with undesired effects, many suspected patients are wrongly diagnosed with lung cancer. This thesis contributes to solve the challenge of early lung cancer detection by utilizing unique data consisting of self-reported symptoms. The proposed method is a predictive machine learning algorithm based on natural language processing, which handles the data as an unstructured data set. A replication of a previous study where a prediction model based on a conventional multivariate machine learning using the same data is done and presented, for comparison. After evaluation, validation and interpretation, a set of variables were highlighted as early predictors of lung cancer. The performance of the proposed approach managed to match the performance of the conventional approach. This promising result opens for further development where such an approach can be used in clinical decision support systems. Future work could then involve other modalities, in a multimodal machine learning approach. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)