Multi-view versus single-view machine learning for disease diagnosis in primary healthcare

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Aleksandar Labroski; [2018]

Keywords: ;

Abstract: The work presented in this report considers and compares two different approaches of machine learning towards solving the problem of disease diagnosis prediction in primary healthcare: single-view and multi-view machine learning. In particular, the problem of disease diagnosis prediction refers to the issue of predicting a (possible) diagnosis for a given patient based on her past medical history. The problem area is extensive, especially considering the fact that there are over 14,400 unique possible diagnoses (grouped into22 high level categories) that can be considered as prediction targets. The approach taken in this work considers the high-level categories as prediction targets and attempts to use the two different machine learning techniques towards getting close to an optimal solution of the issue. The multi-view machine learning paradigm was chosen as an approach that can improve predictive performance of classifiers in settings where we have multiple heterogeneous data sources (different views of the same data), which is exactlyt he case here. In order to compare the single-view and multi-view machine learning paradigms (based on the concept of supervised learning), several different experiments are devised which explore the possible solution space under each paradigm. The work closely touches on other machine learning concepts such as ensemble learning, stacked generalization and dimensionality reduction-based learning. As we shall see, the results show that multiview stacked generalization is a powerful paradigm that can significantly improve the predictive performance in a supervised learning setting. The different models performance was evaluated using F1 scores and we have been able to observe an average increase of performance of 0.04 and a maximum increase of 0.114 F1 score points. The findings also show that approach of multi-view stacked ensemble learning is particularly well suited as a noise reduction technique and works well in cases where the feature data is expected to contain a notable amount of noise. This can be very beneficial and of interest to projects where the features are not manually chosen by domainexperts.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)