Privacy Preserving Survival Prediction With Graph Neural Networks

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: In the development process of novel cancer drugs, one important aspect is to identify patient populations with a high risk of early death so that resources can be focused on patients with the highest medical unmet need. Many cancer types are heterogeneous and there is a need to identify patients with aggressive diseases, meaning a high risk of early death, compared to patients with indolent diseases, meaning a low risk of early death. Predictive modeling can be a useful tool for risk stratification in clinical practice, enabling healthcare providers to treat high-risk patients early and progressively, while applying a less aggressive watch-and-wait strategy for patients with a lower risk of death. This is important from a clinical perspective, but also a health economic perspective since society has limited resources, and costly drugs should be given to patients that can benefit the most from a specific treatment. Thus, the goal of predictive modeling is to ensure that the right patient will have access to the right drug at the right time. In the era of personalized medicine, Artificial Intelligence (AI) applied to high-quality data will most likely play an important role and many techniques have been developed. In particular, Graph Neural Network (GNN) is a promising tool since it captures the complexity of high dimensional data modeled as a graph. In this work, we have applied Network Representation Learning (NRL) techniques to predict survival, using pseudonymized patient-level data from national health registries in Sweden. Over the last decade, more health data of increased complexity has become available for research, and therefore precision medicine could take advantage of this trend by bringing better healthcare to the patients. However, it is important to develop reliable prediction models that not only show high performances but take into consideration privacy, avoiding any leakage of personal information. The present study contributes novel insights related to GNN performance in different survival prediction tasks, using population-based unique nationwide data. Furthermore, we also explored how privacy methods impact the performance of the models when applied to the same dataset. We conducted a set of experiments across 6 dataset using 8 models measuring both AUC, Precision and Recall. Our evaluation results show that Graph Neural Networks were able to reach accuracy performance close to the models used in clinical practice and constantly outperformed, by at least 4.5%, the traditional machine learning methods. Furthermore, the study demonstrated how graph modeling, when applied based on knowledge from clinical experts, performed well and showed high resiliency to the noise introduced for privacy preservation. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)