Predicting True Sepsis and Culture-positive Sepsis in Intensive Care Unit with Machine Learning Techniques

University essay from Lunds universitet/Matematisk statistik

Abstract: Sepsis, a serious medical condition often leading to patients requiring intensive care, has prompted numerous scientists to employ mathematical techniques to aid in its diagnosis. This thesis uses logistic regression and a machine learning technique, XGBoost, to predict true sepsis (as opposed to sepsis mimics) and culture-positive sepsis (among true sepsis) in critical care using blood test results, physiological measurements and other patient characteristics. In this study, the dataset employed for constructing the prediction models comprises the information of 2,667 patients across 105 variables. Notably, a considerable portion of these variables exhibits missing values. To address this issue, imputation techniques are systematically applied to rectify the gaps within the dataset. The predictive models acquired in this study are evaluated with the area under the operating characteristic curve (AUC) and using cross-validation. To address the imputed missing values within the dataset, a modified cross-validation technique is employed. This methodology ensures that imputed values are exclusively utilized during the training phase, while the testing phase exclusively involves the use of the original, unaltered data. Variable selection and analysis have been conducted employing forest plots for regression, while for XGBoost models, significance is determined through the utilization of importance plots and SHAP value plots. The result of this study shows that XGBoost performs better than the regression models. In predicting true sepsis, the XGBoost model achieves an AUC of 0.74, while the regression model yields an AUC of 0.72. In predicting culture positivity, the XGBoost model attains an AUC of 0.77, whereas the regression model yields an AUC of 0.74. Both the XGBoost algorithm and regression models demonstrated efficacy in predicting true sepsis and culture-positive sepsis. The performance of these prediction models exhibits potential for enhancement with the utilization of a more extensive dataset. Consequently, mathematical models serve as valuable and effective aids in supporting medical professionals' clinical judgement.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)