Risk Stratification of Endometriosis through Machine Learning using Lifestyle Data : An Extensive Analysis on Lifestyle Data to Reveal Patterns in People with Endometriosis

University essay from KTH/Skolan för kemi, bioteknologi och hälsa (CBH)

Abstract: Endometriosis affect 11% of women of reproductive years worldwide. The project made use of lifestyle factors coming from the Lucy application. The Pearson correlation test was used to find linear correlation between endometriosis and lifestyle factors, while different machine learning models and logistic regression was used for finding non-linear correlations. The strongest linear correlation found (-0.23) was irregular menstruation however, the score does suggest a weak linear correlation. Decision Tree, Gradient boosted DT, XgBoost, Random Forest, and Logistic regression were usedto find patterns within the dataset. Risk stratification results proved to be unreliable. Decision Tree and its variants show strong evidence of correlation between endometriosis and the following features: weight, irregular menstruation, menstruation length, height, cycle length, irregular cycle, age, pregnancy, and daily symptoms. Additional analysis on those features could give more insight on what may be correlated as well as cause endometriosis.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)