Predict future software defects through machine learning

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Björn Hickman; Victor Holmqvist; [2021]

Keywords: ;

Abstract: The thesis aims to investigate the implications of software defect predictions through machine learning on project management. In addition, the study aims to examine what features of a code base that are useful for making such predictions. The features examined are of both organisational and technical nature, indicated to correlate with the introductions of software defects by previous studies. The machine learning algorithms used in the study are Random forest, logistic regression and naive Bayes. The data was collected from an open source git-repository, VSCode, where the correct classifications of reported defects originated from GitHub-Issues. The results of the study indicate that both technical features of a code base, as well as organisational factors can be useful when predicting future software defects. All three algorithms showed similar performance. Furthermore, the ML-models presented in this study show some promise as a complementary tool in project management decision making, more specifically decisions regarding planning, risk assessment and resource allocation. However, further studies in this area are of interest, in order to confirm the findings of this study and it’s limitations. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)