Static code metrics vs. process metrics for software fault prediction using Bayesian network learners

University essay from Mälardalens högskola/Akademin för innovation, design och teknik

Abstract: Software fault prediction (SFP) has an important role in the process of improving software product quality by identifying fault-prone modules. Constructing quality models includes a usage of metrics that describe real world entities defined by numbers or attributes. Examining the nature of machine learning (ML), researchers proposed its algorithms as suitable for fault prediction. Moreover, information that software metrics contain will be used as statistical data necessary to build models for a certain ML algorithm. One of the most used ML algorithms is a Bayesian network (BN), which is represented as a graph, with a set of variables and relations between them. This thesis will be focused on the usage of process and static code metrics with BN learners for SFP. First, we provided an informal review on non-static code metrics. Furthermore, we created models that contained different combinations of process and static code metrics, and then we used them to conduct an experiment. The results of the experiment were statistically analyzed using a non-parametric test, the Kruskal-Wallis test. The informal review reported that non-static code metrics are beneficial for the prediction process and its usage is highly recommended for industrial projects. Finally, experimental results did not provide a conclusion which process metric gives a statistically significant result; therefore, a further investigation is needed.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)