Just-In-Time Software Defect Prediction using version control tool based software metrics and source code embeddings

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Christopher Dahlén; [2019]

Keywords: ;

Abstract: Software development is a multifactorial process. Its complexity has made it challenging to study the circumstances that underlie efficient software development. However, a better understanding of these factors will reduce the long-term costs of software development. Just-In-Time Software Defect Prediction (JIT SDP) is defined as the problem of predicting whether a given piece of software is defective based on data from the development process. It is a growing research field due to its potential of increasing the development efficiency by identifying defective code at an early stage. The objective of this thesis is to test the applicability of Bayesian Neural Networks (BNN) in JIT SDP and to investigate the application of source code embeddings in JIT SDP. The data utilized are two proprietary datasets containing software metrics and source code snippets logged by a version control tool. The results show that the performance of the BNN is analogous with the state-of-the-art model. Unlike the state-of-the-art model the BNN exhibit lower uncertainty in correct predictions than in incorrect predictions. More research is however needed to establish the applicability of BNNs in JIT SDP. Furthermore, the results show that there is a significant improvement in accuracy (6.7%) and in Matthews Correlation Coefficient (13%) when using source code embeddings in addition to the software metrics. This indicates that source code embeddings are useful in JIT SDP.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)