Using Machine Learning to Learn from Bug Reports : Towards Improved Testing Efficiency

University essay from Linköpings universitet/Institutionen för datavetenskap

Abstract: The evolution of a software system originates from its changes, whether it comes from changed user needs or adaption to its current environment. These changes are as encouraged as they are inevitable, although every change to a software system comes with a risk of introducing an error or a bug. This thesis aimed to investigate the possibilities of using the description of bug reports as a decision basis for detecting the provenance of a bug by using machine learning. K-means and agglomerative clustering have been applied to free text documents by using Natural Language Processing to initially divide the investigated software system into sub parts. Topic labelling is further on performed on the found clusters to find suitable names and get an overall understanding for the clusters.Finally, it was investigated if it was possible to find which cluster that were more likely to cause a bug from certain clusters and should be tested more thoroughly. By evaluating a subset of known causes, it was found that possible direct connections could be found in 50% of the cases, while this number increased to 58% if the cause were attached to clusters. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)