An empirical assessment of the predictive quality of internal product metrics to predict software maintainability in practice

University essay from Blekinge Tekniska Högskola/Institutionen för programvaruteknik; Blekinge Tekniska Högskola/Institutionen för programvaruteknik

Abstract: Background. Maintainability of software products continues to be an area of im- portance and interest both for practice and research. The time used for maintenance usually exceeds 70% of the whole period of software development process. At present, there is a large number of metrics that have been suggested to indicate the main- tainability of a software product. However, there is a gap in validation of proposed source code metrics and the external quality of software maintainability. Objectives. In this thesis, we aim to catalog the proposed metrics for software maintainability. From this catalog we will validate a subset of commonly proposed maintainability indicators. Methods. Through a literature review with a systematic search and selection ap- proach, we collated maintainability metrics from secondary studies on software main- tainability. A subset of commonly metrics identified in the literature review were validated in a retrospective study. The retrospective study used a large open source software "Elastic Search" as a case. We collected internal source code metrics and a proxy for maintainability of the system for 911 bug fixes in 14 version (11 experi- mental samples, 3 are verification samples) of the product. Results. Following a systematic search and selection process, we identified 11 sec- ondary studies on software maintainability. From these studies we identified 290 source code metrics that are claimed to be indicators of the maintainability of a soft- ware product. We used mean time to repair (MTTR) as a proxy for maintainability of a product. Our analysis reveals that for the "elasticsearch" software, the values of the four indicators LOC, CC, WMC and RFC have the strongest correlation with MTTR. Conclusions. In this thesis, we validated a subset of commonly proposed source code metrics for predicting maintainability. The empirical validation using a popu- lar large-scale open source system reveals that some metrics have shown a stronger correlation with a proxy for maintainability in use. This study provides important empirical evidence towards a better understanding of source code attributes and maintainability in practice. However, a single case and a retrospective study are insufficient to establish a cause effect relation. Therefore, further replications of our study design with more diverse cases can increase the confidence in the predictive ability and thus the usefulness of the proposed metrics.

