EALRTS : A predictive regression test selection tool

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Erik Lundsten; [2019]

Keywords: ;

Abstract: Regression testing is the process of confirming that a code change did not introduce any test failure into the current build. One Regression testing technique commonly used is Regression test selection or RTS. It is the process of identifying all tests affected by a code change, which are identified by creating a dependency graph of the project. The selected tests are then executed. The purpose of RTS is to reduce the development time by lowering the time for testing. Machine learning has been used as a test selection tool in recent studies and have shown promising results. Machine learning were used with a RTS tool to further reduce the number of tests selected. The features are primarily extracted from the dependency graph from the RTS tool. The machine learning is then used to estimate the probability of a test failure, and the tests are selected based on the probability of test failure. However, in order to train a machine learning model, it is essential to have a lot of data, and faulty code changes are required. Code defects need to be tested with the RTS tool while extracting data from running the tests. However, for open source projects, obtaining a large number of historical code defects is challenging. This paper presents EALRTS, a predictive regression test selection tool. EALRTS uses mutation generation instead of historical code defects. The data for the machine learning model is obtained with the help of STARTS, which is a static RTS tool. The data extracted comes mainly from two sources: (1) from the dependency graph that STARTS creates. (2) And from the test result reports. The data extracted is then used to train a Random Forest algorithm, whose goal is to predict what test to select. EALRTS managed to reduce the number of tests selected by 60.3% while finding 95% of all failed tests. The recall rate is interpreted as the amount of individual test failure found in a test class. The results show a trade-off between the number of individual test failures found and the number of tests selected. The trade-off suggests that a machine learning model can drastically lower the amount of test selected by a slight reduction in recall rate. The results for EALRTS are based on one case study, 725 test runs with a project consisting of 808 Java-files.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)