Subtree Hashing of Tests in Build Systems : Rust Tricorder

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Software applications are built by teams of developers that constantly iterate over the codebase. Software projects rely on a build system, which handles the management of dependencies, compilation, testing, and deployment of the software. The execution of the tests during each build allow developers to validate that their changes do not introduce regressions. However, the execution of the test suite during each build can take a long time, potentially impacting the development process. To facilitate quicker feedback, build systems use incremental building in order to avoid the reprocessing of unmodified artifacts. This is achieved by maintaining a cache of source files, and only rebuilding artifacts that differ from their cached version. Yet, changing any part of a source file invalidates the cache, triggering the re-execution of unmodified tests. This focus over an entire file can be misleading to the build system, as it can not determine whether the actual function being tested has changed, thus invoking redundant re-testing. In this thesis, we propose a finer-grained approach to caching within build systems, by caching components within the Abstract Syntax Tree instead of entire source files. We compare their hashes on subsequent runs, in order to identify components that have changed. The potential advantage of this strategy is that re-running a specific test that has not been modified will leverage the use of caches even if the file that contains it has been modified. We implement our approach in a system called TRICORDER, and integrate it within a build system called WARP. TRICORDER works by analyzing RUST source code in order to identify the test cases that have not been changed, such as through the addition of comments, or modifications of unrelated functions. This can benefit developers by avoiding the re-execution of tests that are unmodified. We evaluate our approach against 4 notable, open-source RUST projects, targeting a set of 16 tests within them. We first analyze the accuracy with which TRICORDER detects the internal dependencies of a test function, which is needed for the code slicing done by TRICORDER, in order to cache code items related to the target test function. We then introduce artificial changes to our study subjects in order to determine whether or not TRICORDER indicates tests that need to be re-run. Finally, we analyze the ability of TRICORDER to identify real changes based on the commit history of our study subjects. Our results show that the more granular approach to caching can avoid the unnecessary recompilation and re-execution of test cases. An important direction for future work is to extend the current implementation to support the entire set of RUST features in order to evaluate TRICORDER on a larger set of study subjects.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)