An empirical investigation on modern code review focus areas

University essay from Blekinge Tekniska Högskola/Institutionen för programvaruteknik

Abstract: Background: In a sustaining, durable project, an effective code review process is key to ensuring the long-term quality of the code base. As the size of the software continues to increase, although the code inspections have many benefits, the time it takes, the manpower makes it not a good method in some larger projects.  Nowadays more and more industry performs modern code reviews for their project in order to increase the quality of the program. Only a few papers have studied the relationship between code reviewers and code review quality. We need to explore the relationships among code review, code complexity, and reviewers. Finding out which part of the code the reviewers pay more attention to in the code review and how much effort it takes to review. This way we can conduct code reviews more effectively. Objectives: The objective of our study is to investigate if code complexity relates to how software developers to review code in terms of code review length, review frequency, review text quality, reviewer’s sentiment. What’s more, we want to research if the reviewer’s experience will have an impact on code review quality. In order to find a suitable way to conduct a code review for different complexity codes.  Methods: We conduct an exploratory case study. The case and unit of analysis is the open-source project, Cassandra. We extract data from Cassandra Jira (a proprietary issue tracking product), the data are the reviewer’s name, review content, review time, reviewer’s comments, reviewer’s sentiment, comment length, and the review file(java file). Then we use CodeMR to calculate the complexity of the file, it uses some coupling and code complexity metrics. The reviewer’s sentiment is analyzed by a text analysis API. After we collect all these data we use SPSS to do a statistic analysis, to find whether there are relationships between code complexity and these factors. What’s more, we have a workshop and send out questionnaires to collect more input from Cassandra developers. Results: The results show that code review frequency is related to code complexity, complex code requires more review. Reviewer’s sentiment is related to code complexity, reviewer’s sentiment towards complex code is more positive or negative rather than neutral. Code review text quality is related to the reviewer’s experience, experienced reviewers leave a comment with higher quality than novice reviewers. On the other hand, the code review length and review text quality are not related to code complexity. Conclusions: According to the results, the code with higher code complexity related to the more frequent review, and the reviewer's emotions are more clear when reviewing more complex code. Training experienced reviewers are also very necessary because the results show that experienced reviewers review the code with higher quality. From the questionnaire, we know developers believe that more complex code needs more iterations of code review and experienced reviewers do have a positive effect on code review, which gives us a guide on how to do code review based on a different level of code complexity. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)