Summarizing the Results of a Series of Experiments : Application to the Effectiveness of Three Software Evaluation Techniques

University essay from Blekinge Tekniska Högskola/Sektionen för datavetenskap och kommunikation

Abstract: Software quality has become and persistently remains a big issue among software users and developers. So, the importance of software evaluation cannot be overemphasized. An accepted fact in software engineering is that software must undergo evaluation process during development to ascertain and improve its quality level. In fact, there are too many techniques than a single developer could master, yet, it is impossible to be certain that software is free of defects. Therefore, it may not be realistic or cost effective to remove all software defects prior to product release. So, it is crucial for developers to be able to choose from available evaluation techniques, the one most suitable and likely to yield optimum quality results for different products - it bogs down to choosing the most appropriate for different situations. However, not much knowledge is available on the strengths and weaknesses of the available evaluation techniques. Most of the information related to the techniques available is focused on how to apply the techniques but not on the applicability conditions of the techniques – practical information, suitability, strengths, weaknesses etc. This research focuses on contributing to the available applicability knowledge of software evaluation techniques. More precisely, it focuses on code reading by stepwise abstraction as representative of the static technique, as well as equivalence partitioning (functional technique) and decision coverage (structural technique) as representatives of the dynamic technique. The specific focus of the research is to summarize the results of a series of experiments conducted to investigate the effectiveness of these techniques among other factors. By effectiveness in this research, we mean the potential of each of the techniques to generate test cases capable of revealing software faults in the case of the dynamic techniques or the ability of the static technique to generate abstractions that will aid the detection of faults. The experiments used two versions of three different programs with seven different faults seeded into each of the programs. This work uses the results of the eight different experiments performed and analyzed separately, to explore this fact. The analysis results were pooled together and jointly summarized in this research to extract a common knowledge from the experiments using a qualitative deduction approach created in this work as it was decided not to use formal aggregation at this stage. Since the experiments were performed by different researchers, in different years and in some cases at different site, there were several problems that have to be tackled in order to be able to summarize the results. Part of the problems is the fact that the data files exist in different languages, the structure of the files are different, different names is used for data fields, the analysis were done using different confidence level etc. The first step, taken at the inception of this research was to apply all the techniques to the programs used during the experiments in order to detect the faults. This purpose of this personal experience with the experiment is to be familiarized and get acquainted to the faults, failures, the programs and the experiment situations in general and also, to better understand the data as recorded from the experiments. Afterwards, the data files were recreated to conform to a uniform language, data meaning, file style and structure. A well structured directory was created to keep all the data, analysis and experiment files for all the experiments in the series. These steps paved the way for a feasible results synthesis. Using our method, the technique, program, fault, program – technique, program – fault and technique – fault were selected as main and interaction effects having significant knowledge relevant to the analysis summary result. The result, as reported in this thesis, indicated that the functional technique and the structural technique are equally effective as far as the programs and faults in these experiments are concerned. Both perform better than the code review. Also, the analysis revealed that the effectiveness of the techniques is influenced by the fault type and the program type. Some faults were found to exhibit better behavior with certain programs, some were better detected with certain techniques and even the techniques yield different result in different programs.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)