Improving Estimation Accuracy using Better Similarity Distance in Analogy-based Software Cost Estimation

University essay from Uppsala universitet/Institutionen för informationsteknologi

Abstract: Software cost estimation nowadays plays a more and more important role in practical projects since modern software projects become more and more complex as well as diverse. To help estimate software development cost accurately, this research does a systematic analysis of the similarity distances in analogy-based software cost estimation and based on this, a new non-orthogonal space distance (NoSD) is proposed as a measure of the similarities between real software projects. Different from currently adopted measures like the Euclidean distance and so on, this non-orthogonal space distance not only considers the different features to have different importance for cost estimation, but also assumes project features to have a non-orthogonal dependent relationship which is considered independent to each other in Euclidean distance. Based on such assumptions, NoSD method describes the non-orthogonal angles between feature axes using feature redundancy and it represents the feature weights using feature relevance, where both redundancy and relevance are defined in terms of mutual information. It can better reveal the real dependency relationships between real life software projects based on this non-orthogonal space distance. Also experiments show that it brings a greatest of 13.1% decrease of MMRE and a 12.5% increase of PRED(0.25) on ISBSG R8 dataset, and 7.5% and 20.5% respectively on the Desharnais dataset. Furthermore, to make it better fit the complex data distribution of real life software projects data, this research leverages the particle swarm optimization algorithm for an optimization of the proposed non-orthogonal space distance and proposes a PSO optimized non-orthogonal space distance (PsoNoSD). It brings further improvement in the estimation accuracy. As shown in experiments, compared with the normally used Euclidean distance, PsoNoSD improves the estimation accuracy by 38.73% and 11.59% in terms of MMRE and PRED(0.25) on ISBSG R8 dataset. On the Desharnais dataset, the improvements are 23.38% and 24.94% respectively. In summary, the new methods proposed in this research, which are based on theoretical study as well as systematic experiments, have solved some problems of currently used techniques and they show a great ability of notably improving the software cost estimation accuracy.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)