Analyzing user feedback written inmultiple languages and automatically identifyingrequirements from that feedback

University essay from Blekinge Tekniska Högskola/Institutionen för programvaruteknik

Abstract: Background. Requirements gathering is the most important and often an errorprone task in the software development cycle. There is no perfect procedure to getthe requirements for the developing product. Every organization employs its way ofextracting requirements, and the most common way is to extract requirements fromuser stories. Although requirements from users’ stories provide an idea about whatthe end user expects, there is a chance that the gathered requirements are not complete. In this thesis, we suggest a way to automatically extract requirements fromusers’ feedback tweets, a multi-lingual dataset. Requirements from feedback combined with the requirements from user stories will help the developers to understandmore about what the users need and what changes they have to make to improvethe product. Objectives. The objective of this paper is to perform SLR (systematic literaturereview) to know the state-of-art techniques used to classify requirements from users’feedback in multi-lingual language automatically. Then conduct a case study ontelia users’ feedback which is Twitter data in Swedish and English. Build a NaturalLanguage Processing (NLP) model that can automatically classify users’ feedback. Methods. The SLR (systematic literature review) is conducted using forward andbackward snowballing with inclusion criteria. To build an automation model we useda case study approach to Telia’s user feedback. We have used libraries like SpaCy,sklearn, pandas, re to build the model. In the developed model, we used tf-idf vectorization for feature extraction, and an SVM classifier to classify the tweets intopossible requirements. The results are measured using a confusion matrix. Results. In SLR, we understood the state-of-art techniques and technologies usedto classify multi-lingual data. We identified that tf-idf and SVM are the perfect fitfor our case study for feature extraction and classification. On conducting the casestudy using tf-idf feature extraction and SVM classifier, the accuracy, precision, andrecall scores we obtained are 0.73, 0.67, and 0.68 respectively. Conclusions. The main goal of this paper is to improve the requirements engineering phase in the software development cycle. This study will help companies whowant to develop their product from users’ feedback. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)