Detecting Performance Anomalies in a Mobile Application with Unsupervised Machine Learning

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Lukas Saari; [2019]

Keywords: ;

Abstract: Unsupervised anomaly detection algorithms are applied with the purpose of identifying performance regressions in a mobile application. To evaluate the performance, a labeled artificial data set is generated that is based on a real data set and that aims to reflect its properties. In addition to evaluating multiple classes of anomaly detection algorithms, the data set was manipulated in different ways to reduce variance and yield continuous time series, and the encoding from categorical features to numerical values was carried out with various approaches. The best results were achieved for the algorithm isolation forest without any data set manipulations and with randomized encodings for all categorical features as well as the timestamp. Using a randomized encoding for anomaly detection is a previously unexplored research area, and is shown to improve performance due to it making anomalies more separable and reducing the effects of masking.In conclusion, the results are deemed to demonstrate that anomalies are possible to detect in the studied data set and that this report serves as a satisfactory proof of concept. The results are however not regarded to be sufficient for the outlined methodology to be ready to be implemented in a production setting, especially due to low detection rates of anomalies of small magnitudes. Suggestions for future works are given regarding the encoding method, feature selection, other algorithms that would be of interest to evaluate, and applying a clustering and filtering strategy to the detected anomalies to reduce the amount of false positives.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)