Forecasting anomalies in time series data from online production environments

University essay from Linköpings universitet/Institutionen för datavetenskap

Abstract: Anomaly detection on time series forecasts can be used by many industries in especially forewarning systems that can predict anomalies before they happen. Infor (Sweden) AB is software company that provides Enterprise Resource Planning cloud solutions. Infor is interested in predicting anomalies in their data and that is the motivation for this thesis work. The general idea is firstly to forecast the time series and then secondly detect and classify anomalies on the forecast. The first part is time series forecasting and the second part is anomaly detection and classification done on the forecasted values. In this thesis work, the time series forecasting to predict anomalous behaviour is done using two strategies namely the recursive strategy and the direct strategy. The recursive strategy includes two methods; AutoRegressive Integrated Moving Average and Neural Network AutoRegression. The direct strategy is done with ForecastML-eXtreme Gradient Boosting. Then the three methods are compared concerning performance of forecasting. The anomaly detection and classification is done by setting a decision rule based on a threshold. In this thesis work, since the true anomaly thresholds were not previously known, an arbitrary initial anomaly threshold is set by using a combination of statistical methods for outlier detection and then human judgement by the company commissioners. These statistical methods include Seasonal and Trend decomposition using Loess + InterQuartile Range, Twitter + InterQuartile Range and Twitter + GESD (Generalized Extreme Studentized Deviate). After defining what an anomaly threshold is in the usage context of Infor (Sweden) AB, then a decision rule is set and used to classify anomalies in time series forecasts. The results from comparing the classifications of the forecasts from the three time series forecasting methods are unfortunate and no recommendation is made concerning what model or algorithm to be used by Infor (Sweden) AB. However, the thesis work concludes by recommending other methods that can be tried in future research.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)