An Evaluation of Various On-line Classification Algorithms in Nonstationary Environments

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Hugo Bellem Westin; Karolina Shi; [2020]

Keywords: ;

Abstract: On-line classification algorithms are useful in cases were data is streamed in large amounts, which is quite common in todays society. However, when the data starts to drift (i.e. concept drift) it might lead to lower prediction accuracy. For example seasonal changes, climate changes, weather sensor detoriating are all things that might lead to drifts in data in weather forecasts. The purpose of this study is to evaluate and give an overview of how some current on-line classification algorithms handle the problem of concept drift. Dynamic Weighted Majority(DWM), Adaptive Random Forest(ARF) and KNearest Neighbor with ADWIN(KNNA) are tested on a Gaussian dataset and on the SEA dataset. Naïve Bayes(NB) is included in testing as a benchmark as it is well known and not adapted to handle concept drift. Results showed that Naïve Bayes had a distinguishable lower performance accuracy in both datasets. KNNA had a slightly higher accuracy than the rest in the Gaussian dataset. All on-line algorithms had a similar performance in the SEA dataset. Between datasets, the algorithms had a measurable lower performance accuracy in the SEA dataset.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)