Optimizations for uncertainty prediction on streams with decision trees

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Mátyás Manninger; [2018]

Keywords: ;

Abstract: Many companies use machine learning to make predictions that are critical for their everyday operations. In many cases, knowing the uncertainty of the predictions can be crucial, as decisions are made based on calculated risk. Most uncertainty estimation methods are not designed for an online setting and cannot handle unbounded data streams or concept drift. This study proposes optimizations to two decision tree based algorithms that are specially designed to accommodate the online case. One is conformal prediction and the other is onlineQRF. Two new parameters are introduced for conformal prediction to exchange speed and accuracy. The replacement of the histograms at the leaves of the trees for onlineQRF algorithm with online quantile sketches is also proposed. These modifications are then tested on public datasets. The empirical results are analyzed in terms of speed and accuracy. The two parameters for conformal prediction did not have a significant effect on the algorithm and did not improve it in a meaningful way. Changing the data aggregation method at the leaves for onlineQRF reduced the prediction time significantly and opened up for further improvements in accuracy. This empirical study shows an improvement to a state-of-the-art online machine learning algorithm that could be adopted throughout many industries.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)