Multivariate Conditional Distribution Estimation and Analysis

University essay from Uppsala universitet/Institutionen för informationsteknologi

Author: Sander Medri; [2014]

Keywords: ;


The goals of this thesis were to implement different methods for estimating conditional distributions from data and to evaluate the performance of these methods on data sets with different characteristics. The methods were implemented in C++ and several existing software libraries were also used. Tests were run on artificially generated data sets and on some real-world data sets. The accuracy, run time and memory usage of the methods was measured. Based on the results the natural or smoothing spline methods or the k-nearest neighbors method would potentially be a good first choice to apply to a data set if not much is known about it. In general the wavelet method did not seem to perform particularly well. The noisy-OR method could be a faster and possibly more accurate  alternative to the popular logistic regression in certain cases.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)