A study about Active Semi-Supervised Learning for Generative Models

University essay from Linköpings universitet/Institutionen för datavetenskap

Abstract: In many relevant scenarios, there is an imbalance between abundant unlabeled data and scarce labeled data to train predictive models. Semi-Supervised Learning and Active Learning are two distinct approaches to deal with this issue. The first one directly uses the unlabeled data to improve model parameter learning, while the second performs a smart choice of unlabeled points to be sent to an annotator, or oracle, which can label these points and increase the labeled training set. In this context, Generative Models are highly appropriate, since they internally represent the data generating process, naturally benefiting from data samples independently of the presence of labels. This Thesis proposes Expectation-Maximization with Density-Weighted Entropy, a novel active semi-supervised learning framework tailored towards generative models. The method is theoretically explored and experiments are conducted to evaluate its application to Gaussian Mixture Models and Multinomial Mixture Models. Based on its partial success, several questions are raised and discussed as to identify possible improvements and decide which shortcomings need to be dealt with before the method is considered robust and generally applicable.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)