Improving Zero-Shot Learning via Distribution Embeddings

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Zero-Shot Learning (ZSL) for image classification aims to recognize images from novel classes for which we have no training examples. A common approach to tackling such a problem is by transferring knowledge from seen to unseen classes using some auxiliary semantic information of class labels in the form of class embeddings. Most of the existing methods represent image features and class embeddings as point vectors, and such vector representation limits the expressivity in terms of modeling the intra-class variability of the image classes. In this thesis, we propose three novel ZSL methods that represent image features and class labels as distributions and learn their corresponding parameters as distribution embeddings. Therefore, the intra-class variability of image classes is better modeled. The first model is a Triplet model, where image features and class embeddings are projected as Gaussian distributions in a common space, and their associations are learned by metric learning. Next, we have a Triplet-VAE model, where two VAEs are trained with triplet based distributional alignment for ZSL. The third model is a simple Probabilistic Classifier for ZSL, which is inspired by energy-based models. When evaluated on the common benchmark ZSL datasets, the proposed methods result in an improvement over the existing state-of-the-art methods for both traditional ZSL and more challenging Generalized-ZSL (GZSL) settings.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)