Designing Variational Autoencoders for Image Retrieval

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Sara Torres Fernandez; [2018]

Keywords: ;

Abstract: The explosive growth of acquired visual data on the Internet has raised interestin developing advanced image retrieval systems. The main problem relies on thesearch of a specic image among large collections or databases, and this issue isshared by lots of users from a variety of domains, like crime prevention, medicineor journalism. To deal with this situation, this project focuses on variationalautoencoders for image retrieval.Variational autoencoders (VAE) are neural networks used for the unsupervisedlearning of complicated distributions by using stochastic variational inference.Traditionally, they have been used for image reconstruction or generation.However, the goal of this thesis consists of testing variational autoencoders forthe classication and retrieval of dierent images from a database.This thesis investigates several methods to achieve the best performance forimage retrieval applications. We use the latent variables in the bottleneck stageof the VAE as the learned features for the image retrieval task. In order toachieve fast retrieval, we focus on discrete latent features. Specically, the sigmoidfunction for binarization and the Gumbel-Softmax method for discretizationare investigated. The tests show that using the mean of the latent variablesas features gives generally better performance than their stochastic representations.Further, discrete features that use the Gumbel-Softmax method in thelatent space show good performance. It is close to the maximum a posterioriperformance as achieved by using a continuous latent space.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)