Analysis of the effect of latent dimensions on disentanglement in Variational Autoencoders

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Joakim Dahl; [2021]

Keywords: ;

Abstract: Disentanglement is a subcategory to Representaton learning where we, apart from believing that useful properties can be extracted from the data in a more compact form, also envision that the data itself is constituted from a lower-dimensional subset of explanatory factors. Explanatory factors are an ambiguous concept and what they portray varies with the dataset. A dataset constituted of flowers may have stem size and color as explanatory factors, while for another dataset it may be location or position. The explanatory factors are themselves often nested in a complex interaction in order to generate the data. Disentanglement can be summarized as to breaking the potentially complex interaction between the explanatory factors to liberate them from one another. The liberated explanatory factors can then constitute the foundation of the representations, a procedure that is believed to enhance downstream machine learning tasks. Disentangling the explanatory factors in an unsupervised environment has proven to be a difficult task for many reasons, perhaps most notably the lack of knowledge of how many they are and what they reflect. To be able to evaluate the degree of disentanglement attained, we will consider a dataset annotated for us with target labels corresponding to the explanatory factors that generated the data. Knowing the number of explanatory factors gives an indication of what dimensionality the representation should have to at least be able to capture all of the explanatory factors. Many of the empirical studies that have been considered in this paper treat the dimensionality of the representations as a constant when evaluating the degree of disentanglement achieved. The purpose of this paper is to extend the discussion regarding disentanglement by treating the dimensionality of the representations as a variable to be alternated and investigate how this impacts the degree of disentanglement achieved. The experiments performed in this paper do however suggest that the visual inspection of the disentanglement attained in a high dimensional representation space are difficult to interpret and evaluate for the human eye. One is therefore even further reliant on the disentanglement scores, which does not require any human interaction for the evaluation. The disentanglement scores seem to exhibit a static behaviour, not changing as much as one would believe given the visual inspection. Therefore, investigating how the representation dimensionality affect the disentanglement attained among the representations is a delicate matter. Many of the empirical studies considered in this paper suggest that mostly the regularization is impacting the disentanglement. It does however seem like there are far more parameters than originally was expected that need further evaluation to deduce their impact with respect to disentanglement.  

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)