An empirical comparison of generative capabilities of GAN vs VAE

University essay from KTH/Datavetenskap

Author: Norma Cristina Cueto Ceilis; Hanna Peters; [2022]

Keywords: ;

Abstract: Generative models are a family of machine learning algorithms that are aspire to enable computers to understand the real world. Their capability to understand the underlying distribution of data enables them to generate synthetic data from the data they are trained on. One field these networks are specifically useful within is image synthesis, which the medical field could benefit from when images on diseases are scarce. In this report a comparison of a Convolutional Variational autoencoder (CVAE) and a Deep Convolutional Generative adversarial network (DCGAN) was made, with respect to their ability to generate synthetic images. The models were trained on two different datasets, the MNIST digits dataset and the Fashion-MNIST dataset and their performance was measured using Fréchet inception distance (FID), a recently proposed evaluation metric. Our results showed that the DCGAN was superior to the CVAE on generating synthetic images. It achieved lower, i.e better, FID scores on a great majority of the tests with a significant margin. However, the DCGAN had a larger variation of FID scores, whilst the CVAE had more consistent scores. Our results are in line with previous work presented within the field regarding both our results and the characteristics of FID. Although no proper visual inspection of the images was conducted in this report we present the generated images for both models for the curious reader and for further research.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)