Conditional Generative Flow for Street Scene Generation

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Moein Sorkhei; [2020]

Keywords: ;

Abstract: Generative modeling is a major branch of machine learning attributed to designing models that can learn how data are generated and hence are able to synthesize novel data. With the recent advancements in deep learning, generative models have been improved significantly and successfully applied in a variety of domains, including computer vision, video generation, audio generation, and even in medical applications. Amongst different categories of generative models, GAN-based models [1] are by far the most well-known generative models applied to a variety of computer vision tasks due to their ability in synthesizing large images. Recent advances in likelihood-based models [2, 3], however, suggest that these models could alternatively be used instead of GAN-based methods while exhibiting huge benefits such as stable training and useful learned representation. These advancements enable likelihood-based models to generate images that are as realistic as those of GAN-based models. In this project, we study modern generative models and their categorization with focus on computer vision tasks. We study Glow [2], a recent flow-based generative model, in detail and extend its architecture for conditional image generation tasks. We evaluate the model against the most popular GAN-based counterpart [4] and show that this model could be an alternative to GAN-based models while enjoying advantages that are inherently lacking in GANs. We also show that by generalizing the operations in Glow [2] so that they are all conditioned on the features of the condition input, we are able to generate more visually appealing results compared to recent Glow-based conditional models [5, 6]. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)