Generative AI for Synthetic Data

University essay from Lunds universitet/Institutionen för reglerteknik

Abstract: Synthetic data generation has emerged as a valuable technique for addressing data scarcity and privacy concerns and improving machine learning algorithms. This thesis focuses on progressing the field of synthetic data generation, which may play a crucial role in AI-heavy industries such as telecommunications. Generative Adversarial Networks successfully generate various types of synthetic data but fall short when modelling the temporal patterns and conditional distributions of time series data. State-of-the-art TimeGAN has shown promise, but there is potential for refinement. We propose T2GAN, utilising TimeGAN’s novel framework of combining unsupervised and supervised training and extending it using state-of-the-art machine learning techniques, such as Transformers. Through experimental evaluation, we quantify the effectiveness of T2GAN using various benchmark data sets and find that the T2GAN model significantly surpasses the TimeGAN in both discriminative and predictive capacities. Our results demonstrate a 38% enhancement in similarity measures and a 55% reduction in relative prediction error when using synthetic training data. Furthermore, the thesis presents a comprehensive literature study and analysis of generative models, detailing the potential of T2GAN in various domains by enabling privacy-preserving data analysis, facilitating research and development, and enhancing machine learning algorithms.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)