Can Large Language Models Enhance Fake News Detection? : Improving Fake News Detection With Data Augmentation

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Emil Ahlbäck; Max Dougly; [2023]

Keywords: ;

Abstract: In recent years, the proliferation of fake news has become a significant concern due to its potential to cause harm and sow discord in societies worldwide. To address this issue, machine learning techniques have been employed in a task referred to as fake news detection (FND) to assess the veracity of textual news content. However, the scarcity of annotated fake news data poses a challenge in developing effective supervised models. This study seeks to address this limitation by examining the impact of various data augmentation techniques on the performance of a state-of-the-art deep learning FND classifier. We investigate the efficacy of different data augmentation techniques, including novel ones based on generating augmentations from pre-trained large language models (Vicuna and ChatGPT) and a method referred to as “greyscaling”, which involves replacing words with less intense synonyms. We evaluate the effectiveness of these data augmentation techniques in combination with BERT-base, a widely-used classifier for FND, using small random subsets of two popular datasets: WELFake and LIAR. The results reveal modest but non-significant improvements in classification performance when employing data augmentation techniques. Our findings suggest that textual data augmentation might not always lead to substantial improvements in FND performance and that alternative strategies, such as knowledge-based, propagation-based, and source-based methods, could prove more fruitful. For future research, we recommend investigating data augmentation techniques based on GPT-4, the relationship between fake news generation and detection, as well as exploring fake news detection using Swedish news data to contribute to the existing body of research conducted on non-English datasets.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)