Data Fusion and Text Mining for Supporting Journalistic Work

University essay from Linnéuniversitetet/Institutionen för datavetenskap och medieteknik (DM)

Abstract: During the past several decades, journalists have been struggling with the ever growing amount of data on the internet. Investigating the validity of the sources or finding similar articles for a story can consume a lot of time and effort. These issues are even amplified by the declining size of the staff of news agencies. The solution is to empower the remaining professional journalists with digital tools created by computer scientists. This thesis project is inspired by an idea to provide software support for journalistic work with interactive visual interfaces and artificial intelligence. More specifically, within the scope of this thesis project, we created a backend module that supports several text mining methods such as keyword extraction, named entity recognition, sentiment analysis, fake news classification and also data collection from various data sources to help professionals in the field of journalism. To implement our system, first we gathered the requirements from several researchers and practitioners in journalism, media studies, and computer science, then acquired knowledge by reviewing literature on current approaches. Results are evaluated both with quantitative methods such as individual component benchmarks and also with qualitative methods by analyzing the outcomes of the semi-structured interviews with collaborating and external domain experts. Our results show that there is similarity between the domain experts' perceived value and the performance of the components on the individual evaluations. This shows us that there is potential in this research area and future work would be welcomed by the journalistic community.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)