Can Wizards be Polyglots: Towards a Multilingual Knowledge-grounded Dialogue System

University essay from Uppsala universitet/Institutionen för lingvistik och filologi

Abstract: The research of open-domain, knowledge-grounded dialogue systems has been advancing rapidly due to the paradigm shift introduced by large language models (LLMs). While the strides have improved the performance of the dialogue systems, the scope is mostly monolingual and English-centric. The lack of multilingual in-task dialogue data further discourages research in this direction. This thesis explores the use of transfer learning techniques to extend the English-centric dialogue systems to multiple languages. In particular, this work focuses on five typologically diverse languages, of which well-performing models could generalize to the languages that are part of the language family as the target languages, hence widening the accessibility of the systems to speakers of various languages. I propose two approaches: Multilingual Retrieval-Augmented Dialogue Model (xRAD) and Multilingual Generative Dialogue Model (xGenD). xRAD is adopted from a pre-trained multilingual question answering (QA) system and comprises a neural retriever and a multilingual generation model. Prior to the response generation, the retriever fetches relevant knowledge and conditions the retrievals to the generator as part of the dialogue context. This approach can incorporate knowledge into conversational agents, thus improving the factual accuracy of a dialogue model. In addition, xRAD has advantages over xGenD because of its modularity, which allows the fusion of QA and dialogue systems so long as appropriate pre-trained models are employed. On the other hand, xGenD takes advantage of an existing English dialogue model and performs a zero-shot cross-lingual transfer by training sequentially on English dialogue and multilingual QA datasets. Both automated and human evaluation were carried out to measure the models' performance against the machine translation baseline. The result showed that xRAD outperformed xGenD significantly and surpassed the baseline in most metrics, particularly in terms of relevance and engagingness. Whilst xRAD performance was promising to some extent, a detailed analysis revealed that the generated responses were not actually grounded in the retrieved paragraphs. Suggestions were offered to mitigate the issue, which hopefully could lead to significant progress of multilingual knowledge-grounded dialogue systems in the future. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)