Machine Learning to minimize the human efforts in annotating the RDG-Map dialogue acts
Abstract: Annotation is the process of labelling data and when this is done manually can be a very time-consuming and mentally straining job for the particular type of data. Dialogue acts are conversational interactions and annotating them manually will be a difficult task.The thesis investigates how the inclusion of machine learning in the annotation can improve the process. The dialogue acts were taken fromthe RDGMap game which was designed to improve the geography knowledge in people. The dataset had 24 labels to annotate. First, the type ofmachine learning task needs to be decided between the text classification and token classification. Secondly, include machine learning the backend of the process to find the effect on the process.BERT and Distilbert models were used for both the tasks and the token classification task using the distilbert model gave the bestperformance. Label studio is an open-source application that is used for annotations which also can facilitate machine learning backend. Two experiments were set up on label studio, one was with machine learning backend and the other was without machine learning backend. The time taken to annotate a single dialogue act using machine learning backend on average was 10.7 seconds and without machinelearning backend on average was 12.5 seconds. The process with machine learning is approximately 2 seconds faster that is close to 20% fasterthan the manual process. The inter-annotator agreement was also better in the annotation with the machine learning backend. The thesis proves that the inclusion of machine learning can improve the process of dialogue act annotation.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)