aiLangu - Real-time Transcription and Translation to Reduce Language Barriers : An Engineering Project to Develop an Application for Enhancing Human Verbal Communication

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: The research area this report relates to is real-time automatic transcription and translation. The purpose of the work done for the report is to reduce the perceived language barriers online and to make a user-friendly application to make use of the latest deep learning technology to transcribe and translate in real-time. This application could be used in a work environment (especially when working from home) and for leisure activities such as watching videos. There is currently most likely no application that uses automatic speech recognition in this way. The most similar applications that were found were mainly similar to Google Translate which are not meant for real-time usage on a computer but rather to wait for an input and then write it out when it is completely done. The application created for this purpose was a desktop application that combines Open-AI's Whisper model for transcription and Argos Translate for translation into one application with a user-friendly GUI created with Java Swing. For creating the application, an iterative and incremental methodology was used both for the GUI design and the software development. In the end, the development was successful resulting in a working desktop application accomplishing the goals of transcribing and translating in real-time with the user of a user-friendly application, which could for example easily be used for digital meetings or videos online.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)