Language Classification Using Neural Networks

University essay from

Author: Andreas Lindgren; Gustav Lind; [2019]

Keywords: ;

Abstract: In this project a model has been created that with an audio sequence as input can classify the language being spoken to be either english or french. The focus of the project has been to experiment with different ways to process audio files and to design a neural network in order to maximize the performance for the task of language classification. The purpose of the project was to investigate the highest reachable accuracy and to examine what sample length that would be appropriate in order to be useful in voice control application. The signal processing part dealt mainly with how enveloping, Mel frequency ceptral coefficients (MFCC) and Mel frequency spectral coefficients (MFSC) could be used to enhance the accuracy of the model. The neural network design focused on how the width and depth of the network and the use of dropouts could be used to increase the performance. The experiments resulted in a model with a maximum accuracy of 92,30 % that could outperform humans for samples of approximately 1,2 seconds of shorter. A suitable sample length to be usable in other applications was concluded to be in the interval of 0,7 to 1,5 seconds.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)