Artificial Neural Networks in Swedish Speech Synthesis

University essay from KTH/Tal-kommunikation

Author: Per Näslund; [2018]

Keywords: Speech Synthesis; neural; LSTM; Speech Technology; Tacotron; Attention; CNN; Neural Networks; RNN;

Abstract: Text-to-speech (TTS) systems have entered our daily lives in the form of smart assistants and many other applications. Contemporary re- search applies machine learning and artificial neural networks (ANNs) to synthesize speech. It has been shown that these systems outperform the older concatenative and parametric methods. In this paper, ANN-based methods for speech synthesis are ex- plored and one of the methods is implemented for the Swedish lan- guage. The implemented method is dubbed “Tacotron” and is a first step towards end-to-end ANN-based TTS which puts many differ- ent ANN-techniques to work. The resulting system is compared to a parametric TTS through a strength-of-preference test that is carried out with 20 Swedish speaking subjects. A statistically significant pref- erence for the ANN-based TTS is found. Test subjects indicate that the ANN-based TTS performs better than the parametric TTS when it comes to audio quality and naturalness but sometimes lacks in intelli- gibility.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Artificial Neural Networks in Swedish Speech Synthesis

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-25)