Point Process Based Phoneme Recognition Acceleration

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Hongliang Qiu; [2019]

Keywords: ;

Abstract: Stochastic gradient descent (SGD) is the core technology to train a deep learning model. It is well known that SGD suffers from the variance of gradients in each iteration. Deep learning has already been widely used in many applications because of its great performance in tasks such as image recognition. However utilizing deep learning techniques needs long training time, which impedes its broader using. Recent research shows that point process based active mini-batch selection (PP-SGD) is able to accelerate the training process of deep learning models by reducing the variances of gradients and has been successfully tested on image recognition and isolated word recognition tasks. But it is still in doubts whether this method benefits general deep learning scenarios. To eliminate such doubts, we extended PP-SGD to a new deep learning scenario phoneme recognition.The problem of PP-SGD is the generality of its acceleration ability in deep learning scenarios. So we raise the research question that how well can PP-SGD accelerate in phoneme recognition task, which is one of the major deep learning scenario. The purpose of this research is to investigate the acceleration ability of PP-SGD in phoneme recognition task. The goal is to extend the PP-SGD acceleration method to general deep learning scenarios. In the research, we adopt quantitative research methodology.In this research we implemented PP-SGD on bi-directional long short term memory (Bi-LSTM) for single phoneme recognition and sequenced phoneme recognition tasks. We tested our method on TIMIT, a speech corpus with phoneme labels. Results showed that PP-SGD was able to accelerate training for phoneme recognition models. But the effectiveness of acceleration can be weakened by the intrinsic high scatterness of data, which inherently introduce gradient diversity in mini-batches. Based on our results, this research fulfilled the purpose of extending PP-SGD one step further to general deep learning scenarios.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)