Automatic Voice Trading Surveillance : Achieving Speech and Named Entity Recognition in Voice Trade Calls Using Language Model Interpolation and Named Entity Abstraction

University essay from Uppsala universitet/Avdelningen Vi3

Abstract: This master thesis explores the effectiveness of interpolating a larger generic speech recognition model with smaller domain-specific models to enable transcription of domain-specific conversations. The study uses a corpus within the financial domain collected from the web and processed by abstracting named entities such as financial instruments, numbers, as well as names of people and companies. By substituting each named entity with a tag representing the entity type in the domain-specific corpus, each named entity can be replaced during the hypothesis search by words added to the systems pronunciation dictionary. Thus making instruments and other domain-specific terms a matter of extension by configuration.  A proof-of-concept automatic speech recognition system with the ability to transcribe and extract named entities within the constantly changing domain of voice trading was created. The system achieved a 25.08 Word Error Rate and 0.9091 F1-score using stochastic and neural net based language models. The best configuration proved to be a combination of both stochastic and neural net based domain-specific models interpolated with a generic model. This shows that even though the models were trained using the same corpus, different models learned different aspects of the material. The study was deemed successful by the authors as the Word Error Rate was improved by model interpolation and all but one named entities were found in the test recordings by all configurations. By adjusting the amount of influence the domain-specific models had against the generic model, the results improved the transcription accuracy at the cost of named entity recognition, and vice versa. Ultimately, the choice of configuration depends on the business case and the importance of named entity recognition versus accurate transcriptions. 

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)