Essays about: "audio segmentation"
Showing result 1 - 5 of 12 essays containing the words audio segmentation.
-
1. Analysis of speaking time and content of the various debates of the presidential campaign : Automated AI analysis of speech time and content of presidential debates based on the audio using speaker detection and topic detection
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : The field of artificial intelligence (AI) has grown rapidly in recent years and its applications are becoming more widespread in various fields, including politics. In particular, presidential debates have become a crucial aspect of election campaigns and it is important to analyze the information exchanged in these debates in an objective way to let voters choose without being influenced by biased data. READ MORE
-
2. Swedish Language End-to-End Automatic Speech Recognition for Media Monitoring using Deep Learning
University essay from Luleå tekniska universitet/Institutionen för system- och rymdteknikAbstract : In order to extract relevant information from speech recordings, the general approach is to first convert the audio into transcribed text. The text can then be analysed using well researched methods. NewsMachine AB provides customers with an overview of how they are represented in media by analysing articles in text form. READ MORE
-
3. Automatic Podcast Chapter Segmentation : A Framework for Implementing and Evaluating Chapter Boundary Models for Transcribed Audio Documents
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Podcasts are an exponentially growing audio medium where useful and relevant content should be served, which requires new methods of information sorting. This thesis is the first to look into the state-of-art problem of segmenting podcasts into chapters (structurally and topically coherent sections). READ MORE
-
4. Speaker Diarization System for Call-center data
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : To answer the question who spoke when, speaker diarization (SD) is a critical step for many speech applications in practice. The task of our project is building a MFCC-vector based speaker diarization system on top of a speaker verification system (SV), which is an existing Call-centers application to check the customer’s identity from a phone call. READ MORE
-
5. Text and Speech Alignment Methods for Speech Translation Corpora Creation : Augmenting English LibriVox Recordings with Italian Textual Translations
University essay from Uppsala universitet/Institutionen för lingvistik och filologiAbstract : The recent uprise of end-to-end speech translation models requires a new generation of parallel corpora, composed of a large amount of source language speech utterances aligned with their target language textual translations. We hereby show a pipeline and a set of methods to collect hundreds of hours of English audio-book recordings and align them with their Italian textual translations, using exclusively public domain resources gathered semi-automatically from the web. READ MORE