Essays about: "Statistical Machine Translation"
Showing result 1 - 5 of 12 essays containing the words Statistical Machine Translation.
-
1. Syntax-based Concept Alignment for Machine Translation
University essay from Göteborgs universitet/Institutionen för data- och informationsteknikAbstract : This thesis presents a syntax-based approach to Concept Alignment (CA), the task of finding semantical correspondences between parts of multilingual parallel texts, with a focus on Machine Translation (MT). Two variants of CA are taken into account: Concept Extraction (CE), whose aim is to identify new concepts by means of mere linguistic comparison, and Concept Propagation (CP), which consists in looking for the translation equivalents of a set of known concepts in a new language. READ MORE
-
2. Text and Speech Alignment Methods for Speech Translation Corpora Creation : Augmenting English LibriVox Recordings with Italian Textual Translations
University essay from Uppsala universitet/Institutionen för lingvistik och filologiAbstract : The recent uprise of end-to-end speech translation models requires a new generation of parallel corpora, composed of a large amount of source language speech utterances aligned with their target language textual translations. We hereby show a pipeline and a set of methods to collect hundreds of hours of English audio-book recordings and align them with their Italian textual translations, using exclusively public domain resources gathered semi-automatically from the web. READ MORE
-
3. Spelling Normalization of English Student Writings
University essay from Uppsala universitet/Institutionen för lingvistik och filologiAbstract : Spelling normalization is the task to normalize non-standard words into standard words in texts, resulting in a decrease in out-of-vocabulary (OOV) words in texts for natural language processing (NLP) tasks such as information retrieval, machine translation, and opinion mining, improving the performance of various NLP applications on normalized texts. In this thesis, we explore different methods for spelling normalization of English student writings including traditional Levenshtein edit distance comparison, phonetic similarity comparison, character-based Statistical Machine Translation (SMT) and character-based Neural Machine Translation (NMT) methods. READ MORE
-
4. Automatic Identification of Duplicates in Literature in Multiple Languages
University essay from Linköpings universitet/Statistik och maskininlärningAbstract : As the the amount of books available online the sizes of each these collections are at the same pace growing larger and more commonly in multiple languages. Many of these cor- pora contain duplicates in form of various editions or translations of books. READ MORE
-
5. Hybrid Machine Translation : Choosing the best translation with Support Vector Machines
University essay from Uppsala universitet/Institutionen för informationsteknologiAbstract : In the field of machine translation there are various systems available which have different strengths and weaknesses. This thesis investigates the combination of two systems, a rule based one and a statistical one, to see if such a hybrid system can provide higher quality translations. READ MORE