Essays about: "bilingual embeddings"
Found 5 essays containing the words bilingual embeddings.
-
1. Extending a Text Classifier to Multiple Languages
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : This thesis explores the possibility to extend monolingual and bilingual text classifiers to multiple languages. Two different language models are explored, language aligned word embeddings and a transformer model. The goal was to take a classifier based on Swedish and English samples and extend it to Danish, German, and Finnish samples. READ MORE
-
2. Exploring Cross-lingual Sublanguage Classification with Multi-lingual Word Embeddings
University essay from Linköpings universitet/Statistik och maskininlärningAbstract : Cross-lingual text classification is an important task due to the globalization and the increased availability of multilingual data. This thesis explores the method of implementing cross-lingual classification on Swedish and English medical corpora. READ MORE
-
3. Text and Speech Alignment Methods for Speech Translation Corpora Creation : Augmenting English LibriVox Recordings with Italian Textual Translations
University essay from Uppsala universitet/Institutionen för lingvistik och filologiAbstract : The recent uprise of end-to-end speech translation models requires a new generation of parallel corpora, composed of a large amount of source language speech utterances aligned with their target language textual translations. We hereby show a pipeline and a set of methods to collect hundreds of hours of English audio-book recordings and align them with their Italian textual translations, using exclusively public domain resources gathered semi-automatically from the web. READ MORE
-
4. Low Supervision, Low Corpus size, Low Similarity! Challenges in cross-lingual alignment of word embeddings : An exploration of the limitations of cross-lingual word embedding alignment in truly low resource scenarios
University essay from Uppsala universitet/Institutionen för lingvistik och filologiAbstract : Cross-lingual word embeddings are an increasingly important reseource in cross-lingual methods for NLP, particularly for their role in transfer learning and unsupervised machine translation, purportedly opening up the opportunity for NLP applications for low-resource languages. However, most research in this area implicitly expects the availablility of vast monolingual corpora for training embeddings, a scenario which is not realistic for many of the world's languages. READ MORE
-
5. Word embeddings for monolingual and cross-language domain-specific information retrieval
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Various studies have shown the usefulness of word embedding models for a wide variety of natural language processing tasks. This thesis examines how word embeddings can be incorporated into domain-specific search engines for both monolingual and cross-language search. READ MORE
