Essays about: "corpora creation"

Found 5 essays containing the words corpora creation.

  1. 1. IŻ SWÓJ JĘZYK MAJĄ! An exploration of the computational methods for identifying language variation in Polish

    University essay from Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori

    Author : Maria Irena Szawerna; [2023-06-19]
    Keywords : language variation; Polish; diachronic linguistics; part-of-speech tagging; lemmatization; corpus linguistics;

    Abstract : Computational approaches to language variation continue to contribute in a relevant way to various fields, including Natural Language Processing (NLP) and linguistics. Being able to accommodate variation within natural language increases the robustness of NLP models and their usefulness in real-life applications; simultaneously, detecting and describing variation and trends that govern it is one of the main goals of sociolinguistics and historical linguistics, meaning that some of the advances in NLP can contribute to these fields as well. READ MORE

  2. 2. Exploring Human-Robot Interaction Through Explainable AI Poetry Generation

    University essay from Mälardalens högskola/Akademin för innovation, design och teknik

    Author : Philippe Strineholm; [2021]
    Keywords : explainable AI poetry generation human robot interaction HRI;

    Abstract : As the field of Artificial Intelligence continues to evolve into a tool of societal impact, a need of breaking its initial boundaries as a computer science discipline arises to also include different humanistic fields. The work presented in this thesis revolves around the role that explainable artificial intelligence has in human-robot interaction through the study of poetry generators. READ MORE

  3. 3. A comparative study of the grammatical gender systems of languages by means of analysing word embeddings

    University essay from Uppsala universitet/Institutionen för lingvistik och filologi

    Author : Hartger Veeman; [2020]
    Keywords : word embeddings; grammatical gender; computational linguistics; language representations;

    Abstract : The creation of word embeddings is one of the key breakthroughs in natural language processing. Word embeddings allow for words to be represented semantically, opening the way to many new deep learning methods. READ MORE

  4. 4. Text and Speech Alignment Methods for Speech Translation Corpora Creation : Augmenting English LibriVox Recordings with Italian Textual Translations

    University essay from Uppsala universitet/Institutionen för lingvistik och filologi

    Author : Giuseppe Della Corte; [2020]
    Keywords : speech translation; parallel corpora; bilingual sentence alignment; sentence embeddings; cosine similarity; forced alignment; text collection; corpora creation; audio signal processing;

    Abstract : The recent uprise of end-to-end speech translation models requires a new generation of parallel corpora, composed of a large amount of source language speech utterances aligned with their target language textual translations. We hereby show a pipeline and a set of methods to collect hundreds of hours of English audio-book recordings and align them with their Italian textual translations, using exclusively public domain resources gathered semi-automatically from the web. READ MORE

  5. 5. Creating a coreference solver for Swedish and German using distant supervision

    University essay from Lunds universitet/Institutionen för datavetenskap

    Author : Alexander Wallin; [2017]
    Keywords : coreference resolution; distance supervision; machine-learning; multilin- gual; Swedish; German; Technology and Engineering;

    Abstract : It is said that coreference is difficult to explain, but easy to comprehend; everyoneknows coreference, they just don’t know that they do. We trained a computer toknow it too! Coreference resolution is the identification of phrases that refer to the same entity in a text. READ MORE