Essays about: "byte pair encoding BPE"

Found 2 essays containing the words byte pair encoding BPE.

  1. 1. Incremental Re-tokenization in BPE-trained SentencePiece Models

    University essay from Umeå universitet/Institutionen för datavetenskap

    Author : Simon Hellsten; [2024]
    Keywords : BPE; Byte Pair Encoding; SentencePiece; NLP; Natural Language Processing; Tokenization; Re-tokenization;

    Abstract : This bachelor's thesis in Computer Science explores the efficiency of an incremental re-tokenization algorithm in the context of BPE-trained SentencePiece models used in natural language processing. The thesis begins by underscoring the critical role of tokenization in NLP, particularly highlighting the complexities introduced by modifications in tokenized text. READ MORE

  2. 2. Bidirectional LSTM-CNNs-CRF Models for POS Tagging

    University essay from Uppsala universitet/Institutionen för lingvistik och filologi

    Author : Hao Tang; [2018]
    Keywords : bidirectional LSTM; part of speech; CNNs; CRF; byte pair encoding BPE ;

    Abstract : In order to achieve state-of-the-art performance for part-of-speech(POS) tagging, the traditional systems require a significant amount of hand-crafted features and data pre-processing. In this thesis, we present a discriminative word embedding, character embedding and byte pair encoding (BPE) hybrid neural network architecture to implement a true end-to-end system without feature engineering and data pre-processing. READ MORE