Essays about: "Document Preprocessing"

Showing result 1 - 5 of 11 essays containing the words Document Preprocessing.

  1. 1. Recommendation of Text Properties for Short Texts with the Use of Machine Learning : A Comparative Study of State-of-the-Art Techniques Including BERT and GPT-2

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Luciano Zapata; [2023]
    Keywords : Text classification; Short texts; Deep Learning; BERT; GPT; GPT-2; Transformers; Natural Language Processing; Textklassificering; Korta Texter; Djupinlärning; BERT; GPT; GPT-2; Transformatorer; Naturlig språkbehandling;

    Abstract : Text mining has gained considerable attention due to the extensive usage ofelectronic documents. The significant increase in electronic document usagehas created a necessity to process and analyze them effectively. READ MORE

  2. 2. Automatic Handwritten Text Detection and Classification

    University essay from Uppsala universitet/Avdelningen för visuell information och interaktion

    Author : Olle Dahlstedt; [2021]
    Keywords : ;

    Abstract : As more and more organizations digitize their records, the need for automatic document processing software increases. In particular, the rise of ‘digital humanities’ precede a new set of problems on how to digitize historical archival material in an efficient and accurate manner. READ MORE

  3. 3. Semantic Topic Modeling and Trend Analysis

    University essay from Linköpings universitet/Statistik och maskininlärning

    Author : Jasleen Kaur Mann; [2021]
    Keywords : NLP; unsupervised topic modelling; trend analysis; LDA; BERT; Sentence-BERT; TF-IDF; transformer based language models; document clustering;

    Abstract : This thesis focuses on finding an end-to-end unsupervised solution to solve a two-step problem of extracting semantically meaningful topics and trend analysis of these topics from a large temporal text corpus. To achieve this, the focus is on using the latest develop- ments in Natural Language Processing (NLP) related to pre-trained language models like Google’s Bidirectional Encoder Representations for Transformers (BERT) and other BERT based models. READ MORE

  4. 4. How can a module for sentiment analysis be designed to classify tweets about covid19

    University essay from

    Author : Denny Ly; Tamara Saad Abdul Malik; [2021]
    Keywords : Sentiment Analysis; Machine Learning; Lexicon technique; Kaggle; Preprocessing;

    Abstract : The sentiment analysis of a text is getting more focus nowadays from different entities for a variety of reasons. Emotions mining (sentiment analysis) is a very interesting subject to explore thus the research question is How can a module for sentiment analysis be designed to classify tweets about Covid-19. READ MORE

  5. 5. Evaluation of text classification techniques for log file classification

    University essay from Linköpings universitet/Institutionen för datavetenskap

    Author : Per Olin; [2020]
    Keywords : Text classification; machine learning; NLP; natural language processing; log file; doc2vec; CNN; LSTM; LSTM-CNN;

    Abstract : System log files are filled with logged events, status codes, and other messages. By analyzing the log files, the systems current state can be determined, and find out if something during its execution went wrong. READ MORE