Essays about: "document clustering"

Showing result 1 - 5 of 30 essays containing the words document clustering.

  1. 1. Help Document Recommendation System

    University essay from Malmö universitet/Fakulteten för teknik och samhälle (TS)

    Author : Keerthi Vijay Kumar; Pinky Mary Stanly; [2023]
    Keywords : Document similarity; Recommender systems; content-based filtering; collaborative filtering; Term Frequency-Inverse Document Frequency TF-IDF ; Bidirectional Encoder Representation from Transformers BERT ; Non-Negative Matrix Factorisation NMF ; cosine similarity; K-means clustering;

    Abstract : Help documents are important in an organization to use the technology applications licensed from a vendor. Customers and internal employees frequently use and interact with the help documents section to use the applications and know about the new features and developments in them. READ MORE

  2. 2. Descriptive Labeling of Document Clusters

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Adam Österberg; [2022]
    Keywords : Natural Language Processing; Wikipedia; Topic Modeling; Labeling; Språkteknologi; Wikipedia; Temamodellering; Märkning;

    Abstract : Labeling is the process of giving a set of data a descriptive name. This thesis dealt with documents with no additional information and aimed at clustering them using topic modeling and labeling them using Wikipedia as a second source. Labeling documents is a new field with many potential solutions. READ MORE

  3. 3. Anomaly Detection in Log Files Using Machine Learning Techniques

    University essay from Blekinge Tekniska Högskola/Fakulteten för datavetenskaper

    Author : Lakshmi Geethanjali Mandagondi; [2021]
    Keywords : Anomaly Detection; Log Files; Machine Learning; Clustering; Outlier Detection;

    Abstract : Context: Log files are produced in most larger computer systems today which contain highly valuable information about the behavior of the system and thus they are consulted fairly often in order to analyze behavioral aspects of the system. Because of the very high number of log entries produced in some systems, it is however extremely difficult to seek out relevant information in these files. READ MORE

  4. 4. Semantic Topic Modeling and Trend Analysis

    University essay from Linköpings universitet/Statistik och maskininlärning

    Author : Jasleen Kaur Mann; [2021]
    Keywords : NLP; unsupervised topic modelling; trend analysis; LDA; BERT; Sentence-BERT; TF-IDF; transformer based language models; document clustering;

    Abstract : This thesis focuses on finding an end-to-end unsupervised solution to solve a two-step problem of extracting semantically meaningful topics and trend analysis of these topics from a large temporal text corpus. To achieve this, the focus is on using the latest develop- ments in Natural Language Processing (NLP) related to pre-trained language models like Google’s Bidirectional Encoder Representations for Transformers (BERT) and other BERT based models. READ MORE

  5. 5. Automated error matching system using machine learning and data clustering : Evaluating unsupervised learning methods for categorizing error types, capturing bugs, and detecting outliers.

    University essay from Linköpings universitet/Programvara och system

    Author : Jonatan Bjurenfalk; August Johnson; [2021]
    Keywords : Unsupervised learning; machine learning; clustering; DBSCAN; HDBSCAN; X-Means; outlier detection; error log clustering;

    Abstract : For large and complex software systems, it is a time-consuming process to manually inspect error logs produced from the test suites of such systems. Whether it is for identifyingabnormal faults, or finding bugs; it is a process that limits development progress, and requires experience. READ MORE