Essays about: "Document similarity"

Showing result 1 - 5 of 30 essays containing the words Document similarity.

  1. 1. Automated Extraction of Insurance Policy Information : Natural Language Processing techniques to automate the process of extracting information about the insurance coverage from unstructured insurance policy documents.

    University essay from Uppsala universitet/Datalogi

    Author : Jacob Hedberg; Erik Furberg; [2023]
    Keywords : NLP; SBERT; AI; Insurance; Semantic similarity;

    Abstract : This thesis investigates Natural Language Processing (NLP) techniques to extract relevant information from long and unstructured insurance policy documents. The goal is to reduce the amount of time required by readers to understand the coverage within the documents. READ MORE

  2. 2. Help Document Recommendation System

    University essay from Malmö universitet/Fakulteten för teknik och samhälle (TS)

    Author : Keerthi Vijay Kumar; Pinky Mary Stanly; [2023]
    Keywords : Document similarity; Recommender systems; content-based filtering; collaborative filtering; Term Frequency-Inverse Document Frequency TF-IDF ; Bidirectional Encoder Representation from Transformers BERT ; Non-Negative Matrix Factorisation NMF ; cosine similarity; K-means clustering;

    Abstract : Help documents are important in an organization to use the technology applications licensed from a vendor. Customers and internal employees frequently use and interact with the help documents section to use the applications and know about the new features and developments in them. READ MORE

  3. 3. Distilling Multilingual Transformer Models for Efficient Document Retrieval : Distilling multi-Transformer models with distillation losses involving multi-Transformer interactions

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Xuecong Liu; [2022]
    Keywords : Dense Passage Retrieval; Knowledge Distillation; Multilingual Transformer; Document Retrieval; Open Domain Question Answering; Tät textavsnittssökning; kunskapsdestillering; flerspråkiga transformatorer; dokumentsökning; domänlöst frågebesvarande;

    Abstract : Open Domain Question Answering (OpenQA) is a task concerning automatically finding answers to a query from a given set of documents. Language-agnostic OpenQA is an increasingly important research area in the globalised world, where the answers can be in a different language from the question. READ MORE

  4. 4. Duplicate detection of multimodal and domain-specific trouble reports when having few samples : An evaluation of models using natural language processing, machine learning, and Siamese networks pre-trained on automatically labeled data

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Viktor Karlstrand; [2022]
    Keywords : Duplicate detection; Bug reports; Trouble reports; Natural language processing; Information retrieval; Machine learning; Siamese neural network; Transformers; Automated data labeling; Shapley values; Dubblettdetektering; Felrapporter; Buggrapporter; Naturlig språkbehandling; Informationssökning; Maskininlärning; Siamesiska neurala nätverk; Transformatorer; Automatiserad datamärkning; Shapley-värden;

    Abstract : Trouble and bug reports are essential in software maintenance and for identifying faults—a challenging and time-consuming task. In cases when the fault and reports are similar or identical to previous and already resolved ones, the effort can be reduced significantly making the prospect of automatically detecting duplicates very compelling. READ MORE

  5. 5. Trade-offs between Quality and Efficiency in Multilingual Dense Retrieval

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Emma Schüldt; [2022]
    Keywords : Dense retrieval; Binary Retrieval; Semantic search; ColBERT; Multilingual; MSMarco; Tät informationssökning; Binär informationssökning; Semantisk sökning; ColBERT; Flerspråkig; MS Marco;

    Abstract : As the amount of content online grows, information retrieval becomes increasingly crucial. Traditional information retrieval does not take the text order into account and is also dependent on exact text matching between the query and the document. READ MORE