Essays about: "text extraction"

Showing result 16 - 20 of 123 essays containing the words text extraction.

  1. 16. Generic Data Harvester

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : William Asp; Johannes Valck; [2022]
    Keywords : News; Articles; Newspapers; Web crawler; Web site parsing; Optimization; Web robot; Web spider; Web data extraction; HTML; Scrapy; Nyheter; Artiklar; Tidningar; Sökrobot; Analys av hemsida; Optimering; Webbrobot; Webbspindel; Data extrahering hemsidor; HTML; Scrapy;

    Abstract : This report goes through the process of developing a generic article scraper which shall extract relevant information from an arbitrary web article. The extraction is implemented by searching and examining the HTML of the article, by using Python and XPath. READ MORE

  2. 17. Compressing Deep Learning models for Natural Language Understanding

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Nadir Ait Lahmouch; [2022]
    Keywords : Natural Language Processing; Deep learning; BERT; Knowledge Distillation; Pruning.;

    Abstract : Uppgifter för behandling av naturliga språk (NLP) har under de senaste åren visat sig vara särskilt effektiva när man använder förtränade språkmodeller som BERT. Det enorma kravet på datorresurser som krävs för att träna sådana modeller gör det dock svårt att använda dem i verkligheten. READ MORE

  3. 18. Extracting information about arms deals from news articles

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Fredrik Hernqvist; [2022]
    Keywords : Natural Language Processing; Machine Learning; Deep Learning; BERT; ALBERT; Arms Transfers; Information Extraction; Behandling av naturliga språk; maskininlärning; djupinlärning; BERT; ALBERT; vapenaffärer;

    Abstract : The Stockholm International Peace Research Institute (SIPRI) maintains the most comprehensive publicly available database on international arms deals. Updating this database requires humans to sift through large amounts of news articles, only some of which contain information relevant to the database. READ MORE

  4. 19. Information Extraction and Document Similarity: Bag-of- Concepts based approach

    University essay from Uppsala universitet/Institutionen för informationsteknologi

    Author : Shubhomoy Biswas; [2022]
    Keywords : ;

    Abstract : People in many organizations develop rich-text files, such as Microsoft Word (MS-Word) and Microsoft Powerpoint (MS-Powerpoint), which contain textual content in a variety of domains, from product presentations to confidential paperwork. This thesis examines information extraction methods, provides a concept-based strategy for computationally representing documents, and determines the degree of similarity between documents based on the information contained in them. READ MORE

  5. 20. Automatic generation of robot targets : A first step towards a flexible robotic solution for cutting customized mesh tray

    University essay from Högskolan Väst/Institutionen för ingenjörsvetenskap

    Author : Dennis Lindberget; [2022]
    Keywords : Automatic robot programming; flexible automation; AutoLisp; Rapid;

    Abstract : The increased demands for customization in manufacturing industries require new automation methods. This thesis presents the development of such a method for a cutting procedure of customized mesh trays at WIBE Group. READ MORE