Essays about: "Webbskrapning"

Showing result 1 - 5 of 10 essays containing the word Webbskrapning.

  1. 1. The One Spider To Rule Them All : Web Scraping Simplified: Improving Analyst Productivity and Reducing Development Time with A Generalized Spider

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Rikard Johansson; [2023]
    Keywords : Web scraping; Web crawlers; HTML; Scrapy; Optimization; Web data extraction; Webbskrapning; Webbsökrobotar; HTML; Scrapy; Optimering; Webbdataextraktion;

    Abstract : This thesis addresses the process of developing a generalized spider for web scraping, which can be applied to multiple sources, thereby reducing the time and cost involved in creating and maintaining individual spiders for each website or URL. The project aims to improve analyst productivity, reduce development time for developers, and ensure high-quality and accurate data extraction. READ MORE

  2. 2. Neural Cleaning of Swedish Textual Data : Using BERT-based methods for Token Classification of Running and Non-Running Text

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Andreas Ericsson; [2023]
    Keywords : Natural Language Processing; Text Cleaning; Transformers; BERT; Token Classification; Deep Learning; Språkteknologi; Textrensning; Transformers; BERT; Token-klassificering; Djupinlärning;

    Abstract : Modern natural language processing methods requires big textual datasets to function well. A common method is to scrape the internet to acquire the needed data. This does, however, come with the issue that some of the data may be unwanted – for instance, spam websites. READ MORE

  3. 3. Evaluating and comparing different key phrase-based web scraping methods for training domain-specific fasttext models

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Love Book; [2023]
    Keywords : Machine Learning; Natural Language Processing; Word2vec; fasttext; KeyBERT; Web scraping; Transformers; Embeddings.; Maskininlärning; språkteknologi; Word2vec; fasttext; KeyBERT; Webbskrapning; Transformatorer; Inbäddningar.;

    Abstract : The demand for automation of simple tasks is constantly increasing. While some tasks are easy to automate because the logic is fixed and the process is streamlined, other tasks are harder because the performance of the task is heavily reliant on the judgment of a human expert. READ MORE

  4. 4. Data mining historical insights for a software keyword from GitHub and Libraries.io; GraphQL

    University essay from Linköpings universitet/Institutionen för datavetenskap

    Author : Gustaf Bodemar; [2022]
    Keywords : Data mining; Web scraping; Historical data analysis; GitHub; Libraries.io; GraphQL; Datautvinning; Webbskrapning; Historisk dataanalys; GitHub; Libraries.io; GraphQL;

    Abstract : This paper explores an approach to extracting historical insights into a software keyword by data mining GitHub and Libraries.io. We test our method using the keyword GraphQL to see what insights we can gain. We managed to plot several timelines of how repositories and software libraries related to our keyword were created over time. READ MORE

  5. 5. Less Detectable Web Scraping Techniques

    University essay from Linnéuniversitetet/Institutionen för datavetenskap och medieteknik (DM)

    Author : Fredric Färholt; [2021]
    Keywords : web scraping; data mining; javascript; puppeteer; algorithms; data collection; security mechanisms; honeypot; security tools; undetectability; webbskrapning; data mining; javascript; puppeteer; algoritmer; datakollektion; säkerhetsmekanismer; honeypot; säkerhetsverktyg; oupptäckbar;

    Abstract : Web scraping is an efficient way of gathering data, and it has also become much eas- ier to perform and offers a high success rate. People no longer need to be tech-savvy when scraping data since several easy-to-use platform services exist. READ MORE