Essays about: "data ingestion"

Showing result 1 - 5 of 11 essays containing the words data ingestion.

  1. 1. A COMPARISON OF DATA INGESTION PLATFORMS IN REAL-TIME STREAM PROCESSING PIPELINES

    University essay from Mälardalens högskola/Akademin för innovation, design och teknik

    Author : Sebastian Tallberg; [2020]
    Keywords : stream processing; data ingestion; Redis Streams; Apache Kafka; Apache Pulsar; performance benchmark; real-time streaming;

    Abstract : In recent years there has been an increasing demand for real-time streaming applications that handle large volumes of data with low latency. Examples of such applications include real-time monitoring and analytics, electronic trading, advertising, fraud detection, and more. READ MORE

  2. 2. Detecting changes in word associations over short time periods : Analysing Twitter data with Word2Vec over time

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS); KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Andreas Kärrby; Maja Tennander; [2020]
    Keywords : ;

    Abstract : The meanings and connotations of words are constantly changing. Traditionally, one way to track such changes over relatively long time periods is by analysing variations in word usage in written records such as books and newspapers, and by comparing dictionary entries for the words of interest. READ MORE

  3. 3. Hudi on Hops : Incremental Processing and Fast Data Ingestion for Hops

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Netsanet Gebretsadkan Kidane; [2019]
    Keywords : Hudi; Hadoop; Hops; Upsert; SQL; Spark; Kafka; Hudi; Hadoop; Hops; Upsert; SQL; Spark; Kafka;

    Abstract : In the era of big data, data is flooding from numerous data sources and many companies have been utilizing different types of tools to load and process data from various sources in a data lake. The major challenges where different companies are facing these days are how to update data into an existing dataset without having to read the entire dataset and overwriting it to accommodate the changes which have a negative impact on the performance. READ MORE

  4. 4. Verification of linear scalability of a business Big Data platform against the Queueing Networks model

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Magdalena Matczak; [2019]
    Keywords : ;

    Abstract : Ensuring that software built on top of distributed systems for Big Data has good scalability properties is crucial for the design of long-lasting and reliable products. The purpose of this Master Thesis is to investigate and characterize scalability of a business Big Data platform, URights, developed by IBM in cooperation with a French association, SACEM. READ MORE

  5. 5. Kylo Data Lakes Configuration deployed in Public Cloud environments in Single Node Mode

    University essay from Blekinge Tekniska Högskola/Institutionen för datavetenskap

    Author : Rong Peng; [2019]
    Keywords : Data Lake; Kylo; Public Cloud;

    Abstract : The master thesis introduces the Kylo Data Lake which deployed in the public cloud environment,provides a perspective of datalake configuration and data ingestion experiment. This paper reveals the underlying architecture of Kylo data lake. .. READ MORE