Essays about: "in store execution"

Showing result 1 - 5 of 33 essays containing the words in store execution.

  1. 1. Code Generation for Accelerating Data Flow : Enhancing Pentaho Data Integration Performance

    University essay from Umeå universitet/Institutionen för fysik

    Author : Alexander Svensson; [2023]
    Keywords : ;

    Abstract : Pentaho Data Integration, called Kettle, is an ETL tool that functions as a no-code program. The tool, implemented in Java, enables users to create data flow structures via a graphical user interface and store them as XML files, which can be edited or executed. READ MORE

  2. 2. Faster Reading with DuckDB and Arrow Flight on Hopsworks : Benchmark and Performance Evaluation of Offline Feature Stores

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Ayushman Khazanchi; [2023]
    Keywords : Machine Learning; Feature Store; Distributed Systems; MLOps;

    Abstract : Over the last few years, Machine Learning has become a huge field with “Big Tech” companies sharing their experiences building machine learning infrastructure. Feature Stores, used as centralized data repositories for machine learning features, are seen as a central component to operational and scalable machine learning. READ MORE

  3. 3. Highly Available Task Scheduling in Distinctly Branched Directed Acyclic Graphs

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Patrik Zhong; [2023]
    Keywords : Distributed Scheduling; Fault-tolerance; Graph Partitioning; Task Graphs; Dask; Dask Distributed; Data Processing; Distribuerad Schemaläggning; Feltolerans; Grafpartitionering; Uppgiftsgrafer; Dask; Dask Distributed; Dataprocessering;

    Abstract : Big data processing frameworks utilizing distributed frameworks to parallelize the computing of datasets have become a staple part of the data engineering and data science pipelines. One of the more known frameworks is Dask, a widely utilized distributed framework used for parallelizing data processing jobs. READ MORE

  4. 4. Scaling Apache Hudi by boosting query performance with RonDB as a Global Index : Adopting a LATS data store for indexing

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Ralfs Zangis; [2022]
    Keywords : Apache Hudi; Lakehouse; RonDB; Performance; Index; Key-value store; Apache Hudi; Lakehouse; RonDB; Prestanda; Index; Nyckel-värde butik;

    Abstract : The storage and use of voluminous data are perplexing issues, the resolution of which has become more pressing with the exponential growth of information. Lakehouses are relatively new approaches that try to accomplish this while hiding the complexity from the user. READ MORE

  5. 5. Data Build Tool (DBT) Jobs in Hopsworks

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Zidi Chen; [2022]
    Keywords : feature engineering; Structured Query Language SQL ; funktionsteknik; strukturerat frågespråk SQL ;

    Abstract : Feature engineering at scale is always critical and challenging in the machine learning pipeline. Modern data warehouses enable data analysts to do feature engineering by transforming, validating and aggregating data in Structured Query Language (SQL). READ MORE