Essays about: "GEMM"

Found 4 essays containing the word GEMM.

  1. 1. Evaluation of FPGA-based High Performance Computing Platforms

    University essay from Linköpings universitet/Datorteknik

    Author : Martin Frick-Lundgren; [2023]
    Keywords : FPGA; High performance computing; BUDE; GEMM; CPU; GPU;

    Abstract : High performance computing is a topic that has risen to the top in the era ofdigitalization, AI and automation. Therefore, the search for more cost and timeeffective ways to implement HPC work is always a subject extensively researched.One part of this is to have hardware that is capable to improve on these criteria. READ MORE

  2. 2. Register Caching for Energy Efficient GPGPU Tensor Core Computing

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Qiran Qian; [2023]
    Keywords : Computer Architecture; GPGPU; Tensor Core; GEMM; Energy Efficiency; Register File; Cache; Instruction Scheduling; Datorarkitektur; GPGPU; Tensor Core; GEMM; energieffektivitet; registerfil; cache; instruktionsschemaläggning;

    Abstract : The General-Purpose GPU (GPGPU) has emerged as the predominant computing device for extensive parallel workloads in the fields of Artificial Intelligence (AI) and Scientific Computing, primarily owing to its adoption of the Single Instruction Multiple Thread architecture, which not only provides a wealth of thread context but also effectively hide the latencies exposed in the single threads executions. As computational demands have evolved, modern GPGPUs have incorporated specialized matrix engines, e. READ MORE

  3. 3. AXI-PACK : Near-memory Bus Packing for Bandwidth-Efficient Irregular Workloads

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Chi Zhang; [2022]
    Keywords : General propose processor; on-chip bus protocol; irregular memory access; ASIC digital circuit design.; Generellt förslag på processor; on-chip-bussprotokoll; oregelbunden minnesåtkomst; digital ASIC-kretsdesign.;

    Abstract : General propose processor (GPP) are demanded high performance in dataintensive applications, such as deep learning, high performance computation (HPC), where algorithm kernels like GEMM (general matrix-matrix multiply) and SPMV (sparse matrix-vector multiply) kernels are intensively used. The performance of these data-intensive applications are bounded with memory bandwidth, which is limited by computing & memory access coupling and memory wall effect. READ MORE

  4. 4. Efficient LU Factorization for Texas Instruments Keystone Architecture Digital Signal Processors

    University essay from KTH/Skolan för datavetenskap och kommunikation (CSC)

    Author : Gilbert Netzer; [2015]
    Keywords : LU factorization; digital signal processors; Texas Instruments; Keystone architecture; high-performance LINPACK; benchmark; performance; energy efficiency; software-pipelined loops; direct memory access; optimization;

    Abstract : The energy consumption of large-scale high-performance computer (HPC) systems has become one of the foremost concerns of both data-center operators and computer manufacturers. This has renewed interest in alternative computer architectures that could offer substantially better energy-efficiency. READ MORE