Essays about: "shared-memory"

Showing result 1 - 5 of 65 essays containing the word shared-memory.

  1. 1. A Conjugate Residual Solver with Kernel Fusion for massive MIMO Detection

    University essay from Högskolan i Halmstad/Centrum för forskning om tillämpade intelligenta system (CAISR)

    Author : Ioannis Broumas; [2023]
    Keywords : MIMO; massive MIMO; GPU; CUDA; Software Defined Radio; SDR; MMSE; ZF; zero-forcing; parallel detection; iterative methods; conjugate residual; parallel computing; kernel fusion;

    Abstract : This thesis presents a comparison of a GPU implementation of the Conjugate Residual method as a sequence of generic library kernels against implementations ofthe method with custom kernels to expose the performance gains of a keyoptimization strategy, kernel fusion, for memory-bound operations which is to makeefficient reuse of the processed data. For massive MIMO the iterative solver is to be employed at the linear detection stageto overcome the computational bottleneck of the matrix inversion required in theequalization process, which is 𝒪(𝑛3) for direct solvers. READ MORE

  2. 2. Exploring Ethernet Switching Architectures for Area-Efficient Low-End Switches

    University essay from Lunds universitet/Institutionen för elektro- och informationsteknik

    Author : Jon Swedberg; Felix Ghosh; [2023]
    Keywords : Ethernet Switch; Architecture; Silicon Area; Area Optimization; ASIC; FPGA; Technology and Engineering;

    Abstract : The aim of this thesis project has been to develop an architecture for L2 ethernet switches that would be optimized for silicon area, targeting smaller low-end switches. A selection was made of three different switching architectures, which were compared and analyzed to explore the benefits and drawbacks of different approaches. READ MORE

  3. 3. Time-Triggered Execution of 3-Phase Tasks on the RP2040 — A Framework Avoiding Memory Contention by Design

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Everita Annemarija Samusa; [2023]
    Keywords : Commercial-off-the-shelf; Execution framework; Multi-core; Real-time; 3- phase tasks; Kommersiell standard; Ramverk för utförande; Flerkärnig; Realtid; 3-fas uppgifter;

    Abstract : Multi-core processors have emerged as an effective solution for handling complex tasks that cannot be efficiently processed by unicore processors. Their usage is driven by the potential to achieve high processing power while minimizing power consumption. READ MORE

  4. 4. Offloading Workloads from CPU of Multiplayer Game Server to FPGA : SmartNIC implementation with UDP Communication

    University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

    Author : Junwen Bao; [2022]
    Keywords : FPGA; UDP; Multiple-connection Server; Network Communication; Integrated Circuit Design; FPGA; UDP; server med flera anslutningar; nätverkskommunikation; Integrerad kretsdesign;

    Abstract : For multiplayer games, the performance of the server’s Central Processing Unit (CPU) is the main factor that limits the number of players on the server at the same time. Compared with the CPU, the Field-Programmable Gate Array (FPGA) architecture has no instructions set and no shared memory. READ MORE

  5. 5. Extending applicability of symbolic execution to uncover possible shared memory transactions in GPU programs

    University essay from Linköpings universitet/Institutionen för datavetenskap

    Author : Jonathan Hjort; [2022]
    Keywords : ;

    Abstract : General-purpose computing on the graphics processing unit has become popular since the cost-to-power ratio is lower for GPUs (compared to CPUs) and the programmability of the GPU has increased. CUDA is an extension of the C/C++ programming languages which enables software developers to more easily make use of the computational power of the GPUs. READ MORE