Essays about: "shared-memory"
Showing result 1 - 5 of 65 essays containing the word shared-memory.
-
1. A Conjugate Residual Solver with Kernel Fusion for massive MIMO Detection
University essay from Högskolan i Halmstad/Centrum för forskning om tillämpade intelligenta system (CAISR)Abstract : This thesis presents a comparison of a GPU implementation of the Conjugate Residual method as a sequence of generic library kernels against implementations ofthe method with custom kernels to expose the performance gains of a keyoptimization strategy, kernel fusion, for memory-bound operations which is to makeefficient reuse of the processed data. For massive MIMO the iterative solver is to be employed at the linear detection stageto overcome the computational bottleneck of the matrix inversion required in theequalization process, which is 𝒪(𝑛3) for direct solvers. READ MORE
-
2. Exploring Ethernet Switching Architectures for Area-Efficient Low-End Switches
University essay from Lunds universitet/Institutionen för elektro- och informationsteknikAbstract : The aim of this thesis project has been to develop an architecture for L2 ethernet switches that would be optimized for silicon area, targeting smaller low-end switches. A selection was made of three different switching architectures, which were compared and analyzed to explore the benefits and drawbacks of different approaches. READ MORE
-
3. Time-Triggered Execution of 3-Phase Tasks on the RP2040 — A Framework Avoiding Memory Contention by Design
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : Multi-core processors have emerged as an effective solution for handling complex tasks that cannot be efficiently processed by unicore processors. Their usage is driven by the potential to achieve high processing power while minimizing power consumption. READ MORE
-
4. Offloading Workloads from CPU of Multiplayer Game Server to FPGA : SmartNIC implementation with UDP Communication
University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)Abstract : For multiplayer games, the performance of the server’s Central Processing Unit (CPU) is the main factor that limits the number of players on the server at the same time. Compared with the CPU, the Field-Programmable Gate Array (FPGA) architecture has no instructions set and no shared memory. READ MORE
-
5. Extending applicability of symbolic execution to uncover possible shared memory transactions in GPU programs
University essay from Linköpings universitet/Institutionen för datavetenskapAbstract : General-purpose computing on the graphics processing unit has become popular since the cost-to-power ratio is lower for GPUs (compared to CPUs) and the programmability of the GPU has increased. CUDA is an extension of the C/C++ programming languages which enables software developers to more easily make use of the computational power of the GPUs. READ MORE