Benchmarking linear-algebra algorithms on CPU- and FPGA-based platforms

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Moore’s law is the main driving factor behind the rapid evolution of computers that has been observed in the past 50 years. Though the law is soon ending due to heat- and sizing-related issues. One solution to continuing the evolution is utilizing alternative computer hardware, where parallel hardware is especially interesting. The Field Programmable Gate Array (FPGA) is one such piece of hardware. This study compares the runtime of two linear-algebra benchmarks executed on a traditional CPU-based platform and an FPGA-based platform respectively. The benchmarks are called cholesky and durbin respectively. The cholesky benchmark performs Cholesky decomposition and the durbin benchmark computes the solution to a Yule-Walker equation. The CPU implementations of the benchmarks were provided in the C programming language and the FPGA implementations of the benchmarks were written using OpenCL, which is a High-Level-Synthesis framework. The results highlighted a clear advantage for the CPU implementations, which had a shorter runtime than the FPGA implementations in both benchmarks for every test case. This was caused by both benchmarks containing data dependencies, which required them to be executed sequentially. Since the CPU operates at a clock frequency more than ten times higher than the FPGA’s clock frequency, it executed sequential instructions faster than the FPGA.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)