Efficient reduction over threads

University essay from KTH/Teoretisk fysik

Author: Patrik Falkman; [2011]

Keywords: ;

Abstract: The increasing number of cores in both desktops and servers leads to a demand for efficient parallel algorithms. This project focuses on the fundamental collective operation reduce, which merges several arrays into one by applying a binary operation element wise. Several reduce algorithms are evaluated in terms of performance and scalability and a novel algorithm is introduced that takes advantage of shared memory and exploits load imbalance. To do so, the concept of dynamic pair generation is introduced which implies constructing a binary reduce tree dynamically based on the order of thread arrival, where pairs are formed in a lock-free manner. We conclude that the dynamic algorithm, given enough spread in the arriving times, can outperform the reference algorithms for some or all array sizes.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)