Improving performance of sequential code through automatic parallelization
Abstract: Automatic parallelization is the conversion of sequential code into multi-threaded code with little or no supervision. An ideal implementation of automatic parallelization would allow programmers to fully utilize available hardware resources to deliver optimal performance when writing code. Automatic parallelization has been studied for a long time, with one result being that modern compilers support vectorization without any input. In the study, contemporary parallelizing compilers are studied in order to determine whether or not they can easily be used in modern software development, and how code generated by them compares to manually parallelized code. Five compilers, ICC, Cetus, autoPar, PLUTO, and TC Optimizing Compiler are included in the study. Benchmarks are used to measure speedup of parallelized code, these benchmarks are executed on three different sets of hardware. The NAS Parallel Benchmarks (NPB) suite is used for ICC, Cetus, and autoPar, and PolyBench for the previously mentioned compilers in addition to PLUTO and TC Optimizing Compiler. Results show that parallelizing compilers outperform serial code in most cases, with certain coding styles hindering the capability of them to parallelize code. In the NPB suite, manually parallelized code is outperformed by Cetus and ICC for one benchmark. In the PolyBench suite, PLUTO outperforms the other compilers to a great extent, producing code not only optimized for parallel execution, but also for vectorization. Limitations in code generated by Cetus and autoPar prevent them from being used in legacy projects, while PLUTO and TC do not offer fully automated parallelization. ICC was found to offer the most complete automatic parallelization solution, although offered speedups were not as great as ones offered by other tools.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)