OpenCL has been proposed as a means of accelerating functional computation using FPGA and GPU accelerators. Although it provides ease of programmability and code portability, questions remain about the performance portability and underlying vendor’s compiler capabilities to generate efficient implementations without user-defined, platform specific optimizations. In this work, we systematically evaluate this by formalizing a design space exploration strategy using platform-independent micro-architectural and application-specific optimizations only. The optimizations are then applied across Altera FPGA, NVIDIA GPU and ARM Mali GPU platforms for three computing examples, namely matrix-matrix multiplication, binomial-tree option pricing and 3-dimensional finite difference time domain. Our strategy enables a fair comparison across platforms in terms of throughput and energy efficiency by using the same design effort. Our results indicate that FPGA provides better performance portability in terms of achieved percentage of device’s peak performance (68%) compared to NVIDIA GPU (20%) and also achieves better energy efficiency (up to 1.4 ×) for some of the considered cases without requiring in-depth hardware design expertise.
|Title of host publication||ARC 2018: Applied Reconfigurable Computing: Architectures, Tools, and Applications - 14th International Symposium: Proceedings|
|Number of pages||13|
|Publication status||Published - 08 Apr 2018|
|Event||14th International Symposium on Applied Reconfigurable Computing, ARC 2018 - Santorini, Greece|
Duration: 02 May 2018 → 04 May 2018
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||14th International Symposium on Applied Reconfigurable Computing, ARC 2018|
|Period||02/05/2018 → 04/05/2018|
ASJC Scopus subject areas
- Theoretical Computer Science
- Computer Science(all)
FingerprintDive into the research topics of 'Exploring Functional Acceleration of OpenCL on FPGAs and GPUs Through Platform-Independent Optimizations'. Together they form a unique fingerprint.
Student thesis: Doctoral Thesis › Doctor of PhilosophyFile