Abstract
Using heterogeneous accelerators to obtain high performance for mathematical kernels remains an active research frontier in computational science. The accelerators have compute architectures that are different from the CPUs and in
addition have memory spaces independent of the CPU systems to which they are connected. It follows that accelerators require a different approach to writing optimal code than that needed on a multi CPU system. Taken together these issues have represented a significant barrier to widespread adoption of accelerators for execution with large legacy code bases.
OpenCL has emerged as a common programming language with which to implement code that runs across a range of parallel architectures, including multi-core CPUs. This paper is a case study on how the instruction-level parallelism offered by FPGAs and GPUs through OpenCL can be exploited in molecular physics. The algorithm which we study is the evaluation of tail integrals between Gaussian type basis functions for the R-matrix method, a task that arises in the study of scattering of low energy electrons by molecular targets.
The results of our productivity study, which is the first application of OpenCL in this problem domain, show that significant performance can be obtained from both FPGA and GPU accelerators for this application. We discuss suitable transformations unique to each accelerator architecture for the integrals studied and present performance results comparing the FPGA and GPU with execution on Intel multi-core systems.
addition have memory spaces independent of the CPU systems to which they are connected. It follows that accelerators require a different approach to writing optimal code than that needed on a multi CPU system. Taken together these issues have represented a significant barrier to widespread adoption of accelerators for execution with large legacy code bases.
OpenCL has emerged as a common programming language with which to implement code that runs across a range of parallel architectures, including multi-core CPUs. This paper is a case study on how the instruction-level parallelism offered by FPGAs and GPUs through OpenCL can be exploited in molecular physics. The algorithm which we study is the evaluation of tail integrals between Gaussian type basis functions for the R-matrix method, a task that arises in the study of scattering of low energy electrons by molecular targets.
The results of our productivity study, which is the first application of OpenCL in this problem domain, show that significant performance can be obtained from both FPGA and GPU accelerators for this application. We discuss suitable transformations unique to each accelerator architecture for the integrals studied and present performance results comparing the FPGA and GPU with execution on Intel multi-core systems.
Original language | English |
---|---|
Article number | e5984 |
Number of pages | 18 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 33 |
Issue number | 5 |
Early online date | 24 Sept 2020 |
DOIs | |
Publication status | Published - 10 Mar 2021 |
Keywords
- Integrals, heterogeneous accelerator, GPU, FPGA, OpenCL, multi-core CPU
ASJC Scopus subject areas
- Software
- Computer Science Applications
- Atomic and Molecular Physics, and Optics