Abstract
Experts state that dynamic analysis techniques help programmers in finding the root cause of bugs in large-scale high-performance computing (HPC) parallel applications. These applications can run detailed numerical simulations that model the real world. The numerical correctness and software reliability of these applications is a major concern for scientists, due to the public importance of such scientific advances. A set of techniques that build on one another to accomplish large-scale debugging, leading to discovery of scaling bugs, or those that manifest themselves when the application is deployed at large scale, behavioral debugging by modeling control-flow behavior of tasks, and software-defect detection at the communication layer, can help programmers in achieving their objectives.
Original language | English |
---|---|
Pages (from-to) | 72-81 |
Number of pages | 10 |
Journal | Communications of the ACM |
Volume | 58 |
Issue number | 9 |
DOIs | |
Publication status | Published - 01 Sept 2015 |
ASJC Scopus subject areas
- General Computer Science