TY - CHAP
T1 - Large Scale Verification of MPI Programs Using Lamport Clocks with Lazy Update
AU - Vo, Anh
AU - Gopalakrishnan, Ganesh
AU - Kirby, Robert M.
AU - De Supinski, Bronis R.
AU - Schulz, Martin
AU - Bronevetsky, Greg
PY - 2011
Y1 - 2011
N2 - We propose a dynamic verification approach for large-scale message passing programs to locate correctness bugs caused by unforeseen nondeterministic interactions. This approach hinges on an efficient protocol to track the causality between nondeterministic message receive operations and potentially matching send operations. We show that causality tracking protocols that rely solely on logical clocks fail to capture all nuances of MPI program behavior, including the variety of ways in which nonblocking calls can complete. Our approach is hinged on formally defining the matches-before relation underlying the MPI standard, and devising lazy update logical clock based algorithms that can correctly discover all potential outcomes of nondeterministic receives in practice. can achieve the same coverage as a vector clock based algorithm while maintaining good scalability. LLCP allows us to analyze realistic MPI programs involving a thousand MPI processes, incurring only modest overheads in terms of communication bandwidth, latency, and memory consumption.
AB - We propose a dynamic verification approach for large-scale message passing programs to locate correctness bugs caused by unforeseen nondeterministic interactions. This approach hinges on an efficient protocol to track the causality between nondeterministic message receive operations and potentially matching send operations. We show that causality tracking protocols that rely solely on logical clocks fail to capture all nuances of MPI program behavior, including the variety of ways in which nonblocking calls can complete. Our approach is hinged on formally defining the matches-before relation underlying the MPI standard, and devising lazy update logical clock based algorithms that can correctly discover all potential outcomes of nondeterministic receives in practice. can achieve the same coverage as a vector clock based algorithm while maintaining good scalability. LLCP allows us to analyze realistic MPI programs involving a thousand MPI processes, incurring only modest overheads in terms of communication bandwidth, latency, and memory consumption.
UR - http://www.scopus.com/inward/record.url?partnerID=yv4JPVwI&eid=2-s2.0-84856552231&md5=9f506955d3f1e80c96d36c7735b79f23
U2 - 10.1109/PACT.2011.64
DO - 10.1109/PACT.2011.64
M3 - Other chapter contribution
AN - SCOPUS:84856552231
SN - 978-1-4577-1794-9
SP - 330
EP - 339
BT - 2011 International Conference on Parallel Architectures and Compilation Techniques (PACT)
PB - Institute of Electrical and Electronics Engineers Inc.
ER -