TY - GEN
T1 - PINT
T2 - 36th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2022
AU - Xu, Yifan
AU - Zhou, Anchengcheng
AU - Agrawal, Kunal
AU - Lee, I. Ting Angelina
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - A race detector for task-parallel code typically consists of two main components - a reachability analysis component that checks whether two instructions are logically in parallel and an access history component that keeps track of memory locations accessed by previous instructions. Race detectors from prior work typically utilize a hashmap to maintain the access history, which provides asymptotically optimal overhead per operation but can incur significant overhead in practice, since the detector needs to insert into and query the hashmap for every memory access. An exception is STINT by Xu et al., which race detects task-parallel code by coalescing memory accesses into intervals, or continuous memory locations accessed within a sequence of instructions without any parallel construct. STINT utilizes a treap to manage access history that allows for insertions and queries of non-overlapping intervals. While a treap incurs higher asymptotic overhead per operation, this strategy works well in practice as the race detector performs operation on the access history with much lower frequency compared to the strategy that utilizes a hashmap. STINT only executes task-parallel code sequentially, however, due to the unique design of their treap that ensures no overlapping intervals exist in the tree. Parallelizing STINT efficiently is non-trivial, as it would require a concurrent treap that ensures no overlapping interval, which is challenging to design and likely incurs high synchronization overhead. This work proposes PINT, a race detector that, like STINT, race detects task-parallel code at the interval granularity and utilizes the same treap design to maintain access history. PINT executes the computation in parallel, however, while keeping the parallelization / synchronization overhead low. A key insight is that, PINT separates out operations needed for race detection into the core part (e.g., reachability maintenance) and the access history part. Doing so allows PINT to parallelize the core part efficiently and perform the access history part asynchronously, thereby incurring low overhead.
AB - A race detector for task-parallel code typically consists of two main components - a reachability analysis component that checks whether two instructions are logically in parallel and an access history component that keeps track of memory locations accessed by previous instructions. Race detectors from prior work typically utilize a hashmap to maintain the access history, which provides asymptotically optimal overhead per operation but can incur significant overhead in practice, since the detector needs to insert into and query the hashmap for every memory access. An exception is STINT by Xu et al., which race detects task-parallel code by coalescing memory accesses into intervals, or continuous memory locations accessed within a sequence of instructions without any parallel construct. STINT utilizes a treap to manage access history that allows for insertions and queries of non-overlapping intervals. While a treap incurs higher asymptotic overhead per operation, this strategy works well in practice as the race detector performs operation on the access history with much lower frequency compared to the strategy that utilizes a hashmap. STINT only executes task-parallel code sequentially, however, due to the unique design of their treap that ensures no overlapping intervals exist in the tree. Parallelizing STINT efficiently is non-trivial, as it would require a concurrent treap that ensures no overlapping interval, which is challenging to design and likely incurs high synchronization overhead. This work proposes PINT, a race detector that, like STINT, race detects task-parallel code at the interval granularity and utilizes the same treap design to maintain access history. PINT executes the computation in parallel, however, while keeping the parallelization / synchronization overhead low. A key insight is that, PINT separates out operations needed for race detection into the core part (e.g., reachability maintenance) and the access history part. Doing so allows PINT to parallelize the core part efficiently and perform the access history part asynchronously, thereby incurring low overhead.
KW - access history
KW - determinacy race
KW - shadow memory
KW - task parallelism
UR - https://www.scopus.com/pages/publications/85136338108
U2 - 10.1109/IPDPS53621.2022.00087
DO - 10.1109/IPDPS53621.2022.00087
M3 - Conference contribution
AN - SCOPUS:85136338108
T3 - Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium, IPDPS 2022
SP - 850
EP - 861
BT - Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium, IPDPS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 30 May 2022 through 3 June 2022
ER -