TY - GEN
T1 - Using memory mapping to support cactus stacks in work-stealing runtime systems
AU - Lee, I. Ting Angelina
AU - Boyd-Wickizer, Silas
AU - Huang, Zhiyi
AU - Leiserson, Charles E.
PY - 2010
Y1 - 2010
N2 - Many multithreaded concurrency platforms that use a work-stealing runtime system incorporate a "cactus stack," wherein a function's accesses to stack variables properly respect the function's calling ancestry, even when many of the functions operate in parallel. Unfortunately, such existing concurrency platforms fail to satisfy at least one of the following three desirable criteria: • full interoperability with legacy or third-party serial binaries that have been compiled to use an ordinary linear stack, • a scheduler that provides near-perfect linear speedup on applications with sufficient parallelism, and • bounded and efficient use of memory for the cactus stack. We have addressed this cactus-stack problem by modifying the Linux operating system kernel to provide support for thread-local memory mapping (TLMM). We have used TLMM to reimplement the cactus stack in the open-source Cilk-5 runtime system. The Cilk-M runtime system removes the linguistic distinction imposed by Cilk-5 between serial code and parallel code, erases Cilk-5's limitation that serial code cannot call parallel code, and provides full compatibility with existing serial calling conventions. The Cilk-M runtime system provides strong guarantees on scheduler performance and stack space. Benchmark results indicate that the performance of the prototype Cilk-M 1.0 is comparable to the Cilk 5.4.6 system, and the consumption of stack space is modest.
AB - Many multithreaded concurrency platforms that use a work-stealing runtime system incorporate a "cactus stack," wherein a function's accesses to stack variables properly respect the function's calling ancestry, even when many of the functions operate in parallel. Unfortunately, such existing concurrency platforms fail to satisfy at least one of the following three desirable criteria: • full interoperability with legacy or third-party serial binaries that have been compiled to use an ordinary linear stack, • a scheduler that provides near-perfect linear speedup on applications with sufficient parallelism, and • bounded and efficient use of memory for the cactus stack. We have addressed this cactus-stack problem by modifying the Linux operating system kernel to provide support for thread-local memory mapping (TLMM). We have used TLMM to reimplement the cactus stack in the open-source Cilk-5 runtime system. The Cilk-M runtime system removes the linguistic distinction imposed by Cilk-5 between serial code and parallel code, erases Cilk-5's limitation that serial code cannot call parallel code, and provides full compatibility with existing serial calling conventions. The Cilk-M runtime system provides strong guarantees on scheduler performance and stack space. Benchmark results indicate that the performance of the prototype Cilk-M 1.0 is comparable to the Cilk 5.4.6 system, and the consumption of stack space is modest.
KW - cactus stack
KW - cilk
KW - interoperability
KW - memory mapping
KW - serial-parallel reciprocity
KW - work stealing
UR - https://www.scopus.com/pages/publications/78149275789
U2 - 10.1145/1854273.1854324
DO - 10.1145/1854273.1854324
M3 - Conference contribution
AN - SCOPUS:78149275789
SN - 9781450301787
T3 - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
SP - 411
EP - 420
BT - PACT'10 - Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th International Conference on Parallel Architectures and Compilation Techniques, PACT 2010
Y2 - 11 September 2010 through 15 September 2010
ER -