{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T07:15:06Z","timestamp":1757574906179,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":33,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,6,20]],"date-time":"2024-06-20T00:00:00Z","timestamp":1718841600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,6,20]]},"DOI":"10.1145\/3652586.3663313","type":"proceedings-article","created":{"date-parts":[[2024,6,20]],"date-time":"2024-06-20T16:39:11Z","timestamp":1718901551000},"page":"13-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Work Assisting: Linking Task-Parallel Work Stealing with Data-Parallel Self Scheduling"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4731-2234","authenticated-orcid":false,"given":"Ivo Gabe","family":"de Wolff","sequence":"first","affiliation":[{"name":"Utrecht University, Utrecht, Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1442-5387","authenticated-orcid":false,"given":"Gabriele","family":"Keller","sequence":"additional","affiliation":[{"name":"Utrecht University, Utrecht, Netherlands"}]}],"member":"320","published-online":{"date-parts":[[2024,6,20]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"2021. OpenMP Application Programming Interface version 5.2. https:\/\/www.openmp.org\/specifications\/"},{"key":"e_1_3_2_1_2_1","volume-title":"Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming. 219\u2013228","author":"Acar Umut A","year":"2013","unstructured":"Umut A Acar, Arthur Chargu\u00e9raud, and Mike Rainey. 2013. Scheduling parallel programs by work stealing with private deques. In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming. 219\u2013228."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1145\/227234.227246","article-title":"Programming parallel algorithms","volume":"39","author":"Blelloch Guy E","year":"1996","unstructured":"Guy E Blelloch. 1996. Programming parallel algorithms. Communications of the ACM, 39, 3 (1996), 85\u201397.","journal-title":"Communications of the ACM"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"crossref","first-page":"720","DOI":"10.1145\/324133.324234","article-title":"Scheduling multithreaded computations by work stealing","volume":"46","author":"Blumofe Robert D","year":"1999","unstructured":"Robert D Blumofe and Charles E Leiserson. 1999. Scheduling multithreaded computations by work stealing. Journal of the ACM (JACM), 46, 5 (1999), 720\u2013748.","journal-title":"Journal of the ACM (JACM)"},{"key":"e_1_3_2_1_5_1","volume-title":"Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing. 261\u2013270","author":"Brown Trevor Alexander","year":"2015","unstructured":"Trevor Alexander Brown. 2015. Reclaiming memory for lock-free data structures: There has to be a better way. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing. 261\u2013270."},{"key":"e_1_3_2_1_6_1","volume-title":"Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures. 21\u201328","author":"Chase David","year":"2005","unstructured":"David Chase and Yossi Lev. 2005. Dynamic circular work-stealing deque. In Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures. 21\u201328."},{"key":"e_1_3_2_1_7_1","volume-title":"2009 IEEE international symposium on workload characterization (IISWC). 44\u201354","author":"Che Shuai","year":"2009","unstructured":"Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE international symposium on workload characterization (IISWC). 44\u201354."},{"key":"e_1_3_2_1_8_1","volume-title":"International Workshop on OpenMP. 21\u201336","author":"Ciorba Florina M","year":"2018","unstructured":"Florina M Ciorba, Christian Iwainsky, and Patrick Buder. 2018. OpenMP loop scheduling revisited: making a case for more schedules. In International Workshop on OpenMP. 21\u201336."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/0167-8191(88)90094-4","article-title":"A single-program-multiple-data computational model for EPEX\/FORTRAN","volume":"7","author":"Darema Frederica","year":"1988","unstructured":"Frederica Darema, David A George, V Alan Norton, and Gregory F Pfister. 1988. A single-program-multiple-data computational model for EPEX\/FORTRAN. Parallel Comput., 7, 1 (1988), 11\u201324.","journal-title":"Parallel Comput."},{"key":"e_1_3_2_1_10_1","volume-title":"European Conference on Parallel Processing.","author":"de Wolff Ivo Gabe","year":"2024","unstructured":"Ivo Gabe de Wolff, Daniel Anderson, Gabriele Keller, and Aleksei Seletskiy. 2024. A Fast Wait-Free Solution to Read-Reclaim Races in Reference Counting. In European Conference on Parallel Processing. to appear."},{"key":"e_1_3_2_1_11_1","volume-title":"Proceedings of the 15th International Workshop on Programming Models and Applications for Multicores and Manycores. 52\u201361","author":"de Wolff Ivo Gabe","year":"2024","unstructured":"Ivo Gabe de Wolff, David P van Balen, Gabriele K Keller, and Trevor L McDonell. 2024. Zero-Overhead Parallel Scans for Multi-Core CPUs. In Proceedings of the 15th International Workshop on Programming Models and Applications for Multicores and Manycores. 52\u201361."},{"key":"e_1_3_2_1_12_1","volume-title":"International Workshop on OpenMP. 100\u2013110","author":"Duran Alejandro","year":"2008","unstructured":"Alejandro Duran, Julita Corbal\u00e1n, and Eduard Ayguad\u00e9. 2008. Evaluation of OpenMP task scheduling strategies. In International Workshop on OpenMP. 100\u2013110."},{"key":"e_1_3_2_1_13_1","volume-title":"Proceedings of the 2007 workshop on Declarative aspects of multicore programming. 37\u201344","author":"Fluet Matthew","year":"2007","unstructured":"Matthew Fluet, Mike Rainey, John Reppy, Adam Shaw, and Yingqi Xiao. 2007. Manticore: A heterogeneous parallel language. In Proceedings of the 2007 workshop on Declarative aspects of multicore programming. 37\u201344."},{"key":"e_1_3_2_1_14_1","volume-title":"International Conference on Vector and Parallel Processing. 121\u2013134","author":"Fonseca Alcides","year":"2016","unstructured":"Alcides Fonseca and Bruno Cabral. 2016. Evaluation of runtime cut-off approaches for parallel programs. In International Conference on Vector and Parallel Processing. 121\u2013134."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1007\/s10766-006-0018-x","article-title":"SAC\u2014a functional array language for efficient multi-threaded execution","volume":"34","author":"Grelck Clemens","year":"2006","unstructured":"Clemens Grelck and Sven-Bodo Scholz. 2006. SAC\u2014a functional array language for efficient multi-threaded execution. International Journal of Parallel Programming, 34 (2006), 383\u2013427.","journal-title":"International Journal of Parallel Programming"},{"key":"e_1_3_2_1_16_1","article-title":"Performance of memory reclamation for lockless synchronization","volume":"67","author":"Hart Thomas E.","year":"2007","unstructured":"Thomas E. Hart, Paul E. McKenney, Angela Demke Brown, and Jonathan Walpole. 2007. Performance of memory reclamation for lockless synchronization. J. Parallel and Distrib. Comput., 67, 12 (2007).","journal-title":"J. Parallel and Distrib. Comput."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1109\/12.46289","article-title":"Parallel quicksort using fetch-and-add","volume":"39","author":"Heidelberger Philip","year":"1990","unstructured":"Philip Heidelberger, Alan Norton, and John T. Robinson. 1990. Parallel quicksort using fetch-and-add. IEEE Transactions on Computers, 39, 1 (1990), 133\u2013138.","journal-title":"IEEE Transactions on Computers"},{"key":"e_1_3_2_1_18_1","volume-title":"The computer journal, 5, 1","author":"Hoare Charles AR","year":"1962","unstructured":"Charles AR Hoare. 1962. Quicksort. The computer journal, 5, 1 (1962), 10\u201316."},{"volume-title":"Programming languages \u2013 C++","key":"e_1_3_2_1_19_1","unstructured":"2020. Programming languages \u2013 C++. International Organization for Standardization, Geneva, CH."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"crossref","first-page":"1001","DOI":"10.1109\/TSE.1985.231547","article-title":"Allocating independent subtasks on parallel processors","author":"Kruskal Clyde P.","year":"1985","unstructured":"Clyde P. Kruskal and Alan Weiss. 1985. Allocating independent subtasks on parallel processors. IEEE Transactions on Software engineering, 1001\u20131016.","journal-title":"IEEE Transactions on Software engineering"},{"key":"e_1_3_2_1_21_1","volume-title":"Proceedings of the 1995 Glasgow Workshop on Functional Programming. 1\u201310","author":"Loidl Hans-Wolfgang","year":"1995","unstructured":"Hans-Wolfgang Loidl and Kevin Hammond. 1995. On the granularity of divide-and-conquer parallelism. In Proceedings of the 1995 Glasgow Workshop on Functional Programming. 1\u201310."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1145\/2692956.2663188","article-title":"The Rust language","volume":"34","author":"Matsakis Nicholas D","year":"2014","unstructured":"Nicholas D Matsakis and Felix S Klock. 2014. The Rust language. ACM SIGAda Ada Letters, 34, 3 (2014), 103\u2013104.","journal-title":"ACM SIGAda Ada Letters"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1145\/2887747.2804313","article-title":"Type-safe runtime code generation: Accelerate to LLVM","volume":"50","author":"McDonell Trevor L","year":"2015","unstructured":"Trevor L McDonell, Manuel MT Chakravarty, Vinod Grover, and Ryan R Newton. 2015. Type-safe runtime code generation: Accelerate to LLVM. ACM SIGPLAN Notices, 50, 12 (2015), 201\u2013212.","journal-title":"ACM SIGPLAN Notices"},{"key":"e_1_3_2_1_24_1","volume-title":"IWOMP 2016, Nara, Japan, October 5-7, 2016, Proceedings 12","author":"Meadows Larry","year":"2016","unstructured":"Larry Meadows, Simon J Pennycook, Alex Duran, Terry Wilmarth, and Jim Cownie. 2016. Workstealing and nested parallelism in SMP systems. In OpenMP: Memory, Devices, and Tasks: 12th International Workshop on OpenMP, IWOMP 2016, Nara, Japan, October 5-7, 2016, Proceedings 12. 47\u201360."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1145\/3150211","article-title":"Halide: Decoupling algorithms from schedules for high-performance image processing","volume":"61","author":"Ragan-Kelley Jonathan","year":"2017","unstructured":"Jonathan Ragan-Kelley, Andrew Adams, Dillon Sharlet, Connelly Barnes, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Fr\u00e9do Durand. 2017. Halide: Decoupling algorithms from schedules for high-performance image processing. Communications of the ACM, 61, 1 (2017), 106\u2013115.","journal-title":"Communications of the ACM"},{"key":"e_1_3_2_1_26_1","volume-title":"2008 IEEE International Symposium on Parallel and Distributed Processing. 1\u20138.","author":"Robison Arch","year":"2008","unstructured":"Arch Robison, Michael Voss, and Alexey Kukanov. 2008. Optimization via reflection on work stealing in TBB. In 2008 IEEE International Symposium on Parallel and Distributed Processing. 1\u20138."},{"key":"e_1_3_2_1_27_1","volume-title":"European Conference on Parallel Processing. 491\u2013503","author":"Sbirlea Alina","year":"2015","unstructured":"Alina Sbirlea, Kunal Agrawal, and Vivek Sarkar. 2015. Elastic tasks: Unifying task parallelism and SPMD parallelism with an adaptive runtime. In European Conference on Parallel Processing. 491\u2013503."},{"key":"e_1_3_2_1_28_1","volume-title":"Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming. 13\u201322","author":"Subhlok Jaspal","year":"1993","unstructured":"Jaspal Subhlok, James M Stichnoth, David R O\u2019hallaron, and Thomas Gross. 1993. Exploiting task and data parallelism on a multicomputer. In Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming. 13\u201322."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1145\/1837853.1693479","article-title":"Lazy binary-splitting: a run-time adaptive work-stealing scheduler","volume":"45","author":"Tzannes Alexandros","year":"2010","unstructured":"Alexandros Tzannes, George C Caragea, Rajeev Barua, and Uzi Vishkin. 2010. Lazy binary-splitting: a run-time adaptive work-stealing scheduler. ACM Sigplan Notices, 45, 5 (2010), 179\u2013190.","journal-title":"ACM Sigplan Notices"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1109\/71.205655","article-title":"Trapezoid self-scheduling: A practical scheduling scheme for parallel compilers","volume":"4","author":"Tzen Ten H","year":"1993","unstructured":"Ten H Tzen and Lionel M Ni. 1993. Trapezoid self-scheduling: A practical scheduling scheme for parallel compilers. IEEE Transactions on parallel and distributed systems, 4, 1 (1993), 87\u201398.","journal-title":"IEEE Transactions on parallel and distributed systems"},{"key":"e_1_3_2_1_31_1","volume-title":"Proceedings of the 8th annual IEEE\/ACM international symposium on Code generation and optimization. 266\u2013277","author":"Wang Lei","year":"2010","unstructured":"Lei Wang, Huimin Cui, Yuelu Duan, Fang Lu, Xiaobing Feng, and Pen-Chung Yew. 2010. An adaptive task creation strategy for work-stealing scheduling. In Proceedings of the 8th annual IEEE\/ACM international symposium on Code generation and optimization. 266\u2013277."},{"key":"e_1_3_2_1_32_1","unstructured":"Anthony Williams. 2012. C++ Concurrency in Action - Practical Multithreading."},{"key":"e_1_3_2_1_33_1","volume-title":"Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures. 105\u2013116","author":"Wimmer Martin","year":"2011","unstructured":"Martin Wimmer and Jesper Larsson Tr\u00e4ff. 2011. Work-stealing for mixed-mode parallelism by deterministic team-building. In Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures. 105\u2013116."}],"event":{"name":"ARRAY '24: 10th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming","sponsor":["SIGPLAN ACM Special Interest Group on Programming Languages"],"location":"Copenhagen Denmark","acronym":"ARRAY '24"},"container-title":["Proceedings of the 10th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3652586.3663313","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3652586.3663313","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:03:30Z","timestamp":1750291410000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3652586.3663313"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,20]]},"references-count":33,"alternative-id":["10.1145\/3652586.3663313","10.1145\/3652586"],"URL":"https:\/\/doi.org\/10.1145\/3652586.3663313","relation":{},"subject":[],"published":{"date-parts":[[2024,6,20]]},"assertion":[{"value":"2024-06-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}