{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T03:31:30Z","timestamp":1767929490140,"version":"3.49.0"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"OOPSLA","license":[{"start":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T00:00:00Z","timestamp":1634256000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2021,10,20]]},"abstract":"<jats:p>As GPU availability has increased and programming support has matured, a wider variety of applications are being ported to these platforms. Many parallel applications contain fine-grained synchronization idioms; as such, their correct execution depends on a degree of relative forward progress between threads (or thread groups). Unfortunately, many GPU programming specifications (e.g. Vulkan and Metal) say almost nothing about relative forward progress guarantees between workgroups. Although prior work has proposed a spectrum of plausible progress models for GPUs, cross-vendor specifications have yet to commit to any model.<\/jats:p>\n          <jats:p>This work is a collection of tools and experimental data to aid specification designers when considering forward progress guarantees in programming frameworks. As a foundation, we formalize a small parallel programming language that captures the essence of fine-grained synchronization. We then provide a means of formally specifying a progress model, and develop a termination oracle that decides whether a given program is guaranteed to eventually terminate with respect to a given progress model. Next, we formalize a set of constraints that describe concurrent programs that require forward progress to terminate. This allows us to synthesize a large set of 483 progress litmus tests. Combined with the termination oracle, we can determine the expected status of each litmus test -- i.e. whether it is guaranteed to eventually terminate -- under various progress models. We present a large experimental campaign running the litmus tests across 8 GPUs from 5 different vendors. Our results highlight that GPUs have significantly different termination behaviors under our test suite. Most notably, we find that Apple and ARM GPUs do not support the linear occupancy-bound model, as was hypothesized by prior work.<\/jats:p>","DOI":"10.1145\/3485508","type":"journal-article","created":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T19:18:28Z","timestamp":1634325508000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Specifying and testing GPU workgroup progress models"],"prefix":"10.1145","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1646-7935","authenticated-orcid":false,"given":"Tyler","family":"Sorensen","sequence":"first","affiliation":[{"name":"University of California at Santa Cruz, USA"}]},{"given":"Lucas F.","family":"Salvador","sequence":"additional","affiliation":[{"name":"Princeton University, USA"}]},{"given":"Harmit","family":"Raval","sequence":"additional","affiliation":[{"name":"Princeton University, USA"}]},{"given":"Hugues","family":"Evrard","sequence":"additional","affiliation":[{"name":"Google, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6735-5533","authenticated-orcid":false,"given":"John","family":"Wickerson","sequence":"additional","affiliation":[{"name":"Imperial College London, UK"}]},{"given":"Margaret","family":"Martonosi","sequence":"additional","affiliation":[{"name":"Princeton University, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7448-7961","authenticated-orcid":false,"given":"Alastair F.","family":"Donaldson","sequence":"additional","affiliation":[{"name":"Imperial College London, UK"}]}],"member":"320","published-online":{"date-parts":[[2021,10,15]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694391"},{"key":"e_1_2_2_2_1","unstructured":"AMD. 2019. RDNA Architecture. Whitepaper https:\/\/www.amd.com\/system\/files\/documents\/rdna-whitepaper.pdf  AMD. 2019. RDNA Architecture. Whitepaper https:\/\/www.amd.com\/system\/files\/documents\/rdna-whitepaper.pdf"},{"key":"e_1_2_2_3_1","unstructured":"Apple. 2020. Metal Shading Language Specification v2.3. https:\/\/developer.apple.com\/metal\/  Apple. 2020. Metal Shading Language Specification v2.3. https:\/\/developer.apple.com\/metal\/"},{"key":"e_1_2_2_4_1","volume-title":"Principles of model checking","author":"Baier Christel"},{"key":"e_1_2_2_5_1","volume-title":"Patterson","author":"Beamer Scott","year":"2015"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2743017"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2384616.2384625"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/12.42122"},{"key":"e_1_2_2_9_1","volume-title":"On Dynamic Load Balancing on Graphics Processors","author":"Cederman Daniel"},{"key":"e_1_2_2_10_1","volume-title":"Reference Manual of the LNT to LOTOS Translator (Version 7.0). Aug., https:\/\/cadp.inria.fr\/ftp\/publications\/cadp\/Champelovier-Clerc-Garavel-et-al-10.pdf INRIA","author":"Champelovier David"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3192366.3192373"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-37036-6_16"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133917"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ECOOP.2020.22"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA45697.2020.00087"},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2016.7783714"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/567446.567462"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10009-012-0244-z"},{"key":"e_1_2_2_19_1","unstructured":"Google. 2020. Amber. https:\/\/github.com\/google\/amber  Google. 2020. Amber. https:\/\/github.com\/google\/amber"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/InPar.2012.6339596"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-28869-2_16"},{"key":"e_1_2_2_22_1","unstructured":"HSA Foundation. 2017. HSA Programmer\u2019s Reference Manual: HSAIL Virtual ISA and Programming Model Compiler Writer and Object Format (BRIG). (rev 1.1.1).  HSA Foundation. 2017. HSA Programmer\u2019s Reference Manual: HSAIL Virtual ISA and Programming Model Compiler Writer and Object Format (BRIG). (rev 1.1.1)."},{"key":"e_1_2_2_23_1","volume-title":"Wen-mei W"},{"key":"e_1_2_2_24_1","unstructured":"Intel. 2021. oneAPI GPU Optimization Guide. https:\/\/software.intel.com\/content\/dam\/develop\/external\/us\/en\/documents\/oneapi-gpu-optimization-guide.pdf  Intel. 2021. oneAPI GPU Optimization Guide. https:\/\/software.intel.com\/content\/dam\/develop\/external\/us\/en\/documents\/oneapi-gpu-optimization-guide.pdf"},{"key":"e_1_2_2_25_1","unstructured":"ISO\/IEC. 2017. Working Draft Standard for Programming Language C++ v. N4659. https:\/\/doi.org\/jtc1\/sc22\/wg21\/docs\/papers\/2017\/n4659.pdf  ISO\/IEC. 2017. Working Draft Standard for Programming Language C++ v. N4659. https:\/\/doi.org\/jtc1\/sc22\/wg21\/docs\/papers\/2017\/n4659.pdf"},{"key":"e_1_2_2_26_1","volume-title":"Software Abstractions: Logic, Language, and Analysis","author":"Jackson Daniel","year":"2012"},{"key":"e_1_2_2_27_1","unstructured":"Khronos Group. 2020. Khronos Group Releases OpenCL 3.0. https:\/\/khr.io\/ocl3pressrelease  Khronos Group. 2020. Khronos Group Releases OpenCL 3.0. https:\/\/khr.io\/ocl3pressrelease"},{"key":"e_1_2_2_28_1","unstructured":"Khronos Group. 2020. Vulkan 1.2.174 - A Specification. https:\/\/www.khronos.org\/registry\/vulkan\/specs\/1.2-extensions\/html\/vkspec.html  Khronos Group. 2020. Vulkan 1.2.174 - A Specification. https:\/\/www.khronos.org\/registry\/vulkan\/specs\/1.2-extensions\/html\/vkspec.html"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2019.8661172"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(82)90125-6"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3485475"},{"key":"e_1_2_2_32_1","unstructured":"Raph Levien. 2020. Prefix sum on Vulkan. https:\/\/raphlinus.github.io\/gpu\/2020\/04\/30\/prefix-sum.html  Raph Levien. 2020. Prefix sum on Vulkan. https:\/\/raphlinus.github.io\/gpu\/2020\/04\/30\/prefix-sum.html"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2145816.2145844"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3037697.3037723"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908089"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPADS.2011.48"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15898-8_12"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-68237-0_12"},{"key":"e_1_2_2_39_1","doi-asserted-by":"crossref","unstructured":"MCL. 2008. MCL manual page. http:\/\/cadp.inria.fr\/man\/mcl4.html  MCL. 2008. MCL manual page. http:\/\/cadp.inria.fr\/man\/mcl4.html","DOI":"10.1016\/S0969-6210(08)70279-0"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626415500073"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3303084.3309488"},{"key":"e_1_2_2_42_1","unstructured":"Nvidia. 2017. Nvidia Tesla V100 GPU ARCHITECTURE. https:\/\/images.nvidia.com\/content\/volta-architecture\/pdf\/volta-architecture-whitepaper.pdf Whitepaper WP-08608-001_v1.1.  Nvidia. 2017. Nvidia Tesla V100 GPU ARCHITECTURE. https:\/\/images.nvidia.com\/content\/volta-architecture\/pdf\/volta-architecture-whitepaper.pdf Whitepaper WP-08608-001_v1.1."},{"key":"e_1_2_2_43_1","unstructured":"Nvidia. 2021. CUB v1.12.0. https:\/\/github.com\/NVlabs\/cub  Nvidia. 2021. CUB v1.12.0. https:\/\/github.com\/NVlabs\/cub"},{"issue":"2","key":"e_1_2_2_44_1","first-page":"1","article-title":"CUDA C++ Programming Guide","volume":"11","year":"2021","journal-title":"Version"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/2908080.2908114"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3022671.2984032"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.CONCUR.2018.23"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC47752.2019.9042139"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.5501522"},{"key":"e_1_2_2_50_1","volume-title":"Donaldson","author":"Sorensen Tyler","year":"2021"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC47752.2019.9042019"},{"key":"e_1_2_2_52_1","volume-title":"Owens","author":"Tzeng Stanley","year":"2010"},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/2851141.2851145"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3093333.3009838"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2010.5452013"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2010.5470477"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3485508","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3485508","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:18:40Z","timestamp":1750191520000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3485508"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,15]]},"references-count":56,"journal-issue":{"issue":"OOPSLA","published-print":{"date-parts":[[2021,10,20]]}},"alternative-id":["10.1145\/3485508"],"URL":"https:\/\/doi.org\/10.1145\/3485508","relation":{},"ISSN":["2475-1421"],"issn-type":[{"value":"2475-1421","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,15]]},"assertion":[{"value":"2021-10-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}