{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T03:31:55Z","timestamp":1767929515324,"version":"3.49.0"},"reference-count":23,"publisher":"Association for Computing Machinery (ACM)","issue":"5s","license":[{"start":{"date-parts":[[2023,9,9]],"date-time":"2023-09-09T00:00:00Z","timestamp":1694217600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2023,10,31]]},"abstract":"<jats:p>We present a predictable wavefront splitting (PWS) technique for graphics processing units (GPUs). PWS improves the performance of GPU applications by reducing the impact of branch divergence while ensuring that worst-case execution time (WCET) estimates can be computed. This makes PWS an appropriate technique to use in safety-critical applications, such as autonomous driving systems, avionics, and space, that require strict temporal guarantees. In developing PWS on an AMD-based GPU, we propose microarchitectural enhancements to the GPU, and a compiler pass that eliminates branch serializations to reduce the WCET of a wavefront. Our analysis of PWS exhibits a performance improvement of 11% over existing architectures with a lower WCET than prior works in wavefront splitting.<\/jats:p>","DOI":"10.1145\/3609102","type":"journal-article","created":{"date-parts":[[2023,9,9]],"date-time":"2023-09-09T13:33:18Z","timestamp":1694266398000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Predictable GPU Wavefront Splitting for Safety-Critical Systems"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-8458-8512","authenticated-orcid":false,"given":"Artem","family":"Klashtorny","sequence":"first","affiliation":[{"name":"University of Waterloo, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3272-062X","authenticated-orcid":false,"given":"Zhuanhao","family":"Wu","sequence":"additional","affiliation":[{"name":"University of Waterloo, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8347-0109","authenticated-orcid":false,"given":"Anirudh Mohan","family":"Kaushik","sequence":"additional","affiliation":[{"name":"Intel of Canada, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2750-4471","authenticated-orcid":false,"given":"Hiren","family":"Patel","sequence":"additional","affiliation":[{"name":"University of Waterloo, Canada"}]}],"member":"320","published-online":{"date-parts":[[2023,9,9]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.5555\/3236002"},{"key":"e_1_3_1_3_2","unstructured":"Advanced Micro Devices. 2016. Graphics Core Next Architecture Reference Guide. (2016)."},{"key":"e_1_3_1_4_2","unstructured":"Advanced Micro Devices. 2019. Introducing RDNA Architecture. (2019)."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/RTSS.2017.00017"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/HOTCHIPS.2019.8875645"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/ECRTS.2013.29"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3479876.3481590"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/2070337.2070341"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2012.6237005"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA53966.2022.00090"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2007.30"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISORC.2017.24"},{"key":"e_1_3_1_14_2","article-title":"The gem5 simulator: Version 20.0+","volume":"2007","author":"Lowe-Power Jason","year":"2020","unstructured":"Jason Lowe-Power et\u00a0al. 2020. The gem5 simulator: Version 20.0+. CoRR abs\/2007.03152 (2020). arXiv:2007.03152https:\/\/arxiv.org\/abs\/2007.03152","journal-title":"CoRR"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVE45908.2019.8965235"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815992"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3453417.3453432"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ECRTS.2021.1"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522352"},{"key":"e_1_3_1_20_2","article-title":"ROCm Developer Tools: HIP Examples","author":"Robeck Corbin","year":"2016","unstructured":"Corbin Robeck and Aryan Salmanpour. 2016. ROCm Developer Tools: HIP Examples. (2016). https:\/\/github.com\/ROCm-Developer-Tools\/HIP-Examples","journal-title":"https:\/\/github.com\/ROCm-Developer-Tools\/HIP-Examples"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2021.3064290"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.4230\/LIPIcs.ECRTS.2018.20"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/DAC.1995.249991"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503222.3507721"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3609102","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3609102","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:58Z","timestamp":1750182538000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3609102"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,9]]},"references-count":23,"journal-issue":{"issue":"5s","published-print":{"date-parts":[[2023,10,31]]}},"alternative-id":["10.1145\/3609102"],"URL":"https:\/\/doi.org\/10.1145\/3609102","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,9]]},"assertion":[{"value":"2023-03-23","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-06-30","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-09-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}