{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,9]],"date-time":"2026-06-09T08:41:30Z","timestamp":1780994490798,"version":"3.54.1"},"publisher-location":"New York, NY, USA","reference-count":34,"publisher":"ACM","license":[{"start":{"date-parts":[[2015,12,5]],"date-time":"2015-12-05T00:00:00Z","timestamp":1449273600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Qualcomm Innovation Fellowship"},{"DOI":"10.13039\/501100001809","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CCF-1302641,CCF-1018796"],"award-info":[{"award-number":["CCF-1302641,CCF-1018796"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2015,12,5]]},"DOI":"10.1145\/2830772.2830821","type":"proceedings-article","created":{"date-parts":[[2016,1,11]],"date-time":"2016-01-11T13:38:13Z","timestamp":1452519493000},"page":"647-659","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":42,"title":["Efficient GPU synchronization without scopes"],"prefix":"10.1145","author":[{"given":"Matthew D.","family":"Sinclair","sequence":"first","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Johnathan","family":"Alsop","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sarita V.","family":"Adve","sequence":"additional","affiliation":[{"name":"University of Illinois at Urbana-Champaign and \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2015,12,5]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"\"HSA Platform System Architecture Specification.\" http:\/\/www.hsafoundation.com\/?ddownload=4944 2015.  \"HSA Platform System Architecture Specification.\" http:\/\/www.hsafoundation.com\/?ddownload=4944 2015."},{"key":"e_1_3_2_1_2_1","unstructured":"IntelPR \"Intel Discloses Newest Microarchitecture and 14 Nanometer Manufacturing Process Technical Details \" Intel Newsroom 2014.  IntelPR \"Intel Discloses Newest Microarchitecture and 14 Nanometer Manufacturing Process Technical Details \" Intel Newsroom 2014."},{"key":"e_1_3_2_1_3_1","volume-title":"QuickRelease: A Throughput-Oriented Approach to Release Consistency on GPUs,\" in IEEE 20th International Symposium on High Performance Computer Architecture","author":"Hechtman B.","year":"2014","unstructured":"B. Hechtman , S. Che , D. Hower , Y. Tian , B. Beckmann , M. Hill , S. Reinhardt , and D. Wood , \" QuickRelease: A Throughput-Oriented Approach to Release Consistency on GPUs,\" in IEEE 20th International Symposium on High Performance Computer Architecture , 2014 . B. Hechtman, S. Che, D. Hower, Y. Tian, B. Beckmann, M. Hill, S. Reinhardt, and D. Wood, \"QuickRelease: A Throughput-Oriented Approach to Release Consistency on GPUs,\" in IEEE 20th International Symposium on High Performance Computer Architecture, 2014."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2464996.2467280"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694391"},{"key":"e_1_3_2_1_6_1","first-page":"4623","volume":"1110","author":"Stuart J. A.","year":"2011","unstructured":"J. A. Stuart and J. D. Owens , \"Efficient Synchronization Primitives for GPUs,\" CoRR , vol. abs\/ 1110 . 4623 , 2011 . J. A. Stuart and J. D. Owens, \"Efficient Synchronization Primitives for GPUs,\" CoRR, vol. abs\/1110.4623, 2011.","journal-title":"\"Efficient Synchronization Primitives for GPUs,\" CoRR"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2012.6402918"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541981"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.24"},{"key":"e_1_3_2_1_10_1","volume-title":"Pannotia: Understanding Irregular GPGPU Graph Applications,\" in IEEE International Symposium on Workload Characterization","author":"Che S.","year":"2013","unstructured":"S. Che , B. Beckmann , S. Reinhardt , and K. Skadron , \" Pannotia: Understanding Irregular GPGPU Graph Applications,\" in IEEE International Symposium on Workload Characterization , 2013 . S. Che, B. Beckmann, S. Reinhardt, and K. Skadron, \"Pannotia: Understanding Irregular GPGPU Graph Applications,\" in IEEE International Symposium on Workload Characterization, 2013."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694350"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2701618"},{"key":"e_1_3_2_1_13_1","volume-title":"Version 2.0.\"","author":"Howes L.","year":"2015","unstructured":"L. Howes and A. Munshi , \" The OpenCL Specification , Version 2.0.\" Khronos Group , 2015 . L. Howes and A. Munshi, \"The OpenCL Specification, Version 2.0.\" Khronos Group, 2015."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/325164.325100"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1787234.1787255"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2011.21"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2451116.2451119"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694356"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2013.6522351"},{"key":"e_1_3_2_1_20_1","unstructured":"NVIDIA \"CUDA SDK 3.1.\" http:\/\/developer.nvidia.com\/object\/cuda_3_1_downloads.html.  NVIDIA \"CUDA SDK 3.1.\" http:\/\/developer.nvidia.com\/object\/cuda_3_1_downloads.html."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750374"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1105734.1105747"},{"key":"e_1_3_2_1_23_1","volume-title":"Analyzing CUDA Workloads Using a Detailed GPU Simulator,\" in IEEE International Symposium on Performance Analysis of Systems and Software","author":"Bakhoda A.","year":"2009","unstructured":"A. Bakhoda , G. L. Yuan , W. W. L. Fung , H. Wong , and T. M. Aamodt , \" Analyzing CUDA Workloads Using a Detailed GPU Simulator,\" in IEEE International Symposium on Performance Analysis of Systems and Software , 2009 . A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt, \"Analyzing CUDA Workloads Using a Detailed GPU Simulator,\" in IEEE International Symposium on Performance Analysis of Systems and Software, 2009."},{"key":"e_1_3_2_1_24_1","volume-title":"GARNET: A Detailed On-chip Network Model Inside a Full-system Simulator,\" in IEEE International Symposium on Performance Analysis of Systems and Software","author":"Agarwal N.","year":"2009","unstructured":"N. Agarwal , T. Krishna , L.-S. Peh , and N. Jha , \" GARNET: A Detailed On-chip Network Model Inside a Full-system Simulator,\" in IEEE International Symposium on Performance Analysis of Systems and Software , 2009 . N. Agarwal, T. Krishna, L.-S. Peh, and N. Jha, \"GARNET: A Detailed On-chip Network Model Inside a Full-system Simulator,\" in IEEE International Symposium on Performance Analysis of Systems and Software, 2009."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485964"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669172"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2618128.2618134"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_3_2_1_29_1","volume-title":"Department of ECE and CS","author":"Stratton J. A.","year":"2012","unstructured":"J. A. Stratton , C. Rodrigues , I.-J. Sung , N. Obeid , L.-W. Chang , N. Anssari , G. D. Liu , and W. Hwu , \" Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing,\" tech. rep ., Department of ECE and CS , University of Illinois at Urbana-Champaign , 2012 . J. A. Stratton, C. Rodrigues, I.-J. Sung, N. Obeid, L.-W. Chang, N. Anssari, G. D. Liu, and W. Hwu, \"Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing,\" tech. rep., Department of ECE and CS, University of Illinois at Urbana-Champaign, 2012."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2010.5650274"},{"key":"e_1_3_2_1_31_1","volume-title":"Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips,\" in IEEE International Symposium on Performance Analysis of Systems and Software","author":"Hechtman B.","year":"2013","unstructured":"B. Hechtman and D. Sorin , \" Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips,\" in IEEE International Symposium on Performance Analysis of Systems and Software , 2013 . B. Hechtman and D. Sorin, \"Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips,\" in IEEE International Symposium on Performance Analysis of Systems and Software, 2013."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485940"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540747"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750421"}],"event":{"name":"MICRO-48: The 48th Annual IEEE\/ACM International Symposium of Microarchitecture","location":"Waikiki Hawaii","acronym":"MICRO-48","sponsor":["IEEE Computer Society TC-uARCH","SIGMICRO ACM Special Interest Group on Microarchitectural Research and Processing"]},"container-title":["Proceedings of the 48th International Symposium on Microarchitecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2830772.2830821","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2830772.2830821","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T05:48:40Z","timestamp":1750225720000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2830772.2830821"}},"subtitle":["saying no to complex consistency models"],"short-title":[],"issued":{"date-parts":[[2015,12,5]]},"references-count":34,"alternative-id":["10.1145\/2830772.2830821","10.1145\/2830772"],"URL":"https:\/\/doi.org\/10.1145\/2830772.2830821","relation":{},"subject":[],"published":{"date-parts":[[2015,12,5]]},"assertion":[{"value":"2015-12-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}