{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,23]],"date-time":"2025-10-23T05:31:42Z","timestamp":1761197502637,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":51,"publisher":"ACM","license":[{"start":{"date-parts":[[2017,10,14]],"date-time":"2017-10-14T00:00:00Z","timestamp":1507939200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["1455404,1525609,1455733"],"award-info":[{"award-number":["1455404,1525609,1455733"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Tsinghua University Initiative Scientific Research Program"},{"name":"National Key Research and Development Program","award":["2016YFB0200100"],"award-info":[{"award-number":["2016YFB0200100"]}]},{"name":"DOE","award":["DE-SC0013700"],"award-info":[{"award-number":["DE-SC0013700"]}]},{"name":"NSFC","award":["No.61232008"],"award-info":[{"award-number":["No.61232008"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2017,10,14]]},"DOI":"10.1145\/3123939.3123978","type":"proceedings-article","created":{"date-parts":[[2017,11,20]],"date-time":"2017-11-20T14:31:12Z","timestamp":1511188272000},"page":"587-599","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":31,"title":["Versapipe"],"prefix":"10.1145","author":[{"given":"Zhen","family":"Zheng","sequence":"first","affiliation":[{"name":"Tsinghua University"}]},{"given":"Chanyoung","family":"Oh","sequence":"additional","affiliation":[{"name":"University of Seoul"}]},{"given":"Jidong","family":"Zhai","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Xipeng","family":"Shen","sequence":"additional","affiliation":[{"name":"North Carolina State University"}]},{"given":"Youngmin","family":"Yi","sequence":"additional","affiliation":[{"name":"University of Seoul"}]},{"given":"Wenguang","family":"Chen","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]}],"member":"320","published-online":{"date-parts":[[2017,10,14]]},"reference":[{"key":"e_1_3_2_1_1_1","first-page":"33","article-title":"Pyramid Methods in Image Processing","volume":"29","author":"Adelson Edward","year":"1984","unstructured":"Edward Adelson , Charles Anderson , James Bergen , Peter Burt , and Joan Ogden . 1984 . Pyramid Methods in Image Processing . RCA Engineer 29 , 6 (1984), 33 -- 41 . Edward Adelson, Charles Anderson, James Bergen, Peter Burt, and Joan Ogden. 1984. Pyramid Methods in Image Processing. RCA Engineer 29, 6 (1984), 33--41.","journal-title":"RCA Engineer"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2006.244"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1572769.1572792"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/11744023_32"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-24322-6_14"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5160984"},{"key":"e_1_3_2_1_7_1","volume-title":"Eurographics\/acm SIGGRAPH Conference on Graphics Hardware","author":"Cederman Daniel","year":"2008","unstructured":"Daniel Cederman and Philippas Tsigas . 2008. On Dynamic Load Balancing on Graphics Processors . In Eurographics\/acm SIGGRAPH Conference on Graphics Hardware 2008 , Sarajevo, Bosnia and Herzegovina . 57--64. Daniel Cederman and Philippas Tsigas. 2008. On Dynamic Load Balancing on Graphics Processors. In Eurographics\/acm SIGGRAPH Conference on Graphics Hardware 2008, Sarajevo, Bosnia and Herzegovina. 57--64."},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2830772.2830818"},{"key":"e_1_3_2_1_10_1","volume-title":"Dynamic Load Balancing on Single-and Multi-GPU Systems. In IEEE International Symposium on Parallel & Distributed Processing. IEEE, 1--12","author":"Chen Long","year":"2010","unstructured":"Long Chen , Oreste Villa , Sriram Krishnamoorthy , and Guang R Gao . 2010 . Dynamic Load Balancing on Single-and Multi-GPU Systems. In IEEE International Symposium on Parallel & Distributed Processing. IEEE, 1--12 . Long Chen, Oreste Villa, Sriram Krishnamoorthy, and Guang R Gao. 2010. Dynamic Load Balancing on Single-and Multi-GPU Systems. In IEEE International Symposium on Parallel & Distributed Processing. IEEE, 1--12."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2009.935416"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/37402.37414"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2830541"},{"key":"e_1_3_2_1_14_1","volume-title":"KLAP: Kernel Launch Aggregation and Promotion for Optimizing Dynamic Parallelism. In 49th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 1--12","author":"Hajj Izzat El","year":"2016","unstructured":"Izzat El Hajj , Juan G\u00f3mez-Luna , Cheng Li , Li-Wen Chang , Dejan Milojicic , and Wen-mei Hwu. 2016 . KLAP: Kernel Launch Aggregation and Promotion for Optimizing Dynamic Parallelism. In 49th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 1--12 . Izzat El Hajj, Juan G\u00f3mez-Luna, Cheng Li, Li-Wen Chang, Dejan Milojicic, and Wen-mei Hwu. 2016. KLAP: Kernel Launch Aggregation and Promotion for Optimizing Dynamic Parallelism. In 49th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE, 1--12."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1962.1057683"},{"key":"e_1_3_2_1_16_1","volume-title":"Innovative Parallel Computing (InPar)","author":"Gupta Kshitij","year":"2012","unstructured":"Kshitij Gupta , Jeff A Stuart , and John D Owens . 2012. A Study of Persistent Threads Style GPU Programming for GPGPU Workloads . In Innovative Parallel Computing (InPar) , 2012 . IEEE , 1--14. Kshitij Gupta, Jeff A Stuart, and John D Owens. 2012. A Study of Persistent Threads Style GPU Programming for GPGPU Workloads. In Innovative Parallel Computing (InPar), 2012. IEEE, 1--14."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/0376-0421(67)90003-6"},{"key":"e_1_3_2_1_18_1","unstructured":"Jiwei Liang. 2016. LDPC OOK Decoder. https:\/\/github.com\/BibbyLiang\/LDPC-OOK-Decoder-on-GPU. (2016).  Jiwei Liang. 2016. LDPC OOK Decoder. https:\/\/github.com\/BibbyLiang\/LDPC-OOK-Decoder-on-GPU. (2016)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2499368.2451158"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485951"},{"key":"e_1_3_2_1_21_1","volume-title":"Das","author":"Kayiran Onur","year":"2013","unstructured":"Onur Kayiran , Adwait Jog , Mahmut T. Kandemir , and Chita R . Das . 2013 . Neither More Nor Less: Optimizing Thread-level Parallelism for GPGPUs. In PACT. Onur Kayiran, Adwait Jog, Mahmut T. Kandemir, and Chita R. Das. 2013. Neither More Nor Less: Optimizing Thread-level Parallelism for GPGPUs. In PACT."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/40.918001"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2967938.2967952"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2492045.2492060"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2016.7446079"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/62597.62617"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/3014904.3015007"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000029664.99615.94"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155656"},{"key":"e_1_3_2_1_30_1","unstructured":"NVIDIA Corporation. 2016. NVIDIA CUDA. http:\/\/www.nvidia.com\/object\/cuda_home_new.html. (2016).  NVIDIA Corporation. 2016. NVIDIA CUDA. http:\/\/www.nvidia.com\/object\/cuda_home_new.html. (2016)."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2015.7177522"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983990.2984015"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1778803"},{"key":"e_1_3_2_1_34_1","volume-title":"Real-time Reyes: Programmable Pipelines and Research Challenges. ACM SIGGRAPH Asia 2008 Course Notes","author":"Patney Anjul","year":"2008","unstructured":"Anjul Patney and John D Owens . 2008 . Real-time Reyes: Programmable Pipelines and Research Challenges. ACM SIGGRAPH Asia 2008 Course Notes (2008). Anjul Patney and John D Owens. 2008. Real-time Reyes: Programmable Pipelines and Research Challenges. ACM SIGGRAPH Asia 2008 Course Notes (2008)."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766973"},{"key":"e_1_3_2_1_36_1","unstructured":"Pixar. 2016. Pixar's RenderMan. https:\/\/renderman.pixar.com\/view\/renderman. (2016).  Pixar. 2016. Pixar's RenderMan. https:\/\/renderman.pixar.com\/view\/renderman. (2016)."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/566654.566640"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2499370.2462176"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2012.16"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540718"},{"key":"e_1_3_2_1_41_1","volume-title":"Automation and Systems (ICCAS), 2014 14th International Conference on. IEEE, 1046--1051","author":"Shirai Keigo","year":"2014","unstructured":"Keigo Shirai , Hirokazu Madokoro , Satoshi Takahashi , and Kazuhito Sato . 2014 . Parallel Implementation of Saliency Maps for Real-time Robot Vision. In Control , Automation and Systems (ICCAS), 2014 14th International Conference on. IEEE, 1046--1051 . Keigo Shirai, Hirokazu Madokoro, Satoshi Takahashi, and Kazuhito Sato. 2014. Parallel Implementation of Saliency Maps for Real-time Robot Vision. In Control, Automation and Systems (ICCAS), 2014 14th International Conference on. IEEE, 1046--1051."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSTARS.2011.2159962"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366180"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661229.2661250"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1477926.1477930"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.5555\/2537857.2537861"},{"key":"e_1_3_2_1_47_1","volume-title":"Proceedings of the Conference on High Performance Graphics. Eurographics Association, 29--37","author":"Tzeng Stanley","year":"2010","unstructured":"Stanley Tzeng , Anjul Patney , and John D Owens . 2010 . Task Management for Irregular-parallel Workloads on the GPU . In Proceedings of the Conference on High Performance Graphics. Eurographics Association, 29--37 . Stanley Tzeng, Anjul Patney, and John D Owens. 2010. Task Management for Irregular-parallel Workloads on the GPU. In Proceedings of the Conference on High Performance Graphics. Eurographics Association, 29--37."},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2001.990517"},{"key":"e_1_3_2_1_49_1","volume-title":"Workload Characterization (IISWC), 2014 IEEE International Symposium on. IEEE, 51--60","author":"Yalamanchili Jin","year":"2014","unstructured":"Wang, Jin and Yalamanchili , Sudhakar. 2014 . Characterization and Analysis of Dynamic Parallelism in Unstructured GPU Applications . In Workload Characterization (IISWC), 2014 IEEE International Symposium on. IEEE, 51--60 . Wang, Jin and Yalamanchili, Sudhakar. 2014. Characterization and Analysis of Dynamic Parallelism in Unstructured GPU Applications. In Workload Characterization (IISWC), 2014 IEEE International Symposium on. IEEE, 51--60."},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751213"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2016.2586074"}],"event":{"name":"MICRO-50: The 50th Annual IEEE\/ACM International Symposium on Microarchitecture","sponsor":["SIGMICRO ACM Special Interest Group on Microarchitectural Research and Processing","IEEE-CS\\DATC IEEE Computer Society"],"location":"Cambridge Massachusetts","acronym":"MICRO-50"},"container-title":["Proceedings of the 50th Annual IEEE\/ACM International Symposium on Microarchitecture"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3123939.3123978","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3123939.3123978","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3123939.3123978","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:30:31Z","timestamp":1750217431000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3123939.3123978"}},"subtitle":["a versatile programming framework for pipelined computing on GPU"],"short-title":[],"issued":{"date-parts":[[2017,10,14]]},"references-count":51,"alternative-id":["10.1145\/3123939.3123978","10.1145\/3123939"],"URL":"https:\/\/doi.org\/10.1145\/3123939.3123978","relation":{},"subject":[],"published":{"date-parts":[[2017,10,14]]},"assertion":[{"value":"2017-10-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}