{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:26:53Z","timestamp":1750307213425,"version":"3.41.0"},"reference-count":18,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2011,12,19]],"date-time":"2011-12-19T00:00:00Z","timestamp":1324252800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGARCH Comput. Archit. News"],"published-print":{"date-parts":[[2011,12,19]]},"abstract":"<jats:p>With advances in manycore and accelerator architectures, the high performance and embedded spaces are rapidly converging. Emerging architectures feature different forms of parallelism. The Polyhedral Processes Networks (PPNs) are a proven model of choice for automated generation of pipeline and task parallel programs from sequential source code, however data parallelism is not addressed. In this paper, we present asystematic approach for identification and extraction of fine grain data parallelism from the PPN specification. The approach is implemented in a tool, called kpn2gpu, which produces fine-grain data parallel CUDA kernels for graphics processing units (GPUs). First experiments indicate that generated applications have a potential to exploit different forms of parallelism provided by the architecture and that kernels feature a highly regular structure that allows subsequent optimizations.<\/jats:p>","DOI":"10.1145\/2082156.2082173","type":"journal-article","created":{"date-parts":[[2011,12,27]],"date-time":"2011-12-27T15:22:22Z","timestamp":1324999342000},"page":"66-71","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["KPN2GPU"],"prefix":"10.1145","volume":"39","author":[{"given":"Ana","family":"Balevic","sequence":"first","affiliation":[{"name":"University of Leiden, Leiden, The Netherlands"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bart","family":"Kienhuis","sequence":"additional","affiliation":[{"name":"University of Leiden, Leiden, The Netherlands"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2011,12,19]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Parallelization using polyhedral analysis","author":"Associated Compiler ACE","year":"2008","unstructured":"ACE Associated Compiler Experts bv. Parallelization using polyhedral analysis . 2008 . ACE Associated Compiler Experts bv. Parallelization using polyhedral analysis. 2008."},{"volume-title":"Proc of CPC'10","author":"Baghdadi S.","key":"e_1_2_1_2_1","unstructured":"S. Baghdadi , A. Gr\u00f6linger , and A. Cohen . Putting automatic polyhedral compilation for GPGPU to work . Proc of CPC'10 . S. Baghdadi, A. Gr\u00f6linger, and A. Cohen. Putting automatic polyhedral compilation for GPGPU to work. Proc of CPC'10."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1988932.1988939"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-11970-5_14"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375581.1375595"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/556139"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/49418"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01407835"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10766-006-0011-4"},{"key":"e_1_2_1_10_1","first-page":"993","volume-title":"Proceedings of IFIP Congress 77","author":"Kahn G.","year":"1977","unstructured":"G. Kahn and D. MacQueen . Coroutines and Networks of Parallel Processes . In Proceedings of IFIP Congress 77 , pages 993 -- 998 , 1977 . G. Kahn and D. MacQueen. Coroutines and Networks of Parallel Processes. In Proceedings of IFIP Congress 77, pages 993--998, 1977."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/334012.334015"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.381846"},{"key":"e_1_2_1_13_1","first-page":"398","volume-title":"LECTURE NOTES IN COMPUTER SCIENCE","author":"Lengauer C.","year":"1993","unstructured":"C. Lengauer . Loop parallelization in the polytope model . LECTURE NOTES IN COMPUTER SCIENCE , pages 398 -- 398 , 1993 . C. Lengauer. Loop parallelization in the polytope model. LECTURE NOTES IN COMPUTER SCIENCE, pages 398--398, 1993."},{"volume-title":"Proc. ESTIMedia'10","author":"Meijer S.","key":"e_1_2_1_14_1","unstructured":"S. Meijer , H. Nikolov , and T. Stefanov . Combining process splitting and merging transformations for polyhedral process networks . Proc. ESTIMedia'10 . S. Meijer, H. Nikolov, and T. Stefanov. Combining process splitting and merging transformations for polyhedral process networks. Proc. ESTIMedia'10."},{"key":"e_1_2_1_15_1","volume-title":"Sept.","author":"NVIDIA Corp.","year":"2010","unstructured":"NVIDIA Corp. NVIDIA CUDA Technical Documentation: Programming and Best Practices Guide V3.2. Technical report , Sept. 2010 . NVIDIA Corp. NVIDIA CUDA Technical Documentation: Programming and Best Practices Guide V3.2. Technical report, Sept. 2010."},{"key":"e_1_2_1_16_1","volume-title":"Proc. of DATE'04","volume":"1","author":"Stefanov T.","year":"2004","unstructured":"T. Stefanov System design using Kahn process networks: the Compaan\/Laura approach . In Proc. of DATE'04 , volume 1 , 2004 . T. Stefanov et al. System design using Kahn process networks: the Compaan\/Laura approach. In Proc. of DATE'04, volume 1, 2004."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4419-6345-1_33"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1809028.1806606"}],"container-title":["ACM SIGARCH Computer Architecture News"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2082156.2082173","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2082156.2082173","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:06:42Z","timestamp":1750241202000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2082156.2082173"}},"subtitle":["an approach for discovery and exploitation of fine-grain data parallelism in process networks"],"short-title":[],"issued":{"date-parts":[[2011,12,19]]},"references-count":18,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2011,12,19]]}},"alternative-id":["10.1145\/2082156.2082173"],"URL":"https:\/\/doi.org\/10.1145\/2082156.2082173","relation":{},"ISSN":["0163-5964"],"issn-type":[{"type":"print","value":"0163-5964"}],"subject":[],"published":{"date-parts":[[2011,12,19]]},"assertion":[{"value":"2011-12-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}