{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:37:11Z","timestamp":1750307831040,"version":"3.41.0"},"reference-count":33,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2008,7,1]],"date-time":"2008-07-01T00:00:00Z","timestamp":1214870400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Program. Lang. Syst."],"published-print":{"date-parts":[[2008,7]]},"abstract":"<jats:p>\n            This article investigates register allocation for software pipelined multidimensional loops where the execution of successive iterations from an\n            <jats:italic>n<\/jats:italic>\n            -dimensional loop is overlapped. For single loop software pipelining, the lifetimes of a loop variable in successive iterations of the loop form a repetitive pattern. An effective register allocation method is to represent the pattern as a vector of lifetimes (or a vector lifetime using Rau's terminology [Rau 1992]) and map it to rotating registers. Unfortunately, the software pipelined schedule of a multidimensional loop is considerably more complex and so are the vector lifetimes in it.\n          <\/jats:p>\n          <jats:p>In this article, we develop a way to normalize and represent the vector lifetimes, which captures their complexity, while exposing their regularity that enables a simple solution. The problem is formulated as bin-packing of the multidimensional vector lifetimes on the surface of a space-time cylinder. A metric, called distance, is calculated either conservatively or aggressively to guide the bin-packing process, so that there is no overlapping between any two vector lifetimes, and the register requirement is minimized. This approach subsumes the classical register allocation for software pipelined single loops as a special case. The method has been implemented in the ORC compiler and produced code for the IA-64 architecture. Experimental results show the effectiveness. Several strategies for register allocation are compared and analyzed.<\/jats:p>","DOI":"10.1145\/1377492.1377498","type":"journal-article","created":{"date-parts":[[2008,8,5]],"date-time":"2008-08-05T13:35:10Z","timestamp":1217943310000},"page":"1-68","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Register allocation for software pipelined multidimensional loops"],"prefix":"10.1145","volume":"30","author":[{"given":"Hongbo","family":"Rong","sequence":"first","affiliation":[{"name":"Microsoft Corporation, Redmond, WA"}]},{"given":"Alban","family":"Douillet","sequence":"additional","affiliation":[{"name":"Hewlett-Packard Co., Palo Alto, CA"}]},{"given":"Guang R.","family":"Gao","sequence":"additional","affiliation":[{"name":"University of Delaware, Newark, DE"}]}],"member":"320","published-online":{"date-parts":[[2008,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.476167"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/212094.212131"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/567067.567085"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/989393.989400"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/113445.113462"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS'96)","volume":"1","author":"Carr S.","unstructured":"Carr , S. , Ding , C. , and Sweany , P . 1996. Improving software pipelining with unroll-and-jam . In Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS'96) , Software Technology and Architecture , vol. 1 . IEEE Computer Society, 183.]] Carr, S., Ding, C., and Sweany, P. 1996. Improving software pipelining with unroll-and-jam. In Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS'96), Software Technology and Architecture, vol. 1. IEEE Computer Society, 183.]]"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/989393.989403"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/315773.315776"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/115372.115320"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.298207"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01205184"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-69330-7_2"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/1299042.1299109"},{"volume-title":"Selected Papers of the Second Workshop on Languages and Compilers for Parallel Computing. Pitman Publishing","author":"Ebcioglu K.","key":"e_1_2_1_14_1","unstructured":"Ebcioglu , K. and Nakatani , T . 1990. A new compilation technique for parallelizing loops with unpredictable branches on a vliw architecture . In Selected Papers of the Second Workshop on Languages and Compilers for Parallel Computing. Pitman Publishing , London, UK, 213--229.]] Ebcioglu, K. and Nakatani, T. 1990. A new compilation technique for parallelizing loops with unpredictable branches on a vliw architecture. In Selected Papers of the Second Workshop on Languages and Compilers for Parallel Computing. Pitman Publishing, London, UK, 213--229.]]"},{"key":"e_1_2_1_15_1","unstructured":"Gao G. R. Ning Q. and Dongen V. V. 1993. Software pipelining for nested loops. ACAPS Tech Memo 53 School of Computer Science McGill Univ. Montr\u00e9al Qu\u00e9bec.]]  Gao G. R. Ning Q. and Dongen V. V. 1993. Software pipelining for nested loops. ACAPS Tech Memo 53 School of Computer Science McGill Univ. Montr\u00e9al Qu\u00e9bec.]]"},{"volume-title":"Proceedings of the 4th International Conference on Compiler Construction (CC '92)","author":"Hendren L. J.","key":"e_1_2_1_16_1","unstructured":"Hendren , L. J. , Gao , G. R. , Altman , E. R. , and Mukerji , C . 1992. A register allocation framework based on hierarchical cyclic interval graphs . In Proceedings of the 4th International Conference on Compiler Construction (CC '92) . Springer-Verlag, 176--191.]] Hendren, L. J., Gao, G. R., Altman, E. R., and Mukerji, C. 1992. A register allocation framework based on hierarchical cyclic interval graphs. In Proceedings of the 4th International Conference on Compiler Construction (CC '92). Springer-Verlag, 176--191.]]"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/155090.155115"},{"volume-title":"Intel IA-64 Architecture Software Developer's Manual","author":"Intel","key":"e_1_2_1_18_1","unstructured":"Intel . 2001. Intel IA-64 Architecture Software Developer's Manual . Vol. 1: IA-64 Application Architecture. Intel Corporation , Santa Clara, CA.]] Intel. 2001. Intel IA-64 Architecture Software Developer's Manual. Vol. 1: IA-64 Application Architecture. Intel Corporation, Santa Clara, CA.]]"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/53990.54022"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/360827.360844"},{"key":"e_1_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Lawler E. L. Lenstra J. K. Khan A. H. G. R. and Shmoys D. B. 1985. The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization. John Wiley & Sons.]]  Lawler E. L. Lenstra J. K. Khan A. H. G. R. and Shmoys D. B. 1985. The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization. John Wiley & Sons.]]","DOI":"10.2307\/2582681"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/267959.269966"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/647477.727775"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/645604.662584"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/192724.192731"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01205181"},{"key":"e_1_2_1_27_1","unstructured":"Rau B. R. Lee M. Tirumalai P. P. and Schlansker M. S. 1992. Register allocation for modulo scheduled loops: Strategies algorithms and heuristics. HP Labs Tech. rep. HPL-92-48 Hewlett-Packard Laboratories Palo Alto CA.]]  Rau B. R. Lee M. Tirumalai P. P. and Schlansker M. S. 1992. Register allocation for modulo scheduled loops: Strategies algorithms and heuristics. HP Labs Tech. rep. HPL-92-48 Hewlett-Packard Laboratories Palo Alto CA.]]"},{"volume-title":"Proceedings of the International Symposium on Code Generation and Optimization (CGO'04)","author":"Rong H.","key":"e_1_2_1_28_1","unstructured":"Rong , H. , Douillet , A. , Govindarajan , R. , and Gao , G. R . 2004. Code generation for single-dimension software pipelining of multi-dimensional loops . In Proceedings of the International Symposium on Code Generation and Optimization (CGO'04) . IEEE Computer Society, 175--186.]] Rong, H., Douillet, A., Govindarajan, R., and Gao, G. R. 2004. Code generation for single-dimension software pipelining of multi-dimensional loops. In Proceedings of the International Symposium on Code Generation and Optimization (CGO'04). IEEE Computer Society, 175--186.]]"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Rong H. and Govindarajan R. 2007. Advances in software piplining. In The Compiler Design Handbook: Optimization and Machine Code Generation 2nd Ed. Y. N. Srikant and P. Shankar Eds. CRC Chapter 20.]]  Rong H. and Govindarajan R. 2007. Advances in software piplining. In The Compiler Design Handbook: Optimization and Machine Code Generation 2nd Ed. Y. N. Srikant and P. Shankar Eds. CRC Chapter 20.]]","DOI":"10.1201\/9781420043839.ch20"},{"key":"e_1_2_1_30_1","unstructured":"Rong H. Tang Z. Govindarajan R. Douillet A. and Gao G. R. 2007a. Single-dimension software pipelining for multi-dimensional loops. CAPSL Technical Memo Department of Electrical and Computer Engineering University of Delaware Newark DE. In ftp:\/\/ftp.capsl.udel.edu\/pub\/doc\/memos\/memo049.ps.gz.]]  Rong H. Tang Z. Govindarajan R. Douillet A. and Gao G. R. 2007a. Single-dimension software pipelining for multi-dimensional loops. CAPSL Technical Memo Department of Electrical and Computer Engineering University of Delaware Newark DE. In ftp:\/\/ftp.capsl.udel.edu\/pub\/doc\/memos\/memo049.ps.gz.]]"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1216544.1216550"},{"volume-title":"Proceedings of the International Conference on Field Programmable Logic and Applications (FPL)","author":"Turkington K.","key":"e_1_2_1_32_1","unstructured":"Turkington , K. , Masselos , K. , Constantinides , G. A. , and Leong , P . 2006. FPGA based acceleration of the linpack benchmark: A high level code transformation approach . In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL) . Madrid, Spain. IEEE, 1--6.]] Turkington, K., Masselos, K., Constantinides, G. A., and Leong, P. 2006. FPGA based acceleration of the linpack benchmark: A high level code transformation approach. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL). Madrid, Spain. IEEE, 1--6.]]"},{"volume-title":"Proceedings of the 6th International Conference on Compiler Construction (CC '96)","author":"Wang J.","key":"e_1_2_1_33_1","unstructured":"Wang , J. and Gao , G. R . 1996. Pipelining-dovetailing: A transformation to enhance software pipelining for nested loops . In Proceedings of the 6th International Conference on Compiler Construction (CC '96) . Springer-Verlag, London, UK, 1--17.]] Wang, J. and Gao, G. R. 1996. Pipelining-dovetailing: A transformation to enhance software pipelining for nested loops. In Proceedings of the 6th International Conference on Compiler Construction (CC '96). Springer-Verlag, London, UK, 1--17.]]"}],"container-title":["ACM Transactions on Programming Languages and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1377492.1377498","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1377492.1377498","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:57:55Z","timestamp":1750255075000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1377492.1377498"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,7]]},"references-count":33,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2008,7]]}},"alternative-id":["10.1145\/1377492.1377498"],"URL":"https:\/\/doi.org\/10.1145\/1377492.1377498","relation":{},"ISSN":["0164-0925","1558-4593"],"issn-type":[{"type":"print","value":"0164-0925"},{"type":"electronic","value":"1558-4593"}],"subject":[],"published":{"date-parts":[[2008,7]]},"assertion":[{"value":"2006-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2007-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2008-08-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}