{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,4,2]],"date-time":"2023-04-02T16:17:28Z","timestamp":1680452248472},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2007,3]]},"abstract":"<jats:p>\n            Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or from the innermost loop to outer loops. This paper proposes a three-step approach, called\n            <jats:italic>single-dimension software pipelining (SSP)<\/jats:italic>\n            , to software pipeline a loop nest at an arbitrary loop level that has a rectangular iteration space and contains no sibling inner loops in it. The first step identifies the most profitable loop level for software pipelining in terms of initiation rate, data reuse potential, or any other optimization criteria. The second step simplifies the multidimensional data-dependence graph (DDG) of the selected loop level into a one-dimensional DDG and constructs a one-dimensional (1D) schedule. Based on the one-dimensional schedule, the third step derives a simple mapping function that specifies the schedule time for the operation instances in the multidimensional loop. The classical modulo scheduling is subsumed by SSP as a special case. SSP is also closely related to hyperplane scheduling, and, in fact, extends it to be resource constrained. We prove that SSP schedules are correct and at least as efficient as those schedules generated by traditional modulo scheduling methods. We extend SSP to schedule imperfect loop nests, which are most common at the instruction level. Multiple initiation intervals are naturally allowed to improve execution efficiency. Feasibility and correctness of our approach are verified by a prototype implementation in the ORC compiler for the IA-64 architecture, tested with loop nests from Livermore and SPEC2000 floating-point benchmarks. Preliminary experimental results reveal that, compared to modulo scheduling, software pipelining at an appropriate loop level results in significant performance improvement. Software pipelining is beneficial even with prior loop transformations.\n          <\/jats:p>","DOI":"10.1145\/1216544.1216550","type":"journal-article","created":{"date-parts":[[2007,4,5]],"date-time":"2007-04-05T19:20:08Z","timestamp":1175800808000},"page":"7","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":20,"title":["Single-dimension software pipelining for multidimensional loops"],"prefix":"10.1145","volume":"4","author":[{"given":"Hongbo","family":"Rong","sequence":"first","affiliation":[{"name":"Microsoft Corporation, Redmond, Washington"}]},{"given":"Zhizhong","family":"Tang","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"given":"R.","family":"Govindarajan","sequence":"additional","affiliation":[{"name":"Indian Institute of Science, Bangalore, India"}]},{"given":"Alban","family":"Douillet","sequence":"additional","affiliation":[{"name":"Hewlett-Packard Company, Palo Alto, California"}]},{"given":"Guang R.","family":"Gao","sequence":"additional","affiliation":[{"name":"University of Delaware, Newark, Delaware"}]}],"member":"320","published-online":{"date-parts":[[2007,3]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Selected Papers of the 2nd Workshop on Languages and Compilers for Parallel Computing. Pitman Publishing","author":"Aiken A.","unstructured":"Aiken , A. and Nicolau , A . 1990. Fine-grain parallelization and the wavefront method . In Selected Papers of the 2nd Workshop on Languages and Compilers for Parallel Computing. Pitman Publishing , London. 1--16. Aiken, A. and Nicolau, A. 1990. Fine-grain parallelization and the wavefront method. In Selected Papers of the 2nd Workshop on Languages and Compilers for Parallel Computing. Pitman Publishing, London. 1--16."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/212094.212131"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/567067.567085"},{"key":"e_1_2_1_4_1","volume-title":"Loop Transformations for Restructuring Compilers: The Foundations","author":"Banerjee U. K.","unstructured":"Banerjee , U. K. 1993. Loop Transformations for Restructuring Compilers: The Foundations . Kluwer Academic Publ ., Norwell, MA. Banerjee, U. K. 1993. Loop Transformations for Restructuring Compilers: The Foundations. Kluwer Academic Publ., Norwell, MA."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/197320.197366"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/195473.195557"},{"key":"e_1_2_1_7_1","volume-title":"HICSS'96: Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS'96)","volume":"183","author":"Carr S.","unstructured":"Carr , S. , Ding , C. , and Sweany , P . 1996. Improving software pipelining with unroll-and-jam . In HICSS'96: Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS'96) Volume 1: Software Technology and Architecture. IEEE Computer Society, Washington, D.C. 183 . Carr, S., Ding, C., and Sweany, P. 1996. Improving software pipelining with unroll-and-jam. In HICSS'96: Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS'96) Volume 1: Software Technology and Architecture. IEEE Computer Society, Washington, D.C. 183."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.298207"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/504914.504921"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/647429.723579"},{"key":"e_1_2_1_11_1","unstructured":"Gao G. R. Ning Q. and Van Dongen V. 1993. Software pipelining for nested loops. ACAPS Tech Memo 53 School of Computer Science McGill Univ. Montr\u00e9al Qu\u00e9bec.  Gao G. R. Ning Q. and Van Dongen V. 1993. Software pipelining for nested loops. ACAPS Tech Memo 53 School of Computer Science McGill Univ. Montr\u00e9al Qu\u00e9bec."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/325478.325479"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.544355"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/155090.155115"},{"key":"e_1_2_1_15_1","volume-title":"Intel IA-64 Architecture Software Developer's Manual","author":"Intel","unstructured":"Intel . 2001. Intel IA-64 Architecture Software Developer's Manual , Vol. 1: IA-64 Application Architecture. Intel Corporation , Santa Clara, CA. Intel. 2001. Intel IA-64 Architecture Software Developer's Manual, Vol. 1: IA-64 Application Architecture. Intel Corporation, Santa Clara, CA."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/143369.143427"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/53990.54022"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/360827.360844"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/267959.269966"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/647477.727775"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.544356"},{"key":"e_1_2_1_22_1","volume-title":"16th Intl. Parallel and Distributed Processing Symposium (IPDPS '02)","author":"Petkov D.","unstructured":"Petkov , D. , Harr , R. , and Amarasinghe , S . 2002. Efficient pipelining of nested loops: unroll-and-squash . In 16th Intl. Parallel and Distributed Processing Symposium (IPDPS '02) . IEEE, Washigton, D.C. Petkov, D., Harr, R., and Amarasinghe, S. 2002. Efficient pipelining of nested loops: unroll-and-squash. In 16th Intl. Parallel and Distributed Processing Symposium (IPDPS '02). IEEE, Washigton, D.C."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/645604.662584"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/192724.192731"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01205181"},{"key":"e_1_2_1_26_1","volume-title":"CGO '04: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, D.C. 175--186","author":"Rong H.","unstructured":"Rong , H. , Douillet , A. , Govindarajan , R. , and Gao , G. R . 2004a. Code generation for single-dimension software pipelining of multi-dimensional loops . In CGO '04: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, D.C. 175--186 . Rong, H., Douillet, A., Govindarajan, R., and Gao, G. R. 2004a. Code generation for single-dimension software pipelining of multi-dimensional loops. In CGO '04: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, D.C. 175--186."},{"key":"e_1_2_1_27_1","volume-title":"CGO '04: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, D.C. 163--174","author":"Rong H.","unstructured":"Rong , H. , Tang , Z. , Govindarajan , R. , Douillet , A. , and Gao , G. R . 2004b. Single-dimension software pipelining for multi-dimensional loops . In CGO '04: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, D.C. 163--174 . Rong, H., Tang, Z., Govindarajan, R., Douillet, A., and Gao, G. R. 2004b. Single-dimension software pipelining for multi-dimensional loops. In CGO '04: Proceedings of the International Symposium on Code Generation and Optimization. IEEE Computer Society, Washington, D.C. 163--174."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1065010.1065030"},{"key":"e_1_2_1_29_1","unstructured":"Rong H. Tang Z. Govindarajan R. Douillet A. and Gao G. R. 2007. Single-dimension software pipelining for multi-dimensional loops. CAPSL technical memo Department of Electrical and Computer Engineering University of Delaware Newark Delaware. January. In ftp:\/\/ftp.capsl.udel.edu\/pub\/doc\/memos\/memo049.ps.gz.  Rong H. Tang Z. Govindarajan R. Douillet A. and Gao G. R. 2007. Single-dimension software pipelining for multi-dimensional loops. CAPSL technical memo Department of Electrical and Computer Engineering University of Delaware Newark Delaware. January. In ftp:\/\/ftp.capsl.udel.edu\/pub\/doc\/memos\/memo049.ps.gz."},{"key":"e_1_2_1_30_1","volume-title":"CC '96: Proceedings of the 6th International Conference on Compiler Construction. Springer-Verlag","author":"Wang J.","unstructured":"Wang , J. and Gao , G. R . 1996. Pipelining-dovetailing: A transformation to enhance software pipelining for nested loops . In CC '96: Proceedings of the 6th International Conference on Compiler Construction. Springer-Verlag , New York. 1--17. Wang, J. and Gao, G. R. 1996. Pipelining-dovetailing: A transformation to enhance software pipelining for nested loops. In CC '96: Proceedings of the 6th International Conference on Compiler Construction. Springer-Verlag, New York. 1--17."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/113445.113449"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/243846.243895"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1216544.1216550","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T20:52:32Z","timestamp":1672260752000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1216544.1216550"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,3]]},"references-count":32,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2007,3]]}},"alternative-id":["10.1145\/1216544.1216550"],"URL":"https:\/\/doi.org\/10.1145\/1216544.1216550","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,3]]},"assertion":[{"value":"2007-03-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}