{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:26:35Z","timestamp":1750307195701,"version":"3.41.0"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2012,3,1]],"date-time":"2012-03-01T00:00:00Z","timestamp":1330560000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Funds for Creative Research Groups of China","award":["60921062"],"award-info":[{"award-number":["60921062"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61003081"],"award-info":[{"award-number":["61003081"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"publisher","award":["DP0881330DP110104628"],"award-info":[{"award-number":["DP0881330DP110104628"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2012,3]]},"abstract":"<jats:p>The stream processors represent a promising alternative to traditional cache-based general-purpose processors in achieving high performance in stream applications (media and some scientific applications). In a stream programming model for stream processors, an application is decomposed into a sequence of kernels operating on streams of data. During the execution of a kernel on a stream processor, all streams accessed must be communicated through a nonbypassing software-managed on-chip memory, the SRF (Stream Register File). Optimizing utilization of the scarce on-chip memory is crucial for good performance. The key insight is that the interference graphs (IGs) formed by the streams in stream applications tend to be comparability graphs or decomposable into a set of comparability graphs. We present a compiler algorithm for finding optimal or near-optimal colorings, that is, SRF allocations in stream IGs, by computing a maximum spanning forest of the sub-IG formed by long live ranges, if necessary. Our experimental results validate the optimality and near-optimality of our algorithm by comparing it with an ILP solver, and show that our algorithm yields improved SRF utilization over the First-Fit bin-packing algorithm, the best in the literature.<\/jats:p>","DOI":"10.1145\/2133382.2133387","type":"journal-article","created":{"date-parts":[[2012,4,3]],"date-time":"2012-04-03T14:56:22Z","timestamp":1333464982000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Comparability Graph Coloring for Optimizing Utilization of Software-Managed Stream Register Files for Stream Processors"],"prefix":"10.1145","volume":"9","author":[{"given":"Xuejun","family":"Yang","sequence":"first","affiliation":[{"name":"National University of Defense Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Li","family":"Wang","sequence":"additional","affiliation":[{"name":"National University of Defense Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jingling","family":"Xue","sequence":"additional","affiliation":[{"name":"University of New South Wales"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qingbo","family":"Wu","sequence":"additional","affiliation":[{"name":"National University of Defense Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2012,3]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/774789.774805"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0020-0190(99)00120-9"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/177492.177575"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/780542.780624"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015706.1015800"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/800230.806984"},{"volume-title":"Proceedings of the Workshop on Modeling, Benchmarking, and Simulation (MoBS\u201905)","author":"Cuvillo J.","key":"e_1_2_1_7_1","unstructured":"Cuvillo , J. , Zhu , W. , Ziang , H. , and Gao , G . 2005. Fast: A functionally accurate simulation toolset for the cyclops64 cellular architecture . In Proceedings of the Workshop on Modeling, Benchmarking, and Simulation (MoBS\u201905) . ACM Press, 11--20. Cuvillo, J., Zhu, W., Ziang, H., and Gao, G. 2005. Fast: A functionally accurate simulation toolset for the cyclops64 cellular architecture. In Proceedings of the Workshop on Modeling, Benchmarking, and Simulation (MoBS\u201905). ACM Press, 11--20."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1048935.1050187"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1152154.1152164"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/872732.806957"},{"key":"e_1_2_1_11_1","unstructured":"Garey M. R. and Johnson D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co. New York NY. Garey M. R. and Johnson D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co. New York NY."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/229542.229546"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/647906.739657"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201999)","author":"Gergov J.","year":"1999","unstructured":"Gergov , J. 1999 . Algorithms for compile-time memory optimization . In Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201999) . SIAM, Philadelphia, PA, 907--908. Gergov, J. 1999. Algorithms for compile-time memory optimization. In Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA\u201999). SIAM, Philadelphia, PA, 907--908."},{"volume-title":"Vol 57)","author":"Golumbic M. C.","key":"e_1_2_1_15_1","unstructured":"Golumbic , M. C. 2004. Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics , Vol 57) . North-Holland Publishing Co. , Amsterdam, Netherlands . Golumbic, M. C. 2004. Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57). North-Holland Publishing Co., Amsterdam, Netherlands."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2005.32"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1346281.1346319"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/368996.369025"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1137\/0401048"},{"key":"e_1_2_1_20_1","first-page":"2","article-title":"A polynomial time approximation algorithm for dynamic storage allocation","volume":"81","author":"Kierstead H. A.","year":"1991","unstructured":"Kierstead , H. A. 1991 . A polynomial time approximation algorithm for dynamic storage allocation . Discrete Math. 81 , 2 -- 3 , 231--237. Kierstead, H. A. 1991. A polynomial time approximation algorithm for dynamic storage allocation. Discrete Math. 81, 2--3, 231--237.","journal-title":"Discrete Math."},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the International Conference on High Performance Computing.","author":"Koch K.","year":"2006","unstructured":"Koch , K. 2006 . The new roadrunner supercomputer: What, when, and how . In Proceedings of the International Conference on High Performance Computing. Koch, K. 2006. The new roadrunner supercomputer: What, when, and how. In Proceedings of the International Conference on High Performance Computing."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375581.1375596"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/1025127.1026015"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8191(98)00029-5"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250707"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2005.27"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1254766.1254805"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1582710.1582711"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1137\/0607036"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1362622.1362647"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/989995.989999"},{"volume-title":"Proceedings of the IEEE International Conference on Computer Design. 295--302","author":"Owens J. D.","key":"e_1_2_1_32_1","unstructured":"Owens , J. D. , Kapasi , U. J. , Mattson , P. , Towles , B. , Serebrin , B. , Rixner , S. , and Dally , W. J . 2002. Media processing applications on the imagine stream processor . In Proceedings of the IEEE International Conference on Computer Design. 295--302 . Owens, J. D., Kapasi, U. J., Mattson, P., Towles, B., Serebrin, B., Rixner, S., and Dally, W. J. 2002. Media processing applications on the imagine stream processor. In Proceedings of the IEEE International Conference on Computer Design. 295--302."},{"volume-title":"Proceedings of the 31st Annual International Symposium on Microarchitecture. 3--13","author":"Rixner S.","key":"e_1_2_1_33_1","unstructured":"Rixner , S. , Dally , W. J. , Kapasi , U. J. , Khailany , B. , Lopez-Lagunas , A. , Mattson , P. R. , and Owens , J. D . 1998. A bandwidth-efficient architecture for media processing . In Proceedings of the 31st Annual International Symposium on Microarchitecture. 3--13 . Rixner, S., Dally, W. J., Kapasi, U. J., Khailany, B., Lopez-Lagunas, A., Mattson, P. R., and Owens, J. D. 1998. A bandwidth-efficient architecture for media processing. In Proceedings of the 31st Annual International Symposium on Microarchitecture. 3--13."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/996841.996875"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1006\/jagm.1997.0891"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/0012-365X(73)90108-8"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2002.997877"},{"key":"e_1_2_1_38_1","doi-asserted-by":"crossref","unstructured":"Thies W. Karczmarek M. Gordon M. Maze D. Wong J. Ho H. Brown M. and Amarasinghe S. 2001. StreamIt: A compiler for streaming applications MIT-LCS Tech. memo TM-622. Thies W. Karczmarek M. Gordon M. Maze D. Wong J. Ho H. Brown M. and Amarasinghe S. 2001. StreamIt: A compiler for streaming applications MIT-LCS Tech. memo TM-622.","DOI":"10.1007\/3-540-45937-5_14"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1054010"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375657.1375679"},{"volume-title":"Introduction to Graph Theory","author":"West D. B.","key":"e_1_2_1_41_1","unstructured":"West , D. B. 1996. Introduction to Graph Theory . Prentice Hall . West, D. B. 1996. Introduction to Graph Theory. Prentice Hall."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1128022.1128027"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/11859802_56"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250689"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454121"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1504176.1504195"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2133382.2133387","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2133382.2133387","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T10:06:05Z","timestamp":1750241165000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2133382.2133387"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,3]]},"references-count":46,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2012,3]]}},"alternative-id":["10.1145\/2133382.2133387"],"URL":"https:\/\/doi.org\/10.1145\/2133382.2133387","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2012,3]]},"assertion":[{"value":"2010-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2011-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-03-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}