{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:33:44Z","timestamp":1772724824241,"version":"3.50.1"},"reference-count":33,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2009,9,1]],"date-time":"2009-09-01T00:00:00Z","timestamp":1251763200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100004837","name":"Ministerio de Ciencia e Innovaci\u00f3n","doi-asserted-by":"publisher","award":["TIN2007-61763"],"award-info":[{"award-number":["TIN2007-61763"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002809","name":"Generalitat de Catalunya","doi-asserted-by":"publisher","award":["2009 SGR 1250"],"award-info":[{"award-number":["2009 SGR 1250"]}],"id":[{"id":"10.13039\/501100002809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2009,9]]},"abstract":"<jats:p>Register pressure in modern superscalar processors can be reduced by releasing registers early and by copying their contents to cheap back-up storage. This article quantifies the potential benefits of register occupancy reduction and shows that existing hardware-based schemes typically achieve only a small fraction of this potential. This is because they are unable to accurately determine the last use of a register and must wait until the redefining instruction enters the pipeline. On the other hand, compilers have a global view of the program and, using simple dataflow analysis, can determine the last use. This article evaluates the extent to which compiler analysis can aid early releasing, explores the design space, and introduces commit and issue-based early releasing schemes, quantifying their benefits. Using simple compiler analysis and microarchitecture changes, we achieve 70% of the potential register file occupancy reduction. By adding more hardware support, we can increase this to 94%. Our schemes are compared to state-of-the-art approaches for varying register file sizes and are shown to outperform these existing techniques.<\/jats:p>","DOI":"10.1145\/1582710.1582714","type":"journal-article","created":{"date-parts":[[2009,10,6]],"date-time":"2009-10-06T18:18:59Z","timestamp":1254853139000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Exploring the limits of early register release"],"prefix":"10.1145","volume":"6","author":[{"given":"Timothy M.","family":"Jones","sequence":"first","affiliation":[{"name":"University of Edinburgh, Edinburgh, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael F. P.","family":"O'Boyle","sequence":"additional","affiliation":[{"name":"University of Edinburgh, Edinburgh, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jaume","family":"Abella","sequence":"additional","affiliation":[{"name":"Intel Labs Barcelona\u2014UPC"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Antonio","family":"Gonz\u00e1lez","sequence":"additional","affiliation":[{"name":"Intel Labs Barcelona\u2014UPC"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"O\u011fuz","family":"Ergin","sequence":"additional","affiliation":[{"name":"TOBB University of Economics and Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2009,10,2]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 21st International Conference on Computer Design (ICCD-21)","author":"Abella J.","unstructured":"Abella , J. and Gonz\u00e1lez , A . 2003. On reducing register pressure and energy in multiple- banked register files . In Proceedings of the 21st International Conference on Computer Design (ICCD-21) . IEEE, Los Alamitos, CA. Abella, J. and Gonz\u00e1lez, A. 2003. On reducing register pressure and energy in multiple- banked register files. In Proceedings of the 21st International Conference on Computer Design (ICCD-21). IEEE, Los Alamitos, CA."},{"key":"e_1_2_1_2_1","volume-title":"Modern Compiler Implementation in Java","author":"Appel A. W.","unstructured":"Appel , A. W. 2002. Modern Compiler Implementation in Java . Cambridge University Press , Cambridge, UK . Appel, A. W. 2002. Modern Compiler Implementation in Java. Cambridge University Press, Cambridge, UK."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 34th International Symposium on Microarchitecture (MICRO-34)","author":"Balasubramonian R.","unstructured":"Balasubramonian , R. , Dwarkadas , S. , and Albonesi , D. H . 2001. Reducing the complexity of the register file in dynamic super-scalar processors . In Proceedings of the 34th International Symposium on Microarchitecture (MICRO-34) .ACM, New York. Balasubramonian, R., Dwarkadas, S., and Albonesi, D. H. 2001. Reducing the complexity of the register file in dynamic super-scalar processors. In Proceedings of the 34th International Symposium on Microarchitecture (MICRO-34).ACM, New York."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 8th International Symposium on High-Performance Computer Architecture (HPCA-8). IEEE","author":"Borch E.","unstructured":"Borch , E. , Manne , S. , Emer , J. , and Tune , E . 2002. Loose loops sink chips . In Proceedings of the 8th International Symposium on High-Performance Computer Architecture (HPCA-8). IEEE , Los Alamitos, CA. Borch, E., Manne, S., Emer, J., and Tune, E. 2002. Loose loops sink chips. In Proceedings of the 8th International Symposium on High-Performance Computer Architecture (HPCA-8). IEEE, Los Alamitos, CA."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339657"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Burger D. and Austin T. 1997. The simple-scalar tool set version 2.0. Tech. rep. TR1342 University of Wisconsin-Madison.  Burger D. and Austin T. 1997. The simple-scalar tool set version 2.0. Tech. rep. TR1342 University of Wisconsin-Madison.","DOI":"10.1145\/268806.268810"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 31st International Symposium on Computer Architecture (ISCA-31)","author":"Butts J. A.","unstructured":"Butts , J. A. and Sohi , G. S . 2004. Use-based register caching with decoupled indexing . In Proceedings of the 31st International Symposium on Computer Architecture (ISCA-31) . ACM, New York. Butts, J. A. and Sohi, G. S. 2004. Use-based register caching with decoupled indexing. In Proceedings of the 31st International Symposium on Computer Architecture (ISCA-31). ACM, New York."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/377792.377854"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339708"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 10th International Conference on Parallel Architectures and Compilation Techniques (PACT'01)","author":"Emer J.","year":"2001","unstructured":"Emer , J. 2001 . Ev8: The post-ultimate alpha . In Proceedings of the 10th International Conference on Parallel Architectures and Compilation Techniques (PACT'01) . (Keynote.) ACM, New York. Emer, J. 2001. Ev8: The post-ultimate alpha. In Proceedings of the 10th International Conference on Parallel Architectures and Compilation Techniques (PACT'01). (Keynote.) ACM, New York."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2004.29"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 22nd International Conference on Computer Design (ICCD-22)","author":"Ergin O.","unstructured":"Ergin , O. , Balkan , D. , Ponomarev , D. , and Ghose , K . 2004. Increasing processor performance through early register release . In Proceedings of the 22nd International Conference on Computer Design (ICCD-22) . IEEE, Los Alamitos, CA. Ergin, O., Balkan, D., Ponomarev, D., and Ghose, K. 2004. Increasing processor performance through early register release. In Proceedings of the 22nd International Conference on Computer Design (ICCD-22). IEEE, Los Alamitos, CA."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the 25th International Symposium on Microarchitecture (MICRO-25)","author":"Franklin M.","unstructured":"Franklin , M. and Sohi , G. S . 1992. Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors . In Proceedings of the 25th International Symposium on Microarchitecture (MICRO-25) . ACM, New York. Franklin, M. and Sohi, G. S. 1992. Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors. In Proceedings of the 25th International Symposium on Microarchitecture (MICRO-25). ACM, New York."},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 4th International Symposium on High Performance Computer Architecture (HPCA-4). IEEE","author":"Gonz\u00e1lez A.","unstructured":"Gonz\u00e1lez , A. , Gonz\u00e1lez , J. , and Valero , M . 1998. Virtual-physical registers . In Proceedings of the 4th International Symposium on High Performance Computer Architecture (HPCA-4). IEEE , Los Alamitos, CA. Gonz\u00e1lez, A., Gonz\u00e1lez, J., and Valero, M. 1998. Virtual-physical registers. In Proceedings of the 4th International Symposium on High Performance Computer Architecture (HPCA-4). IEEE, Los Alamitos, CA."},{"key":"e_1_2_1_16_1","unstructured":"Gunther S. H. Binns F. Carmean D. M. and Hall J. C. 2001. Managing the impact of increasing microprocessor power consumption. Intel Tech. J. Q1.  Gunther S. H. Binns F. Carmean D. M. and Hall J. C. 2001. Managing the impact of increasing microprocessor power consumption. Intel Tech. J. Q1."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the Workshop on Complexity Effective Design (WCED) in Conjunction with the 27th International Symposium on Computer Architecture (ISCA-27)","author":"Hu Z.","unstructured":"Hu , Z. and Martonosi , M . 2000. Reducing register file power consumption by exploiting value lifetime . In Proceedings of the Workshop on Complexity Effective Design (WCED) in Conjunction with the 27th International Symposium on Computer Architecture (ISCA-27) . ACM, New York. Hu, Z. and Martonosi, M. 2000. Reducing register file power consumption by exploiting value lifetime. In Proceedings of the Workshop on Complexity Effective Design (WCED) in Conjunction with the 27th International Symposium on Computer Architecture (ISCA-27). ACM, New York."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2005.32"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/PACT.2005.14"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1013235.1013254"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/871506.871602"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 31st International Symposium on Computer Architecture (ISCA-31)","author":"Lipasti M. H.","unstructured":"Lipasti , M. H. , Mestan , B. R. , and Gunadi , E . 2004. Physical register in lining . In Proceedings of the 31st International Symposium on Computer Architecture (ISCA-31) . ACM, New York. Lipasti, M. H., Mestan, B. R., and Gunadi, E. 2004. Physical register in lining. In Proceedings of the 31st International Symposium on Computer Architecture (ISCA-31). ACM, New York."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.798316"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 30th International Symposium on Microarchitecture (MICRO-30)","author":"Martin M. M.","unstructured":"Martin , M. M. , Roth , A. , and Fischer , C. N . 1997. Exploiting dead value information . In Proceedings of the 30th International Symposium on Microarchitecture (MICRO-30) . ACM, New York. Martin, M. M., Roth, A., and Fischer, C. N. 1997. Exploiting dead value information. In Proceedings of the 30th International Symposium on Microarchitecture (MICRO-30). ACM, New York."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 35th International Symposium on Microarchitecture (MICRO-35)","author":"Martinez J. F.","unstructured":"Martinez , J. F. , Renau , J. , Huang , M. C. , Prvulovic , M. , and Torrellas , J . 2002. Cherry: Check-pointed early resource recycling in out-of-order microprocessors . In Proceedings of the 35th International Symposium on Microarchitecture (MICRO-35) . ACM, New York. Martinez, J. F., Renau, J., Huang, M. C., Prvulovic, M., and Torrellas, J. 2002. Cherry: Check-pointed early resource recycling in out-of-order microprocessors. In Proceedings of the 35th International Symposium on Microarchitecture (MICRO-35). ACM, New York."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the International Conference on Parallel Processing (ICPP). IEEE","author":"Monreal T.","unstructured":"Monreal , T. , Vi\u00f1als , V. , Gonz\u00e1lez , A. , and Valero , M . 2002. Hardware schemes for early register release . In Proceedings of the International Conference on Parallel Processing (ICPP). IEEE , Los Alamitos, CA. Monreal, T., Vi\u00f1als, V., Gonz\u00e1lez, A., and Valero, M. 2002. Hardware schemes for early register release. In Proceedings of the International Conference on Parallel Processing (ICPP). IEEE, Los Alamitos, CA."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 26th International Symposium on Microarchitecture (MICRO-26)","author":"Moudgill M.","unstructured":"Moudgill , M. , Pingali , K. , and Vassiliadis , S . 1993. Register renaming and dynamic speculation: An alternative approach . In Proceedings of the 26th International Symposium on Microarchitecture (MICRO-26) . ACM, New York. Moudgill, M., Pingali, K., and Vassiliadis, S. 1993. Register renaming and dynamic speculation: An alternative approach. In Proceedings of the 26th International Symposium on Microarchitecture (MICRO-26). ACM, New York."},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the 35th International Symposium on Microarchitecture (MICRO-35)","author":"Park I.","unstructured":"Park , I. , Powell , M. D. , and Vijaykumar , T. N . 2002. Reducing register ports for higher speed and lower energy . In Proceedings of the 35th International Symposium on Microarchitecture (MICRO-35) . ACM, New York. Park, I., Powell, M. D., and Vijaykumar, T. N. 2002. Reducing register ports for higher speed and lower energy. In Proceedings of the 35th International Symposium on Microarchitecture (MICRO-35). ACM, New York."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the Workshop on Complexity Effective Design (WCED) in Conjunction with the 27th International Symposium on Computer Architecture (ISCA-27)","author":"Savransky G.","unstructured":"Savransky , G. , Ronen , R. , and Gonz\u00e1lez , A . 2004. Lazy retirement: A power aware register management mechanism . In Proceedings of the Workshop on Complexity Effective Design (WCED) in Conjunction with the 27th International Symposium on Computer Architecture (ISCA-27) . ACM, New York. Savransky, G., Ronen, R., and Gonz\u00e1lez, A. 2004. Lazy retirement: A power aware register management mechanism. In Proceedings of the Workshop on Complexity Effective Design (WCED) in Conjunction with the 27th International Symposium on Computer Architecture (ISCA-27). ACM, New York."},{"key":"e_1_2_1_30_1","unstructured":"Smith M. D. and Holloway G. 2000. The Machine-SUIF documentation set. http:\/\/www.eecs. harvard.edu\/machsuif\/software\/software.html.  Smith M. D. and Holloway G. 2000. The Machine-SUIF documentation set. http:\/\/www.eecs. harvard.edu\/machsuif\/software\/software.html."},{"key":"e_1_2_1_31_1","unstructured":"Tarjan D. Thoziyoor S. and Jouppi N. P. 2006. CACTI 4.0. Tech. rep. HPL-2006-86 HP Laboratories Palo Alto.  Tarjan D. Thoziyoor S. and Jouppi N. P. 2006. CACTI 4.0. Tech. rep. HPL-2006-86 HP Laboratories Palo Alto."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the International Symposium on Performance Analysis of Systems and Software. IEEE","author":"Tran L.","unstructured":"Tran , L. , Nelson , N. , Ngai , F. , Dropsho , S. , and Huang , M . 2004. Dynamically reducing pressure on the physical register file through simple register sharing . In Proceedings of the International Symposium on Performance Analysis of Systems and Software. IEEE , Los Alamitos, CA. Tran, L., Nelson, N., Ngai, F., Dropsho, S., and Huang, M. 2004. Dynamically reducing pressure on the physical register file through simple register sharing. In Proceedings of the International Symposium on Performance Analysis of Systems and Software. IEEE, Los Alamitos, CA."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/859618.859627"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 5th International Conference on Parallel Architectures and Compilation Techniques (PACT). ACM","author":"Wallace S.","unstructured":"Wallace , S. and Bagherzadeh , N . 1996. A scalable register file architecture for dynamically scheduled processors . In Proceedings of the 5th International Conference on Parallel Architectures and Compilation Techniques (PACT). ACM , New York. Wallace, S. and Bagherzadeh, N. 1996. A scalable register file architecture for dynamically scheduled processors. In Proceedings of the 5th International Conference on Parallel Architectures and Compilation Techniques (PACT). ACM, New York."}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1582710.1582714","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/1582710.1582714","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T13:30:08Z","timestamp":1750253408000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/1582710.1582714"}},"subtitle":["Exploiting compiler analysis"],"short-title":[],"issued":{"date-parts":[[2009,9]]},"references-count":33,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2009,9]]}},"alternative-id":["10.1145\/1582710.1582714"],"URL":"https:\/\/doi.org\/10.1145\/1582710.1582714","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2009,9]]},"assertion":[{"value":"2008-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2009-10-02","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}