{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T12:17:32Z","timestamp":1763468252875,"version":"3.41.0"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2015,5,11]],"date-time":"2015-05-11T00:00:00Z","timestamp":1431302400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000185","name":"DARPA","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Science Foundation","award":["CCF-0916971"],"award-info":[{"award-number":["CCF-0916971"]}]},{"name":"C-FAR"},{"name":"one of six centers of STARnet"},{"DOI":"10.13039\/100007245","name":"MARCO","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100007245","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2015,7,8]]},"abstract":"<jats:p>GPU performance and power tuning is difficult, requiring extensive user expertise and time-consuming trial and error. To accelerate design tuning, statistical design space exploration methods have been proposed. This article presents Starchart, a novel design space partitioning tool that uses regression trees to approach GPU tuning problems. Improving on prior work, Starchart offers more automation in identifying key design trade-offs and models design subspaces with distinctly different behaviors. Starchart achieves good model accuracy using very few random samples: less than 0.3% of a given design space; iterative sampling can more quickly target subspaces of interest.<\/jats:p>","DOI":"10.1145\/2736287","type":"journal-article","created":{"date-parts":[[2015,5,11]],"date-time":"2015-05-11T16:30:57Z","timestamp":1431361857000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":12,"title":["GPU Performance and Power Tuning Using Regression Trees"],"prefix":"10.1145","volume":"12","author":[{"given":"Wenhao","family":"Jia","sequence":"first","affiliation":[{"name":"Princeton University, Princeton, NJ"}]},{"given":"Elba","family":"Garza","sequence":"additional","affiliation":[{"name":"Princeton University, Princeton, NJ"}]},{"given":"Kelly A.","family":"Shaw","sequence":"additional","affiliation":[{"name":"University of Richmond, Richmond, VA"}]},{"given":"Margaret","family":"Martonosi","sequence":"additional","affiliation":[{"name":"Princeton University, Princeton, NJ"}]}],"member":"320","published-online":{"date-parts":[[2015,5,11]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Retrieved","author":"AMD.","year":"2012","unstructured":"AMD. 2012 . APP Profiler . Retrieved March 27, 2015, from http:\/\/developer.amd.com\/tools-and-sdks\/archive\/amd-app-profiler. AMD. 2012. APP Profiler. Retrieved March 27, 2015, from http:\/\/developer.amd.com\/tools-and-sdks\/archive\/amd-app-profiler."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2628071.2628092"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/InPar.2012.6339587"},{"key":"e_1_2_1_4_1","volume-title":"Olshen","author":"Breiman Leo","year":"1984","unstructured":"Leo Breiman , Jerome Friedman , Charles J. Stone , and Richard A . Olshen . 1984 . Classification and Regression Trees. Chapman and Hall\/CRC. Leo Breiman, Jerome Friedman, Charles J. Stone, and Richard A. Olshen. 1984. Classification and Regression Trees. Chapman and Hall\/CRC."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD.2011.6081376"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1693453.1693471"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/1413370.1413375"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1941553.1941589"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1958.10501479"},{"key":"e_1_2_1_11_1","volume-title":"Statistics","author":"Freedman David","unstructured":"David Freedman , Robert Pisani , and Roger Purves . 2007. Statistics ( 4 th ed.). W. W. Norton and Company . David Freedman, Robert Pisani, and Roger Purves. 2007. Statistics (4th ed.). W. W. Norton and Company.","edition":"4"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/1880043.1880047"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 1st USENIX Conference on Hot Topics in Parallelism. 1.","author":"Ganapathi Archana","year":"2009","unstructured":"Archana Ganapathi , Kaushik Datta , Armando Fox , and David Patterson . 2009 . A case for machine learning to optimize multicore performance . In Proceedings of the 1st USENIX Conference on Hot Topics in Parallelism. 1. Archana Ganapathi, Kaushik Datta, Armando Fox, and David Patterson. 2009. A case for machine learning to optimize multicore performance. In Proceedings of the 1st USENIX Conference on Hot Topics in Parallelism. 1."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555775"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815998"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(76)90095-8"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2012.6189201"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. 257--267","author":"Jia Wenhao","year":"2013","unstructured":"Wenhao Jia , Kelly A. Shaw , and Margaret Martonosi . 2013 . Starchart: Hardware and software optimization using recursive partitioning regression trees . In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. 257--267 . Wenhao Jia, Kelly A. Shaw, and Margaret Martonosi. 2013. Starchart: Hardware and software optimization using recursive partitioning regression trees. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. 257--267."},{"volume-title":"Proceedings of the 12th International Symposium on High Performance Computer Architecture. 99--108","author":"Joseph P. J.","key":"e_1_2_1_19_1","unstructured":"P. J. Joseph , Kapil Vaswani , and Matthew J. Thazhuthaveetil . 2006. Construction and use of linear regression models for processor performance analysis . In Proceedings of the 12th International Symposium on High Performance Computer Architecture. 99--108 . P. J. Joseph, Kapil Vaswani, and Matthew J. Thazhuthaveetil. 2006. Construction and use of linear regression models for processor performance analysis. In Proceedings of the 12th International Symposium on High Performance Computer Architecture. 99--108."},{"key":"e_1_2_1_20_1","volume-title":"Applied Linear Regression Models","author":"Kutner Michael H.","unstructured":"Michael H. Kutner , Christopher J. Nachtsheim , John Neter , and William Li. 2005. Applied Linear Regression Models ( 5 th ed.). McGraw-Hill . Michael H. Kutner, Christopher J. Nachtsheim, John Neter, and William Li. 2005. Applied Linear Regression Models (5th ed.). McGraw-Hill.","edition":"5"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1168857.1168881"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-01970-8_89"},{"volume-title":"Encyclopedia of Statistics in Quality and Reliability","author":"Loh Wei-Yin","key":"e_1_2_1_23_1","unstructured":"Wei-Yin Loh . 2008. Classification and regression tree methods . In Encyclopedia of Statistics in Quality and Reliability , F. Ruggeri, R. S. Kenett, and F. W. Faltin (Eds.). Wiley , 315--323. Wei-Yin Loh. 2008. Classification and regression tree methods. In Encyclopedia of Statistics in Quality and Reliability, F. Ruggeri, R. S. Kenett, and F. W. Faltin (Eds.). Wiley, 315--323."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503268"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the Linux Symposium.","author":"Moilanen Jake","year":"2005","unstructured":"Jake Moilanen and Peter Williams . 2005 . Using genetic algorithms to automatically tune the kernel . In Proceedings of the Linux Symposium. Jake Moilanen and Peter Williams. 2005. Using genetic algorithms to automatically tune the kernel. In Proceedings of the Linux Symposium."},{"volume-title":"Decision tree induction: How effective is the greedy heuristic&quest","author":"Murthy Sreerama","key":"e_1_2_1_26_1","unstructured":"Sreerama Murthy and Steven Salzberg . 1995. Decision tree induction: How effective is the greedy heuristic&quest ; In Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining . 222--227. Sreerama Murthy and Steven Salzberg. 1995. Decision tree induction: How effective is the greedy heuristic&quest; In Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining. 222--227."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/GREENCOMP.2010.5598315"},{"key":"e_1_2_1_28_1","volume-title":"Retrieved","author":"NVIDIA.","year":"2009","unstructured":"NVIDIA. 2009 . NVIDIA\u2019s Next Generation CUDA Compute Architecture: Fermi . Retrieved March 27, 2015, from http:\/\/www.nvidia.com\/content\/pdf\/fermi_white_papers\/nvidia_fermi_compute_architecture_whitpaper.pdf. NVIDIA. 2009. NVIDIA\u2019s Next Generation CUDA Compute Architecture: Fermi. Retrieved March 27, 2015, from http:\/\/www.nvidia.com\/content\/pdf\/fermi_white_papers\/nvidia_fermi_compute_architecture_whitpaper.pdf."},{"key":"e_1_2_1_29_1","volume-title":"Retrieved","author":"NVIDIA.","year":"2011","unstructured":"NVIDIA. 2011 . Tuning CUDA Applications for Fermi . Retrieved March 27, 2015, from http:\/\/hpc.oit.uci.edu\/nvidia-doc\/sdk-cuda-doc\/C\/doc\/Fermi_Tuning_Guide.pdf. NVIDIA. 2011. Tuning CUDA Applications for Fermi. Retrieved March 27, 2015, from http:\/\/hpc.oit.uci.edu\/nvidia-doc\/sdk-cuda-doc\/C\/doc\/Fermi_Tuning_Guide.pdf."},{"key":"e_1_2_1_30_1","volume-title":"Retrieved","author":"NVIDIA.","year":"2012","unstructured":"NVIDIA. 2012 . NVIDIA\u2019s Next Generation CUDA Compute Architecture: Kepler GK110 . Retrieved March 27, 2015, from http:\/\/www.nvidia.com\/content\/PDF\/kepler\/NVIDIA-kepler-GK110-Architecture-Whitepaper.pdf. NVIDIA. 2012. NVIDIA\u2019s Next Generation CUDA Compute Architecture: Kepler GK110. Retrieved March 27, 2015, from http:\/\/www.nvidia.com\/content\/PDF\/kepler\/NVIDIA-kepler-GK110-Architecture-Whitepaper.pdf."},{"key":"e_1_2_1_31_1","unstructured":"NVIDIA. 2014a. GPU Computing SDK. Available at https:\/\/developer.nvidia.com\/gpu-computing-sdk.  NVIDIA. 2014a. GPU Computing SDK. Available at https:\/\/developer.nvidia.com\/gpu-computing-sdk."},{"key":"e_1_2_1_32_1","volume-title":"Retrieved","author":"NVIDIA.","year":"2014","unstructured":"NVIDIA. 2014 b. NVIDIA Visual Profiler . Retrieved March 27, 2015, from https:\/\/developer.nvidia.com\/nvidia-visual-profiler. NVIDIA. 2014b. NVIDIA Visual Profiler. Retrieved March 27, 2015, from https:\/\/developer.nvidia.com\/nvidia-visual-profiler."},{"key":"e_1_2_1_33_1","volume-title":"Retrieved","author":"NVIDIA.","year":"2014","unstructured":"NVIDIA. 2014 c. Tuning CUDA Applications for Kepler . Retrieved March 27, 2015, from http:\/\/docs.nvidia.com\/cuda\/kepler-tuning-guide\/#axzz3Ve1bH0Vr. NVIDIA. 2014c. Tuning CUDA Applications for Kepler. Retrieved March 27, 2015, from http:\/\/docs.nvidia.com\/cuda\/kepler-tuning-guide\/#axzz3Ve1bH0Vr."},{"key":"e_1_2_1_34_1","volume-title":"R: A Language and Environment for Statistical Computing","author":"Team R Core","year":"2013","unstructured":"R Core Team . 2013 . R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing, Vienna , Austria . http:\/\/www.R-project.org\/. R Core Team. 2013. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http:\/\/www.R-project.org\/."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1356058.1356084"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2145816.2145819"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCSim.2011.5999886"},{"volume-title":"Proceedings of the International Symposium on Code Generation and Optimization. 204--215","author":"Triantafyllis Spyridon","key":"e_1_2_1_39_1","unstructured":"Spyridon Triantafyllis , Manish Vachharajani , Neil Vachharajani , and David I. August . 2003. Compiler optimization-space exploration . In Proceedings of the International Symposium on Code Generation and Optimization. 204--215 . Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, and David I. August. 2003. Compiler optimization-space exploration. In Proceedings of the International Symposium on Code Generation and Optimization. 204--215."}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2736287","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2736287","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:12:30Z","timestamp":1750227150000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2736287"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,5,11]]},"references-count":38,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2015,7,8]]}},"alternative-id":["10.1145\/2736287"],"URL":"https:\/\/doi.org\/10.1145\/2736287","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2015,5,11]]},"assertion":[{"value":"2014-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-05-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}