{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:06:08Z","timestamp":1766066768206,"version":"3.41.0"},"reference-count":27,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2017,10,11]],"date-time":"2017-10-11T00:00:00Z","timestamp":1507680000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGMETRICS Perform. Eval. Rev."],"published-print":{"date-parts":[[2017,10,11]]},"abstract":"<jats:p>With the increasing installation of Graphics Processing Units (GPUs) in supercomputers and data centers, their huge electricity cost brings new environmental and economic concerns. Although Dynamic Voltage and Frequency Scaling (DVFS) techniques have been successfully applied on traditional CPUs to reserve energy, the impact of GPU DVFS on application performance and power consumption is not yet fully understood, mainly due to the complicated GPU memory system. This paper proposes a fast prediction model based on Support Vector Regression (SVR), which can estimate the average runtime power of a given GPU kernel using a set of profiling parameters under different GPU core and memory frequencies. Our experimental data set includes 931 samples obtained from 19 GPU kernels running on a real GPU platform with the core and memory frequencies ranging between 400MHz and 1000MHz. We evaluate the accuracy of the SVR-based prediction model by ten-fold cross validation. We achieve greater accuracy than prior models, being Mean Square Error (MSE) of 0.797 Watt and Mean Absolute Percentage Error (MAPE) of 3.08% on average. Combined with an existing performance prediction model, we can find the optimal GPU frequency settings that can save an average of 13.2% energy across those GPU kernels with no more than 10% performance penalty compared to applying the default setting.<\/jats:p>","DOI":"10.1145\/3152042.3152066","type":"journal-article","created":{"date-parts":[[2017,10,12]],"date-time":"2017-10-12T12:52:50Z","timestamp":1507812770000},"page":"73-78","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["GPGPU Power Estimation with Core and Memory Frequency Scaling"],"prefix":"10.1145","volume":"45","author":[{"given":"Qiang","family":"Wang","sequence":"first","affiliation":[{"name":"Hong Kong Baptist University"}]},{"given":"Xiaowen","family":"Chu","sequence":"additional","affiliation":[{"name":"Hong Kong Baptist University"}]}],"member":"320","published-online":{"date-parts":[[2017,10,11]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.23"},{"key":"e_1_2_1_2_1","first-page":"203","article-title":"Support vector regression","volume":"11","author":"Basak Debasish","year":"2007","unstructured":"Debasish Basak , Srimanta Pal , and Dipak Chandra Patranabis . 2007 . Support vector regression . Neural Information Processing-Letters and Reviews 11 , 10 (2007), 203 -- 224 . Debasish Basak, Srimanta Pal, and Dipak Chandra Patranabis. 2007. Support vector regression. Neural Information Processing-Letters and Reviews 11, 10 (2007), 203--224.","journal-title":"Neural Information Processing-Letters and Reviews"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1961189.1961199"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077839.3077855"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2593069.2593208"},{"volume-title":"2015 IEEE International Conference on Communications (ICC). 436--441","author":"Chu X.","key":"e_1_2_1_7_1","unstructured":"X. Chu , C. Liu , K. Ouyang , L. S. Yung , H. Liu , and Y.W. Leung . 2015. PErasure: A parallel Cauchy Reed-Solomon coding library for GPUs . In 2015 IEEE International Conference on Communications (ICC). 436--441 . X. Chu, C. Liu, K. Ouyang, L. S. Yung, H. Liu, and Y.W. Leung. 2015. PErasure: A parallel Cauchy Reed-Solomon coding library for GPUs. In 2015 IEEE International Conference on Communications (ICC). 436--441."},{"key":"e_1_2_1_8_1","unstructured":"Wu chun Feng and Tom Scoglands. 2016. GREEN500. {Online} https:\/\/www.top500.org\/green500\/lists\/2016\/11\/. (2016).  Wu chun Feng and Tom Scoglands. 2016. GREEN500. {Online} https:\/\/www.top500.org\/green500\/lists\/2016\/11\/. (2016)."},{"volume-title":"2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 1190--1199","author":"Coplin J.","key":"e_1_2_1_9_1","unstructured":"J. Coplin and M. Burtscher . 2016. Energy, Power, and Performance Characterization of GPGPU Benchmark Programs . In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 1190--1199 . J. Coplin and M. Burtscher. 2016. Energy, Power, and Performance Characterization of GPGPU Benchmark Programs. In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 1190--1199."},{"volume-title":"Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS'12)","author":"Dean Jeffrey","key":"e_1_2_1_10_1","unstructured":"Jeffrey Dean , Greg S. Corrado , Rajat Monga , Kai Chen , Matthieu Devin , Quoc V. Le , Mark Z. Mao , Marc'Aurelio Ranzato , Andrew Senior , Paul Tucker , Ke Yang , and Andrew Y. Ng . 2012. Large Scale Distributed Deep Networks . In Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS'12) . 1223--1231. Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc'Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large Scale Distributed Deep Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS'12). 1223--1231."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1815961.1815998"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CPSNA.2015.23"},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"V. Kursun and E. G. Friedman. 2006. Supply and Threshold Voltage Scaling Techniques. Multi-Voltage CMOS Circuit Design (2006) 45--84.  V. Kursun and E. G. Friedman. 2006. Supply and Threshold Voltage Scaling Techniques. Multi-Voltage CMOS Circuit Design (2006) 45--84.","DOI":"10.1002\/0470033371.ch3"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485964"},{"key":"e_1_2_1_15_1","volume-title":"Proceeding of ACM SOSP Workshop on Power Aware Computing and Systems (HotPower).","author":"Ma Xiaohan","year":"2009","unstructured":"Xiaohan Ma , Mian Dong , Lin Zhong , and Zhigang Deng . 2009 . Statistical power consumption analysis and modeling for GPU-based computing . In Proceeding of ACM SOSP Workshop on Power Aware Computing and Systems (HotPower). Xiaohan Ma, Mian Dong, Lin Zhong, and Zhigang Deng. 2009. Statistical power consumption analysis and modeling for GPU-based computing. In Proceeding of ACM SOSP Workshop on Power Aware Computing and Systems (HotPower)."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2017.8057205"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dcan.2016.10.001"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/GREENCOMP.2010.5598315"},{"key":"e_1_2_1_19_1","unstructured":"NVIDIA. 2014. GeForce GTX 980 Whitepaper. {Online} http:\/\/www.geforce.com\/hardware\/notebook-gpus\/geforce-gtx-980\/specifications. (2014).  NVIDIA. 2014. GeForce GTX 980 Whitepaper. {Online} http:\/\/www.geforce.com\/hardware\/notebook-gpus\/geforce-gtx-980\/specifications. (2014)."},{"key":"e_1_2_1_20_1","unstructured":"NVIDIA. 2016. GPU Computing SDK. {Online} https:\/\/developer.nvidia.com\/gpucomputing-sdk. (2016).  NVIDIA. 2016. GPU Computing SDK. {Online} https:\/\/developer.nvidia.com\/gpucomputing-sdk. (2016)."},{"key":"e_1_2_1_21_1","unstructured":"NVIDIA. 2016. NVIDIA Profiler. {Online} http:\/\/docs.nvidia.com\/cuda\/profilerusers-guide. (2016).  NVIDIA. 2016. NVIDIA Profiler. {Online} http:\/\/docs.nvidia.com\/cuda\/profilerusers-guide. (2016)."},{"key":"e_1_2_1_22_1","unstructured":"NVIDIA. 2016. NVIDIA System Management Interface (nvidia-smi). {Online} https:\/\/developer.nvidia.com\/nvidia-system-management-interface. (2016).  NVIDIA. 2016. NVIDIA System Management Interface (nvidia-smi). {Online} https:\/\/developer.nvidia.com\/nvidia-system-management-interface. (2016)."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2013.73"},{"key":"e_1_2_1_24_1","volume-title":"GPGPU Performance Estimation with Core and Memory Frequency Scaling. arXiv preprint arXiv:1701.05308","author":"Wang Qiang","year":"2017","unstructured":"Qiang Wang and Xiaowen Chu . 2017. GPGPU Performance Estimation with Core and Memory Frequency Scaling. arXiv preprint arXiv:1701.05308 ( 2017 ). Qiang Wang and Xiaowen Chu. 2017. GPGPU Performance Estimation with Core and Memory Frequency Scaling. arXiv preprint arXiv:1701.05308 (2017)."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3077839.3077858"},{"volume-title":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). 564--576","author":"Wu G.","key":"e_1_2_1_26_1","unstructured":"G. Wu , J. L. Greathouse , A. Lyashevsky , N. Jayasena , and D. Chiou . 2015. GPGPU performance and power estimation using machine learning . In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). 564--576 . G.Wu, J. L. Greathouse, A. Lyashevsky, N. Jayasena, and D. Chiou. 2015. GPGPU performance and power estimation using machine learning. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). 564--576."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btu047"}],"container-title":["ACM SIGMETRICS Performance Evaluation Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3152042.3152066","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3152042.3152066","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:26:26Z","timestamp":1750213586000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3152042.3152066"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,10,11]]},"references-count":27,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2017,10,11]]}},"alternative-id":["10.1145\/3152042.3152066"],"URL":"https:\/\/doi.org\/10.1145\/3152042.3152066","relation":{},"ISSN":["0163-5999"],"issn-type":[{"type":"print","value":"0163-5999"}],"subject":[],"published":{"date-parts":[[2017,10,11]]},"assertion":[{"value":"2017-10-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}