{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T18:26:24Z","timestamp":1769192784198,"version":"3.49.0"},"reference-count":45,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2013,3,1]],"date-time":"2013-03-01T00:00:00Z","timestamp":1362096000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000144","name":"Division of Computer and Network Systems","doi-asserted-by":"publisher","award":["CNS-0845947"],"award-info":[{"award-number":["CNS-0845947"]}],"id":[{"id":"10.13039\/100000144","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2013,3]]},"abstract":"<jats:p>System level power management must consider the uncertainty and variability that come from the environment, the application and the hardware. A robust power management technique must be able to learn the optimal decision from past events and improve itself as the environment changes. This article presents a novel on-line power management technique based on model-free constrained reinforcement learning (Q-learning). The proposed learning algorithm requires no prior information of the workload and dynamically adapts to the environment to achieve autonomous power management. We focus on the power management of the peripheral device and the microprocessor, two of the basic components of a computer. Due to their different operating behaviors and performance considerations, these two types of devices require different designs of Q-learning agent. The article discusses system modeling and cost function construction for both types of Q-learning agent. Enhancement techniques are also proposed to speed up the convergence and better maintain the required performance (or power) constraint in a dynamic system with large variations. Compared with the existing machine learning based power management techniques, the Q-learning based power management is more flexible in adapting to different workload and hardware and provides a wider range of power-performance tradeoff.<\/jats:p>","DOI":"10.1145\/2442087.2442095","type":"journal-article","created":{"date-parts":[[2013,4,9]],"date-time":"2013-04-09T12:17:58Z","timestamp":1365509878000},"page":"1-32","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":85,"title":["Achieving autonomous power management using reinforcement learning"],"prefix":"10.1145","volume":"18","author":[{"given":"Hao","family":"Shen","sequence":"first","affiliation":[{"name":"Syracuse University"}]},{"given":"Ying","family":"Tan","sequence":"additional","affiliation":[{"name":"Binghamton University"}]},{"given":"Jun","family":"Lu","sequence":"additional","affiliation":[{"name":"Binghamton University"}]},{"given":"Qing","family":"Wu","sequence":"additional","affiliation":[{"name":"Air Force Research Laboratory"}]},{"given":"Qinru","family":"Qiu","sequence":"additional","affiliation":[{"name":"Syracuse University"}]}],"member":"320","published-online":{"date-parts":[[2013,4,11]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Abdelzaher T. Diao Y. Hellerstein J. L. Lu C. and Zhu X. 2008. Introduction to control theory and its application to computing systems. SIGMETRICS Tutorial Annapolis MD.  Abdelzaher T. Diao Y. Hellerstein J. L. Lu C. and Zhu X. 2008. Introduction to control theory and its application to computing systems. SIGMETRICS Tutorial Annapolis MD.","DOI":"10.1007\/978-0-387-79361-0_7"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the USENIX Annual Technical Conference (USENIX ATC'10)","author":"Agarwal Y."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the IEEE International Symposium on Parallel and Distributed Processing. 1--6.","author":"Ahmad I."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1594233.1594256"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2011.2153852"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2012.6169035"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2006.882587"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 11th International Conference on Machine Learning","author":"Caruana R.","year":"1994"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2004.839485(410) 24"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2010.2049045"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2008.2005309"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE'07)","author":"Coskun A. K."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC'08)","author":"Coskun A. K."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1391469.1391693"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2009.2026357"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2009.2015740"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1324969.1324971"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1347375.1347380"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2006.87"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/335043.335046"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.21"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/996566.996650"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1274858.1274870"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS'06)","author":"Langen P."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISQED.2007.158"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE'10)","author":"Liu W."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2009.77"},{"key":"e_1_2_1_28_1","unstructured":"Pendrith M. 1994. On reinforcement learning of control actions in noisy and nonMarkovian domains. Tech. rep. NSW-CSE-TR-9410 School of Computer Science and Engineering University of New South Wales Sydney Australia.  Pendrith M. 1994. On reinforcement learning of control actions in noisy and nonMarkovian domains. Tech. rep. NSW-CSE-TR-9410 School of Computer Science and Engineering University of New South Wales Sydney Australia."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2008.180"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe (DATE'07)","author":"Qiu Q."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/43.931003"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the USENIX Winter Conference. 405--420","author":"Ruemmler C."},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the Agents-2001 Workshop on Learning Agents.","author":"Sikorski K."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/GREENCOMP.2010.5598310"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1687399.1687486"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1403375.1403402"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1118299.1118505"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICAC.2006.1662383"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 21st Annual Conference on Neural Information Processing Systems (NIPS'07)","author":"Tesauro G."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1535\/itj.1004.05"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2003.820523"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1555754.1555794"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/1289720.1289721"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/1391469.1391658"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ECCTD.2009.5275073"}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2442087.2442095","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2442087.2442095","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T08:35:25Z","timestamp":1750235725000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2442087.2442095"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,3]]},"references-count":45,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2013,3]]}},"alternative-id":["10.1145\/2442087.2442095"],"URL":"https:\/\/doi.org\/10.1145\/2442087.2442095","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"value":"1084-4309","type":"print"},{"value":"1557-7309","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,3]]},"assertion":[{"value":"2012-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2012-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-04-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}