{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T19:23:22Z","timestamp":1774121002320,"version":"3.50.1"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2020,12,30]],"date-time":"2020-12-30T00:00:00Z","timestamp":1609286400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2021,3,31]]},"abstract":"<jats:p>\n            Chip multiprocessors (CMPs) are ubiquitous in all computing systems ranging from high-end servers to mobile devices. In these systems, energy consumption is a critical design constraint as it constitutes the most significant operating cost for computing clouds. Analogous to this, longer battery life continues to be an essential user concern in mobile devices. To optimize on power consumption, modern processors are designed with Dynamic Voltage and Frequency Scaling (DVFS) support at the individual core as well as the uncore level. This allows fine-grained control of performance and energy. For an\n            <jats:italic>n<\/jats:italic>\n            core processor with\n            <jats:italic>m<\/jats:italic>\n            core and uncore frequency choices, the total DVFS configuration space is now\n            <jats:italic>m<\/jats:italic>\n            <jats:sup>(n+1)<\/jats:sup>\n            (with the uncore accounting for the + 1). In addition to that, in CMPs, the performance-energy trade-off due to core\/uncore frequency scaling concerning a single application cannot be determined independently as cores share critical resources like the last level cache (LLC) and the memory. Thus, unlike the uni-processor environment, the energy consumption of an application running on a CMP depends not only on its characteristics but also on those of its co-runners (applications running on other cores). The key objective of our work is to select a suitable core and uncore frequency that minimizes power consumption while limiting application performance degradation within certain pre-defined limits (can be termed as QoS requirements). The key contribution of our work is a learning-based model that is able to capture the interference due to shared cache, bus bandwidth, and memory bandwidth between applications running on multiple cores and predict near-optimal frequencies for core and uncore.\n          <\/jats:p>","DOI":"10.1145\/3427092","type":"journal-article","created":{"date-parts":[[2020,12,30]],"date-time":"2020-12-30T12:30:51Z","timestamp":1609331451000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Performance-Energy Trade-off in Modern CMPs"],"prefix":"10.1145","volume":"18","author":[{"given":"Solomon","family":"Abera","sequence":"first","affiliation":[{"name":"Indian Institute of Technology Delhi, India"}]},{"given":"M.","family":"Balakrishnan","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Delhi, India"}]},{"given":"Anshul","family":"Kumar","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Delhi, India"}]}],"member":"320","published-online":{"date-parts":[[2020,12,30]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"PLSS: A scheduler for multi-core embedded systems. In Architecture of Computing Systems (ARCS\u201917)","author":"Abera Solomon","year":"2017","unstructured":"Solomon Abera , M. Balakrishnan , and Anshul Kumar . 2017 . PLSS: A scheduler for multi-core embedded systems. In Architecture of Computing Systems (ARCS\u201917) . Springer International Publishing , Cham , 164--176 Solomon Abera, M. Balakrishnan, and Anshul Kumar. 2017. PLSS: A scheduler for multi-core embedded systems. In Architecture of Computing Systems (ARCS\u201917). Springer International Publishing, Cham, 164--176"},{"key":"e_1_2_1_2_1","volume-title":"Architecture of Computing Systems (ARCS\u201918)","author":"Abera Solomon","unstructured":"Solomon Abera , M. Balakrishnan , and Anshul Kumar . 2018. Performance-energy trade-off in CMPs with per-core DVFS . In Architecture of Computing Systems (ARCS\u201918) . Springer International Publishing , Cham , 225--238. Solomon Abera, M. Balakrishnan, and Anshul Kumar. 2018. Performance-energy trade-off in CMPs with per-core DVFS. In Architecture of Computing Systems (ARCS\u201918). Springer International Publishing, Cham, 225--238."},{"key":"e_1_2_1_3_1","volume-title":"2019 10th International Green and Sustainable Computing Conference (IGSC\u201919)","author":"Acun B.","unstructured":"B. Acun , K. Chandrasekar , and L. V. Kale . 2019. Fine-grained energy efficiency using per-core DVFS with an adaptive runtime system . In 2019 10th International Green and Sustainable Computing Conference (IGSC\u201919) . 1--8. B. Acun, K. Chandrasekar, and L. V. Kale. 2019. Fine-grained energy efficiency using per-core DVFS with an adaptive runtime system. In 2019 10th International Green and Sustainable Computing Conference (IGSC\u201919). 1--8."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/3338075.3338081"},{"key":"e_1_2_1_5_1","volume-title":"Noise Reduction in Speech Processing","author":"\u00a0al Jacob Benesty","unstructured":"Jacob Benesty et \u00a0al . 2009. Pearson correlation coefficient . In Noise Reduction in Speech Processing . Springer , 37--40. Jacob Benesty et\u00a0al. 2009. Pearson correlation coefficient. In Noise Reduction in Speech Processing. Springer, 37--40."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"e_1_2_1_7_1","unstructured":"D. Brodowski N. Golde R. J. Wysocki and V. Kumar. 2017. Linux CPUFreq Governors - Information for Users and Developers. Linux Kernel. Retrieved from https:\/\/www.kernel.org\/doc\/Documentation\/cpu-freq\/governors.txt.  D. Brodowski N. Golde R. J. Wysocki and V. Kumar. 2017. Linux CPUFreq Governors - Information for Users and Developers. Linux Kernel. Retrieved from https:\/\/www.kernel.org\/doc\/Documentation\/cpu-freq\/governors.txt."},{"key":"e_1_2_1_8_1","volume-title":"SPEC CPU2017: Next-generation compute benchmark. In Companion of the 2018 ACM\/SPEC ICPE\u201918","author":"James","year":"1857","unstructured":"James Bucek et al. 2018 . SPEC CPU2017: Next-generation compute benchmark. In Companion of the 2018 ACM\/SPEC ICPE\u201918 . ACM, 41--42. DOI:https:\/\/doi.org\/10.1145\/3 1857 68.3185771 James Bucek et al. 2018. SPEC CPU2017: Next-generation compute benchmark. In Companion of the 2018 ACM\/SPEC ICPE\u201918. ACM, 41--42. DOI:https:\/\/doi.org\/10.1145\/3185768.3185771"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Rajkumar Buyya et al. [n.d.]. Cloud computing and emerging IT platforms: Vision hype and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. ([n.d.]) 599--616. DOI:https:\/\/doi.org\/10.1016\/j.future.2008.12.001  Rajkumar Buyya et al. [n.d.]. Cloud computing and emerging IT platforms: Vision hype and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. ([n.d.]) 599--616. DOI:https:\/\/doi.org\/10.1016\/j.future.2008.12.001","DOI":"10.1016\/j.future.2008.12.001"},{"key":"e_1_2_1_10_1","volume-title":"2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.","author":"Chang M.","unstructured":"M. Chang and W. Liang . 2011. Learning-directed dynamic voltage and frequency scaling for computation time prediction . In 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications. M. Chang and W. Liang. 2011. Learning-directed dynamic voltage and frequency scaling for computation time prediction. In 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of DAC\u201913","author":"Xi","unstructured":"Xi Chen et al. 2013. Dynamic voltage and frequency scaling for shared resources in multicore processor designs . In Proceedings of DAC\u201913 . Article 114, 7 pages. DOI:https:\/\/doi.org\/10.1145\/2463209.2488874 Xi Chen et al. 2013. Dynamic voltage and frequency scaling for shared resources in multicore processor designs. In Proceedings of DAC\u201913. Article 114, 7 pages. DOI:https:\/\/doi.org\/10.1145\/2463209.2488874"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of DATE\u201904 -","volume":"1","author":"Choi Kihwan","year":"2004","unstructured":"Kihwan Choi , Ramakrishna Soma , and Massoud Pedram . 2004 . Fine-grained dynamic voltage and frequency scaling for precise energy and performance trade-off based on the ratio of off-chip access to on-chip computation times . In Proceedings of DATE\u201904 - Volume 1 . IEEE Computer Society, 10004. Kihwan Choi, Ramakrishna Soma, and Massoud Pedram. 2004. Fine-grained dynamic voltage and frequency scaling for precise energy and performance trade-off based on the ratio of off-chip access to on-chip computation times. In Proceedings of DATE\u201904 - Volume 1. IEEE Computer Society, 10004."},{"key":"e_1_2_1_13_1","volume-title":"2011 44th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201911)","author":"Cochran R.","unstructured":"R. Cochran , C. Hankendi , A. K. Coskun , and S. Reda . 2011. Pack cap: Adaptive DVFS and thread packing under power caps . In 2011 44th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201911) . 175--185. R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. 2011. Pack cap: Adaptive DVFS and thread packing under power caps. In 2011 44th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201911). 175--185."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1870109.1870115"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1283780.1283825"},{"key":"e_1_2_1_16_1","volume-title":"2012 USENIX Annual Technical Conference (USENIX ATC\u201912)","author":"Vishal","unstructured":"Vishal Gupta et al. 2012. The forgotten \u2018Uncore\u2019: On the energy-efficiency of heterogeneous cores. Presented as part of the 2012 USENIX Annual Technical Conference (USENIX ATC\u201912) . Vishal Gupta et al. 2012. The forgotten \u2018Uncore\u2019: On the energy-efficiency of heterogeneous cores. Presented as part of the 2012 USENIX Annual Technical Conference (USENIX ATC\u201912)."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASPDAC.2017.7858403"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250665"},{"key":"e_1_2_1_19_1","volume-title":"2011 48th ACM\/EDAC\/IEEE Design Automation Conference (DAC\u201911)","author":"Ge Y.","unstructured":"Y. Ge and Q. Qiu . 2011. Dynamic thermal management for multimedia applications using machine learning . In 2011 48th ACM\/EDAC\/IEEE Design Automation Conference (DAC\u201911) . 95--100. Y. Ge and Q. Qiu. 2011. Dynamic thermal management for multimedia applications using machine learning. In 2011 48th ACM\/EDAC\/IEEE Design Automation Conference (DAC\u201911). 95--100."},{"key":"e_1_2_1_20_1","first-page":"1","article-title":"The WEKA data mining software: An update","volume":"11","author":"Mark Hall","year":"2009","unstructured":"Mark Hall et al. 2009 . The WEKA data mining software: An update . SIGKDD Explor. Newsl. 11 , 1 (Nov. 2009), 10--18. DOI:https:\/\/doi.org\/10.1145\/1656274.1656278 Mark Hall et al. 2009. The WEKA data mining software: An update. SIGKDD Explor. Newsl. 11, 1 (Nov. 2009), 10--18. DOI:https:\/\/doi.org\/10.1145\/1656274.1656278","journal-title":"SIGKDD Explor. Newsl."},{"key":"e_1_2_1_21_1","unstructured":"Intel. 2007. Intel 64 and IA-32 Architectures Software Developer\u2019s Manual - Volume 3B. Intel Corporation.  Intel. 2007. Intel 64 and IA-32 Architectures Software Developer\u2019s Manual - Volume 3B. Intel Corporation."},{"key":"e_1_2_1_22_1","volume-title":"2007 IEEE Asian Solid-State Circuits Conference. 360--363","author":"Lee Jeabin","year":"2007","unstructured":"Jeabin Lee , Byeong-Gyu Nam , and Hoi-Jun Yoo . 2007 . Dynamic voltage and frequency scaling (DVFS) scheme for multi-domains power management . In 2007 IEEE Asian Solid-State Circuits Conference. 360--363 . Jeabin Lee, Byeong-Gyu Nam, and Hoi-Jun Yoo. 2007. Dynamic voltage and frequency scaling (DVFS) scheme for multi-domains power management. In 2007 IEEE Asian Solid-State Circuits Conference. 360--363."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2333660.2333686"},{"key":"e_1_2_1_24_1","unstructured":"Linux Kernel. 2011. Profiling with perf. Retrieved from https:\/\/perf.wiki.kernel.org\/index.php\/Tutorial.  Linux Kernel. 2011. Profiling with perf. Retrieved from https:\/\/perf.wiki.kernel.org\/index.php\/Tutorial."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of EEHPDC\u201913","author":"Il Sung","unstructured":"Sung Il Kim et al. 2013. Using DVFS and task scheduling algorithms for a hard real-time heterogeneous multicore processor environment . In Proceedings of EEHPDC\u201913 . ACM, 23--30. DOI:https:\/\/doi.org\/10.1145\/2480347.2480350 Sung Il Kim et al. 2013. Using DVFS and task scheduling algorithms for a hard real-time heterogeneous multicore processor environment. In Proceedings of EEHPDC\u201913. ACM, 23--30. DOI:https:\/\/doi.org\/10.1145\/2480347.2480350"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2009.136"},{"key":"e_1_2_1_27_1","volume-title":"2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications.","author":"Liang W.","unstructured":"W. Liang , S. Chen , Y. Chang , and J. Fang . 2008. Memory-aware dynamic voltage and frequency prediction for portable devices . In 2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications. W. Liang, S. Chen, Y. Chang, and J. Fang. 2008. Memory-aware dynamic voltage and frequency prediction for portable devices. In 2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1755913.1755930"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1787275.1787336"},{"key":"e_1_2_1_30_1","volume-title":"Proceeding of Linux Symposium.","author":"Pallipadi V.","unstructured":"V. Pallipadi and A. Starikovskiy . 2006. The ondemand governor . In Proceeding of Linux Symposium. V. Pallipadi and A. Starikovskiy. 2006. The ondemand governor. In Proceeding of Linux Symposium."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jnca.2016.08.010"},{"key":"e_1_2_1_32_1","volume-title":"Automation Test in Europe Conference Exhibition (DATE\u201914)","author":"Shen H.","year":"2014","unstructured":"H. Shen and Q. Qiu . 2014. Contention aware frequency scaling on CMPs with guaranteed quality of service. In 2014 Design , Automation Test in Europe Conference Exhibition (DATE\u201914) . 1--6. DOI:https:\/\/doi.org\/10.7873\/DATE. 2014 .291 H. Shen and Q. Qiu. 2014. Contention aware frequency scaling on CMPs with guaranteed quality of service. In 2014 Design, Automation Test in Europe Conference Exhibition (DATE\u201914). 1--6. DOI:https:\/\/doi.org\/10.7873\/DATE.2014.291"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Sheng Yang et al. 2015. Adaptive energy minimization of embedded heterogeneous systems using regression-based learning. In 2015 25th PATMOS. 103--110. DOI:https:\/\/doi.org\/10.1109\/PATMOS.2015.7347594  Sheng Yang et al. 2015. Adaptive energy minimization of embedded heterogeneous systems using regression-based learning. In 2015 25th PATMOS. 103--110. DOI:https:\/\/doi.org\/10.1109\/PATMOS.2015.7347594","DOI":"10.1109\/PATMOS.2015.7347594"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of HPC\u201918","author":"Sundriyal Vaibhav","year":"2018","unstructured":"Vaibhav Sundriyal , Masha Sosonkina , Bryce M. Westheimer , and Mark Gordon . 2018 . Comparisons of core and uncore frequency scaling modes in quantum chemistry application GAMESS . In Proceedings of HPC\u201918 . Vaibhav Sundriyal, Masha Sosonkina, Bryce M. Westheimer, and Mark Gordon. 2018. Comparisons of core and uncore frequency scaling modes in quantum chemistry application GAMESS. In Proceedings of HPC\u201918."},{"key":"e_1_2_1_35_1","volume-title":"2015 IEEE 17th International Conference on High Performance Computing and Communications. 721--726","author":"Islam F. M. M. u.","year":"2015","unstructured":"F. M. M. u. Islam and M. Lin . 2015. A framework for learning based DVFS technique selection and frequency scaling for multi-core real-time systems . In 2015 IEEE 17th International Conference on High Performance Computing and Communications. 721--726 . DOI:https:\/\/doi.org\/10.1109\/HPCC-CSS-ICESS. 2015 .313 F. M. M. u. Islam and M. Lin. 2015. A framework for learning based DVFS technique selection and frequency scaling for multi-core real-time systems. In 2015 IEEE 17th International Conference on High Performance Computing and Communications. 721--726. DOI:https:\/\/doi.org\/10.1109\/HPCC-CSS-ICESS.2015.313"},{"key":"e_1_2_1_36_1","volume-title":"2015 IEEE 17th International Conference on High Performance Computing and Communications. 721--726","author":"Islam F. M. M. u.","year":"2015","unstructured":"F. M. M. u. Islam and M. Lin . 2015. A framework for learning based DVFS technique selection and frequency scaling for multi-core real-time systems . In 2015 IEEE 17th International Conference on High Performance Computing and Communications. 721--726 . DOI:https:\/\/doi.org\/10.1109\/HPCC-CSS-ICESS. 2015 .313 F. M. M. u. Islam and M. Lin. 2015. A framework for learning based DVFS technique selection and frequency scaling for multi-core real-time systems. In 2015 IEEE 17th International Conference on High Performance Computing and Communications. 721--726. DOI:https:\/\/doi.org\/10.1109\/HPCC-CSS-ICESS.2015.313"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 1st USENIX (OSDI\u201994)","author":"Mark","unstructured":"Mark Weiser et al. 1994. Scheduling for reduced CPU energy . In Proceedings of the 1st USENIX (OSDI\u201994) . USENIX Association, Berkeley, CA, Article 2. http:\/\/dl.acm.org\/citation.cfm?id=1267638.1267640 Mark Weiser et al. 1994. Scheduling for reduced CPU energy. In Proceedings of the 1st USENIX (OSDI\u201994). USENIX Association, Berkeley, CA, Article 2. http:\/\/dl.acm.org\/citation.cfm?id=1267638.1267640"},{"key":"e_1_2_1_38_1","doi-asserted-by":"crossref","unstructured":"J. Won X. Chen P. Gratz J. Hu and V. Soteriou. 2014. Up by their bootstraps: Online learning in Artificial Neural Networks for CMP uncore power management. In HPCA\u201914. 308--319.  J. Won X. Chen P. Gratz J. Hu and V. Soteriou. 2014. Up by their bootstraps: Online learning in Artificial Neural Networks for CMP uncore power management. In HPCA\u201914. 308--319.","DOI":"10.1109\/HPCA.2014.6835941"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 38th MICRO. IEEE Computer Society, 271--282","author":"Qiang","year":"2005","unstructured":"Qiang Wu et al. 2005. A dynamic compilation framework for controlling microprocessor energy and performance . In Proceedings of the 38th MICRO. IEEE Computer Society, 271--282 . DOI:https:\/\/doi.org\/10.1109\/MICRO. 2005 .7 Qiang Wu et al. 2005. A dynamic compilation framework for controlling microprocessor energy and performance. In Proceedings of the 38th MICRO. IEEE Computer Society, 271--282. DOI:https:\/\/doi.org\/10.1109\/MICRO.2005.7"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/781131.781138"},{"key":"e_1_2_1_41_1","volume-title":"[n.d.]. Scheduling with dynamic voltage\/speed adjustment using slack reclamation in multiprocessor real-time systems","author":"Dakai Zhu","unstructured":"Dakai Zhu et al. [n.d.]. Scheduling with dynamic voltage\/speed adjustment using slack reclamation in multiprocessor real-time systems . IEEE Trans. Parallel Distrib. Syst . 14, 7 ([n.d.]), 686--700. DOI:https:\/\/doi.org\/10.1109\/TPDS.2003.1214320 Dakai Zhu et al. [n.d.]. Scheduling with dynamic voltage\/speed adjustment using slack reclamation in multiprocessor real-time systems. IEEE Trans. Parallel Distrib. Syst. 14, 7 ([n.d.]), 686--700. DOI:https:\/\/doi.org\/10.1109\/TPDS.2003.1214320"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3427092","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3427092","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:24Z","timestamp":1750197744000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3427092"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,30]]},"references-count":41,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,3,31]]}},"alternative-id":["10.1145\/3427092"],"URL":"https:\/\/doi.org\/10.1145\/3427092","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12,30]]},"assertion":[{"value":"2019-12-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-12-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}