{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T11:27:06Z","timestamp":1750505226630,"version":"3.41.0"},"reference-count":43,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2016,9,17]],"date-time":"2016-09-17T00:00:00Z","timestamp":1474070400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2016,9,17]]},"abstract":"<jats:p>Heterogeneous processors (e.g., ARM\u2019s big.LITTLE) improve performance in power-constrained environments by executing applications on the \u2018little\u2019 low-power core and move them to the \u2018big\u2019 high-performance core when there is available power budget. The total time spent on the big core depends on the rate at which the application dissipates the available power budget. When applications with different big-core power consumption characteristics concurrently execute on a heterogeneous processor, it is best to give a larger share of the power budget to applications that can run longer on the big core, and a smaller share to applications that run for a very short duration on the big core.<\/jats:p>\n          <jats:p>This article investigates mechanisms to manage the available power budget on power-constrained heterogeneous processors. We show that existing proposals that schedule applications onto a big core based on various performance metrics are not high performing, as these strategies do not optimize over an entire power period and are unaware of the applications\u2019 power\/performance characteristics. We use linear programming to design the DPDP power management technique, which guarantees optimal performance on heterogeneous processors. We mathematically derive a metric (Delta Performance by Delta Power) that takes into account the power\/performance characteristics of each running application and allows our power-management technique to decide how best to distribute the available power budget among the co-running applications at minimal overhead. Our evaluations with a 4-core heterogeneous processor consisting of big.LITTLE pairs show that DPDP improves performance by 16% on average and up to 40% compared to a strategy that globally and greedily optimizes the power budget. We also show that DPDP outperforms existing heterogeneous scheduling policies that use performance metrics to decide how best to schedule applications on the big core.<\/jats:p>","DOI":"10.1145\/2976739","type":"journal-article","created":{"date-parts":[[2016,9,19]],"date-time":"2016-09-19T20:11:45Z","timestamp":1474315905000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Maximizing Heterogeneous Processor Performance Under Power Constraints"],"prefix":"10.1145","volume":"13","author":[{"given":"Almutaz","family":"Adileh","sequence":"first","affiliation":[{"name":"Ghent University, Zwijnaarde, Belgium"}]},{"given":"Stijn","family":"Eyerman","sequence":"additional","affiliation":[{"name":"Intel Belgium, Kontich, Belgium"}]},{"given":"Aamer","family":"Jaleel","sequence":"additional","affiliation":[{"name":"Nvidia Research"}]},{"given":"Lieven","family":"Eeckhout","sequence":"additional","affiliation":[{"name":"Ghent University, Zwijnaarde, Belgium"}]}],"member":"320","published-online":{"date-parts":[[2016,9,17]]},"reference":[{"doi-asserted-by":"publisher","key":"e_1_2_1_1_1","DOI":"10.1145\/1128022.1128029"},{"doi-asserted-by":"publisher","key":"e_1_2_1_2_1","DOI":"10.5555\/580550.876439"},{"doi-asserted-by":"publisher","key":"e_1_2_1_3_1","DOI":"10.1145\/2629677"},{"doi-asserted-by":"publisher","key":"e_1_2_1_4_1","DOI":"10.1145\/1629911.1630149"},{"doi-asserted-by":"publisher","key":"e_1_2_1_5_1","DOI":"10.1109\/HPCA.2012.6169046"},{"doi-asserted-by":"publisher","key":"e_1_2_1_6_1","DOI":"10.1145\/2155620.2155641"},{"doi-asserted-by":"publisher","key":"e_1_2_1_7_1","DOI":"10.1109\/ISCA.2006.39"},{"doi-asserted-by":"publisher","key":"e_1_2_1_8_1","DOI":"10.1145\/2000064.2000108"},{"doi-asserted-by":"publisher","key":"e_1_2_1_9_1","DOI":"10.1109\/MM.2008.44"},{"doi-asserted-by":"publisher","key":"e_1_2_1_10_1","DOI":"10.1145\/2872362.2872383"},{"doi-asserted-by":"publisher","key":"e_1_2_1_11_1","DOI":"10.1145\/1062261.1062295"},{"unstructured":"Peter Greenhalgh. 2011. big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7. ARM White paper.  Peter Greenhalgh. 2011. big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7. ARM White paper.","key":"e_1_2_1_12_1"},{"key":"e_1_2_1_13_1","first-page":"3","article-title":"Energy-efficient computing: Power management system on the Nehalem family of processors","volume":"14","author":"Gunther Steve","year":"2010","unstructured":"Steve Gunther , Anant Deval , Ted Burton , and Rajesh Kumar . 2010 . Energy-efficient computing: Power management system on the Nehalem family of processors . Intel Technology Journal 14 , 3 . Steve Gunther, Anant Deval, Ted Burton, and Rajesh Kumar. 2010. Energy-efficient computing: Power management system on the Nehalem family of processors. Intel Technology Journal 14, 3.","journal-title":"Intel Technology Journal"},{"doi-asserted-by":"publisher","key":"e_1_2_1_14_1","DOI":"10.1109\/MM.2011.77"},{"unstructured":"Scott Huck. 2011. Measuring processor power. Intel white paper.  Scott Huck. 2011. Measuring processor power. Intel white paper.","key":"e_1_2_1_15_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_16_1","DOI":"10.1109\/MICRO.2006.8"},{"unstructured":"Brian Jeff. 2013. big.LITTLE Technology moves towards fully heterogeneous global task scheduling. ARM White paper.  Brian Jeff. 2013. big.LITTLE Technology moves towards fully heterogeneous global task scheduling. ARM White paper.","key":"e_1_2_1_17_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_18_1","DOI":"10.1145\/1755913.1755928"},{"volume-title":"36th International Symposium on Microarchitecture (MICRO\u201903)","author":"Kumar Rakesh","unstructured":"Rakesh Kumar , Keith I. Farkas , Norman P. Jouppi , Parthasarathy Ranganathan , and Dean M. Tullsen . 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction . In 36th International Symposium on Microarchitecture (MICRO\u201903) . 81--92. Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy Ranganathan, and Dean M. Tullsen. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In 36th International Symposium on Microarchitecture (MICRO\u201903). 81--92.","key":"e_1_2_1_19_1"},{"key":"e_1_2_1_20_1","volume-title":"Technology Insight: Intel Silvermont Microarchitecture. Intel Developer Forum.","author":"Kuttana Belli","year":"2013","unstructured":"Belli Kuttana . 2013 . Technology Insight: Intel Silvermont Microarchitecture. Intel Developer Forum. Belli Kuttana. 2013. Technology Insight: Intel Silvermont Microarchitecture. Intel Developer Forum."},{"doi-asserted-by":"publisher","key":"e_1_2_1_21_1","DOI":"10.1145\/1654059.1654085"},{"doi-asserted-by":"publisher","key":"e_1_2_1_22_1","DOI":"10.1007\/s10586-007-0045-4"},{"doi-asserted-by":"publisher","key":"e_1_2_1_23_1","DOI":"10.1145\/1669112.1669172"},{"doi-asserted-by":"publisher","key":"e_1_2_1_24_1","DOI":"10.1145\/2628071.2628078"},{"doi-asserted-by":"publisher","key":"e_1_2_1_25_1","DOI":"10.1109\/MICRO.2012.37"},{"doi-asserted-by":"publisher","key":"e_1_2_1_26_1","DOI":"10.1145\/2000064.2000117"},{"doi-asserted-by":"publisher","key":"e_1_2_1_27_1","DOI":"10.1145\/2541940.2541974"},{"unstructured":"NVIDIA. 2011. Variable SMP -- A multi-core CPU architecture for low power and high performance. White paper.  NVIDIA. 2011. Variable SMP -- A multi-core CPU architecture for low power and high performance. White paper.","key":"e_1_2_1_28_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_29_1","DOI":"10.1109\/MICRO.2004.28"},{"doi-asserted-by":"publisher","key":"e_1_2_1_30_1","DOI":"10.1145\/2485922.2485947"},{"doi-asserted-by":"publisher","key":"e_1_2_1_31_1","DOI":"10.1109\/MM.2013.76"},{"doi-asserted-by":"publisher","key":"e_1_2_1_32_1","DOI":"10.1145\/2451116.2451135"},{"doi-asserted-by":"publisher","key":"e_1_2_1_33_1","DOI":"10.1109\/HPCA.2012.6169031"},{"doi-asserted-by":"publisher","key":"e_1_2_1_34_1","DOI":"10.1109\/MM.2012.12"},{"volume-title":"Samsung Primes Exynos 5 Octa for ARM big.LITTLE Technology with Heterogeneous Multi-Processing Capability","author":"Electronics Samsung","unstructured":"Samsung Electronics . 2013. Samsung Primes Exynos 5 Octa for ARM big.LITTLE Technology with Heterogeneous Multi-Processing Capability . Press release. Samsung Electronics. 2013. Samsung Primes Exynos 5 Octa for ARM big.LITTLE Technology with Heterogeneous Multi-Processing Capability. Press release.","key":"e_1_2_1_35_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_36_1","DOI":"10.1145\/1531793.1531804"},{"doi-asserted-by":"publisher","key":"e_1_2_1_37_1","DOI":"10.1145\/2228360.2228567"},{"doi-asserted-by":"publisher","key":"e_1_2_1_38_1","DOI":"10.1109\/MM.2013.90"},{"key":"e_1_2_1_39_1","volume-title":"22nd International Conference on Parallel Architectures and Compilation Techniques (PACT\u201913)","author":"Craeynest Kenzo Van","year":"2013","unstructured":"Kenzo Van Craeynest , Shoaib Akram , Wim Heirman , Aamer Jaleel , and Lieven Eeckhout . 2013 . Fairness-aware scheduling on single-ISA heterogeneous multi-cores . In 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT\u201913) . 177--187. Kenzo Van Craeynest, Shoaib Akram, Wim Heirman, Aamer Jaleel, and Lieven Eeckhout. 2013. Fairness-aware scheduling on single-ISA heterogeneous multi-cores. In 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT\u201913). 177--187."},{"doi-asserted-by":"publisher","key":"e_1_2_1_40_1","DOI":"10.5555\/2337159.2337184"},{"doi-asserted-by":"publisher","key":"e_1_2_1_41_1","DOI":"10.1145\/1555754.1555794"},{"doi-asserted-by":"publisher","key":"e_1_2_1_42_1","DOI":"10.1145\/1854273.1854283"},{"doi-asserted-by":"publisher","key":"e_1_2_1_43_1","DOI":"10.1109\/HPCA.2015.7056028"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2976739","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2976739","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:56:16Z","timestamp":1750222576000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2976739"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,9,17]]},"references-count":43,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2016,9,17]]}},"alternative-id":["10.1145\/2976739"],"URL":"https:\/\/doi.org\/10.1145\/2976739","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2016,9,17]]},"assertion":[{"value":"2016-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-09-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}