{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T03:25:12Z","timestamp":1780543512357,"version":"3.54.1"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2022,3,7]],"date-time":"2022-03-07T00:00:00Z","timestamp":1646611200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union Horizon 2020 research and innovation programme","award":["780681"],"award-info":[{"award-number":["780681"]}]},{"name":"European High-Performance Computing Joint Undertaking","award":["956702"],"award-info":[{"award-number":["956702"]}]},{"name":"European Union\u2019s Horizon 2020 research and innovation programme and Spain, Sweden, Greece, Italy, France, Germany"},{"name":"Swedish National Infrastructure for Computing"},{"DOI":"10.13039\/501100004359","name":"Swedish Research Council","doi-asserted-by":"crossref","award":["2018-05973"],"award-info":[{"award-number":["2018-05973"]}],"id":[{"id":"10.13039\/501100004359","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2022,6,30]]},"abstract":"<jats:p>Parallel applications often rely on work stealing schedulers in combination with fine-grained tasking to achieve high performance and scalability. However, reducing the total energy consumption in the context of work stealing runtimes is still challenging, particularly when using asymmetric architectures with different types of CPU cores. A common approach for energy savings involves dynamic voltage and frequency scaling (DVFS) wherein throttling is carried out based on factors like task parallelism, stealing relations, and task criticality. This article makes the following observations: (i) leveraging DVFS on a per-task basis is impractical when using fine-grained tasking and in environments with cluster\/chip-level DVFS; (ii) task moldability, wherein a single task can execute on multiple threads\/cores via work-sharing, can help to reduce energy consumption; and (iii) mismatch between tasks and assigned resources (i.e.,\u00a0core type and number of cores) can detrimentally impact energy consumption. In this article, we propose EneRgy Aware SchedulEr (ERASE), an intra-application task scheduler on top of work stealing runtimes that aims to reduce the total energy consumption of parallel applications. It achieves energy savings by guiding scheduling decisions based on per-task energy consumption predictions of different resource configurations. In addition, ERASE is capable of adapting to both given static frequency settings and externally controlled DVFS. Overall, ERASE achieves up to 31% energy savings and improves performance by 44% on average, compared to the state-of-the-art DVFS-based schedulers.<\/jats:p>","DOI":"10.1145\/3510422","type":"journal-article","created":{"date-parts":[[2022,3,8]],"date-time":"2022-03-08T07:19:56Z","timestamp":1646723996000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["ERASE: Energy Efficient Task Mapping and Resource Management for Work Stealing Runtimes"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3409-8651","authenticated-orcid":false,"given":"Jing","family":"Chen","sequence":"first","affiliation":[{"name":"Chalmers University of Technology, Gothenburg, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Madhavan","family":"Manivannan","sequence":"additional","affiliation":[{"name":"Chalmers University of Technology, Gothenburg, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mustafa","family":"Abduljabbar","sequence":"additional","affiliation":[{"name":"Chalmers University of Technology, Gothenburg, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Miquel","family":"Peric\u00e0s","sequence":"additional","affiliation":[{"name":"Chalmers University of Technology, Gothenburg, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2022,3,7]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"2014. NVIDIA Jetson. Retrieved from https:\/\/developer.nvidia.com\/EMBEDDED\/Jetson-modules."},{"key":"e_1_3_2_3_2","unstructured":"2014. ODROID XU3. Retrieved from https:\/\/www.hardkernel.com\/shop\/odroid-xu3\/."},{"key":"e_1_3_2_4_2","unstructured":"2018. Cilk scheduler. Retrieved from https:\/\/github.com\/OpenCilk\/cilkrts\/tree\/main\/runtime."},{"key":"e_1_3_2_5_2","unstructured":"2020. Biomarker Discovery. Retrieved from https:\/\/legato-project.eu\/use-cases\/healthcare."},{"key":"e_1_3_2_6_2","unstructured":"2021. XiTAO. Retrieved February 2 2021 from https:\/\/github.com\/CHART-Team\/xitao."},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-009-0119-6"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/324133.324234"},{"key":"e_1_3_2_9_2","unstructured":"Dominik Brodowski. [n.d.]. CPU Frequency and Voltage Scaling Code in the Linux(TM) Kernel. Retrieved from https:\/\/www.kernel.org\/doc\/Documentation\/cpu-freq\/governors.txt."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2005.845533"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2016.49"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3409390.3409408"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2012.32"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-013-0870-6"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751235"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2008.4636091"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCS.2017.67"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2017.38"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2009.64"},{"key":"e_1_3_2_20_2","volume-title":"High Performance Computing","author":"Eastep Jonathan","year":"2017","unstructured":"Jonathan Eastep, Steve Sylvester, Christopher Cantalupo, Brad Geltz, Federico Ardanaz, Asma Al-Rawi, Kelly Livingston, Fuat Keceli, Matthias Maiterth, and Siddhartha Jana. 2017. Global extensible open power manager: A vehicle for HPC community collaboration on co-designed energy management solutions. In High Performance Computing."},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/277650.277725"},{"key":"e_1_3_2_22_2","unstructured":"Houzeaux Guillaume and Vazquez Mariano. [n.d.]. Alya Application. Retrieved from https:\/\/www.bsc.es\/research-development\/research-areas\/engineering-simulations\/alya-high-performance-computational."},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/1283780.1283790"},{"key":"e_1_3_2_24_2","unstructured":"Brian Jeff. 2012. Advances in big.LITTLE technology for power and energy savings improving energy efficiency in high-performance mobile platforms. White Paper released by ARM ."},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-92990-1_4"},{"key":"e_1_3_2_26_2","unstructured":"kangalow. 2017. NVPModel\u2014NVIDIA Jetson TX2 Development Kit. Retrieved March 25 2017 from https:\/\/www.jetsonhacks.com\/2017\/03\/25\/nvpmodel-nvidia-jetson-tx2-development-kit\/."},{"key":"e_1_3_2_27_2","unstructured":"The kernel development community. [n.d.]. Energy Aware Scheduling. Retrieved from https:\/\/www.kernel.org\/doc\/html\/latest\/scheduler\/sched-energy.html."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3177754"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2011.5763052"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/1250662.1250683"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/PADSW.2018.8644537"},{"key":"e_1_3_2_32_2","unstructured":"Linux Programmer\u2019s Manual. [n.d.]. perf_event_open()\u2014Set up Performance Monitoring. Retrieved from https:\/\/man7.org\/linux\/man-pages\/man2\/perf_event_open.2.html."},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/2687653"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISPASS.2014.6844463"},{"key":"e_1_3_2_35_2","unstructured":"OpenMP Architecture Review Board. 2018. OpenMP Application Program Interface. Version 5.0."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2012.2235126"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3185458"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/1995896.1995933"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/2925426.2926260"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541971"},{"key":"e_1_3_2_41_2","volume-title":"Lakefield: Hybrid Cores in 3D Package","author":"Khushu Wilfred Gomes Sanjeev","year":"2019","unstructured":"Wilfred Gomes Sanjeev Khushu. August 2019. Lakefield: Hybrid Cores in 3D Package. Retrieved from https:\/\/newsroom.intel.com\/wp-content\/uploads\/sites\/11\/2019\/08\/Intel-Lakefield-HotChips-presentation.pdf."},{"key":"e_1_3_2_42_2","doi-asserted-by":"crossref","unstructured":"Robert Sch\u00f6ne Thomas Ilsche Mario Bielert Markus Velten Markus Schmidl and Daniel Hackenberg. 2021. Energy efficiency aspects of the AMD Zen 2 architecture. arXiv:2108.00808. Retrieved fromhttps:\/\/arxiv.org\/abs\/2108.00808.","DOI":"10.1109\/Cluster48925.2021.00087"},{"key":"e_1_3_2_43_2","volume-title":"iPhone XS A12 Bionic Chip Is Industry-first 7nm CPU","author":"Shankland Stephen","year":"2018","unstructured":"Stephen Shankland. September 2018. iPhone XS A12 Bionic Chip Is Industry-first 7nm CPU. Retrieved from https:\/\/www.cnet.com\/news\/iphone-xs-a12-bionic-chip-is-industry-first-7nm-cpu\/."},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISSCC.2017.7870256"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2016.14"},{"key":"e_1_3_2_46_2","unstructured":"Link\u00f6ping University. [n.d.]. Tetralith. Retrieved from https:\/\/www.nsc.liu.se\/systems\/tetralith\/."},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2008.4536359"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/2742854.2742870"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2015.56"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3510422","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3510422","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:45Z","timestamp":1750183785000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3510422"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,7]]},"references-count":48,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,6,30]]}},"alternative-id":["10.1145\/3510422"],"URL":"https:\/\/doi.org\/10.1145\/3510422","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,7]]},"assertion":[{"value":"2021-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-03-07","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}