{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T03:55:10Z","timestamp":1776916510482,"version":"3.51.2"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1-2","license":[{"start":{"date-parts":[[2024,2,26]],"date-time":"2024-02-26T00:00:00Z","timestamp":1708905600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,26]],"date-time":"2024-02-26T00:00:00Z","timestamp":1708905600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Parallel Prog"],"published-print":{"date-parts":[[2024,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>High-performance computing (HPC) processors are nowadays integrated cyber-physical systems demanding complex and high-bandwidth closed-loop power and thermal control strategies. To efficiently satisfy real-time multi-input multi-output (MIMO) optimal power requirements, high-end processors integrate an on-die power controller system (PCS). While traditional PCSs are based on a simple microcontroller (MCU)-class core, more scalable and flexible PCS architectures are required to support advanced MIMO control algorithms for managing the ever-increasing number of cores, power states, and process, voltage, and temperature variability. This paper presents ControlPULP, an open-source, HW\/SW RISC-V parallel PCS platform consisting of a single-core MCU with fast interrupt handling coupled with a scalable multi-core programmable cluster accelerator and a specialized DMA engine for the parallel acceleration of real-time power management policies. ControlPULP relies on FreeRTOS to schedule a reactive power control firmware (PCF) application layer. We demonstrate ControlPULP in a power management use-case targeting a next-generation 72-core HPC processor. We first show that the multi-core cluster accelerates the PCF, achieving 4.9x speedup compared to single-core execution, enabling more advanced power management algorithms within the control hyper-period at a shallow area overhead, about 0.1% the area of a modern HPC CPU die. We then assess the PCS and PCF by designing an FPGA-based, closed-loop emulation framework that leverages the heterogeneous SoCs paradigm, achieving DVFS tracking with a mean deviation within 3% the plant\u2019s thermal design power (TDP) against a software-equivalent model-in-the-loop approach. Finally, we show that the proposed PCF compares favorably with an industry-grade control algorithm under computational-intensive workloads.<\/jats:p>","DOI":"10.1007\/s10766-024-00761-4","type":"journal-article","created":{"date-parts":[[2024,2,26]],"date-time":"2024-02-26T06:02:32Z","timestamp":1708927352000},"page":"93-123","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation"],"prefix":"10.1007","volume":"52","author":[{"given":"Alessandro","family":"Ottaviano","sequence":"first","affiliation":[]},{"given":"Robert","family":"Balas","sequence":"additional","affiliation":[]},{"given":"Giovanni","family":"Bambini","sequence":"additional","affiliation":[]},{"given":"Antonio","family":"Del Vecchio","sequence":"additional","affiliation":[]},{"given":"Maicol","family":"Ciani","sequence":"additional","affiliation":[]},{"given":"Davide","family":"Rossi","sequence":"additional","affiliation":[]},{"given":"Luca","family":"Benini","sequence":"additional","affiliation":[]},{"given":"Andrea","family":"Bartolini","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,2,26]]},"reference":[{"issue":"6495","key":"761_CR1","doi-asserted-by":"publisher","first-page":"eaam9744","DOI":"10.1126\/science.aam9744","volume":"368","author":"CE Leiserson","year":"2020","unstructured":"Leiserson, C.E., Thompson, N.C., Emer, J.S., Kuszmaul, B.C., Lampson, B.W., Sanchez, D., Schardl, T.B.: There\u2019s plenty of room at the top: What will drive computer performance after Moore\u2019s law? Science 368(6495), eaam9744 (2020)","journal-title":"Science"},{"key":"761_CR2","doi-asserted-by":"publisher","DOI":"10.1016\/j.conengprac.2022.105099","author":"A Tilli","year":"2022","unstructured":"Tilli, A., Garone, E., Conficoni, C., Cacciari, M., Bosso, A., Bartolini, A.: A two-layer distributed mpc approach to thermal control of multiprocessor systems-on-chip. Control Eng. Pract. (2022). https:\/\/doi.org\/10.1016\/j.conengprac.2022.105099","journal-title":"Control Eng. Pract."},{"key":"761_CR3","unstructured":"Labs, A.: AWS Graviton 2. https:\/\/en.wikichip.org\/wiki\/annapurna_labs\/alpine\/alc12b00 (2020)"},{"key":"761_CR4","unstructured":"Intel: Raptor Lake. https:\/\/en.wikichip.org\/wiki\/intel\/microarchitectures\/raptor_lake (2022)"},{"key":"761_CR5","unstructured":"AMD: EPYC 7004 Genoa. https:\/\/en.wikichip.org\/wiki\/amd\/cores\/genoa (2022)"},{"key":"761_CR6","unstructured":"Group, T.L.: SiPearl Develops ARM HPC Chip. https:\/\/www.linleygroup.com\/newsletters\/newsletter_detail.php?num=6227 &year=2020 &tag=3 (2020)"},{"key":"761_CR7","first-page":"682","volume":"1","author":"D Cesarini","year":"2020","unstructured":"Cesarini, D., Bartolini, A., Bonfa, P., Cavazzoni, C., Benini, L.: COUNTDOWN: a run-time library for performance-neutral energy saving in MPI applications. IEEE Trans. Comput. 1, 682\u2013695 (2020)","journal-title":"IEEE Trans. Comput."},{"issue":"2","key":"761_CR8","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1109\/MM.2012.12","volume":"32","author":"E Rotem","year":"2012","unstructured":"Rotem, E., Naveh, A., Ananthakrishnan, A., Weissmann, E., Rajwan, D.: Power-management architecture of the intel microarchitecture code-named Sandy Bridge. IEEE Micro 32(2), 20\u201327 (2012). https:\/\/doi.org\/10.1109\/MM.2012.12","journal-title":"IEEE Micro"},{"key":"761_CR9","doi-asserted-by":"publisher","first-page":"170","DOI":"10.1109\/TPDS.2012.117","volume":"24","author":"A Bartolini","year":"2013","unstructured":"Bartolini, A., Cacciari, M., Tilli, A., Benini, L.: Thermal and energy management of high-performance multicores: distributed and self-calibrating model-predictive controller. IEEE Trans. Parallel Distrib. Syst. 24, 170\u2013183 (2013). https:\/\/doi.org\/10.1109\/TPDS.2012.117","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"761_CR10","doi-asserted-by":"publisher","unstructured":"Beneventi, F., Bartolini, A., Benini, L.: On-line thermal emulation: how to speed-up your thermal controller design. In: 2013 23rd International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), pp. 99\u2013106 (2013). https:\/\/doi.org\/10.1109\/PATMOS.2013.6662161","DOI":"10.1109\/PATMOS.2013.6662161"},{"key":"761_CR11","unstructured":"Ltd., A.: Power and performance management using arm SCMI specification. Technical report (2019)"},{"key":"761_CR12","unstructured":"LLC, G.: Power management for multiple processor cores (U.S. Patent US8402290B2, 2020)"},{"key":"761_CR13","unstructured":"Ripoll, I., Ballester, R.: Period selection for minimal hyper-period in real-time systems (2014)"},{"key":"761_CR14","doi-asserted-by":"publisher","first-page":"943","DOI":"10.1002\/spe.v40:11","volume":"40","author":"Z Liu","year":"2010","unstructured":"Liu, Z., Zhu, H.: A survey of the research on power management techniques for high-performance systems. Softw. Pract. Exp. 40, 943\u2013964 (2010). https:\/\/doi.org\/10.1002\/spe.v40:11","journal-title":"Softw. Pract. Exp."},{"key":"761_CR15","doi-asserted-by":"publisher","first-page":"275","DOI":"10.1007\/978-3-319-67630-2_21","volume-title":"High Performance Computing","author":"T Rosedahl","year":"2017","unstructured":"Rosedahl, T., Broyles, M., Lefurgy, C., Christensen, B., Feng, W.: Power\/performance controlling techniques in OpenPOWER. In: Kunkel, J.M., Yokota, R., Taufer, M., Shalf, J. (eds.) High Performance Computing, pp. 275\u2013289. Springer, Cham (2017)"},{"key":"761_CR16","doi-asserted-by":"publisher","unstructured":"Schlager, M., Obermaisser, R., Elmenreich, W.: A framework for hardware-in-the-loop testing of an integrated architecture, vol. 4761, pp. 159\u2013170 (2007). https:\/\/doi.org\/10.1007\/978-3-540-75664-4_16","DOI":"10.1007\/978-3-540-75664-4_16"},{"key":"761_CR17","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1007\/978-3-031-15074-6_8","volume-title":"Embedded Computer Systems: Architectures, Modeling, and Simulation","author":"A Ottaviano","year":"2022","unstructured":"Ottaviano, A., Balas, R., Bambini, G., Bonfanti, C., Benatti, S., Rossi, D., Benini, L., Bartolini, A.: ControlPULP: a RISC-V power controller for HPC processors with parallel control-law computation acceleration. In: Orailoglu, A., Reichenbach, M., Jung, M. (eds.) Embedded Computer Systems: Architectures, Modeling, and Simulation, pp. 120\u2013135. Springer, Cham (2022)"},{"key":"761_CR18","doi-asserted-by":"crossref","unstructured":"Rossi, D., Conti, F., Marongiu, A., Pullini, A., Loi, I., Gautschi, M., Tagliavini, G., Capotondi, A., Flatresse, P., Benini, L.: PULP: a parallel ultra low power platform for next generation IoT applications. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1\u201339 (2015)","DOI":"10.1109\/HOTCHIPS.2015.7477325"},{"key":"761_CR19","unstructured":"RISC-V: \u201cSmclic\u201d core-local interrupt controller (CLIC) RISC-V privileged architecture extension. https:\/\/github.com\/riscv\/riscv-fast-interrupt\/blob\/master\/clic.adoc"},{"key":"761_CR20","doi-asserted-by":"crossref","unstructured":"Bambini, G., Balas, R., Conficoni, C., Tilli, A., Benini, L., Benatti, S., Bartolini, A.: An open-source scalable thermal and power controller for HPC processors. In: 2020 IEEE 38th International Conference on Computer Design (ICCD), pp. 364\u2013367 (2020)","DOI":"10.1109\/ICCD50377.2020.00067"},{"issue":"3","key":"761_CR21","first-page":"50","volume":"14","author":"S Gunther","year":"2010","unstructured":"Gunther, S., Deval, A., Burton, T., Kumar, R.: Energy-efficient computing: power management system on the Nehalem family of processors. Intel Technol. J. 14(3), 50\u201366 (2010)","journal-title":"Intel Technol. J."},{"key":"761_CR22","doi-asserted-by":"publisher","unstructured":"Sch\u00f6ne, R., Ilsche, T., Bielert, M., Gocht, A., Hackenberg, D.: Energy efficiency features of the Intel Skylake-SP processor and their impact on performance. In: 2019 International Conference on High Performance Computing Simulation (HPCS), pp. 399\u2013406 (2019). https:\/\/doi.org\/10.1109\/HPCS48598.2019.9188239","DOI":"10.1109\/HPCS48598.2019.9188239"},{"issue":"1","key":"761_CR23","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1109\/JSSC.2018.2873584","volume":"54","author":"T Burd","year":"2019","unstructured":"Burd, T., Beck, N., White, S., Paraschou, M., Kalyanasundharam, N., Donley, G., Smith, A., Hewitt, L., Naffziger, S.: \u201cZeppelin\u2019\u2019: an SoC for multichip architectures. IEEE J. Solid-State Circuits 54(1), 133\u2013143 (2019). https:\/\/doi.org\/10.1109\/JSSC.2018.2873584","journal-title":"IEEE J. Solid-State Circuits"},{"key":"761_CR24","unstructured":"ARM Ltd.: Arm system control and management interface V3.0. ARM Ltd. https:\/\/developer.arm.com\/documentation\/den0056\/latest"},{"key":"761_CR25","doi-asserted-by":"publisher","unstructured":"Atienza, D., Del\u00a0Valle, P.G., Paci, G., Poletti, F., Benini, L., De\u00a0Micheli, G., Mendias, J.M.: A fast HW\/SW FPGA-based thermal emulation framework for multi-processor system-on-chip. In: 2006 43rd ACM\/IEEE Design Automation Conference, pp. 618\u2013623 (2006). https:\/\/doi.org\/10.1145\/1146909.1147068","DOI":"10.1145\/1146909.1147068"},{"key":"761_CR26","unstructured":"Atienza, D.: Emulation-based transient thermal modeling of 2D\/3D systems-on-chip with active cooling. In: 2009 15th International Workshop on Thermal Investigations of ICs and Systems, pp. 50\u201355 (2009)"},{"key":"761_CR27","doi-asserted-by":"publisher","unstructured":"Brayanov, N., Eichberger, A.: Automation in hardware-in-the-loop units development and integration. In: 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C), pp. 191\u2013197 (2019). https:\/\/doi.org\/10.1109\/QRS-C.2019.00047","DOI":"10.1109\/QRS-C.2019.00047"},{"key":"761_CR28","doi-asserted-by":"publisher","unstructured":"Tan, Z., Waterman, A., Cook, H., Bird, S., Asanovi\u0107, K., Patterson, D.: A case for fame: FPGA architecture model execution. In: Proceedings of the 37th Annual International Symposium on Computer Architecture. ISCA \u201910, pp. 290\u2013301. Association for Computing Machinery, New York, NY, USA (2010). https:\/\/doi.org\/10.1145\/1815961.1815999","DOI":"10.1145\/1815961.1815999"},{"key":"761_CR29","unstructured":"RISC-V: the RISC-V instruction set manual volume II: privileged architecture. RISC-V. https:\/\/riscv.org\/technical\/specifications\/"},{"issue":"5","key":"761_CR30","doi-asserted-by":"publisher","first-page":"1038","DOI":"10.1109\/TPDS.2021.3101764","volume":"33","author":"F Montagna","year":"2022","unstructured":"Montagna, F., Mach, S., Benatti, S., Garofalo, A., Ottavi, G., Benini, L., Rossi, D., Tagliavini, G.: A low-power transprecision floating-point cluster for efficient near-sensor data analytics. IEEE Trans. Parallel Distrib. Syst. 33(5), 1038\u20131053 (2022). https:\/\/doi.org\/10.1109\/TPDS.2021.3101764","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"761_CR31","doi-asserted-by":"crossref","unstructured":"Rossi, D., Loi, I., Haugou, G., Benini, L.: Ultra-low-latency lightweight DMA for tightly coupled multi-core clusters. In: Proceedings of the 11th ACM Conference on Computing Frontiers. CF \u201914. Association for Computing Machinery, New York, NY, USA (2014)","DOI":"10.1145\/2597917.2597922"},{"issue":"12","key":"761_CR32","doi-asserted-by":"publisher","first-page":"4368","DOI":"10.1109\/TPDS.2022.3189390","volume":"33","author":"A Kurth","year":"2022","unstructured":"Kurth, A., Forsberg, B., Benini, L.: HEROv2: full-stack open-source research platform for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 33(12), 4368\u20134382 (2022). https:\/\/doi.org\/10.1109\/TPDS.2022.3189390","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"issue":"5","key":"761_CR33","doi-asserted-by":"publisher","first-page":"1097","DOI":"10.1109\/TC.2012.293","volume":"63","author":"F Beneventi","year":"2014","unstructured":"Beneventi, F., Bartolini, A., Tilli, A., Benini, L.: An effective gray-box identification procedure for multicore thermal modeling. IEEE Trans. Comput. 63(5), 1097\u20131110 (2014)","journal-title":"IEEE Trans. Comput."},{"key":"761_CR34","doi-asserted-by":"publisher","unstructured":"Bambini, G., Conficoni, C., Tilli, A., Benini, L., Bartolini, A.: Modeling the thermal and power control subsystem in HPC processors. In: 2022 IEEE Conference on Control Technology and Applications (CCTA), pp. 397\u2013402 (2022). https:\/\/doi.org\/10.1109\/CCTA49430.2022.9966082","DOI":"10.1109\/CCTA49430.2022.9966082"},{"key":"761_CR35","doi-asserted-by":"publisher","unstructured":"Das, S., Whatmough, P., Bull, D.: Modeling and characterization of the system-level power delivery network for a dual-core ARM Cortex-A57 cluster in 28nm CMOS. In: 2015 IEEE\/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp. 146\u2013151 (2015). https:\/\/doi.org\/10.1109\/ISLPED.2015.7273505","DOI":"10.1109\/ISLPED.2015.7273505"},{"key":"761_CR36","doi-asserted-by":"publisher","unstructured":"Bartolini, A., Ficarelli, F., Parisi, E., Beneventi, F., Barchi, F., Gregori, D., Magugliani, F., Cicala, M., Gianfreda, C., Cesarini, D., Acquaviva, A., Benini, L.: Monte Cimone: paving the road for the first generation of RISC-V high-performance computers. arXiv (2022). https:\/\/doi.org\/10.48550\/ARXIV.2205.03725. https:\/\/arxiv.org\/abs\/2205.03725","DOI":"10.48550\/ARXIV.2205.03725"},{"key":"761_CR37","doi-asserted-by":"publisher","unstructured":"Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. PACT \u201908, pp. 72\u201381. Association for Computing Machinery, New York, NY, USA (2008). https:\/\/doi.org\/10.1145\/1454115.1454128","DOI":"10.1145\/1454115.1454128"},{"key":"761_CR38","doi-asserted-by":"publisher","unstructured":"M\u00fcller, M., Whitney, B., Henschel, R., Kumaran, K.: In: Padua, D. (ed.) SPEC Benchmarks, pp. 1886\u20131893. Springer, Boston, MA (2011). https:\/\/doi.org\/10.1007\/978-0-387-09766-4_370","DOI":"10.1007\/978-0-387-09766-4_370"},{"key":"761_CR39","volume-title":"Computer Architecture: A Quantitative Approach","author":"JL Hennessy","year":"2012","unstructured":"Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach, 5th edn. Morgan Kaufmann, Amsterdam (2012)","edition":"5"},{"key":"761_CR40","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2021.3125885","author":"A Borghesi","year":"2021","unstructured":"Borghesi, A., Burrello, A., Bartolini, A.: ExaMon-X: a predictive maintenance framework for automatic monitoring in industrial IoT systems. IEEE Internet Things J. (2021). https:\/\/doi.org\/10.1109\/JIOT.2021.3125885","journal-title":"IEEE Internet Things J."}],"container-title":["International Journal of Parallel Programming"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10766-024-00761-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10766-024-00761-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10766-024-00761-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,29]],"date-time":"2024-03-29T09:06:34Z","timestamp":1711703194000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10766-024-00761-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,26]]},"references-count":40,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2024,4]]}},"alternative-id":["761"],"URL":"https:\/\/doi.org\/10.1007\/s10766-024-00761-4","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-2525734\/v1","asserted-by":"object"}]},"ISSN":["0885-7458","1573-7640"],"issn-type":[{"value":"0885-7458","type":"print"},{"value":"1573-7640","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,26]]},"assertion":[{"value":"29 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 January 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and\/or discussion reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}