{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T22:54:14Z","timestamp":1777676054946,"version":"3.51.4"},"reference-count":24,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2023,10,21]],"date-time":"2023-10-21T00:00:00Z","timestamp":1697846400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"name":"DEEP Projects, at the European Commission\u2019s FP7","award":["H2020"],"award-info":[{"award-number":["H2020"]}]},{"name":"Spanish State Research Agency - Ministry of Science and Innovation.","award":["PCI2021-121958"],"award-info":[{"award-number":["PCI2021-121958"]}]},{"DOI":"10.13039\/501100004837","name":"Spanish Ministry of Science and Innovation","doi-asserted-by":"crossref","award":["PID2019-107255GB"],"award-info":[{"award-number":["PID2019-107255GB"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"crossref"}]},{"name":"EuroHPC Programmes, under Grant Agreements","award":["287530, 610476, 754304, and 955606"],"award-info":[{"award-number":["287530, 610476, 754304, and 955606"]}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2024,3]]},"abstract":"<jats:p>This paper presents the evolution of the free agent threads for OpenMP to the new role-shifting threads model and their integration with the Dynamic Load Balancing (DLB) library. We demonstrate how free agent threads can improve resource utilization in OpenMP applications with load imbalance in their nested parallel regions. We also demonstrate how DLB efficiently manages the malleability exposed by the role-shifting threads to address load imbalance issues. We use three real-world scientific applications, one of them to demonstrate that free agents alone can improve the OpenMP model without external tools, and two other MPI+OpenMP applications, one of them with a coupling case, to illustrate the potential of the free agent threads\u2019 malleability with an external resource manager to increase the efficiency of the system. In addition, we demonstrate that the new implementation is more usable than the former one, letting the runtime system automatically make decisions that were made by the programmer previously. All software is released open-source.<\/jats:p>","DOI":"10.1177\/10943420231201153","type":"journal-article","created":{"date-parts":[[2023,10,21]],"date-time":"2023-10-21T06:48:32Z","timestamp":1697870912000},"page":"94-107","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":3,"title":["Role-shifting threads: Increasing OpenMP malleability to address load imbalance at MPI and OpenMP"],"prefix":"10.1177","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6482-0214","authenticated-orcid":false,"given":"Joel","family":"Criado","sequence":"first","affiliation":[{"name":"Department of Computer Science, Barcelona Supercomputing Center, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3113-9166","authenticated-orcid":false,"given":"Victor","family":"Lopez","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Barcelona Supercomputing Center, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4711-6815","authenticated-orcid":false,"given":"Joan","family":"Vinyals-Ylla-Catala","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Barcelona Supercomputing Center, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2741-3705","authenticated-orcid":false,"given":"Guillem","family":"Ramirez-Miranda","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Barcelona Supercomputing Center, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5181-7545","authenticated-orcid":false,"given":"Xavier","family":"Teruel","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Barcelona Supercomputing Center, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3682-9905","authenticated-orcid":false,"given":"Marta","family":"Garcia-Gasulla","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Barcelona Supercomputing Center, Barcelona, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2023,10,21]]},"reference":[{"key":"bibr1-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2009.02.016"},{"key":"bibr2-10943420231201153","unstructured":"Barcelona Supercomputing Center (2009) DLB repository. https:\/\/github.com\/bsc-pm\/dlb\/commit\/c4642f8 (Accessed 29 06 2022)."},{"key":"bibr3-10943420231201153","unstructured":"Barcelona Supercomputing Center (2011) OmpSs specification. URL https:\/\/pm.bsc.es\/ompss (Accessed 03 01 2022)."},{"key":"bibr4-10943420231201153","unstructured":"Barcelona Supercomputing Center (2016) LLVM repository. URL https:\/\/github.com\/bsc-pm\/llvm\/commit\/21f396fde4a9 (Accessed 29 06.2022)."},{"key":"bibr5-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1007\/s10494-013-9470-z"},{"key":"bibr6-10943420231201153","unstructured":"Cirrottola L, Froehly A (2019) Parallel unstructured mesh adaptation using iterative remeshing and repartitioning. Research Report RR-9307, INRIA Bordeaux, \u00e9quipe CARDAMOM. https:\/\/hal.inria.fr\/hal-02386837"},{"key":"bibr7-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1177\/1094342017701278"},{"key":"bibr8-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-28596-8_20"},{"key":"bibr11-10943420231201153","doi-asserted-by":"crossref","unstructured":"D\u2019Amico M, Garcia-Gasulla M, L\u00f3pez V, et al. (2018) DROM: enabling efficient and effortless malleability for resource managers. In: Proceedings of the 47th International Conference on Parallel Processing Companion. ACM: 41.","DOI":"10.1145\/3229710.3229752"},{"key":"bibr9-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1007\/s10586-007-0032-9"},{"key":"bibr10-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1142\/S0129626411000151"},{"key":"bibr12-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1109\/CCGRID.2007.45"},{"key":"bibr13-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2018.00075"},{"key":"bibr14-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2014.05.004"},{"key":"bibr15-10943420231201153","unstructured":"Intel Corporation (2009) Intel Cilk++ SDK programmer\u2019s guide. https:\/\/www.clear.rice.edu\/comp422\/resources\/Intel_Cilk++_Programmers_Guide.pdf"},{"key":"bibr16-10943420231201153","unstructured":"Intel Corporation (2011) Intel threading building Blocks. https:\/\/www.inf.ed.ac.uk\/teaching\/courses\/ppls\/TBBtutorial.pdf"},{"key":"bibr17-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1109\/CCGRID.2007.96"},{"key":"bibr18-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-85262-7_15"},{"key":"bibr19-10943420231201153","unstructured":"Massachusetts Institute of Technology (2021) OpenCilk Language Extension Specification Version 1.0. https:\/\/cilk.mit.edu\/docs\/OpenCilkLanguageExtensionSpecification.htm."},{"key":"bibr20-10943420231201153","unstructured":"OpenMP ARB (2008) OpenMP application programming interface, https:\/\/www.openmp.org\/wp-content\/uploads\/spec30.pdf (Accessed 29 11 2022)."},{"key":"bibr21-10943420231201153","doi-asserted-by":"crossref","unstructured":"Prabhakaran S, Neumann M, Rinke S, et al. (2015) A batch system with efficient adaptive scheduling for malleable and evolving applications. In: 2015 IEEE International Parallel and Distributed Processing Symposium. Hyderabad, India. 25-29 May 2015: 429\u2013438.","DOI":"10.1109\/IPDPS.2015.34"},{"key":"bibr22-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-95953-1_4"},{"key":"bibr23-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1016\/j.jocs.2015.12.007"},{"key":"bibr24-10943420231201153","doi-asserted-by":"publisher","DOI":"10.1016\/j.proci.2014.05.052"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/10943420231201153","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/10943420231201153","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/10943420231201153","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:17:32Z","timestamp":1777450652000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/10943420231201153"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,21]]},"references-count":24,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,3]]}},"alternative-id":["10.1177\/10943420231201153"],"URL":"https:\/\/doi.org\/10.1177\/10943420231201153","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"value":"1094-3420","type":"print"},{"value":"1741-2846","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,21]]}}}