{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,18]],"date-time":"2026-01-18T01:55:14Z","timestamp":1768701314573,"version":"3.49.0"},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"5s","license":[{"start":{"date-parts":[[2017,9,27]],"date-time":"2017-09-27T00:00:00Z","timestamp":1506470400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/K034448\/1, EP\/L000563\/1"],"award-info":[{"award-number":["EP\/K034448\/1, EP\/L000563\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2017,10,31]]},"abstract":"<jats:p>Heterogeneous Multi-Processor Systems-on-Chips (MPSoCs) containing CPU and GPU cores are typically required to execute applications concurrently. However, as will be shown in this paper, existing approaches are not well suited for concurrent applications as they are developed either by considering only a single application or they do not exploit both CPU and GPU cores at the same time. In this paper, we propose an energy-efficient run-time mapping and thread partitioning approach for executing concurrent OpenCL applications on both GPU and GPU cores while satisfying performance requirements. Depending upon the performance requirements, for each concurrently executing application, the mapping process finds the appropriate number of CPU cores and operating frequencies of CPU and GPU cores, and the partitioning process identifies an efficient partitioning of the applications\u2019 threads between CPU and GPU cores. We validate the proposed approach experimentally on the Odroid-XU3 hardware platform with various mixes of applications from the Polybench benchmark suite. Additionally, a case-study is performed with a real-world application SLAMBench. Results show an average energy saving of 32% compared to existing approaches while still satisfying the performance requirements.<\/jats:p>","DOI":"10.1145\/3126548","type":"journal-article","created":{"date-parts":[[2017,9,27]],"date-time":"2017-09-27T12:33:53Z","timestamp":1506515633000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":36,"title":["Energy-Efficient Run-Time Mapping and Thread Partitioning of Concurrent OpenCL Applications on CPU-GPU MPSoCs"],"prefix":"10.1145","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2056-0569","authenticated-orcid":false,"given":"Amit Kumar","family":"Singh","sequence":"first","affiliation":[{"name":"University of Southampton, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alok","family":"Prakash","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Karunakar Reddy","family":"Basireddy","sequence":"additional","affiliation":[{"name":"University of Southampton, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Geoff V.","family":"Merrett","sequence":"additional","affiliation":[{"name":"University of Southampton, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bashir M.","family":"Al-Hashimi","sequence":"additional","affiliation":[{"name":"University of Southampton, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2017,9,27]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2013. ARM Mali T628. http:\/\/www.arm.com\/. (2013).  2013. ARM Mali T628. http:\/\/www.arm.com\/. (2013)."},{"key":"e_1_2_1_2_1","unstructured":"2014. ARM big.LITTLE Technology. http:\/\/www.arm.com\/. (2014).  2014. ARM big.LITTLE Technology. http:\/\/www.arm.com\/. (2014)."},{"key":"e_1_2_1_3_1","unstructured":"2015. Qualcomm Adreno 530 and 540. https:\/\/www.qualcomm.com\/. (2015).  2015. Qualcomm Adreno 530 and 540. https:\/\/www.qualcomm.com\/. (2015)."},{"key":"e_1_2_1_4_1","unstructured":"2016. ARM Mali 71. http:\/\/www.arm.com\/. (2016).  2016. ARM Mali 71. http:\/\/www.arm.com\/. (2016)."},{"key":"e_1_2_1_5_1","unstructured":"2016. Exynos 5 Octa (5422). www.samsung.com\/exynos\/. (2016).  2016. Exynos 5 Octa (5422). www.samsung.com\/exynos\/. (2016)."},{"key":"e_1_2_1_6_1","unstructured":"2016. Odroid-XU3. http:\/\/www.hardkernel.com\/main\/products\/prdt_info.php?g_code&equals;g140448267127. (2016).  2016. Odroid-XU3. http:\/\/www.hardkernel.com\/main\/products\/prdt_info.php?g_code&equals;g140448267127. (2016)."},{"key":"e_1_2_1_7_1","unstructured":"2016. The open standard for parallel programming of heterogeneous systems. https:\/\/goo.gl\/A9wXRJ. (2016).  2016. The open standard for parallel programming of heterogeneous systems. https:\/\/goo.gl\/A9wXRJ. (2016)."},{"key":"e_1_2_1_8_1","unstructured":"2017. FreeOCL: Multi-platform implementation of OpenCL 1.2 targeting CPUs. (2017). https:\/\/github.com\/zuzuf\/freeocl  2017. FreeOCL: Multi-platform implementation of OpenCL 1.2 targeting CPUs. (2017). https:\/\/github.com\/zuzuf\/freeocl"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2934583.2934612"},{"key":"e_1_2_1_10_1","volume-title":"Conference on Design, Automation and Test in Europe (DATE)","author":"Basireddy Karunakar Reddy"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2666357.2597822"},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","unstructured":"E. Del Sozzo G. C. Durelli E. M. G. Trainiti A. Miele M. D. Santambrogio and C. Bolchini. 2016. Workload-aware power optimization strategy for asymmetric multiprocessors. In 2016 Design Automation 8 Test in Europe Conference 8 Exhibition (DATE). IEEE 531--534.   E. Del Sozzo G. C. Durelli E. M. G. Trainiti A. Miele M. D. Santambrogio and C. Bolchini. 2016. Workload-aware power optimization strategy for asymmetric multiprocessors. In 2016 Design Automation 8 Test in Europe Conference 8 Exhibition (DATE). IEEE 531--534.","DOI":"10.3850\/9783981537079_0253"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2968456.2968459"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"L. Bagn\u00e8res et al.Switchable scheduling for runtime adaptation of optimization. In Euro-Par\u201914. 222--233.  L. Bagn\u00e8res et al.Switchable scheduling for runtime adaptation of optimization. In Euro-Par\u201914. 222--233.","DOI":"10.1007\/978-3-319-09873-9_19"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.24"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/InPar.2012.6339595"},{"key":"e_1_2_1_17_1","volume-title":"little processing with arm cortex-a15 8 cortex-a7. ARM White paper","author":"Greenhalgh Peter","year":"2011"},{"key":"e_1_2_1_18_1","volume-title":"O\u2019Boyle","author":"Grewe Dominik","year":"2011"},{"key":"e_1_2_1_19_1","volume-title":"O\u2019Boyle","author":"Grewe Dominik","year":"2013"},{"key":"e_1_2_1_20_1","unstructured":"Timo H\u00f6nig Heiko Janker Christopher Eibel Oliver Mihelic R\u00fcdiger Kapitza and Wolfgang Schr\u00f6der-Preikschat. 2014. Proactive Energy-Aware Programming with PEEK. In TRIOS.   Timo H\u00f6nig Heiko Janker Christopher Eibel Oliver Mihelic R\u00fcdiger Kapitza and Wolfgang Schr\u00f6der-Preikschat. 2014. Proactive Energy-Aware Programming with PEEK. In TRIOS."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2568058.2568064"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-014-1338-z"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CPSNA.2015.23"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669121"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2015.2419655"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2015.7140009"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2544137.2544163"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1155\/2014\/210762"},{"key":"e_1_2_1_29_1","volume-title":"Automation 8 Test in Europe Conference 8 Exhibition (DATE)","author":"Pourmohseni Behnaz"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD.2015.7357105"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3057267"},{"key":"e_1_2_1_32_1","volume-title":"Bashir M. Al-Hashimi, and Geoff V. Merrett.","author":"Singh Amit Kumar","year":"2017"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463209.2488734"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366231.2337184"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370873"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2628071.2628096"},{"key":"e_1_2_1_37_1","volume-title":"High Performance Computing (HiPC), 2014 21st International Conference on. IEEE, 1--10","author":"Wen Yuan"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2688500.2688505"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3126548","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3126548","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T19:05:02Z","timestamp":1750273502000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3126548"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,9,27]]},"references-count":38,"journal-issue":{"issue":"5s","published-print":{"date-parts":[[2017,10,31]]}},"alternative-id":["10.1145\/3126548"],"URL":"https:\/\/doi.org\/10.1145\/3126548","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,9,27]]},"assertion":[{"value":"2017-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-09-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}