{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T07:46:13Z","timestamp":1761896773457,"version":"3.37.3"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"10","license":[{"start":{"date-parts":[[2018,5,23]],"date-time":"2018-05-23T00:00:00Z","timestamp":1527033600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2018,10]]},"DOI":"10.1007\/s11227-018-2435-1","type":"journal-article","created":{"date-parts":[[2018,5,23]],"date-time":"2018-05-23T04:16:43Z","timestamp":1527049003000},"page":"5399-5431","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":26,"title":["E-OSched: a load balancing scheduler for heterogeneous multicores"],"prefix":"10.1007","volume":"74","author":[{"given":"Yasir Noman","family":"Khalid","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8342-5757","authenticated-orcid":false,"given":"Muhammad","family":"Aleem","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Radu","family":"Prodan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Muhammad Azhar","family":"Iqbal","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Muhammad Arshad","family":"Islam","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2018,5,23]]},"reference":[{"key":"2435_CR1","doi-asserted-by":"publisher","unstructured":"Albayrak OE, Akturk I, Ozturk O (2012) Effective kernel mapping for OpenCL applications in heterogeneous platforms. In: Proceedings of International Conference on Parallel Processing Work, pp 81\u201388. \n                    https:\/\/doi.org\/10.1109\/ICPPW.2012.14","DOI":"10.1109\/ICPPW.2012.14"},{"key":"2435_CR2","doi-asserted-by":"crossref","unstructured":"Aleem M, Prodan R, Fahringer T (2011) Scheduling javasymphony applications on many-core parallel computers. In: Euro-Par 2011 Parallel Processing. Springer, pp 167\u2013179","DOI":"10.1007\/978-3-642-23400-2_17"},{"key":"2435_CR3","unstructured":"APP SDK [WWW Document], n.d. \n                    http:\/\/developer.amd.com\/tools-and-sdks\/opencl-zone\/amd-accelerated-parallel-processing-app-sdk\/\n                    \n                  . Accessed 1 May 2017"},{"key":"2435_CR4","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1002\/cpe.1631","volume":"23","author":"C Augonnet","year":"2011","unstructured":"Augonnet C, Thibault S, Namyst R, Wacrenier P-A, Wacrenier StarPU P-A (2011) StarPU: a unified platform for task scheduling on heterogeneous multicore architectures a unified platform for task scheduling on heterogeneous multicore architectures. Concurr Comput Pract Exp 23:187\u2013198","journal-title":"Concurr Comput Pract Exp"},{"key":"2435_CR5","doi-asserted-by":"publisher","unstructured":"Becchi M, Byna S, Cadambi S, Chakradhar S (2010) Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory. In: Proceedings of 22nd ACM Symposium Parallelism algorithms Architecture, pp 82\u201391. \n                    https:\/\/doi.org\/10.1145\/1810479.1810498","DOI":"10.1145\/1810479.1810498"},{"key":"2435_CR6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2400682.2400716","volume":"9","author":"ME Belviranli","year":"2013","unstructured":"Belviranli ME, Bhuyan LN, Gupta R (2013) A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures. ACM Trans Archit Code Optim 9:1\u201320. \n                    https:\/\/doi.org\/10.1145\/2400682.2400716","journal-title":"ACM Trans Archit Code Optim"},{"key":"2435_CR7","doi-asserted-by":"crossref","unstructured":"Binotto APD, Pereira CE, Kuijper A, Stork A, Fellner DW (2011) An effective dynamic scheduling runtime and tuning system for heterogeneous multi and many-core desktop platforms. In: 2011 IEEE 13th International Conference on High Performance Computing and Communications (HPCC). IEEE, pp 78\u201385","DOI":"10.1109\/HPCC.2011.20"},{"key":"2435_CR8","doi-asserted-by":"crossref","unstructured":"Boyer M, Skadron K, Che S, Jayasena N (2013) Load balancing in a changing world: dealing with heterogeneity and performance variability. In: Proceedings of the ACM International Conference on Computing Frontiers. ACM, p 21","DOI":"10.1145\/2482767.2482794"},{"key":"2435_CR9","doi-asserted-by":"crossref","unstructured":"Che S, Boyer M, Meng J, Tarjan D, Sheaffer JW, Lee S-H, Skadron K (2009) Rodinia: a benchmark suite for heterogeneous computing. In: IISWC 2009. IEEE International Symposium on Workload Characterization, 2009. IEEE, pp 44\u201354","DOI":"10.1109\/IISWC.2009.5306797"},{"key":"2435_CR10","unstructured":"Chen Z, Marculescu D (2017) Task scheduling for heterogeneous multicore systems. arXiv Prepr. arXiv1712.03209"},{"key":"2435_CR11","doi-asserted-by":"publisher","first-page":"886","DOI":"10.1007\/s11227-013-0870-6","volume":"65","author":"HJ Choi","year":"2013","unstructured":"Choi HJ, Son DO, Kang SG, Kim JM, Lee H-H, Kim CH (2013) An efficient scheduling scheme using estimated execution time for heterogeneous computing systems. J. Supercomput 65:886\u2013902. \n                    https:\/\/doi.org\/10.1007\/s11227-013-0870-6","journal-title":"J. Supercomput"},{"key":"2435_CR12","doi-asserted-by":"publisher","first-page":"1341","DOI":"10.1007\/s11227-017-2177-5","volume":"74","author":"R Dolbeau","year":"2018","unstructured":"Dolbeau R (2018) Theoretical peak FLOPS per instruction set: a tutorial. J Supercomput 74:1341\u20131377. \n                    https:\/\/doi.org\/10.1007\/s11227-017-2177-5","journal-title":"J Supercomput"},{"key":"2435_CR13","doi-asserted-by":"publisher","unstructured":"Ghose A, Dey S, Mitra P, Chaudhuri M (2016) Divergence aware automated partitioning of OpenCL workloads. In: Proceedings of the 9th India Software Engineering Conference. ACM, pp 131\u2013135. \n                    https:\/\/doi.org\/10.1145\/2856636.2856639","DOI":"10.1145\/2856636.2856639"},{"key":"2435_CR14","doi-asserted-by":"crossref","unstructured":"Grauer-Gray S, Xu L, Searles R, Ayalasomayajula S, Cavazos J (2012) Auto-tuning a high-level language targeted to GPU codes. In: Innovative Parallel Computing (InPar). IEEE, pp 1\u201310","DOI":"10.1109\/InPar.2012.6339595"},{"key":"2435_CR15","unstructured":"Gregg C, Boyer M, Hazelwood K, Skadron K (2011) Dynamic heterogeneous scheduling decisions using historical runtime data. In: Proceedings of the 2nd Workshop on Applications for Multi-and Many-Core Processors. San Jose, CA"},{"key":"2435_CR16","unstructured":"Gregg C, Brantley JS, Hazelwood K (2010) Contention-aware scheduling of parallel code for heterogeneous systems. In: 2nd USENIX Workshop on Hot Topics Parallelism"},{"key":"2435_CR17","doi-asserted-by":"crossref","unstructured":"Grewe D, O\u2019Boyle MF (2011) A static task partitioning approach for heterogeneous systems using OpenCL. In: International Conference on Compiler Construction. Springer, pp 286\u2013305","DOI":"10.1007\/978-3-642-19861-8_16"},{"key":"2435_CR18","unstructured":"IMPACT Research Group and others (2007) IMPACT: parboil benchmarks [WWW Document]. \n                    http:\/\/impact.crhc.illinois.edu\/parboil\/parboil.aspx\n                    \n                  . Accessed 1 May 2017"},{"key":"2435_CR19","unstructured":"Insieme Compiler Project [WWW Document], n.d. \n                    http:\/\/www.insieme-compiler.org\/\n                    \n                  . Accessed 9 July 2017"},{"key":"2435_CR20","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1007\/978-3-540-92990-1_4","volume-title":"High Performance Embedded Architectures and Compilers","author":"V\u00edctor J. Jim\u00e9nez","year":"2009","unstructured":"Jim\u00e9nez VJ, Vilanova L, Gelado I, Gil M, Fursin G, Navarro N (2009) Predictive runtime code scheduling for heterogeneous architectures. In: International Conference on High-Performance Embedded Architectures and Compilers. Springer Berlin Heidelberg, pp 19\u201333"},{"key":"2435_CR21","doi-asserted-by":"crossref","unstructured":"Kaleem R, Barik R, Shpeisman T, Lewis BT, Hu C, Pingali K (2014) Adaptive heterogeneous scheduling for integrated GPUs. In: Proceedings of the 23rd International Conference on Parallel Architectures and Compilation. ACM, pp 151\u2013162","DOI":"10.1145\/2628071.2628088"},{"key":"2435_CR22","doi-asserted-by":"publisher","unstructured":"Kofler K, Grasso I, Cosenza B, Fahringer T (2013) An automatic input-sensitive approach for heterogeneous task partitioning categories and subject descriptors. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing\u2014ICS\u201913. pp 149\u2013160. \n                    https:\/\/doi.org\/10.1145\/2464996.2465007","DOI":"10.1145\/2464996.2465007"},{"key":"2435_CR23","doi-asserted-by":"crossref","unstructured":"Lee J, Samadi M, Mahlke S (2015a) Orchestrating multiple data-parallel kernels on multiple devices. In: 2015 International Conference on Parallel Architecture and Compilation (PACT). IEEE, pp 355\u2013366","DOI":"10.1109\/PACT.2015.14"},{"key":"2435_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2798725","volume":"33","author":"J Lee","year":"2015","unstructured":"Lee J, Samadi M, Park Y, Mahlke S (2015) Skmd: single kernel on multiple devices for transparent cpu-gpu collaboration. ACM Trans Comput Syst 33:1\u201327. \n                    https:\/\/doi.org\/10.1145\/2798725","journal-title":"ACM Trans Comput Syst"},{"key":"2435_CR25","unstructured":"Lee J, Samadi M, Park Y, Mahlke S (2013) Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems. In: Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. IEEE Press, pp 245\u2013256"},{"key":"2435_CR26","doi-asserted-by":"crossref","unstructured":"L\u00f6sch A, Beisel T, Kenter T, Plessl C, Platzner M (2016) Performance-centric scheduling with task migration for a heterogeneous compute node in the data center. In: Proceedings of the 2016 Conference on Design, Automation and Test in Europe. EDA Consortium, pp 912\u2013917","DOI":"10.3850\/9783981537079_0987"},{"key":"2435_CR27","unstructured":"Luk C-K, Hong S, Kim H (2009) Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: 2009 42nd Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO). IEEE, pp 45\u201355"},{"key":"2435_CR28","doi-asserted-by":"publisher","unstructured":"Munshi A (2009) The OpenCL specification. In: 2009 IEEE Hot Chips 21 Symposium (HCS). IEEE, pp 1\u2013314. \n                    https:\/\/doi.org\/10.1109\/HOTCHIPS.2009.7478342","DOI":"10.1109\/HOTCHIPS.2009.7478342"},{"key":"2435_CR29","unstructured":"OpenCL\u2014The open standard for parallel programming of heterogeneous systems [WWW Document], n.d. \n                    https:\/\/www.khronos.org\/opencl\/\n                    \n                  . Accessed 1 Mar 17"},{"key":"2435_CR30","doi-asserted-by":"publisher","first-page":"879","DOI":"10.1109\/JPROC.2008.917757","volume":"96","author":"JD Owens","year":"2008","unstructured":"Owens JD, Houston M, Luebke D, Green S, Stone JE, Phillips JC (2008) GPU computing. Proc IEEE 96:879\u2013899. \n                    https:\/\/doi.org\/10.1109\/JPROC.2008.917757","journal-title":"Proc IEEE"},{"key":"2435_CR31","doi-asserted-by":"publisher","unstructured":"Pandit P, Govindarajan R (2014) Fluidic kernels: Cooperative execution of opencl programs on multiple heterogeneous devices. In: Proceedings of Annual IEEE\/ACM International Symposium on Code Generation and Optimization. ACM, p 273. \n                    https:\/\/doi.org\/10.1145\/2544137.2544163","DOI":"10.1145\/2544137.2544163"},{"key":"2435_CR32","doi-asserted-by":"publisher","unstructured":"Ravi VT, Agrawal G (2011) A dynamic scheduling framework for emerging heterogeneous systems. In: 18th International Conference on High Performance Computing, HiPC 2011. IEEE, pp 1\u201310. \n                    https:\/\/doi.org\/10.1109\/HiPC.2011.6152724","DOI":"10.1109\/HiPC.2011.6152724"},{"key":"2435_CR33","doi-asserted-by":"publisher","unstructured":"Rohr D, Kalcher S, Bach M, Alaqeeliy AA, Alzaidy HM, Eschweiler D, Lindenstruth V, Alkhereyfy SB, Alharthiy A, Almubaraky A, Alqwaizy I, Suliman RB (2014) An energy-efficient multi-GPU supercomputer. In: 2014 IEEE International Conference on High Performance Computing and Communications, 2014 IEEE 6th International Symposium on Cyberspace Safety and Security, 2014 IEEE 11th International Conference on Embedded Software and Systems (HPCC, CSS, ICESS). IEEE, Paris, pp 42\u201345. \n                    https:\/\/doi.org\/10.1109\/HPCC.2014.14","DOI":"10.1109\/HPCC.2014.14"},{"key":"2435_CR34","unstructured":"Rul S, Vandierendonck H, D\u2019haene J, De Bosschere K (2010) An experimental study on performance portability of OpenCL kernels. Papers presented at the 2010 Symposium on Application Accelerators in High Performance Computing (SAAHPC \u201910)"},{"key":"2435_CR35","unstructured":"Samsung Galaxy S8+\u2014Full phone specifications [WWW Document], n.d. \n                    http:\/\/www.gsmarena.com\/samsung_galaxy_s8+-8523.php\n                    \n                  . Accessed 7 Oct 2017"},{"key":"2435_CR36","doi-asserted-by":"crossref","unstructured":"Sun E, Schaa D, Bagley R, Rubin N, Kaeli D (2012) Enabling task-level scheduling on heterogeneous platforms *. In: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units. ACM, pp 84\u201393","DOI":"10.1145\/2159430.2159440"},{"key":"2435_CR37","doi-asserted-by":"publisher","unstructured":"Wang Z, Zheng L, Chen Q, Guo M (2013) CAP: co-scheduling based on asymptotic profiling in CPU\u2009+\u2009GPU hybrid systems. In: Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores\u2014PMAM\u201913. ACM, pp 107\u2013114. \n                    https:\/\/doi.org\/10.1145\/2442992.2443004","DOI":"10.1145\/2442992.2443004"},{"key":"2435_CR38","doi-asserted-by":"publisher","unstructured":"Wen Y, O\u2019Boyle MF (2017) Merge or separate? Multi-job scheduling for OpenCL kernels on CPU\/GPU platforms. In: Proceedings of the General Purpose GPUs. ACM, pp 22\u201331. \n                    https:\/\/doi.org\/10.1145\/3038228.3038235","DOI":"10.1145\/3038228.3038235"},{"key":"2435_CR39","doi-asserted-by":"crossref","unstructured":"Wen Y, Wang Z, O\u2019boyle MFP (2014) Smart multi-task scheduling for OpenCL programs on CPU\/GPU heterogeneous platforms. In: 2014 21st International Conference on High Performance Computing (HiPC). IEEE, pp 1\u201310","DOI":"10.1109\/HiPC.2014.7116910"},{"key":"2435_CR40","doi-asserted-by":"publisher","first-page":"693","DOI":"10.1007\/s11227-014-1112-2","volume":"69","author":"X Yan","year":"2014","unstructured":"Yan X, Shi X, Wang L, Yang H (2014) An OpenCL micro-benchmark suite for GPUs and CPUs. J Supercomput 69:693\u2013713. \n                    https:\/\/doi.org\/10.1007\/s11227-014-1112-2","journal-title":"J Supercomput"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11227-018-2435-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-018-2435-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-018-2435-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,8,27]],"date-time":"2019-08-27T11:45:20Z","timestamp":1566906320000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11227-018-2435-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,5,23]]},"references-count":40,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2018,10]]}},"alternative-id":["2435"],"URL":"https:\/\/doi.org\/10.1007\/s11227-018-2435-1","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"type":"print","value":"0920-8542"},{"type":"electronic","value":"1573-0484"}],"subject":[],"published":{"date-parts":[[2018,5,23]]},"assertion":[{"value":"23 May 2018","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}