{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T12:16:19Z","timestamp":1763468179325,"version":"3.41.0"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2014,2,1]],"date-time":"2014-02-01T00:00:00Z","timestamp":1391212800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Shanghai Excellent Academic Leaders Plan"},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["60725208, 61003012"],"award-info":[{"award-number":["60725208, 61003012"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Program for Changjiang Scholars and Innovative Research Team in University (IRT1158, PCSIRT) China"},{"DOI":"10.13039\/501100002855","name":"Ministry of Science and Technology of the People's Republic of China","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002855","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2014,2]]},"abstract":"<jats:p>Single-ISA Asymmetric Multicore (AMC) architectures have shown high performance as well as power efficiency. However, current parallel programming environments do not perform well on AMC because they are designed for symmetric multicore architectures in which all cores provide equal performance. Their random task scheduling policies can result in unbalanced workloads in AMC and severely degrade the performance of parallel applications. To balance the workloads of parallel applications in AMC, this article proposes an adaptive Workload-Aware Task Scheduler (WATS) that consists of a history-based task allocator and a preference-based task scheduler. The history-based task allocator is based on a near-optimal, static task allocation using the historical statistics collected during the execution of a parallel application. The preference-based task scheduler, which schedules tasks based on a preference list, can dynamically adjust the workloads in AMC if the task allocation is less optimal due to approximation in the history-based task allocator. Experimental results show that WATS can improve both the performance and energy efficiency of task-based applications, with the performance gain up to 66.1% compared with traditional task schedulers.<\/jats:p>","DOI":"10.1145\/2579674","type":"journal-article","created":{"date-parts":[[2014,3,18]],"date-time":"2014-03-18T12:09:07Z","timestamp":1395144547000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":23,"title":["Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures"],"prefix":"10.1145","volume":"11","author":[{"given":"Quan","family":"Chen","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Minyi","family":"Guo","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2014,2]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2008.105"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2005.51"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/341800.341803"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1810085.1810113"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454128"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1996.0107"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2012.32"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-011-0682-5"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2304576.2304599"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2012.322"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2011.32"},{"volume-title":"Proceedings of the International Parallel and Distributed Processing Symposium. IEEE.","author":"De Vuyst M.","key":"e_1_2_1_12_1","unstructured":"M. De Vuyst , R. Kumar , and D. M. Tullsen . 2006. Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors . In Proceedings of the International Parallel and Distributed Processing Symposium. IEEE. M. De Vuyst, R. Kumar, and D. M. Tullsen. 2006. Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors. In Proceedings of the International Parallel and Distributed Processing Symposium. IEEE."},{"volume-title":"Proceedings of the International Parallel and Distributed Processing Symposium. IEEE.","author":"El-Moursy A.","key":"e_1_2_1_13_1","unstructured":"A. El-Moursy , R. Garg , D. H. Albonesi , and S. Dwarkadas . 2006. Compatible phase co-scheduling on a CMP of multi-threaded processors . In Proceedings of the International Parallel and Distributed Processing Symposium. IEEE. A. El-Moursy, R. Garg, D. H. Albonesi, and S. Dwarkadas. 2006. Compatible phase co-scheduling on a CMP of multi-threaded processors. In Proceedings of the International Parallel and Distributed Processing Symposium. IEEE."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/277650.277725"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1062261.1062295"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5161079"},{"volume-title":"Proceedings of the International Parallel and Distributed Processing Symposium. IEEE.","author":"Guo Y.","key":"e_1_2_1_17_1","unstructured":"Y. Guo , J. Zhao , V. Cave , and V. Sarkar . 2010. SLAW: A scalable locality-aware adaptive work-stealing scheduler . In Proceedings of the International Parallel and Distributed Processing Symposium. IEEE. Y. Guo, J. Zhao, V. Cave, and V. Sarkar. 2010. SLAW: A scalable locality-aware adaptive work-stealing scheduler. In Proceedings of the International Parallel and Distributed Processing Symposium. IEEE."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2008.209"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1137\/0217033"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1693453.1693475"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2150976.2151001"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1755913.1755928"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2005.379"},{"volume-title":"Proceedings of the 31st Annual International Symposium on Computer Architecture. IEEE.","author":"Kumar R.","key":"e_1_2_1_24_1","unstructured":"R. Kumar , D. M. Tullsen , P. Ranganathan , N. P. Jouppi , and K. I. Farkas . 2004. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance . In Proceedings of the 31st Annual International Symposium on Computer Architecture. IEEE. R. Kumar, D. M. Tullsen, P. Ranganathan, N. P. Jouppi, and K. I. Farkas. 2004. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. In Proceedings of the 31st Annual International Symposium on Computer Architecture. IEEE."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654085"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/1693453.1693459"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1362622.1362694"},{"key":"e_1_2_1_28_1","unstructured":"J. W. S. Liu and C. L. Liu. 1974. Bounds on Scheduling Algorithms for Heterogeneous Computing Systems. Department of Computer Science University of Illinois at Urbana--Champaign.  J. W. S. Liu and C. L. Liu. 1974. Bounds on Scheduling Algorithms for Heterogeneous Computing Systems. Department of Computer Science University of Illinois at Urbana--Champaign."},{"key":"e_1_2_1_29_1","unstructured":"M. Mahoney. 2013. Data Compression Programs. http:\/\/mattmahoney.net\/dc\/.  M. Mahoney. 2013. Data Compression Programs. http:\/\/mattmahoney.net\/dc\/."},{"volume-title":"Proceedings of the 8th International Symposium on Industrial Embedded Systems. IEEE.","author":"Maia C.","key":"e_1_2_1_30_1","unstructured":"C. Maia , L. Nogueira , and L. M. Pinho . 2013. Scheduling parallel real-time tasks using a fixed-priority work-stealing algorithm on multiprocessors . In Proceedings of the 8th International Symposium on Industrial Embedded Systems. IEEE. C. Maia, L. Nogueira, and L. M. Pinho. 2013. Scheduling parallel real-time tasks using a fixed-priority work-stealing algorithm on multiprocessors. In Proceedings of the 8th International Symposium on Industrial Embedded Systems. IEEE."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-28293-5_15"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/645561.659355"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1542275.1542358"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024724.2024954"},{"key":"e_1_2_1_35_1","unstructured":"J. Reinders. 2007. Intel Threading Building Blocks. O'Reilly.   J. Reinders. 2007. Intel Threading Building Blocks. O'Reilly."},{"volume-title":"Proceedings of the International Parallel and Distributed Processing Symposium. IEEE, 1--10","author":"Rosenberg A. L.","key":"e_1_2_1_36_1","unstructured":"A. L. Rosenberg and R. C. Chiang . 2010. Toward understanding heterogeneity in computing . In Proceedings of the International Parallel and Distributed Processing Symposium. IEEE, 1--10 . A. L. Rosenberg and R. C. Chiang. 2010. Toward understanding heterogeneity in computing. In Proceedings of the International Parallel and Distributed Processing Symposium. IEEE, 1--10."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1787275.1787281"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/1755913.1755929"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1531793.1531804"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1508244.1508274"},{"volume-title":"Proceedings of the 39th International Symposium on Computer Architecture. IEEE, 213--224","author":"Van Craeynest K.","key":"e_1_2_1_41_1","unstructured":"K. Van Craeynest , A. Jaleel , L. Eeckhout , P. Narvaez , and J. Emer . 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE) . In Proceedings of the 39th International Symposium on Computer Architecture. IEEE, 213--224 . K. Van Craeynest, A. Jaleel, L. Eeckhout, P. Narvaez, and J. Emer. 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In Proceedings of the 39th International Symposium on Computer Architecture. IEEE, 213--224."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/CSE.2011.65"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2579674","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2579674","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T23:43:50Z","timestamp":1750290230000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2579674"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,2]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2014,2]]}},"alternative-id":["10.1145\/2579674"],"URL":"https:\/\/doi.org\/10.1145\/2579674","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2014,2]]},"assertion":[{"value":"2013-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2013-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-02-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}