{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T02:04:38Z","timestamp":1776996278907,"version":"3.51.4"},"reference-count":33,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2017,12,19]],"date-time":"2017-12-19T00:00:00Z","timestamp":1513641600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["NSF-CMMI-1538204"],"award-info":[{"award-number":["NSF-CMMI-1538204"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"NWO","award":["024.002.003"],"award-info":[{"award-number":["024.002.003"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Meas. Anal. Comput. Syst."],"published-print":{"date-parts":[[2017,12,19]]},"abstract":"<jats:p>To keep pace with Moore's law, chip designers have focused on increasing the number of cores per chip rather than single core performance. In turn, modern jobs are often designed to run on any number of cores. However, to effectively leverage these multi-core chips, one must address the question of how many cores to assign to each job. Given that jobs receive sublinear speedups from additional cores, there is an obvious tradeoff: allocating more cores to an individual job reduces the job's runtime, but in turn decreases the efficiency of the overall system. We ask how the system should schedule jobs across cores so as to minimize the mean response time over a stream of incoming jobs.<\/jats:p>\n          <jats:p>To answer this question, we develop an analytical model of jobs running on a multi-core machine. We prove that EQUI, a policy which continuously divides cores evenly across jobs, is optimal when all jobs follow a single speedup curve and have exponentially distributed sizes. EQUI requires jobs to change their level of parallelization while they run. Since this is not possible for all workloads, we consider a class of \"fixed-width\" policies, which choose a single level of parallelization, k, to use for all jobs. We prove that, surprisingly, it is possible to achieve EQUI's performance without requiring jobs to change their levels of parallelization by using the optimal fixed level of parallelization, k*. We also show how to analytically derive the optimal k* as a function of the system load, the speedup curve, and the job size distribution.<\/jats:p>\n          <jats:p>In the case where jobs may follow different speedup curves, finding a good scheduling policy is even more challenging. In particular, we find that policies like EQUI which performed well in the case of a single speedup function now perform poorly. We propose a very simple policy, GREEDY*, which performs near-optimally when compared to the numerically-derived optimal policy.<\/jats:p>","DOI":"10.1145\/3154499","type":"journal-article","created":{"date-parts":[[2018,3,23]],"date-time":"2018-03-23T18:28:08Z","timestamp":1521829688000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["Towards Optimality in Parallel Scheduling"],"prefix":"10.1145","volume":"1","author":[{"given":"Benjamin","family":"Berg","sequence":"first","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, PA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jan-Pieter","family":"Dorsman","sequence":"additional","affiliation":[{"name":"University of Amsterdam, Amsterdam, Netherlands"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mor","family":"Harchol-Balter","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, PA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2017,12,19]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02024665"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2935764.2935782"},{"key":"e_1_2_1_3_1","volume-title":"Grass: Trimming stragglers in approximation analytics.","author":"Ananthanarayanan G.","year":"2014","unstructured":"G. Ananthanarayanan , M. C. Hung , X. Ren , I. Stoica , A. Wierman , and M. Yu . 2014 . Grass: Trimming stragglers in approximation analytics. (2014). G. Ananthanarayanan, M. C. Hung, X. Ren, I. Stoica, A. Wierman, and M. Yu. 2014. Grass: Trimming stragglers in approximation analytics. (2014)."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1997.1335"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/321879.321887"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1454115.1454128"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0166-5316(02)00110-4"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1287\/moor.1110.0533"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/1384529.1375490"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.2002.1869"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0304-3975(99)00186-3"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '09)","author":"Edmonds J.","unstructured":"J. Edmonds and K. Pruhs . 2009. Scalably scheduling processes with arbitrary speedup curves . In Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '09) . ACM, New York, NY, USA, 685--692. J. Edmonds and K. Pruhs. 2009. Scalably scheduling processes with arbitrary speedup curves. In Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '09). ACM, New York, NY, USA, 685--692."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/646378.689517"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2492101.1555368"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.peva.2007.06.012"},{"key":"e_1_2_1_16_1","volume-title":"Performance Modeling and Design of Computer Systems: Queueing Theory in Action","author":"Harchol-Balter M.","unstructured":"M. Harchol-Balter . 2013. Performance Modeling and Design of Computer Systems: Queueing Theory in Action . Cambridge University Press . M. Harchol-Balter. 2013. Performance Modeling and Design of Computer Systems: Queueing Theory in Action. Cambridge University Press."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2492101.1555383"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2008.209"},{"key":"e_1_2_1_19_1","volume-title":"Advances in Intelligent Systems and Applications -","author":"Huang K.-C.","unstructured":"K.-C. Huang , T.-C. Huang , Y.-H. Tung , and P.-Z. Shih . 2013. Effective Processor Allocation for Moldable Jobs with Application Speedup Model . In Advances in Intelligent Systems and Applications - Volume 2 . Springer , 563--572. K.-C. Huang, T.-C. Huang, Y.-H. Tung, and P.-Z. Shih. 2013. Effective Processor Allocation for Moldable Jobs with Application Speedup Model. In Advances in Intelligent Systems and Applications - Volume 2. Springer, 563--572."},{"key":"e_1_2_1_20_1","volume-title":"Queueing Systems, Volume II: Computer Applications","author":"Kleinrock L.","unstructured":"L. Kleinrock . 1976. Queueing Systems, Volume II: Computer Applications . Wiley , New York . L. Kleinrock. 1976. Queueing Systems, Volume II: Computer Applications. Wiley, New York."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1239\/aap\/1093962238"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1561\/0900000002"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1287\/mnsc.19.7.717"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.peva.2011.07.015"},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","unstructured":"J. McCool M. Robison and A. Reinders. 2012. Structured Parallel Programming: Patterns for Efficient Computation. Elsevier.   J. McCool M. Robison and A. Reinders. 2012. Structured Parallel Programming: Patterns for Efficient Computation. Elsevier.","DOI":"10.1016\/B978-0-12-415993-8.00003-7"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/0166-5316(93)90004-E"},{"key":"e_1_2_1_27_1","volume-title":"Markov Decision Processes: Discrete Stochastic Dynamic Programming","author":"Puterman M. L.","unstructured":"M. L. Puterman . 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming . John Wiley & Sons , Chichester . M. L. Puterman. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Chichester."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2829988.2787481"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the ACM Workshop on Mathematical Performance Modeling and Analysis.","author":"Scully Z.","unstructured":"Z. Scully , G. Blelloch , M. Harchol-Balter , and A. Scheller-Wolf . 2017. Optimally Scheduling Jobs with Multiple Tasks . In Proceedings of the ACM Workshop on Mathematical Performance Modeling and Analysis. Z. Scully, G. Blelloch, M. Harchol-Balter, and A. Scheller-Wolf. 2017. Optimally Scheduling Jobs with Multiple Tasks. In Proceedings of the ACM Workshop on Mathematical Performance Modeling and Analysis."},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER '03)","author":"Srinivasan S.","unstructured":"S. Srinivasan , S. Krishnamoorthy , and P. Sadayappan . 2003. A Robust Scheduling Strategy for Moldable Scheduling of Parallel Jobs . In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER '03) . 92--99. S. Srinivasan, S. Krishnamoorthy, and P. Sadayappan. 2003. A Robust Scheduling Strategy for Moldable Scheduling of Parallel Jobs. In Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER '03). 92--99."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2007116.2007131"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/3202838.3202854"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3053277.3053279"}],"container-title":["Proceedings of the ACM on Measurement and Analysis of Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3154499","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3154499","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3154499","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:11:27Z","timestamp":1750212687000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3154499"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,12,19]]},"references-count":33,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2017,12,19]]}},"alternative-id":["10.1145\/3154499"],"URL":"https:\/\/doi.org\/10.1145\/3154499","relation":{},"ISSN":["2476-1249"],"issn-type":[{"value":"2476-1249","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,12,19]]},"assertion":[{"value":"2017-12-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}