{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,12]],"date-time":"2026-06-12T15:45:30Z","timestamp":1781279130089,"version":"3.54.1"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2017,11,14]],"date-time":"2017-11-14T00:00:00Z","timestamp":1510617600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2017,12,31]]},"abstract":"<jats:p>Modern multi-core systems provide huge computational capabilities, which can be used to run multiple processes concurrently. To achieve the best possible performance within limited power budgets, the various system resources need to be allocated effectively. Any mismatch between runtime resource requirement and allocation leads to a sub-optimal energy-delay product (EDP). Different optimization techniques exist for addressing the problem of mismatch between the dynamic requirement and runtime allocation of the system resources. Choosing between multiple optimizations at runtime is complex due to the non-additive effects, making the scenario suitable for the application of machine learning techniques. We present a novel method, Machine Learned Machines (MLM), by using online reinforcement learning (RL) to perform dynamic partitioning of the last level cache (LLC), along with dynamic voltage and frequency scaling (DVFS) of the core and uncore (interconnection network and LLC). We have proposed and evaluated three different MLM co-optimization techniques based on independent and cooperative multi-agent learners. We show that the co-optimization results in a much lower system EDP than any of the techniques applied individually. We explore various RL models targeted toward optimization of different system metrics and study their effects on a system EDP, system throughput (STP), and Fairness. The various proposed techniques have been extensively evaluated with a mix of 20 workloads on a 4-core system using Spec2006 benchmarks. We have further evaluated our cooperative MLM techniques on a 16-core system. The results show an average of 20.5% and 19.1% system EDP improvement on a 4-core and 16-core system, respectively, with limited degradation of STP and Fairness.<\/jats:p>","DOI":"10.1145\/3132170","type":"journal-article","created":{"date-parts":[[2017,11,14]],"date-time":"2017-11-14T14:02:44Z","timestamp":1510668164000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["Cooperative Multi-Agent Reinforcement Learning-Based Co-optimization of Cores, Caches, and On-chip Network"],"prefix":"10.1145","volume":"14","author":[{"given":"Rahul","family":"Jain","sequence":"first","affiliation":[{"name":"Indian Institute of Technology Delhi, New Delhi, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Preeti Ranjan","family":"Panda","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Delhi, New Delhi, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sreenivas","family":"Subramoney","sequence":"additional","affiliation":[{"name":"Microarchitecture Research Lab, Intel India, Bangalore, Karnataka"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2017,11,14]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1375527.1375575"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/JETCAS.2011.2164973"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2008.4771801"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1995896.1995927"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1993744.1993746"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463209.2488874"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/2755753.2757163"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1283780.1283825"},{"key":"e_1_2_1_9_1","unstructured":"AGBTG Dietterich. 2004. Reinforcement learning and its relationship to supervised learning. Handbook of Learning and Approximate Dynamic Programming IEEE (2004) 47--60.  AGBTG Dietterich. 2004. Reinforcement learning and its relationship to supervised learning. Handbook of Learning and Approximate Dynamic Programming IEEE (2004) 47--60."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2008.44"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1278480.1278537"},{"key":"e_1_2_1_12_1","volume-title":"8th International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems (ACACES-2012)","author":"Heirman Wim","year":"2012"},{"key":"e_1_2_1_13_1","volume-title":"2013 International Conference on. IEEE, 1--8.","author":"Henkel J\u00f6rg","year":"2013"},{"key":"e_1_2_1_14_1","first-page":"136","article-title":"Intel Xeon Processor 5500 Series","volume":"1","year":"2011","journal-title":"Datasheet"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2008.21"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/2971808.2971865"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1013235.1013261"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2333660.2333686"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2010.2059270"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2013.18"},{"key":"e_1_2_1_21_1","volume-title":"Proceedings of the 20th International Symposium on High Performance Computer Architecture (HPCA).","author":"Khan Samira"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1669112.1669172"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1064978.1065034"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 2001 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 164--171","author":"Luo Kun","year":"2001"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366231.2337208"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2012.23"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/1531793.1531806"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-77560-7_23"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2012.6168945"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2004.28"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/885651.781076"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2006.49"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024723.2000073"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/1756006.1756030"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/173682.165152"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228482"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/635506.605403"},{"key":"e_1_2_1_38_1","unstructured":"Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. Vol. 1. MIT Press Cambridge.   Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. Vol. 1. MIT Press Cambridge."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.5555\/3091529.3091572"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1403375.1403402"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2015.7056026"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024724.2024735"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00992698"},{"key":"e_1_2_1_44_1","volume-title":"Proceedings of the Conference on Design, Automation 8 Test in Europe. European Design and Automation Association, 221","author":"Wildermann Stefan","year":"2014"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/1837274.1837309"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/1347375.1347381"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3132170","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3132170","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:13:56Z","timestamp":1750212836000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3132170"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,11,14]]},"references-count":46,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2017,12,31]]}},"alternative-id":["10.1145\/3132170"],"URL":"https:\/\/doi.org\/10.1145\/3132170","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,11,14]]},"assertion":[{"value":"2016-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-11-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}