{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:10:41Z","timestamp":1750306241276,"version":"3.41.0"},"reference-count":20,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2017,4,27]],"date-time":"2017-04-27T00:00:00Z","timestamp":1493251200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Qualcomm Fellow-Mentor-Advisor (FMA) award"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2017,7,31]]},"abstract":"<jats:p>Heterogeneous MPSoCs typically integrate diverse cores, including application CPUs, GPUs, and HD coders. These cores commonly share an off-chip memory to save cost and energy, but their memory accesses often interfere with each other, leading to undesirable consequences like a slowdown of application performance or a failure to sustain real-time performance. The memory controller plays a central role in meeting the QoS needs of real-time cores while maximizing CPU performance. Previous QoS-aware memory controllers are based on a classic two-tier queuing architecture that buffers memory transactions at the first tier, followed by a second tier that buffers translated DRAM commands. In these designs, QoS-aware policies are used to schedule competing transactions at the first stage, but the translated DRAM commands are served in FIFO order at the second stage. Unfortunately, once the scheduled transactions have been forwarded to the command stage, newly arriving transactions that may be more critical cannot be served ahead of those translated commands that are already queued at the second stage. To address this, we propose a scalable memory controller architecture based on single-tier virtual queuing (STVQ) that maintains a single tier of request queues and employs an efficacious scheduler that considers both QoS requirements and DRAM bank states. In comparison with previous QoS-aware memory controllers, the proposed STVQ memory controller reduces CPU slowdown by up to 13.9% while satisfying all frame rate requirements. We propose further optimizations that can significantly increase row-buffer hits by up to 66.2% and reduce memory latency by up to 19.8%.<\/jats:p>","DOI":"10.1145\/3035481","type":"journal-article","created":{"date-parts":[[2017,4,28]],"date-time":"2017-04-28T12:38:23Z","timestamp":1493383103000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["A Single-Tier Virtual Queuing Memory Controller Architecture for Heterogeneous MPSoCs"],"prefix":"10.1145","volume":"22","author":[{"given":"Yang","family":"Song","sequence":"first","affiliation":[{"name":"University of California San Diego, La Jolla, CA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kambiz","family":"Samadi","sequence":"additional","affiliation":[{"name":"Qualcomm Research, San Diego, CA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bill","family":"Lin","sequence":"additional","affiliation":[{"name":"University of California San Diego, La Jolla, CA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2017,4,27]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT\u201913)","author":"Arnau Jose-Maria","year":"2013","unstructured":"Jose-Maria Arnau , Joan-Manuel Parcerisa , and Polychronis Xekalakis . 2013 a. Parallel frame rendering: Trading responsiveness for energy on a mobile GPU . In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT\u201913) . IEEE, Los Alamitos, CA, 83--92. Jose-Maria Arnau, Joan-Manuel Parcerisa, and Polychronis Xekalakis. 2013a. Parallel frame rendering: Trading responsiveness for energy on a mobile GPU. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT\u201913). IEEE, Los Alamitos, CA, 83--92."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2464996.2464999"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366231.2337207"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1654059.1654112"},{"key":"e_1_2_1_5_1","volume-title":"Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann","author":"Jacob Bruce","year":"2007","unstructured":"Bruce Jacob , Spencer Ng , and David Wang . 2007 . Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann , San Francisco, CA . Bruce Jacob, Spencer Ng, and David Wang. 2007. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann, San Francisco, CA."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/RTSS.2014.23"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228513"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155624"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 16th International Symposium on High Performance Computer Architecture (HPCA\u201910)","author":"Kim Yoongu","year":"2010","unstructured":"Yoongu Kim , Dongsu Han , Onur Mutlu , and Mor Harchol-Balter . 2010 a. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers . In Proceedings of the 16th International Symposium on High Performance Computer Architecture (HPCA\u201910) . IEEE, Los Alamitos, CA, 1--12. Yoongu Kim, Dongsu Han, Onur Mutlu, and Mor Harchol-Balter. 2010a. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In Proceedings of the 16th International Symposium on High Performance Computer Architecture (HPCA\u201910). IEEE, Los Alamitos, CA, 1--12."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.51"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2013.07.014"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463209.2488779"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1394608.1382128"},{"key":"e_1_2_1_14_1","unstructured":"NVIDIA. 2015. Tegra X1. Retrieved February 4 2017 from http:\/\/www.nvidia.com\/object\/tegra-x1-processor.html.  NVIDIA. 2015. Tegra X1. Retrieved February 4 2017 from http:\/\/www.nvidia.com\/object\/tegra-x1-processor.html."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540747"},{"key":"e_1_2_1_16_1","unstructured":"Qualcomm. 2015. Snapdragon 820. Retrieved February 4 2017 from https:\/\/www.qualcomm.com\/products\/snapdragon\/processors\/820.  Qualcomm. 2015. Snapdragon 820. Retrieved February 4 2017 from https:\/\/www.qualcomm.com\/products\/snapdragon\/processors\/820."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/339647.339668"},{"key":"e_1_2_1_18_1","unstructured":"Tungsten Graphics. 2010. Gallium3D. Retrieved February 4 2017 from http:\/\/en.wikipedia.org\/wiki\/Gallium3D\/.  Tungsten Graphics. 2010. Gallium3D. Retrieved February 4 2017 from http:\/\/en.wikipedia.org\/wiki\/Gallium3D\/."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1105734.1105748"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/2591513.2591529"}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3035481","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3035481","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:23:34Z","timestamp":1750220614000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3035481"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,4,27]]},"references-count":20,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2017,7,31]]}},"alternative-id":["10.1145\/3035481"],"URL":"https:\/\/doi.org\/10.1145\/3035481","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"type":"print","value":"1084-4309"},{"type":"electronic","value":"1557-7309"}],"subject":[],"published":{"date-parts":[[2017,4,27]]},"assertion":[{"value":"2016-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-12-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-04-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}