{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T19:43:57Z","timestamp":1776109437906,"version":"3.50.1"},"reference-count":54,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,9,11]],"date-time":"2024-09-11T00:00:00Z","timestamp":1726012800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2024,11,30]]},"abstract":"<jats:p>\n            Deep neural networks (DNNs) have been widely adopted, owing to break-through performance and high accuracy. DNNs exhibit varying memory behavior involving specific and recognizable memory access patterns and access intensity, depending on the selected data reuse in different layers. Such applications have high memory bandwidth demands due to aggressive computations, performing several billion-floating-point-operations-per-second (BFLOPs). 3D DRAMs, providing very high memory access bandwidth, are extensively employed to break the\n            <jats:italic>memory wall<\/jats:italic>\n            , bridging the gap between compute and memory while running DNNs. However, the vertical integration in 3D DRAM introduces serious thermal issues, resulting from high power density and close proximity of memory cells, and requires dynamic thermal management (DTM). To unleash the true potential of 3D DRAM and exploit the enormous bandwidth under thermal constraints, there is a need to intelligently map the DNN application\u2019s data across memory channels, pseudo-channels, and banks, minimizing the effective memory latency and reducing the thermal-induced application slowdown. The specific memory access patterns exhibited by a DNN layer execution are crucial to determine a favorable data mapping method for 3D DRAM dies that potentially causes minimal thermal impact and also maximizes DRAM bandwidth utilization. In this work, we propose an application-aware and thermal-sensitive data mapping that intelligently assigns portions of the 3D DRAM to DNN layers, leveraging the knowledge about layer\u2019s memory access patterns and minimizing DTM-induced performance overheads. Additionally, we also deploy a DRAM low-power states based DTM mechanism to keep the 3D DRAM within safe thermal limits. Using our proposal, we observe a performance improvement of 1% to 61%, and memory energy savings of 1% to 55% for popular DNNs over state-of-the-art DTM strategies while running DNN inference.\n          <\/jats:p>","DOI":"10.1145\/3677178","type":"journal-article","created":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T11:29:22Z","timestamp":1722943762000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["NeuroTAP: Thermal and Memory Access Pattern-Aware Data Mapping on 3D DRAM for Maximizing DNN Performance"],"prefix":"10.1145","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3013-5128","authenticated-orcid":false,"given":"Shailja","family":"Pandey","sequence":"first","affiliation":[{"name":"Computer Science &amp; Engineering, Indian Institute of Technology Delhi, New Delhi, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2508-7531","authenticated-orcid":false,"given":"Preeti Ranjan","family":"Panda","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, Indian Institute of Technology Delhi, New Delhi, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2024,9,11]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","unstructured":"Shashank Adavally and Krishna Kavi. 2021. Towards Application-Specific Address Mapping for Emerging Memory Devices. ACM.","DOI":"10.1145\/3422575.3422785"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCAD51958.2021.9643473"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/1594233.1594256"},{"key":"e_1_3_2_5_2","unstructured":"CADENCE. 2022. PHY IP for HBM2 for Samsung 10LPP. Retrieved from https:\/\/www.cadence.com\/content\/dam\/cadence-www\/global\/en_US\/documents\/tools\/ip\/design-ip\/hbm2-for-samsung-10lpp-br.pdf"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3357526.3357569"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.5555\/1874620.1874960"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/ESTIMedia.2013.6704498"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/N19-1423"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2599174"},{"key":"e_1_3_2_11_2","unstructured":"Mohsen Ghasempour Aamer Jaleel Jim D. Garside and Mikel Luj\u00e1n. 2016. DReAM: Dynamic Re-arrangement of Address Mapping to Improve the Performance of DRAMs. ACM."},{"key":"e_1_3_2_12_2","first-page":"1","volume-title":"Proceedings of the Conference on Design, Automation and Test in Europe","author":"Hameed Fazal","year":"2011","unstructured":"Fazal Hameed, Mohammad Abdullah Al Faruque, and J\u00f6rg Henkel. 2011. Dynamic thermal management in 3D multi-core architecture through run-time adaptation. In Proceedings of the Conference on Design, Automation and Test in Europe. 1\u20136."},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/2512457"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2021.3075765"},{"key":"e_1_3_2_15_2","unstructured":"INTEL. 2022. High Bandwidth Memory Can Make CPUs the Desired Platform for AI and HPC. Retrieved from https:\/\/community.intel.com\/t5\/Blogs\/Products-and-Solutions\/HPC\/High-Bandwidth-Memory-Can-Make-CPUs-the-Desired-Platform-for-AI\/post\/1434192"},{"key":"e_1_3_2_16_2","unstructured":"Intel. 2022. Intel Max Series Brings Breakthrough Memory Bandwidth and Performance to HPC and AI. Retrieved from https:\/\/www.intel.com\/content\/www\/us\/en\/newsroom\/news\/introducing-intel-max-series-product-family.html"},{"key":"e_1_3_2_17_2","unstructured":"JEDEC. 2022. JEDEC Standard High Bandwidth Memory DRAM (HBM3) JESD238. Retrieved from https:\/\/www.jedec.org\/standards-documents\/docs\/jesd238"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001178"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3065386"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE54114.2022.9774608"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3223046"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.23919\/DATE.2018.8342033"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.5555\/2755753.2755833"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/2508148.2485928"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.5555\/3358807.3358895"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.5555\/2971808.2972061"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2015.2409847"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-15-3383-9_54"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/2228360.2228477"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/2155620.2155664"},{"key":"e_1_3_2_31_2","first-page":"217","volume-title":"Proceedings of the Great Lakes Symposium on VLSI 2022","author":"NS Aswathy","year":"2022","unstructured":"Aswathy NS, Sreesiddesh Bhavanasi, Arnab Sarkar, and Hemangee K. Kapoor. 2022. SRS-Mig: Selection and run-time scheduling of page migration for improved response time in hybrid PCM-DRAM memories. In Proceedings of the Great Lakes Symposium on VLSI 2022. 217\u2013222."},{"key":"e_1_3_2_32_2","first-page":"330","volume-title":"ISSCC","author":"Oh Chi-Sung","year":"2020","unstructured":"Chi-Sung Oh, Ki Chul Chun, Young-Yong Byun, Yong-Ki Kim, So-Young Kim, Yesin Ryu, Jaewon Park, Sinho Kim, Sanguhn Cha, Donghak Shin, Jungyu Lee, Jong-Pil Son, Byung-Kyu Ho, Seong-Jin Cho, Beomyong Kil, Sungoh Ahn, Baekmin Lim, Yongsik Park, Kijun Lee, Myung-Kyu Lee, Seungduk Baek, Junyong Noh, Jae-Wook Lee, Seungseob Lee, Sooyoung Kim, Botak Lim, Seouk-Kyu Choi, Jin-Guk Kim, Hye-In Choi, Hyuk-Jun Kwon, Jun Jin Kong, Kyomin Sohn, Nam Sung Kim, Kwang-Il Park, and Jung-Bae Lee. 2020. 22.1 A 1.1 V 16GB 640GB\/s HBM2E DRAM with a data-bus window-extension technique and a synergetic on-die ECC scheme. In ISSCC. IEEE, 330\u2013332."},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2016.2564969"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2022.3197698"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2024.3367235"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3630012"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC55918.2022.00033"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2021.3060509"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2011.2164540"},{"key":"e_1_3_2_41_2","first-page":"1921","volume-title":"Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence","author":"Sharma Vishal","year":"2023","unstructured":"Vishal Sharma, Daman Arora, Mausam, and Parag Singla. 2023. SymNet 3.0: Exploiting long-range influences in learning generalized neural policies for relational MDPs. In Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence. Robin J. Evans and Ilya Shpitser (Eds.), Proceedings of Machine Learning Research, Vol. 216, PMLR, 1921\u20131931."},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3624581"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3419468"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3532185"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3358208"},{"key":"e_1_3_2_46_2","volume-title":"International Conference on Learning Representations (ICLR)","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/782814.782831"},{"key":"e_1_3_2_48_2","volume-title":"Deep Learning and Unsupervised Feature Learning Workshop, (NIPS\u201911)","author":"Vanhoucke Vincent","year":"2011","unstructured":"Vincent Vanhoucke, Andrew Senior, and Mark Z. Mao. 2011. Improving the speed of neural networks on CPUs. In Deep Learning and Unsupervised Feature Learning Workshop, (NIPS\u201911)."},{"key":"e_1_3_2_49_2","article-title":"Attention is all you need","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems.","journal-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems"},{"key":"e_1_3_2_50_2","first-page":"20929","volume-title":"NIPS","author":"Wickramanayake Sandareka","year":"2021","unstructured":"Sandareka Wickramanayake, Wynne Hsu, and Mong Li Lee. 2021. Explanation-based data augmentation for image classification. In NIPS. M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 20929\u201320940. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2021\/file\/af3b6a54e9e9338abc54258e3406e485-Paper.pdf"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/1391469.1391658"},{"issue":"1","key":"e_1_3_2_52_2","first-page":"146","article-title":"Parana: A parallel neural architecture considering thermal problem of 3d stacked memory","volume":"30","author":"Yin Shouyi","year":"2018","unstructured":"Shouyi Yin, Shibin Tang, Xinhan Lin, Peng Ouyang, Fengbin Tu, Leibo liu, Jishen Zhao, Cong Xu, Shuangcheng Li, Yuan Xie, and ShaoJun Wei. 2018. Parana: A parallel neural architecture considering thermal problem of 3d stacked memory. IEEE TPDS 30, 1 (2018), 146\u2013160.","journal-title":"IEEE TPDS"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCD.2012.6378661"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503222.3507774"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2009.27"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3677178","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3677178","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:06:17Z","timestamp":1750291577000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3677178"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,11]]},"references-count":54,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,11,30]]}},"alternative-id":["10.1145\/3677178"],"URL":"https:\/\/doi.org\/10.1145\/3677178","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"value":"1539-9087","type":"print"},{"value":"1558-3465","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,9,11]]},"assertion":[{"value":"2023-10-06","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-05-07","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-09-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}