{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T16:10:45Z","timestamp":1778256645605,"version":"3.51.4"},"reference-count":85,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2025,3,20]],"date-time":"2025-03-20T00:00:00Z","timestamp":1742428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\/"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2022YFB4500303"],"award-info":[{"award-number":["2022YFB4500303"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["No. 62202184 and No. 61825202"],"award-info":[{"award-number":["No. 62202184 and No. 61825202"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2025,3,31]]},"abstract":"<jats:p>Queries on linked data structures, such as trees and graphs, often suffer from frequent cache misses and significant performance loss due to dependent and random pointer-chasing memory accesses. In this article, we propose a software-hardware co-designed solution for accelerating linked data structures implemented in strongly typed languages. The solution incorporates a compiler extension and a hardware prefetcher. The compiler extension extracts type information from the code, annotates each load instruction, and forwards the type information to the hardware prefetcher. The prefetcher leverages the type information to fetch the referred objects and identify the associated pointers in advance. By doing so, the program can find these objects in the cache when it follows the prefetched pointers, thus minimizing cache misses. In the evaluation, the proposed solution achieves an average speedup of 1.37\u00d7 over a set of memory-intensive benchmarks.<\/jats:p>","DOI":"10.1145\/3701994","type":"journal-article","created":{"date-parts":[[2024,10,29]],"date-time":"2024-10-29T10:11:59Z","timestamp":1730196719000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["DTAP: Accelerating Strongly-Typed Programs with Data Type-Aware Hardware Prefetching"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-1776-8506","authenticated-orcid":false,"given":"Yingshuai","family":"Dong","sequence":"first","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3432-855X","authenticated-orcid":false,"given":"Chencheng","family":"Ye","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4290-1408","authenticated-orcid":false,"given":"Haikun","family":"Liu","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-6708-8296","authenticated-orcid":false,"given":"Liting","family":"Tang","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2712-421X","authenticated-orcid":false,"given":"Xiaofei","family":"Liao","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3934-7605","authenticated-orcid":false,"given":"Hai","family":"Jin","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-2718-1628","authenticated-orcid":false,"given":"Cheng","family":"Chen","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-2085-7887","authenticated-orcid":false,"given":"Yanjiang","family":"Li","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-4850-3196","authenticated-orcid":false,"given":"Yi","family":"Wang","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,3,20]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"1","volume-title":"Proceedings of the International Conference on Supercomputing","author":"Ainsworth Sam","year":"2016","unstructured":"Sam Ainsworth and Timothy M. Jones. 2016. Graph prefetching using data structure knowledge. In Proceedings of the International Conference on Supercomputing. 1\u201311."},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3296957.3173189"},{"key":"e_1_3_1_4_2","first-page":"291","volume-title":"Proceedings of the Shortest Path Problem","author":"Ajwani Deepak","year":"2006","unstructured":"Deepak Ajwani, Ulrich Meyer, and Vitaly Osipov. 2006. Breadth first search on massive graphs. In Proceedings of the Shortest Path Problem. 291\u2013307."},{"key":"e_1_3_1_5_2","first-page":"91","volume-title":"Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques","author":"Al-Sukhni Hassan","year":"2003","unstructured":"Hassan Al-Sukhni, Ian Bratt, and Daniel A. Connors. 2003. Compiler-directed content-aware prefetching for dynamic data structures. In Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques. 91\u2013100."},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/2254756.2254766"},{"key":"e_1_3_1_7_2","first-page":"513","volume-title":"Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems","author":"Ayers Grant","year":"2020","unstructured":"Grant Ayers, Heiner Litz, Christos Kozyrakis, and Parthasarathy Ranganathan. 2020. Classifying memory access patterns for prefetching. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems. 513\u2013526."},{"key":"e_1_3_1_8_2","first-page":"131","volume-title":"Proceedings of the IEEE International Symposium on High Performance Computer Architecture","author":"Bakhshalipour Mohammad","year":"2018","unstructured":"Mohammad Bakhshalipour, Pejman Lotfi-Kamran, and Hamid Sarbazi-Azad. 2018. Domino temporal data prefetcher. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture. 131\u2013142."},{"key":"e_1_3_1_9_2","first-page":"362","volume-title":"Proceedings of the IEEE 29th International Conference on Data Engineering","author":"Balkesen Cagri","year":"2013","unstructured":"Cagri Balkesen, Jens Teubner, Gustavo Alonso, and M. Tamer \u00d6zsu. 2013. Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware. In Proceedings of the IEEE 29th International Conference on Data Engineering. 362\u2013373."},{"key":"e_1_3_1_10_2","first-page":"80","volume-title":"Proceedings of the 12th European Conference on Computer Systems","author":"Balmau Oana","year":"2017","unstructured":"Oana Balmau, Rachid Guerraoui, Vasileios Trigonakis, and Igor Zablotchi. 2017. FloDB: Unlocking memory in persistent key-value stores. In Proceedings of the 12th European Conference on Computer Systems. 80\u201394."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3452099"},{"key":"e_1_3_1_13_2","first-page":"276","volume-title":"Proceedings of the 10th International Symposium on High Performance Computer Architecture","author":"Chen Chi F.","year":"2004","unstructured":"Chi F. Chen, S.-H. Yang, Babak Falsafi, and Andreas Moshovos. 2004. Accurate and complexity-effective spatial pattern prediction. In Proceedings of the 10th International Symposium on High Performance Computer Architecture. 276\u2013287."},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/1272743.1272747"},{"issue":"2","key":"e_1_3_1_15_2","first-page":"1","article-title":"A hybrid memory architecture supporting fine-grained data migration","volume":"18","author":"Chi Ye","year":"2024","unstructured":"Ye Chi, Jianhui Yue, Xiaofei Liao, Haikun Liu, and Hai Jin. 2024. A hybrid memory architecture supporting fine-grained data migration. Front. Comput. Sci. 18, 2 (2024), 1\u201310.","journal-title":"Front. Comput. Sci."},{"key":"e_1_3_1_16_2","first-page":"1","volume-title":"Proceedings of the ACM SIGPLAN conference on Programming Language Design and Implementation","author":"Chilimbi Trishul M.","year":"1999","unstructured":"Trishul M. Chilimbi, Mark D. Hill, and James R. Larus. 1999. Cache-conscious structure layout. In Proceedings of the ACM SIGPLAN conference on Programming Language Design and Implementation. 1\u201312."},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/158511.158639"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/986533.986536"},{"key":"e_1_3_1_19_2","first-page":"62","volume-title":"Proceedings of the 35th Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Collins Jamison","year":"2002","unstructured":"Jamison Collins, Suleyman Sair, Brad Calder, and Dean M. Tullsen. 2002. Pointer cache assisted prefetching. In Proceedings of the 35th Annual IEEE\/ACM International Symposium on Microarchitecture. 62\u201373."},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/605432.605427"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_3_1_22_2","first-page":"7","volume-title":"Proceedings of the IEEE International Symposium on High Performance Computer Architecture","author":"Ebrahimi Eiman","year":"2009","unstructured":"Eiman Ebrahimi, Onur Mutlu, and Yale N. Patt. 2009. Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture. 7\u201317."},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/144965.145006"},{"key":"e_1_3_1_24_2","first-page":"1","volume-title":"Proceedings of the 49th Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Hashemi Milad","year":"2016","unstructured":"Milad Hashemi, Onur Mutlu, and Yale N. Patt. 2016. Continuous runahead: Transparent hardware acceleration for memory intensive workloads. In Proceedings of the 49th Annual IEEE\/ACM International Symposium on Microarchitecture. 1\u201312."},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/1186736.1186737"},{"key":"e_1_3_1_26_2","first-page":"25","volume-title":"Proceedings of the IEEE International Conference on Computer Design","author":"Hsieh Kevin","year":"2016","unstructured":"Kevin Hsieh, Samira Khan, Nandita Vijaykumar, Kevin K. Chang, Amirali Boroumand, Saugata Ghose, and Onur Mutlu. 2016. Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation. In Proceedings of the IEEE International Conference on Computer Design. 25\u201332."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2004.11.004"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2012.73"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/1542275.1542349"},{"issue":"2011","key":"e_1_3_1_30_2","first-page":"1","article-title":"Access map pattern matching for high performance data cache prefetch","volume":"13","author":"Ishii Yasuo","year":"2011","unstructured":"Yasuo Ishii, Mary Inaba, and Kei Hiraki. 2011. Access map pattern matching for high performance data cache prefetch. J. Instruct.-Level Parallel. 13, 2011 (2011), 1\u201324.","journal-title":"J. Instruct.-Level Parallel."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540730"},{"issue":"6","key":"e_1_3_1_32_2","doi-asserted-by":"crossref","first-page":"1778","DOI":"10.1109\/TC.2022.3224372","article-title":"PMLiteDB: Streamlining access paths for high-performance persistent memory document database systems","volume":"72","author":"Jin Hai","year":"2022","unstructured":"Hai Jin, Shuo Wei, Yan Sha, Chencheng Ye, Haikun Liu, and Xiaofei Liao. 2022. PMLiteDB: Streamlining access paths for high-performance persistent memory document database systems. IEEE Trans. Comput. 72, 6 (2022), 1778\u20131791.","journal-title":"IEEE Trans. Comput."},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/264107.264207"},{"key":"e_1_3_1_34_2","first-page":"1","volume-title":"Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium","author":"Jung Changhee","year":"2006","unstructured":"Changhee Jung, Daeseob Lim, Jaejin Lee, and Yan Solihin. 2006. Helper thread prefetching for loosely coupled multiprocessor systems. In Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium. 1\u201310."},{"issue":"2","key":"e_1_3_1_35_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3439803","article-title":"Gretch: A hardware prefetcher for graph analytics","volume":"18","author":"Kaushik Anirudh Mohan","year":"2021","unstructured":"Anirudh Mohan Kaushik, Gennady Pekhimenko, and Hiren Patel. 2021. Gretch: A hardware prefetcher for graph analytics. ACM Trans. Arch. Code Optimiz. 18, 2 (2021), 1\u201325.","journal-title":"ACM Trans. Arch. Code Optimiz."},{"key":"e_1_3_1_36_2","first-page":"1","volume-title":"Proceedings of the 11th ACM Conference on Computing Frontiers","author":"Kim Taesu","year":"2014","unstructured":"Taesu Kim, Dali Zhao, and Alexander V. Veidenbaum. 2014. Multiple stream tracker: A new hardware stride prefetcher. In Proceedings of the 11th ACM Conference on Computing Frontiers. 1\u201310."},{"key":"e_1_3_1_37_2","first-page":"252","volume-title":"Proceedings of the VLDB Endowment","volume":"9","author":"Kocberber Onur","year":"2015","unstructured":"Onur Kocberber, Babak Falsafi, and Boris Grot. 2015. Asynchronous memory access chaining. In Proceedings of the VLDB Endowment, Vol. 9. 252\u2013263."},{"key":"e_1_3_1_38_2","first-page":"268","volume-title":"Proceedings of the International Conference on Parallel Architectures and Compilation Techniques","author":"Kohout Nicholas","year":"2001","unstructured":"Nicholas Kohout, Seungryul Choi, Dongkeun Kim, and Donald Yeung. 2001. Multi-chain prefetching: Effective exploitation of inter-chain memory parallelism for pointer-chasing codes. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 268\u2013279."},{"key":"e_1_3_1_39_2","first-page":"83","volume-title":"Proceedings of the ACM\/IEEE 45th Annual International Symposium on Computer Architecture","author":"Kondguli Sushant","year":"2018","unstructured":"Sushant Kondguli and Michael Huang. 2018. Division of labor: A more effective approach to prefetching. In Proceedings of the ACM\/IEEE 45th Annual International Symposium on Computer Architecture. 83\u201395."},{"key":"e_1_3_1_40_2","first-page":"1","volume-title":"Proceedings of the 13th EuroSys Conference","author":"Kroes Taddeus","year":"2018","unstructured":"Taddeus Kroes, Koen Koning, Erik van der Kouwe, Herbert Bos, and Cristiano Giuffrida. 2018. Delta pointers: Buffer overflow checks without the checks. In Proceedings of the 13th EuroSys Conference. 1\u201314."},{"key":"e_1_3_1_41_2","first-page":"205","volume-title":"Proceedings of the 12th European Conference on Computer Systems","author":"Kuvaiskii Dmitrii","year":"2017","unstructured":"Dmitrii Kuvaiskii, Oleksii Oleksenko, Sergei Arnautov, Bohdan Trach, Pramod Bhatotia, Pascal Felber, and Christof Fetzer. 2017. SGXBOUNDS: Memory safety for shielded execution. In Proceedings of the 12th European Conference on Computer Systems. 205\u2013221."},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.5555\/977395.977673"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3605731.3605906"},{"issue":"1","key":"e_1_3_1_44_2","first-page":"1","article-title":"LPW: An efficient data-aware cache replacement strategy for apache spark","volume":"66","author":"Li Hui","year":"2023","unstructured":"Hui Li, Shuping Ji, Hua Zhong, Wei Wang, Lijie Xu, Zhen Tang, Jun Wei, and Tao Huang. 2023. LPW: An efficient data-aware cache replacement strategy for apache spark. Sci. China Inf. Sci. 66, 1 (2023), 1\u201320.","journal-title":"Sci. China Inf. Sci."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2019.2908175"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/237090.237190"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2016.7446087"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysarc.2023.102924"},{"key":"e_1_3_1_49_2","first-page":"28","article-title":"CACTI 6.0: A tool to model large caches","volume":"27","author":"Muralimanohar Naveen","year":"2009","unstructured":"Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P. Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP Lab. 27 (2009), 28.","journal-title":"HP Lab."},{"key":"e_1_3_1_50_2","first-page":"45","article-title":"Introducing the graph 500","volume":"19","author":"Murphy Richard C.","year":"2010","unstructured":"Richard C. Murphy, Kyle B. Wheeler, Brian W. Barrett, and James A. Ang. 2010. Introducing the graph 500. Cray Users Group 19 (2010), 45\u201374.","journal-title":"Cray Users Group"},{"key":"e_1_3_1_51_2","first-page":"397","volume-title":"Proceedings of the IEEE International Symposium on High Performance Computer Architecture","author":"Naithani Ajeya","year":"2020","unstructured":"Ajeya Naithani, Josu\u00e9 Feliu, Almutaz Adileh, and Lieven Eeckhout. 2020. Precise runahead execution. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture. 397\u2013410."},{"key":"e_1_3_1_52_2","first-page":"612","volume-title":"Proceedings of the 35th Annual Computer Security Applications Conference","author":"Nam Myoung Jin","year":"2019","unstructured":"Myoung Jin Nam, Periklis Akritidis, and David J. Greaves. 2019. FRAMER: A tagged-pointer capability system with memory safety applications. In Proceedings of the 35th Annual Computer Security Applications Conference. 612\u2013626."},{"key":"e_1_3_1_53_2","first-page":"96","volume-title":"Proceedings of the 10th International Symposium on High Performance Computer Architecture","author":"Nesbit Kyle J.","year":"2004","unstructured":"Kyle J. Nesbit and James E. Smith. 2004. Data cache prefetching using a global history buffer. In Proceedings of the 10th International Symposium on High Performance Computer Architecture. 96\u201396."},{"key":"e_1_3_1_54_2","first-page":"118","volume-title":"Proceedings of the ACM\/IEEE 47th Annual International Symposium on Computer Architecture","author":"Pakalapati Samuel","year":"2020","unstructured":"Samuel Pakalapati and Biswabandan Panda. 2020. Bouquet of instruction pointers: Instruction pointer classifier-based spatial hardware prefetching. In Proceedings of the ACM\/IEEE 47th Annual International Symposium on Computer Architecture. 118\u2013131."},{"key":"e_1_3_1_55_2","first-page":"1","volume-title":"Proceedings of the International Conference on Supercomputing","author":"Panda Reena","year":"2016","unstructured":"Reena Panda, Yasuko Eckert, Nuwan Jayasena, Onur Kayiran, Michael Boyer, and Lizy Kurian John. 2016. Prefetching techniques for near-memory throughput processors. In Proceedings of the International Conference on Supercomputing. 1\u201314."},{"key":"e_1_3_1_56_2","first-page":"1526","volume-title":"Proceedings of the VLDB Endowment","volume":"10","author":"Pilman Markus","year":"2017","unstructured":"Markus Pilman, Kevin Bocksrocker, Lucas Braun, Renato Marroquin, and Donald Kossmann. 2017. Fast scans on key-value stores. In Proceedings of the VLDB Endowment, Vol. 10. 1526\u20131537."},{"key":"e_1_3_1_57_2","first-page":"626","volume-title":"Proceedings of the IEEE International Symposium on High Performance Computer Architecture","author":"Pugsley Seth H.","year":"2014","unstructured":"Seth H. Pugsley, Zeshan Chishti, Chris Wilkerson, Peng-fei Chuang, Robert L. Scott, Aamer Jaleel, Shih-Lien Lu, Kingsum Chow, and Rajeev Balasubramonian. 2014. Sandbox prefetching: Safe run-time evaluation of aggressive prefetchers. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture. 626\u2013637."},{"key":"e_1_3_1_58_2","volume-title":"Using the STL: The C++ Standard Template Library","author":"Robson Robert","year":"2012","unstructured":"Robert Robson. 2012. Using the STL: The C++ Standard Template Library. Springer Science & Business Media."},{"key":"e_1_3_1_59_2","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1145\/3470496.3527390","volume-title":"Proceedings of the 49th Annual International Symposium on Computer Architecture","author":"Schall David","year":"2022","unstructured":"David Schall, Artemiy Margaritov, Dmitrii Ustiugov, Andreas Sandberg, and Boris Grot. 2022. Lukewarm serverless functions: Characterization and optimization. In Proceedings of the 49th Annual International Symposium on Computer Architecture. 757\u2013770."},{"key":"e_1_3_1_60_2","first-page":"1","volume-title":"Proceedings of the 3rd Data Prefetching Championship","author":"Shakerinava Mehran","year":"2019","unstructured":"Mehran Shakerinava, Mohammad Bakhshalipour, Pejman Lotfi-Kamran, and Hamid Sarbazi-Azad. 2019. Multi-lookahead offset prefetching. In Proceedings of the 3rd Data Prefetching Championship. 1\u20134."},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/2830772.2830793"},{"key":"e_1_3_1_62_2","first-page":"608","volume-title":"Proceedings of the 21st Euromicro Conference on Digital System Design","author":"Singh Gagandeep","year":"2018","unstructured":"Gagandeep Singh, Lorenzo Chelini, Stefano Corda, Ahsan Javed Awan, Sander Stuijk, Roel Jordans, Henk Corporaal, and Albert-Jan Boonstra. 2018. A review of near-memory computing architectures: Opportunities and challenges. In Proceedings of the 21st Euromicro Conference on Digital System Design. 608\u2013617."},{"key":"e_1_3_1_63_2","first-page":"285","volume-title":"Proceedings of the ACM\/SPEC International Conference on Performance Engineering","author":"Singh Sarabjeet","year":"2019","unstructured":"Sarabjeet Singh and Manu Awasthi. 2019. Memory centric characterization and analysis of spec CPU2017 suite. In Proceedings of the ACM\/SPEC International Conference on Performance Engineering. 285\u2013292."},{"key":"e_1_3_1_64_2","first-page":"7","volume-title":"Computer","author":"Smith Alan Jay","year":"1978","unstructured":"Alan Jay Smith. 1978. Sequential program prefetching in memory hierarchies. In Computer, Vol. 11. 7\u201321."},{"key":"e_1_3_1_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/1555815.1555766"},{"key":"e_1_3_1_66_2","doi-asserted-by":"publisher","DOI":"10.1145\/1150019.1136508"},{"key":"e_1_3_1_67_2","doi-asserted-by":"publisher","DOI":"10.1145\/1504176.1504208"},{"key":"e_1_3_1_68_2","first-page":"654","volume-title":"Proceedings of the IEEE International Symposium on High-Performance Computer Architecture","author":"Talati Nishil","year":"2021","unstructured":"Nishil Talati, Kyle May, Armand Behroozi, Yichen Yang, Kuba Kaszyk, Christos Vasiladiotis, Tarunesh Verma, Lu Li, Brandon Nguyen, Jiawen Sun, et\u00a0al. 2021. Prodigy: Improving the memory latency of data-indirect irregular workloads using hardware-software co-design. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture. 654\u2013667."},{"key":"e_1_3_1_69_2","first-page":"48","volume-title":"Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications and IEEE International Conference on Ubiquitous Computing and Communications","author":"Tian Teng","year":"2017","unstructured":"Teng Tian, Tianqi Wang, and Xi Jin. 2017. An efficient hardware prefetcher exploiting the prefetch potential of long-stride access pattern on virtual address. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications and IEEE International Conference on Ubiquitous Computing and Communications. 48\u201357."},{"key":"e_1_3_1_70_2","doi-asserted-by":"publisher","DOI":"10.1145\/3470644"},{"key":"e_1_3_1_71_2","first-page":"496","volume-title":"Proceedings of the 53rd Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Wang Zixuan","year":"2020","unstructured":"Zixuan Wang, Xiao Liu, Jian Yang, Theodore Michailidis, Steven Swanson, and Jishen Zhao. 2020. Characterizing and modeling non-volatile memory systems. In Proceedings of the 53rd Annual IEEE\/ACM International Symposium on Microarchitecture. 496\u2013508."},{"issue":"6","key":"e_1_3_1_72_2","first-page":"1","article-title":"Large sequence models for sequential decision-making: A survey","volume":"17","author":"Wen Muning","year":"2023","unstructured":"Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, and Weinan Zhang. 2023. Large sequence models for sequential decision-making: A survey. Front. Comput. Sci. 17, 6 (2023), 1\u201330.","journal-title":"Front. Comput. Sci."},{"key":"e_1_3_1_73_2","first-page":"79","volume-title":"Proceedings of the IEEE International Symposium on High Performance Computer Architecture","author":"Wenisch Thomas F.","year":"2009","unstructured":"Thomas F. Wenisch, Michael Ferdman, Anastasia Ailamaki, Babak Falsafi, and Andreas Moshovos. 2009. Practical off-chip meta-data for temporal memory streaming. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture. 79\u201390."},{"key":"e_1_3_1_74_2","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358300"},{"key":"e_1_3_1_75_2","doi-asserted-by":"publisher","DOI":"10.1145\/3307650.3322225"},{"key":"e_1_3_1_76_2","doi-asserted-by":"publisher","DOI":"10.1360\/SSI-2021-0155"},{"key":"e_1_3_1_77_2","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1007\/3-540-47847-7_15","volume-title":"Proceedings of the International Symposium on High Performance Computing","author":"Yang Chia-Lin","year":"2002","unstructured":"Chia-Lin Yang and Alvin Lebeck. 2002. A programmable memory hierarchy for prefetching linked data structures. In Proceedings of the International Symposium on High Performance Computing. 160\u2013174."},{"key":"e_1_3_1_78_2","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1145\/335231.335248","volume-title":"Proceedings of the 14th International Conference on Supercomputing","author":"Yang Chia-Lin","year":"2000","unstructured":"Chia-Lin Yang and Alvin R. Lebeck. 2000. Push vs. Pull: Data movement for linked data structures. In Proceedings of the 14th International Conference on Supercomputing. 176\u2013186."},{"key":"e_1_3_1_79_2","first-page":"601","volume-title":"Proceedings of the USENIX Annual Technical Conference","author":"Yang Shao-Peng","year":"2023","unstructured":"Shao-Peng Yang, Minjae Kim, Sanghyun Nam, Juhyung Park, Jin-Yong Choi, Eyee Hyun Nam, Eunji Lee, Sungjin Lee, and Bryan S. Kim. 2023. Overcoming the memory wall with CXL-enabled SSDs. In Proceedings of the USENIX Annual Technical Conference. 601\u2013617."},{"key":"e_1_3_1_80_2","doi-asserted-by":"publisher","DOI":"10.1145\/3511706"},{"key":"e_1_3_1_81_2","first-page":"736","volume-title":"Proceedings of the IEEE International Symposium on High-Performance Computer Architecture","author":"Ye Chencheng","year":"2021","unstructured":"Chencheng Ye, Yuanchao Xu, Xipeng Shen, Xiaofei Liao, Hai Jin, and Yan Solihin. 2021. Hardware-based address-centric acceleration of key-value store. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture. 736\u2013748."},{"key":"e_1_3_1_82_2","first-page":"664","volume-title":"Proceedings of the IEEE International Symposium on High-Performance Computer Architecture","author":"Ye Chencheng","year":"2023","unstructured":"Chencheng Ye, Yuanchao Xu, Xipeng Shen, Yan Sha, Xiaofei Liao, Hai Jin, and Yan Solihin. 2023. Reconciling selective logging and hardware persistent memory transaction. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture. 664\u2013676."},{"key":"e_1_3_1_83_2","first-page":"762","volume-title":"Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems","author":"Ye Chencheng","year":"2023","unstructured":"Chencheng Ye, Yuanchao Xu, Xipeng Shen, Yan Sha, Xiaofei Liao, Hai Jin, and Yan Solihin. 2023. SpecPMT: Speculative logging for resolving crash consistency overhead of persistent memory. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 762\u2013777."},{"key":"e_1_3_1_84_2","doi-asserted-by":"publisher","DOI":"10.1145\/2830772.2830807"},{"key":"e_1_3_1_85_2","first-page":"609","volume-title":"Proceedings of the 53rd Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Zhang Chao","year":"2020","unstructured":"Chao Zhang, Yuan Zeng, John Shalf, and Xiaochen Guo. 2020. RnR: A software-assisted record-and-replay hardware prefetcher. In Proceedings of the 53rd Annual IEEE\/ACM International Symposium on Microarchitecture. 609\u2013621."},{"key":"e_1_3_1_86_2","first-page":"19","volume-title":"Proceedings of the International Symposium on Advanced Parallel Processing Technologies","author":"Zhou Zhe","year":"2023","unstructured":"Zhe Zhou, Shuotao Xu, Yiqi Chen, Tao Zhang, Ran Shu, Lei Qu, Peng Cheng, Yongqiang Xiong, and Guangyu Sun. 2023. Polaris: Enhancing CXL-based memory expanders with memory-side prefetching. In Proceedings of the International Symposium on Advanced Parallel Processing Technologies. 19\u201339."}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3701994","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3701994","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:57:16Z","timestamp":1750298236000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3701994"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,20]]},"references-count":85,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,3,31]]}},"alternative-id":["10.1145\/3701994"],"URL":"https:\/\/doi.org\/10.1145\/3701994","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,20]]},"assertion":[{"value":"2024-02-02","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-09-03","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}