{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T20:58:53Z","timestamp":1757624333142,"version":"3.44.0"},"publisher-location":"New York, NY, USA","reference-count":44,"publisher":"ACM","funder":[{"name":"Semiconductor Research Corporation (SRC) JUMP 2.0 Program","award":["2023-JU-3134"],"award-info":[{"award-number":["2023-JU-3134"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,7,20]]},"DOI":"10.1145\/3731545.3731579","type":"proceedings-article","created":{"date-parts":[[2025,9,9]],"date-time":"2025-09-09T12:46:16Z","timestamp":1757421976000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["AutoSSD: CXL-Enhanced Autonomous SSDs for Low Tail Latency"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-2061-9013","authenticated-orcid":false,"given":"Mingyao","family":"Shen","sequence":"first","affiliation":[{"name":"University of California San Diego, La Jolla, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5287-5317","authenticated-orcid":false,"given":"Suyash","family":"Mahar","sequence":"additional","affiliation":[{"name":"University of California San Diego, La Jolla, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6748-2890","authenticated-orcid":false,"given":"Heewoo","family":"Kim","sequence":"additional","affiliation":[{"name":"University of Colorado, Boulder, Boulder, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-1267-5024","authenticated-orcid":false,"given":"Joseph","family":"Izraelevitz","sequence":"additional","affiliation":[{"name":"University of Colorado, Boulder, Boulder, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5896-1037","authenticated-orcid":false,"given":"Steven","family":"Swanson","sequence":"additional","affiliation":[{"name":"University of California San Diego, La Jolla, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,9,9]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Compute Express Link\u2122: The breakthrough cpu-to-device interconnect CXL\u2122. https:\/\/www.computeexpresslink.org\/ Accessed: 2023-12-22."},{"key":"e_1_3_2_1_2_1","unstructured":"Inside language models (from GPT to Olympus). https:\/\/lifearchitect.ai\/models\/ Accessed: 2023-12-22."},{"key":"e_1_3_2_1_3_1","volume-title":"captured, copied, and consumed worldwide from 2010 to","author":"Volume","year":"2020","unstructured":"Volume of data\/information created, captured, copied, and consumed worldwide from 2010 to 2020, with forecasts from 2021 to 2025. https:\/\/www.statista.com\/statistics\/871513\/worldwide-data-created\/, Accessed: 2023-12-22."},{"key":"e_1_3_2_1_4_1","volume-title":"https:\/\/github.com\/brianfrankcooper\/YCSB\/tree\/master\/rocksdb","author":"YCSB","year":"2020","unstructured":"YCSB on RocksDB. https:\/\/github.com\/brianfrankcooper\/YCSB\/tree\/master\/rocksdb, 2020. [Online; accessed 12-October-2024]."},{"key":"e_1_3_2_1_5_1","volume-title":"https:\/\/www.marvell.com\/content\/dam\/marvell\/en\/public-collateral\/storage\/marvell-ssd-mv-ss1331-1333-product-brief.pdf","author":"Controllers Marvell\u00ae Bravera\u2122","year":"2021","unstructured":"Marvell\u00ae Bravera\u2122 SC5 SSD Controllers. https:\/\/www.marvell.com\/content\/dam\/marvell\/en\/public-collateral\/storage\/marvell-ssd-mv-ss1331-1333-product-brief.pdf, 2021. [Online; accessed 28-Jan-2025]."},{"key":"e_1_3_2_1_6_1","volume-title":"https:\/\/www.extremetech.com\/computing\/325146-why-latency-impacts-ssd-performance-more-than-bandwidth-does","author":"Why","year":"2021","unstructured":"Why latency impacts ssd performance more than bandwidth does. https:\/\/www.extremetech.com\/computing\/325146-why-latency-impacts-ssd-performance-more-than-bandwidth-does, 2021. [Online; accessed 15-September-2024]."},{"key":"e_1_3_2_1_7_1","volume-title":"https:\/\/iotta.snia.org\/traces\/block-io","author":"Traces Block","year":"2022","unstructured":"Block I\/O Traces. https:\/\/iotta.snia.org\/traces\/block-io, 2022. [Online; accessed 12-October-2024]."},{"key":"e_1_3_2_1_8_1","volume-title":"https:\/\/github.com\/axboe\/fio","author":"Tester Flexible","year":"2022","unstructured":"Flexible I\/O Tester. https:\/\/github.com\/axboe\/fio, 2022. [Online; accessed 12-October-2024]."},{"key":"e_1_3_2_1_9_1","volume-title":"https:\/\/iotta.snia.org\/traces\/parallel","author":"Traces Parallel","year":"2022","unstructured":"Parallel Traces. https:\/\/iotta.snia.org\/traces\/parallel, 2022. [Online; accessed 12-October-2024]."},{"key":"e_1_3_2_1_10_1","volume-title":"https:\/\/spdk.io","author":"Development Storage Performance","year":"2023","unstructured":"Storage Performance Development Kit (SPDK). https:\/\/spdk.io, 2023. [Online; accessed 08-Jan-2022]."},{"key":"e_1_3_2_1_11_1","volume-title":"https:\/\/docs.kernel.org\/block\/ublk.html","author":"Userspace","year":"2023","unstructured":"Userspace block device driver (ublk driver). https:\/\/docs.kernel.org\/block\/ublk.html, 2023. [Online; accessed 08-Jan-2022]."},{"key":"e_1_3_2_1_12_1","volume-title":"Choosing the Best Storage for Your Needs. https:\/\/www.hp.com\/us-en\/shop\/tech-takes\/ssd-vs-hdd","author":"SSD","year":"2024","unstructured":"SSD vs HDD: Choosing the Best Storage for Your Needs. https:\/\/www.hp.com\/us-en\/shop\/tech-takes\/ssd-vs-hdd, 2024. [Online; accessed 15-September-2024]."},{"key":"e_1_3_2_1_13_1","first-page":"11","volume-title":"Proceedings of the 51st International Conference on Parallel Processing","author":"Arif Moiz","year":"2022","unstructured":"Moiz Arif, Kevin Assogba, M Mustafa Rafique, and Sudharshan Vazhkudai. Exploiting cxl-based memory for distributed deep learning. In Proceedings of the 51st International Conference on Parallel Processing, pages 1\u201311, 2022."},{"key":"e_1_3_2_1_14_1","unstructured":"Jens Axboe. Flexible I\/O Tester 2017. https:\/\/github.com\/axboe\/fio."},{"key":"e_1_3_2_1_15_1","first-page":"600","volume-title":"2021 58th ACM\/IEEE Design Automation Conference (DAC)","author":"Chen Yun-Chih","unstructured":"Yun-Chih Chen, Chun-Feng Wu, Yuan-Hao Chang, and Tei-Wei Kuo. Reptail: Cutting storage tail latency with inherent redundancy. In 2021 58th ACM\/IEEE Design Automation Conference (DAC), pages 595\u2013600. IEEE, 2021."},{"key":"e_1_3_2_1_16_1","first-page":"1694","volume-title":"Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data","author":"Colgrove John","year":"2015","unstructured":"John Colgrove, John D Davis, John Hayes, Ethan L Miller, Cary Sandvig, Russell Sears, Ari Tamches, Neil Vachharajani, and Feng Wang. Purity: Building fast, highly-available enterprise flash storage from commodity components. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1683\u20131694, 2015."},{"key":"e_1_3_2_1_17_1","first-page":"154","volume-title":"Russell Sears. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC '10","author":"Cooper Brian F.","unstructured":"Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC '10, pages 143\u2013154. ACM, 2010."},{"key":"e_1_3_2_1_18_1","volume-title":"emucxl: an emulation framework for cxl-based disaggregated memory applications. arXiv preprint arXiv:2404.08311","author":"Gond Raja","year":"2024","unstructured":"Raja Gond and Purushottam Kulkarni. emucxl: an emulation framework for cxl-based disaggregated memory applications. arXiv preprint arXiv:2404.08311, 2024."},{"key":"e_1_3_2_1_19_1","first-page":"276","volume-title":"14th USENIX Conference on File and Storage Technologies (FAST 16)","author":"Hao Mingzhe","year":"2016","unstructured":"Mingzhe Hao, Gokul Soundararajan, Deepak Kenchammana-Hosekote, Andrew A Chien, and Haryadi S Gunawi. The tail at store: A revelation from millions of hours of disk and SSD deployments. In 14th USENIX Conference on File and Storage Technologies (FAST 16), pages 263\u2013276, 2016."},{"key":"e_1_3_2_1_20_1","first-page":"195","volume-title":"14th USENIX Conference on File and Storage Technologies (FAST 16)","author":"Harter Tyler","year":"2016","unstructured":"Tyler Harter, Brandon Salmon, Rose Liu, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. Slacker: Fast distribution with lazy docker containers. In 14th USENIX Conference on File and Storage Technologies (FAST 16), pages 181\u2013195, 2016."},{"issue":"1","key":"e_1_3_2_1_21_1","first-page":"80","article-title":"Flash-aware RAID techniques for dependable and high-performance flash memory SSD","volume":"60","author":"Im Soojun","year":"2010","unstructured":"Soojun Im and Dongkun Shin. Flash-aware RAID techniques for dependable and high-performance flash memory SSD. IEEE Transactions on Computers, 60(1):80\u201392, 2010.","journal-title":"IEEE Transactions on Computers"},{"key":"e_1_3_2_1_22_1","first-page":"370","volume-title":"19th USENIX Conference on File and Storage Technologies (FAST 21)","author":"Jiang Tianyang","year":"2021","unstructured":"Tianyang Jiang, Guangyan Zhang, Zican Huang, Xiaosong Ma, Junyu Wei, Zhiyue Li, and Weimin Zheng. FusionRAID: Achieving consistent low latency for commodity SSD arrays. In 19th USENIX Conference on File and Storage Technologies (FAST 21), pages 355\u2013370, 2021."},{"key":"e_1_3_2_1_23_1","volume-title":"Reinforcement learning-assisted garbage collection to mitigate long-tail latency in SSD. ACM Transactions on Embedded Computing Systems (TECS), 16(5s):1\u201320","author":"Kang Wonkyung","year":"2017","unstructured":"Wonkyung Kang, Dongkun Shin, and Sungjoo Yoo. Reinforcement learning-assisted garbage collection to mitigate long-tail latency in SSD. ACM Transactions on Embedded Computing Systems (TECS), 16(5s):1\u201320, 2017."},{"key":"e_1_3_2_1_24_1","first-page":"812","volume-title":"2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Kim Jaeho","year":"2019","unstructured":"Jaeho Kim, Kwanghyun Lim, Youngdon Jung, Sungjin Lee, Changwoo Min, and Sam H Noh. Alleviating garbage collection interference through spatial separation in all flash arrays. In 2019 USENIX Annual Technical Conference (USENIX ATC 19), pages 799\u2013812, 2019."},{"key":"e_1_3_2_1_25_1","first-page":"820","volume-title":"2019 USENIX Annual Technical Conference (USENIX ATC 19)","author":"Kim Shine","year":"2019","unstructured":"Shine Kim, Jonghyun Bae, Hakbeom Jang, Wenjing Jin, Jeonghun Gong, Seungyeon Lee, Tae Jun Ham, and Jae W Lee. Practical erase suspension for modern low-latency SSDs. In 2019 USENIX Annual Technical Conference (USENIX ATC 19), pages 813\u2013820, 2019."},{"key":"e_1_3_2_1_26_1","first-page":"30","volume-title":"Proceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems","author":"Kwon Miryeong","year":"2023","unstructured":"Miryeong Kwon, Sangwon Lee, and Myoungsoo Jung. Cache in hand: Expander-driven CXL prefetcher for next generation CXL-SSD. In Proceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems, pages 24\u201330, 2023."},{"key":"e_1_3_2_1_27_1","first-page":"11","volume-title":"Proceedings of the 10th ACM International Systems and Storage Conference","author":"Lee Chunghan","year":"2017","unstructured":"Chunghan Lee, Tatsuo Kumano, Tatsuma Matsuki, Hiroshi Endo, Naoto Fukumoto, and Mariko Sugawara. Understanding storage traffic characteristics on enterprise virtual desktop infrastructure. In Proceedings of the 10th ACM International Systems and Storage Conference, pages 1\u201311, 2017."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3575693.3578835"},{"key":"e_1_3_2_1_29_1","volume-title":"Hpda: A hybrid parity-based disk array for enhanced performance and reliability. ACM Transactions on Storage (TOS), 8(1):1\u201320","author":"Mao Bo","year":"2012","unstructured":"Bo Mao, Hong Jiang, Suzhen Wu, Lei Tian, Dan Feng, Jianxi Chen, and Lingfang Zeng. Hpda: A hybrid parity-based disk array for enhanced performance and reliability. ACM Transactions on Storage (TOS), 8(1):1\u201320, 2012."},{"key":"e_1_3_2_1_30_1","first-page":"755","volume-title":"Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"3","author":"Maruf Hasan Al","year":"2023","unstructured":"Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Johannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit Kanaujia, and Prakash Chauhan. Tpp: Transparent page placement for cxl-enabled tieredmemory. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, pages 742\u2013755, 2023."},{"key":"e_1_3_2_1_31_1","volume-title":"Blocks and Files","author":"Mellor Chris","year":"2024","unstructured":"Chris Mellor. Fadu's ssd controller edge: Multiple parallel ppus. Blocks and Files, December 3 2024. Accessed: 2025-01-28."},{"issue":"165","key":"e_1_3_2_1_32_1","first-page":"7","article-title":"Analysis of commercial cloud workload and study on how to apply cache methods. IEICE Technical Report;","volume":"118","author":"Oe Kazuichi","year":"2018","unstructured":"Kazuichi Oe, Kazutaka Ogihara, and Takeo Honda. Analysis of commercial cloud workload and study on how to apply cache methods. IEICE Technical Report; IEICE Tech. Rep., 118(165):7\u201312, 2018.","journal-title":"IEICE Tech. Rep."},{"key":"e_1_3_2_1_33_1","first-page":"145","volume-title":"2009 9th International Symposium on Communications and Information Technology","author":"Park Kwanghee","unstructured":"Kwanghee Park, Dong-Hwan Lee, Youngjoo Woo, Geunhyung Lee, Ju-Hong Lee, and Deok-Hwan Kim. Reliability and performance enhancement technique for SSD array storage system using RAID mechanism. In 2009 9th International Symposium on Communications and Information Technology, pages 140\u2013145. IEEE, 2009."},{"key":"e_1_3_2_1_34_1","volume-title":"ServeTheHome","author":"Robinson Cliff","year":"2023","unstructured":"Cliff Robinson. Kioxia cxl and bics flash ssd shown at fms 2023. ServeTheHome, August 10 2023."},{"key":"e_1_3_2_1_35_1","volume-title":"March 14","year":"2024","unstructured":"Samsung. Samsung cxl solutions - cmm-h, March 14 2024. Accessed: 2025-01-28."},{"key":"e_1_3_2_1_36_1","volume-title":"ACM Transactions on Architecture and Code Optimization (TACO), 18(4):1\u201325","author":"Sha Zhibing","year":"2021","unstructured":"Zhibing Sha, Jun Li, Lihao Song, Jiewen Tang, Min Huang, Zhigang Cai, Lianju Qian, Jianwei Liao, and Zhiming Liu. Low I\/O intensity-aware partial GC scheduling to reduce long-tail latency in SSDs. ACM Transactions on Architecture and Code Optimization (TACO), 18(4):1\u201325, 2021."},{"key":"e_1_3_2_1_37_1","first-page":"12","volume-title":"2022 IEEE Symposium on High-Performance Interconnects (HOTI)","author":"Sharma Debendra Das","unstructured":"Debendra Das Sharma. Compute express link\u00ae: An open industry-standard interconnect enabling heterogeneous data-centric computing. In 2022 IEEE Symposium on High-Performance Interconnects (HOTI), pages 5\u201312. IEEE, 2022."},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3613424.3614256"},{"key":"e_1_3_2_1_39_1","first-page":"305","volume-title":"2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","author":"Wu Suzhen","unstructured":"Suzhen Wu, Weidong Zhu, Guixin Liu, Hong Jiang, and Bo Mao. Gc-aware request steering with improved performance and reliability for ssd-based raids. In 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 296\u2013305. IEEE, 2018."},{"key":"e_1_3_2_1_40_1","first-page":"103","volume-title":"2014 IEEE Fourth International Conference on Big Data and Cloud Computing","author":"Wu Xiaoquan","unstructured":"Xiaoquan Wu, Nong Xiao, Fang Liu, Zhiguang Chen, Yimo Du, and Yuxuan Xing. RAID-aware SSD: improving the write performance and lifespan of SSD in SSD-based RAID-5 system. In 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, pages 99\u2013103. IEEE, 2014."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"crossref","unstructured":"Shiqin Yan Huaicheng Li Mingzhe Hao Michael Hao Tong Swaminathan Sundararaman Andrew A Chien and Haryadi S Gunawi. Tiny-tail flash: Near-perfect elimination of garbage collection tail latencies in NAND SSDs. ACM Transactions on Storage (TOS) 13(3):1\u201326 2017.","DOI":"10.1145\/3121133"},{"key":"e_1_3_2_1_42_1","first-page":"617","volume-title":"2023 USENIX Annual Technical Conference (USENIX ATC 23)","author":"Yang Shao-Peng","year":"2023","unstructured":"Shao-Peng Yang, Minjae Kim, Sanghyun Nam, Juhyung Park, Jin-Yong Choi, Eyee Hyun Nam, Eunji Lee, Sungjin Lee, and Bryan S Kim. Overcoming the memory wall with CXL-enabled SSDs. In 2023 USENIX Annual Technical Conference (USENIX ATC 23), pages 601\u2013617, 2023."},{"key":"e_1_3_2_1_43_1","first-page":"294","volume-title":"16th USENIX Conference on File and Storage Technologies (FAST 18)","author":"Zhang Guangyan","year":"2018","unstructured":"Guangyan Zhang, Zican Huang, Xiaosong Ma, Songlin Yang, Zhufan Wang, and Weimin Zheng. RAID+: Deterministic and balanced data distribution for large disk enclosures. In 16th USENIX Conference on File and Storage Technologies (FAST 18), pages 279\u2013294, Oakland, CA, February 2018. USENIX Association."},{"key":"e_1_3_2_1_44_1","first-page":"798","volume-title":"2020 USENIX Annual Technical Conference (USENIX ATC 20)","author":"Zhang Yu","year":"2020","unstructured":"Yu Zhang, Ping Huang, Ke Zhou, Hua Wang, Jianying Hu, Yongguang Ji, and Bin Cheng. {OSCA}: An {Online-Model} based cache allocation scheme in cloud block storage systems. In 2020 USENIX Annual Technical Conference (USENIX ATC 20), pages 785\u2013798, 2020."}],"event":{"name":"HPDC '25: 34th International Symposium on High-Performance Parallel and Distributed Computing","location":"University of Notre Dame Conference Facilities Notre Dame IN USA","acronym":"HPDC '25","sponsor":["SIGHPC ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing","SIGARCH ACM Special Interest Group on Computer Architecture"]},"container-title":["Proceedings of the 34th International Symposium on High-Performance Parallel and Distributed Computing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3731545.3731579","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,9]],"date-time":"2025-09-09T12:50:21Z","timestamp":1757422221000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3731545.3731579"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,20]]},"references-count":44,"alternative-id":["10.1145\/3731545.3731579","10.1145\/3731545"],"URL":"https:\/\/doi.org\/10.1145\/3731545.3731579","relation":{},"subject":[],"published":{"date-parts":[[2025,7,20]]},"assertion":[{"value":"2025-09-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}