{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T00:19:12Z","timestamp":1777421952487,"version":"3.51.4"},"reference-count":35,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2023,8]]},"abstract":"<jats:p>In this work, we focus on the performance benchmarking problem of storage services in cloud-native database systems, which are widely used in various cloud applications. The core idea of these systems is to separate computation and storage in traditional monolithic OLTP databases. Specifically, we first present the characteristics of two representative real I\/O workloads at the storage tier of ByteDance's cloud-native database veDB. We then elaborate the limitations of using standard benchmarks such as TPC-C and YCSB to resemble these workloads. To overcome these limitations, we devise a learning-based I\/O workload benchmark called CDS-Ben. We demonstrate the superiority of CDSBen by deploying it at ByteDance and showing that its generated I\/O traces accurately resemble the real I\/O traces in production. Additionally, we verify the accuracy and flexibility of CDSBen by generating a wide range of I\/O workloads with different I\/O characteristics.<\/jats:p>","DOI":"10.14778\/3611540.3611549","type":"journal-article","created":{"date-parts":[[2023,9,15]],"date-time":"2023-09-15T11:32:37Z","timestamp":1694777557000},"page":"3584-3596","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["CDSBen: Benchmarking the Performance of Storage Services in Cloud-Native Database System at ByteDance"],"prefix":"10.14778","volume":"16","author":[{"given":"Jiashu","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Southern University of Science and Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wen","family":"Jiang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Southern University of Science and Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bo","family":"Tang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Southern University of Science and Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haoxiang","family":"Ma","sequence":"additional","affiliation":[{"name":"ByteDance"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lixun","family":"Cao","sequence":"additional","affiliation":[{"name":"ByteDance"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhongbin","family":"Jiang","sequence":"additional","affiliation":[{"name":"ByteDance"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuanyuan","family":"Nie","sequence":"additional","affiliation":[{"name":"ByteDance"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fan","family":"Wang","sequence":"additional","affiliation":[{"name":"ByteDance"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lei","family":"Zhang","sequence":"additional","affiliation":[{"name":"ByteDance"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yuming","family":"Liang","sequence":"additional","affiliation":[{"name":"ByteDance"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,8]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.peva.2013.08.006"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.heliyon.2018.e00938"},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Ibrahim Umit Akgun Geoff Kuenning and Erez Zadok. 2020. Re-Animator: Versatile High-Fidelity Storage-System Tracing and Replaying. In SYSTOR.61--74.","DOI":"10.1145\/3383669.3398276"},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Ahmad Al-Shishtawy and Vladimir Vlassov. 2013. ElastMan: Elasticity Manager for Elastic Key-Value Stores in the Cloud. In CAC. 1--10.","DOI":"10.1145\/2494621.2494630"},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Panagiotis Antonopoulos Alex Budovski Cristian Diaconu Alejandro Hernandez Saenz Jack Hu Hanuma Kodavalla Donald Kossmann Sandeep Lingam Umar Farooq Minhas Naveen Prakash Vijendra Purohit Hugh Qu Chaitanya Sreenivas Ravella Krystyna Reisteter Sheetal Shrotri Dixin Tang and Vikram Wakade. 2019. Socrates: The New SQL Server in the Cloud. In SIGMOD. 1743--1756.","DOI":"10.1145\/3299869.3314047"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Esmail Asyabi Yuanli Wang John Liagouris Vasiliki Kalavri and Azer Bestavros. 2022. A New Benchmark Harness for Systematic and Robust Evaluation of Streaming State Stores. In EuroSys. 559--574.","DOI":"10.1145\/3492321.3519592"},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","unstructured":"Dina Bitton David J. DeWitt and Carolyn Turbyfill. 1983. Benchmarking Database Systems A Systematic Approach. In VLDB. 8--19.","DOI":"10.1145\/319983.319987"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/3229863.3229872"},{"key":"e_1_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Wei Cao Yingqiang Zhang Xinjun Yang Feifei Li Sheng Wang Qingda Hu Xuntao Cheng Zongzhi Chen Zhenjun Liu Jing Fang Bo Wang Yuhui Wang Haiqing Sun Ze Yang Zhushi Cheng Sen Chen Jian Wu Wei Hu Jianwei Zhao Yusong Gao Songlu Cai Yunyang Zhang and Jiawang Tong. 2021. PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers. In SIGMOD. 2477--2489.","DOI":"10.1145\/3448016.3457560"},{"key":"e_1_2_1_10_1","volume-title":"Du","author":"Cao Zhichao","year":"2020","unstructured":"Zhichao Cao, Siying Dong, Sagar Vemuri, and David H. C. Du. 2020. Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook. In FAST. 209--223."},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Brian F. Cooper Adam Silberstein Erwin Tam Raghu Ramakrishnan and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In SoCC. 143--154.","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_1_12_1","volume-title":"TPC Benchamrk C Standard Specification. Retrieved","author":"Transaction Processing Performance Council","year":"2023","unstructured":"Transaction Processing Performance Council. 2010. TPC Benchamrk C Standard Specification. Retrieved July 3, 2023 from https:\/\/www.tpc.org\/tpc_documents_current_versions\/pdf\/tpc-c_v5.11.0.pdf"},{"key":"e_1_2_1_13_1","volume-title":"Taurus Database: How to Be Fast, Available, and Frugal in the Cloud. In SIGMOD. 1463--1478.","author":"Depoutovitch Alex","year":"2020","unstructured":"Alex Depoutovitch, Chong Chen, Jin Chen, Paul Larson, Shu Lin, Jack Ng, Wenlin Cui, Qiang Liu, Wei Huang, Yong Xiao, and Yongjun He. 2020. Taurus Database: How to Be Fast, Available, and Frugal in the Cloud. In SIGMOD. 1463--1478."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732240.2732246"},{"key":"e_1_2_1_15_1","unstructured":"Ian J. Goodfellow Jean Pouget-Abadie Mehdi Mirza Bing Xu David Warde-Farley Sherjil Ozair Aaron Courville and Yoshua Bengio. 2014. Generative Adversarial Nets. In NeurIPS. 2672--2680."},{"key":"e_1_2_1_16_1","volume-title":"Benchmark Handbook: For Database and Transaction Processing Systems","author":"Gray Jim","year":"1992","unstructured":"Jim Gray. 1992. Benchmark Handbook: For Database and Transaction Processing Systems. Morgan Kaufmann Publishers Inc."},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. 770--778.","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_18_1","volume-title":"Sturdivant","author":"Hosmer David W.","year":"2013","unstructured":"David W. Hosmer Jr, Stanley Lemeshow, and Rodney X. Sturdivant. 2013. Applied Logistic Regression. John Wiley & Sons."},{"key":"e_1_2_1_19_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2014","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_2_1_20_1","volume-title":"Kingma and Max Welling","author":"Diederik","year":"2013","unstructured":"Diederik P. Kingma and Max Welling. 2013. Auto-Encoding Variational Bayes. arXiv preprint arXiv:1312.6114 (2013)."},{"key":"e_1_2_1_21_1","volume-title":"Leutenegger and Daniel Dias","author":"Scott","year":"1993","unstructured":"Scott T. Leutenegger and Daniel Dias. 1993. A Modeling Study of the TPC-C Benchmark. In SIGMOD. 22--31."},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Miodrag Lovric et al. 2011. International Encyclopedia of Statistical Science. Springer Berlin Heidelberg.","DOI":"10.1007\/978-3-642-04898-2"},{"key":"e_1_2_1_23_1","volume-title":"Arpaci-Dusseau","author":"Lu Lanyue","year":"2016","unstructured":"Lanyue Lu, Thanumalayan S. Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating Keys from Values in SSD-conscious Storage. In FAST. 133--148."},{"key":"e_1_2_1_24_1","unstructured":"Michael P. Mesnier. 2007. \/\/TRACE: Parallel Trace Replay with Approximate Causal Events. In FAST. 153--167."},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Zhu Pang Qingda Lu Shuo Chen Rui Wang Yikang Xu and Jiesheng Wu. 2021. ArkDB: A Key-Value Engine for Scalable Cloud Storage Services. In SIGMOD. 2570--2583.","DOI":"10.1145\/3448016.3457553"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078195"},{"key":"e_1_2_1_27_1","volume-title":"Miller","author":"Pitchumani Rekha","year":"2015","unstructured":"Rekha Pitchumani, Shayna Frank, and Ethan L. Miller. 2015. Realistic Request Arrival Generation in Storage Benchmarks. In MSST. 1--10."},{"key":"e_1_2_1_28_1","first-page":"3272","article-title":"Realistic and Scalable Benchmarking Cloud File Systems: Practices and Lessons from AliCloud","volume":"28","author":"Ren Zujie","year":"2017","unstructured":"Zujie Ren, Weisong Shi, Jian Wan, Feng Cao, and Jiangbin Lin. 2017. Realistic and Scalable Benchmarking Cloud File Systems: Practices and Lessons from AliCloud. TPDS 28, 11 (2017), 3272--3285.","journal-title":"TPDS"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Zujie Ren Biao Xu Weisong Shi Yongjian Ren Feng Cao Jiangbin Lin and Zheng Ye. 2016. iGen: A Realistic Request Generator for Cloud File Systems Benchmarking. In CLOUD. 343--350.","DOI":"10.1109\/CLOUD.2016.0053"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1018628609742"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1021\/ci034160g"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TGE.1977.6498972"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Rebecca Taft Irfan Sharif Andrei Matei Nathan VanBenschoten Jordan Lewis Tobias Grieger Kai Niemi Andy Woods Anne Birzin Raphael Poss Paul Bardea Amruta Ranade Ben Darnell Bram Gruneir Justin Jaffray Lucy Zhang and Peter Mattis. 2020. CockroachDB: The Resilient Geo-Distributed SQL Database. In SIGMOD. 1493--1509.","DOI":"10.1145\/3318464.3386134"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1367829.1367831"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056101"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3611540.3611549","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,10]],"date-time":"2025-09-10T22:37:35Z","timestamp":1757543855000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3611540.3611549"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8]]},"references-count":35,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2023,8]]}},"alternative-id":["10.14778\/3611540.3611549"],"URL":"https:\/\/doi.org\/10.14778\/3611540.3611549","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2023,8]]},"assertion":[{"value":"2023-08-01","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}