{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,1]],"date-time":"2025-11-01T13:58:14Z","timestamp":1762005494635,"version":"3.44.0"},"reference-count":62,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2023,12,8]],"date-time":"2023-12-08T00:00:00Z","timestamp":1701993600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Direct Grant for Research, The Chinese University of Hong Kong","award":["Project No. 4055151"],"award-info":[{"award-number":["Project No. 4055151"]}]},{"name":"Research Grants Council of the Hong Kong Special Administrative Region, China","award":["GRF 14219422 and GRF 14202123"],"award-info":[{"award-number":["GRF 14219422 and GRF 14202123"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2023,12,8]]},"abstract":"<jats:p>LSM-based key-value stores have been leveraged in many state-of-the-art data-intensive applications as storage engines. As data volume scales up, a cost-efficient approach is to deploy these applications on hybrid cloud storage with hot\/cold separation, which splits the LSM-tree into two parts and thus brings new challenges on how to split and how to close the significant performance gap between these two parts. Existing LSM-tree key-value stores mainly focus on the optimizations of local storage, which incurs sub-optimal performance when directly applied to hybrid storage.<\/jats:p>\n          <jats:p>In this paper, we present MirrorKV for efficient compaction and querying on hybrid cloud storage. First, based on the capacities of fast and slow cloud storage, MirrorKV vertically separates hot\/cold data of different levels stored in different cloud storage with different compaction mechanisms. To avoid compaction in slow storage being the bottleneck of the write path, MirrorKV proposes a novel virtual split to only compact the metadata during the compaction, which postpones the actual compaction until it reaches deep enough levels. Second, to reduce accessing slow storage during querying, MirrorKV horizontally separates keys and values into two mirrored LSM-trees to differentiate caching priorities; the maintained tree structures preserve the data locality for efficient sequential reading without incurring the overhead of the traditional key-value separation solutions. Finally, MirrorKV leverages cached data to guide the compaction where the hot data is retained in the fast storage while the cold data is compacted to deeper levels in slow storage. Compared with RocksDB-cloud, MirrorKV achieves 2.4\u00d7 higher random insertion throughput, 29% higher random read throughput, and 99% less compaction time.<\/jats:p>","DOI":"10.1145\/3626736","type":"journal-article","created":{"date-parts":[[2023,12,12]],"date-time":"2023-12-12T14:01:21Z","timestamp":1702389681000},"page":"1-27","source":"Crossref","is-referenced-by-count":3,"title":["MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4536-6252","authenticated-orcid":false,"given":"Zhiqi","family":"Wang","sequence":"first","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2173-2847","authenticated-orcid":false,"given":"Zili","family":"Shao","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, Hong Kong"}]}],"member":"320","published-online":{"date-parts":[[2023,12,12]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376645"},{"key":"e_1_2_1_2_1","volume-title":"2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Chan Helen H. W.","year":"2018","unstructured":"Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee, and Yinlong Xu. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 1007--1019. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/chan"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.14778\/3485450.3485461"},{"key":"e_1_2_1_4_1","volume-title":"Cost-Effective LSM-tree Based KV Store on Hybrid Storage. In 19th USENIX Conference on File and Storage Technologies (FAST 21)","author":"Chen Hao","year":"2021","unstructured":"Hao Chen, Chaoyi Ruan, Cheng Li, Xiaosong Ma, and Yinlong Xu. 2021. SpanDB: A Fast, Cost-Effective LSM-tree Based KV Store on Hybrid Storage. In 19th USENIX Conference on File and Storage Technologies (FAST 21). USENIX Association, 17--32. https:\/\/www.usenix.org\/conference\/fast21\/presentation\/chen-hao"},{"key":"e_1_2_1_5_1","unstructured":"cockroachdb 2022. CockroachDB. https:\/\/www.cockroachlabs.com\/product\/."},{"key":"e_1_2_1_6_1","volume-title":"SplinterDB: Closing the Bandwidth Gap for NVMe Key-Value Stores. In 2020 USENIX Annual Technical Conference (USENIX ATC 20)","author":"Conway Alexander","year":"2020","unstructured":"Alexander Conway, Abhishek Gupta, Vijay Chidambaram, Martin Farach-Colton, Richard Spillane, Amy Tai, and Rob Johnson. 2020. SplinterDB: Closing the Bandwidth Gap for NVMe Key-Value Stores. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 49--63. https:\/\/www.usenix.org\/conference\/atc20\/presentation\/conway"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_1_8_1","volume-title":"ElasTraS: An Elastic Transactional Data Store in the Cloud. In Workshop on Hot Topics in Cloud Computing (HotCloud 09)","author":"Das Sudipto","year":"2009","unstructured":"Sudipto Das, Amr El Abbadi, and Divyakant Agrawal. 2009. ElasTraS: An Elastic Transactional Data Store in the Cloud. In Workshop on Hot Topics in Cloud Computing (HotCloud 09). USENIX Association, San Diego, CA. https:\/\/www.usenix.org\/conference\/hotcloud-09\/elastras-elastic-transactional-data-store-cloud"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3064054"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196927"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3319903"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457273"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.14778\/3551793.3551853"},{"key":"e_1_2_1_14_1","unstructured":"ebs 2022. Amazon Elastic Block Store. https:\/\/aws.amazon.com\/ebs\/."},{"key":"e_1_2_1_15_1","unstructured":"ebs_features 2022. Amazon EBS features. https:\/\/aws.amazon.com\/ebs\/features\/?nc1=h_ls."},{"key":"e_1_2_1_16_1","unstructured":"efs 2022. Amazon Elastic File System. https:\/\/aws.amazon.com\/efs\/."},{"key":"e_1_2_1_17_1","unstructured":"flink 2022. Apache Flink - Stateful Computations over Data Streams. https:\/\/flink.apache.org\/."},{"key":"e_1_2_1_18_1","unstructured":"gcs 2022. Google Cloud Storage. https:\/\/cloud.google.com\/storage\/."},{"key":"e_1_2_1_19_1","unstructured":"gfs 2022. Cloud Filestore. https:\/\/cloud.google.com\/filestore."},{"key":"e_1_2_1_20_1","unstructured":"gpd 2022. Cloud Persistent Disk. https:\/\/cloud.google.com\/persistent-disk."},{"key":"e_1_2_1_21_1","volume-title":"SplitKV: Splitting IO Paths for Different Sized Key-Value Items with Advanced Storage Devices. In 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 20)","author":"Han Shukai","year":"2020","unstructured":"Shukai Han, Dejun Jiang, and Jin Xiong. 2020. SplitKV: Splitting IO Paths for Different Sized Key-Value Items with Advanced Storage Devices. In 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 20). USENIX Association. https:\/\/www.usenix.org\/conference\/hotstorage20\/presentation\/han"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/645923.671013"},{"key":"e_1_2_1_23_1","volume-title":"SLM-DB: Single-Level Key-Value Store with Persistent Memory. In 17th USENIX Conference on File and Storage Technologies (FAST 19)","author":"Kaiyrakhmet Olzhas","year":"2019","unstructured":"Olzhas Kaiyrakhmet, Songyi Lee, Beomseok Nam, Sam H. Noh, and Young ri Choi. 2019. SLM-DB: Single-Level Key-Value Store with Persistent Memory. In 17th USENIX Conference on File and Storage Technologies (FAST 19). USENIX Association, Boston, MA, 191--205. https:\/\/www.usenix.org\/conference\/fast19\/presentation\/kaiyrakhmet"},{"key":"e_1_2_1_24_1","volume-title":"Redesigning LSMs for Nonvolatile Memory with NoveLSM. In 2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Kannan Sudarsun","year":"2018","unstructured":"Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2018. Redesigning LSMs for Nonvolatile Memory with NoveLSM. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 993--1005. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/kannan"},{"key":"e_1_2_1_25_1","volume-title":"Selecta: Heterogeneous Cloud Storage Configuration for Data Analytics. In 2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Klimovic Ana","year":"2018","unstructured":"Ana Klimovic, Heiner Litz, and Christos Kozyrakis. 2018. Selecta: Heterogeneous Cloud Storage Configuration for Data Analytics. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 759--773. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/klimovic-selecta"},{"key":"e_1_2_1_26_1","volume-title":"Pocket: Elastic Ephemeral Storage for Serverless Analytics. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Klimovic Ana","year":"2018","unstructured":"Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. 2018. Pocket: Elastic Ephemeral Storage for Serverless Analytics. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 427--444. https:\/\/www.usenix.org\/conference\/osdi18\/presentation\/klimovic"},{"key":"e_1_2_1_27_1","unstructured":"leveldbbench 2022. LevelDB bench tool. https:\/\/github.com\/google\/leveldb\/blob\/master\/benchmarks\/db_bench.cc."},{"key":"e_1_2_1_28_1","volume-title":"Alluxio: A Virtual Distributed File System.","author":"Li Haoyuan","year":"2018","unstructured":"Haoyuan Li. 2018. Alluxio: A Virtual Distributed File System."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3387902.3392621"},{"key":"e_1_2_1_30_1","volume-title":"2021 USENIX Annual Technical Conference (USENIX ATC 21)","author":"Li Yongkun","year":"2021","unstructured":"Yongkun Li, Zhen Liu, Patrick P. C. Lee, Jiayu Wu, Yinlong Xu, Yi Wu, Liu Tang, Qi Liu, and Qiu Cui. 2021. Differentiated Key-Value Storage Management for Balanced I\/O Performance. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, 673--687. https:\/\/www.usenix.org\/conference\/atc21\/presentation\/li-yongkun"},{"key":"e_1_2_1_31_1","volume-title":"Metis: Robustly Tuning Tail Latencies of Cloud Systems. In 2018 USENIX Annual Technical Conference (USENIX ATC 18)","author":"Li Zhao Lucis","year":"2018","unstructured":"Zhao Lucis Li, Chieh-Jan Mike Liang, Wenjia He, Lianjie Zhu, Wenjun Dai, Jin Jiang, and Guangzhong Sun. 2018. Metis: Robustly Tuning Tail Latencies of Cloud Systems. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 981--992. https:\/\/www.usenix.org\/conference\/atc18\/presentation\/li-zhao"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3033273"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-019-00555-y"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/3372716.3372719"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389731"},{"key":"e_1_2_1_36_1","unstructured":"mongodb 2022. MongoDB. https:\/\/www.mongodb.com\/."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.48786\/edbt.2023.11"},{"key":"e_1_2_1_38_1","unstructured":"myrocks 2022. A RocksDB storage engine with MySQL. http:\/\/myrocks.io\/."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1007\/s002360050048"},{"key":"e_1_2_1_40_1","unstructured":"presto 2022. Presto - Distributed SQL Query Engine for Big Data. https:\/\/prestodb.io\/."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3132747.3132765"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/3151106.3151108"},{"key":"e_1_2_1_43_1","unstructured":"rocksdbcache 2022. RocksDB Block Cache. https:\/\/github.com\/facebook\/rocksdb\/wiki\/Block-Cache."},{"key":"e_1_2_1_44_1","unstructured":"rocksdbcloud 2022. RocksDB-Cloud: A Key-Value Store for Cloud Applications. https:\/\/github.com\/rockset\/rocksdb-cloud."},{"key":"e_1_2_1_45_1","unstructured":"s3 2022. Amazon S3. https:\/\/aws.amazon.com\/s3\/."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476274"},{"key":"e_1_2_1_47_1","unstructured":"spark 2022. Apache Spark - Lightning-fast unified analytics engine. https:\/\/spark.apache.org\/."},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCGrid.2015.150"},{"key":"e_1_2_1_49_1","unstructured":"tidb 2022. PingCAP. https:\/\/pingcap.com\/products\/tidb."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.14778\/3529337.3529347"},{"key":"e_1_2_1_51_1","unstructured":"Ziwei Wang Zheng Zhong Jiarui Guo Yuhan Wu Haoyu Li Tong Yang Yaofeng Tu Huanchen Zhang and Bin Cui. [n. d.]. REncoder: A Space-Time Efficient Range Filter with Local Encoder. ([n. d.])."},{"key":"e_1_2_1_52_1","unstructured":"wisckeygithub 2022. Implementation of WiscKey. https:\/\/github.com\/cld378632668\/Wisckey_SeparateKVStorage."},{"key":"e_1_2_1_53_1","volume-title":"LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items. In 2015 USENIX Annual Technical Conference (USENIX ATC 15)","author":"Wu Xingbo","year":"2015","unstructured":"Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). USENIX Association, Santa Clara, CA, 71--82. https:\/\/www.usenix.org\/conference\/atc15\/technical-session\/presentation\/wu"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/Cluster48925.2021.00032"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2019.2918219"},{"key":"e_1_2_1_56_1","volume-title":"GearDB: A GC-free Key-Value Store on HM-SMR Drives with Gear Compaction. In 17th USENIX Conference on File and Storage Technologies (FAST 19)","author":"Yao Ting","year":"2019","unstructured":"Ting Yao, Jiguang Wan, Ping Huang, Yiwen Zhang, Zhiwen Liu, Changsheng Xie, and Xubin He. 2019. GearDB: A GC-free Key-Value Store on HM-SMR Drives with Gear Compaction. In 17th USENIX Conference on File and Storage Technologies (FAST 19) (Boston, MA, USA) (FAST'19). USENIX Association, Boston, MA, 159--171. https:\/\/www.usenix.org\/conference\/fast19\/presentation\/yao"},{"key":"e_1_2_1_57_1","volume-title":"MatrixKV: Reducing Write Stalls and Write Amplification in LSM-tree Based KV Stores with Matrix Container in NVM. In 2020 USENIX Annual Technical Conference (USENIX ATC 20)","author":"Yao Ting","year":"2020","unstructured":"Ting Yao, Yiwen Zhang, Jiguang Wan, Qiu Cui, Liu Tang, Hong Jiang, Changsheng Xie, and Xubin He. 2020. MatrixKV: Reducing Write Stalls and Write Amplification in LSM-tree Based KV Stores with Matrix Container in NVM. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 17--31. https:\/\/www.usenix.org\/conference\/atc20\/presentation\/yao"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3267809.3267846"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196931"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE48307.2020.00034"},{"volume-title":"18th USENIX Conference on File and Storage Technologies (FAST 20)","author":"Dong Siying","key":"e_1_2_1_61_1","unstructured":"zhichao Cao, Siying Dong, Sagar Vemuri, and David H.C. Du. 2020. Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook. In 18th USENIX Conference on File and Storage Technologies (FAST 20). USENIX Association, Santa Clara, CA, 209--223. https:\/\/www.usenix.org\/conference\/fast20\/presentation\/cao-zhichao"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.14778\/3311880.3311885"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626736","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3626736","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,22]],"date-time":"2025-08-22T12:59:46Z","timestamp":1755867586000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626736"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,8]]},"references-count":62,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,12,8]]}},"alternative-id":["10.1145\/3626736"],"URL":"https:\/\/doi.org\/10.1145\/3626736","relation":{},"ISSN":["2836-6573"],"issn-type":[{"type":"electronic","value":"2836-6573"}],"subject":[],"published":{"date-parts":[[2023,12,8]]}}}