{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T08:44:17Z","timestamp":1777106657872,"version":"3.51.4"},"reference-count":54,"publisher":"Association for Computing Machinery (ACM)","issue":"7","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,3]]},"abstract":"<jats:p>Modern applications span multiple clouds to reduce costs, avoid vendor lock-in, and leverage low-availability resources in another cloud. However, standard object stores operate within a single cloud, forcing users to manually manage data placement across clouds, i.e., navigate their diverse APIs and handle heterogeneous costs for network and storage. This is often a complex choice: users must either pay to store objects in a remote cloud, or pay to transfer them over the network based on application access patterns and cloud provider cost offerings. To address this, we present SkyStore, a unified object store that addresses cost-optimal data management across regions and clouds. SkyStore introduces a virtual object and bucket API to hide the complexity of interacting with multiple clouds. At its core, SkyStore has a novel TTL-based data placement policy that dynamically replicates and evicts objects according to application access patterns while optimizing for lower cost. Our evaluation shows that across various workloads, SkyStore reduces the overall cost by up to 6X over academic baselines and commercial alternatives like AWS multi-region buckets. SkyStore also has comparable latency, and its availability and fault tolerance are on par with standard cloud offerings.<\/jats:p>","DOI":"10.14778\/3734839.3734846","type":"journal-article","created":{"date-parts":[[2025,8,29]],"date-time":"2025-08-29T16:01:06Z","timestamp":1756483266000},"page":"2084-2096","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["SkyStore: Cost-Optimized Object Storage Across Regions and Clouds"],"prefix":"10.14778","volume":"18","author":[{"given":"Shu","family":"Liu","sequence":"first","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiangxi","family":"Mo","sequence":"additional","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Moshik","family":"Hershcovitch","sequence":"additional","affiliation":[{"name":"IBM Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Henric","family":"Zhang","sequence":"additional","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Audrey","family":"Cheng","sequence":"additional","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guy","family":"Girmonsky","sequence":"additional","affiliation":[{"name":"IBM Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gil","family":"Vernik","sequence":"additional","affiliation":[{"name":"IBM Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Factor","sequence":"additional","affiliation":[{"name":"IBM Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tiemo","family":"Bang","sequence":"additional","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Soujanya","family":"Ponnapalli","sequence":"additional","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Natacha","family":"Crooks","sequence":"additional","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Joseph E.","family":"Gonzalez","sequence":"additional","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Danny","family":"Harnik","sequence":"additional","affiliation":[{"name":"IBM Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ion","family":"Stoica","sequence":"additional","affiliation":[{"name":"UC Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,8,29]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Actions - Amazon Simple Storage Service \u2014 docs.aws.amazon.com. https:\/\/docs.aws.amazon.com\/AmazonS3\/latest\/API\/API_Operations.html. [Accessed 18-04-2024]."},{"key":"e_1_2_1_2_1","unstructured":"All networking pricing. Virtual Private Cloud. Google Cloud \u2014 cloud.google.com. https:\/\/cloud.google.com\/vpc\/network-pricing#standard-pricing. [Accessed 14-04-2024]."},{"key":"e_1_2_1_3_1","unstructured":"AWS\u00e2\u0102\u0179s Egregious Egress \u2014 blog.cloudflare.com. https:\/\/blog.cloudflare.com\/aws-egregious-egress. [Accessed 14-04-2024]."},{"key":"e_1_2_1_4_1","unstructured":"Ceph Object Gateway Ceph Documentation \u2014 docs.ceph.com. https:\/\/docs.ceph.com\/en\/quincy\/radosgw\/. [Accessed 20-04-2024]."},{"key":"e_1_2_1_5_1","unstructured":"Chapter 27. High Availability Load Balancing and Replication \u2014 postgresql.org. https:\/\/www.postgresql.org\/docs\/current\/high-availability.html. [Accessed 18-04-2024]."},{"key":"e_1_2_1_6_1","unstructured":"Cloudflare R2 | Zero Egress Fee Distributed Object Storage | Cloudflare \u2014 cloud-flare.com. https:\/\/www.cloudflare.com\/developer-platform\/r2\/. [Accessed 14-04-2024]."},{"key":"e_1_2_1_7_1","unstructured":"Landsat data \u00c1\u0103|\u00c2\u0103 Cloud Storage \u00c2\u0103|\u00c2\u0103 Google Cloud \u2014 cloud.google.com. https:\/\/cloud.google.com\/storage\/docs\/public-datasets\/landsat. [Accessed 14-04-2024]."},{"key":"e_1_2_1_8_1","unstructured":"Pricing - Bandwidth | Microsoft Azure \u2014 azure.microsoft.com. https:\/\/azure.microsoft.com\/en-us\/pricing\/details\/bandwidth\/. [Accessed 14-04-2024]."},{"key":"e_1_2_1_9_1","volume-title":"GCP \u2014 digitalocean.com. https:\/\/www.digitalocean.com\/resources\/article\/comparing-aws-azure-gcp","author":"Azure Comparing AWS","year":"2023","unstructured":"Comparing AWS, Azure, GCP \u2014 digitalocean.com. https:\/\/www.digitalocean.com\/resources\/article\/comparing-aws-azure-gcp, 2023. [Accessed 20-04-2024]."},{"key":"e_1_2_1_10_1","unstructured":"SNIA: IOTTA repository. http:\/\/iotta.snia.org\/traces\/key-value\/36305 2023."},{"key":"e_1_2_1_11_1","volume-title":"https:\/\/azure.microsoft.com\/en-us\/products\/storage\/blobs","author":"Microsoft Azure Blob","year":"2024","unstructured":"Azure Blob Storage | Microsoft Azure \u2014 azure.microsoft.com. https:\/\/azure.microsoft.com\/en-us\/products\/storage\/blobs, 2024. [Accessed 20-04-2024]."},{"key":"e_1_2_1_12_1","volume-title":"https:\/\/aws.amazon.com\/s3\/","author":"Cloud Object","year":"2024","unstructured":"Cloud Object Storage - Amazon S3 - AWS \u2014 aws.amazon.com. https:\/\/aws.amazon.com\/s3\/, 2024. [Accessed 20-04-2024]."},{"key":"e_1_2_1_13_1","volume-title":"https:\/\/cloud.google.com\/storage?hl=en","author":"Cloud Storage","year":"2024","unstructured":"Cloud Storage \u2014 cloud.google.com. https:\/\/cloud.google.com\/storage?hl=en, 2024. [Accessed 20-04-2024]."},{"key":"e_1_2_1_14_1","volume-title":"https:\/\/www.ibm.com\/cloud\/storage","author":"Cloud Storage","year":"2024","unstructured":"Cloud Storage Services | IBM \u2014 ibm.com. https:\/\/www.ibm.com\/cloud\/storage, 2024. [Accessed 20-04-2024]."},{"key":"e_1_2_1_15_1","first-page":"2","volume-title":"Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, NSDI'10","author":"Agarwal Sharad","year":"2010","unstructured":"Sharad Agarwal, John Dunagan, Navendu Jain, Stefan Saroiu, Alec Wolman, and Harbinder Bhogan. Volley: Automated data placement for geo-distributed cloud services. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, NSDI'10, page 2, USA, 2010. USENIX Association."},{"key":"e_1_2_1_16_1","volume-title":"Deepref: Deep reinforcement learning for video prefetching in content delivery networks","author":"Alkassab Nawras","year":"2023","unstructured":"Nawras Alkassab, Chin-Tser Huang, and Tania Lorido Botran. Deepref: Deep reinforcement learning for video prefetching in content delivery networks, 2023."},{"key":"e_1_2_1_17_1","unstructured":"Amazon s3 multi-region access points. https:\/\/aws.amazon.com\/s3\/features\/multi-region-access-points\/. Accessed on 12\/15\/2022."},{"key":"e_1_2_1_18_1","unstructured":"Amazon s3 pricing. https:\/\/aws.amazon.com\/s3\/pricing\/. Accessed on 09\/29\/2024."},{"key":"e_1_2_1_19_1","unstructured":"Aws cross-region replication. https:\/\/docs.aws.amazon.com\/AmazonS3\/latest\/userguide\/replication.html. Accessed on 12\/15\/2022."},{"key":"e_1_2_1_20_1","unstructured":"aws-lifecycle-policy. https:\/\/docs.aws.amazon.com\/AmazonS3\/latest\/userguide\/intro-lifecycle-rules.html."},{"key":"e_1_2_1_21_1","unstructured":"Microsoft Azure. Managing concurrency in Blob storage - Azure Storage \u2014 learn.microsoft.com. https:\/\/learn.microsoft.com\/en-us\/azure\/storage\/blobs\/concurrency-manage. [Accessed 19-04-2024]."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2018.2818468"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1147\/sj.52.0078"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2019.8737546"},{"key":"e_1_2_1_25_1","unstructured":"Cataloging and analyzing your data with s3 inventory. https:\/\/docs.aws.amazon.com\/AmazonS3\/latest\/userguide\/storage-inventory.html. Accessed on 12\/15\/2022."},{"key":"e_1_2_1_26_1","first-page":"50","volume-title":"19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)","author":"Chen Jun Lin","year":"2022","unstructured":"Jun Lin Chen, Daniyal Liaqat, Moshe Gabel, and Eyal de Lara. Starlight: Fast container provisioning on the edge and over the WAN. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), pages 35\u201350, Renton, WA, April 2022. USENIX Association."},{"key":"e_1_2_1_27_1","unstructured":"Cloud bursting. https:\/\/aws.amazon.com\/what-is\/cloud-bursting\/. Accessed on 09\/29\/2024."},{"key":"e_1_2_1_28_1","unstructured":"Databricks lakehouse use cases. https:\/\/www.databricks.com\/blog\/2020\/01\/30\/what-is-a-data-lakehouse.html. Accessed on 09\/29\/2024."},{"key":"e_1_2_1_29_1","unstructured":"Disaster recovery workloads. https:\/\/docs.aws.amazon.com\/whitepapers\/latest\/disaster-recovery-workloads-on-aws\/disaster-recovery-options-in-the-cloud.html. Accessed on 09\/29\/2024."},{"key":"e_1_2_1_30_1","volume-title":"IBM object store traces (SNIA IOTTA trace set 36305)","author":"Eytan Ohad","year":"2019","unstructured":"Ohad Eytan, Danny Harnik, Effi Ofer, Roy Friedman, and Ronen Kat. IBM object store traces (SNIA IOTTA trace set 36305). In Geoff Kuenning, editor, SNIA IOTTA Trace Repository. Storage Networking Industry Association, July 2019."},{"key":"e_1_2_1_31_1","first-page":"2020","article-title":"It's Time to Revisit LRU vs","author":"Eytan Ohad","year":"2020","unstructured":"Ohad Eytan, Danny Harnik, Effi Ofer, Roy Friedman, and Ronen I. Kat. It's Time to Revisit LRU vs. FIFO. In HotStorage 2020, 2020.","journal-title":"FIFO. In HotStorage"},{"key":"e_1_2_1_32_1","unstructured":"Gcp multi-region bucket. https:\/\/cloud.google.com\/storage\/docs\/locations#location-mr. Accessed on 12\/15\/2022."},{"issue":"1","key":"e_1_2_1_33_1","first-page":"16","article-title":"Evaluation of business-oriented performance metrics in ecommerce using web-based simulation. Journal of Emerging research and solutions","volume":"1","author":"Hristoski Ilija","year":"2016","unstructured":"Ilija Hristoski and Pece Mitrevski. Evaluation of business-oriented performance metrics in ecommerce using web-based simulation. Journal of Emerging research and solutions in ICT, 1(1):1\u00e1\u0102\u015f16, April 2016.","journal-title":"ICT"},{"key":"e_1_2_1_34_1","volume-title":"Skyplane: Optimizing transfer cost and throughput using cloud-aware overlays. arXiv preprint arXiv:2210.07259","author":"Jain Paras","year":"2022","unstructured":"Paras Jain, Sam Kumar, Sarah Wooders, Shishir G Patil, Joseph E Gonzalez, and Ion Stoica. Skyplane: Optimizing transfer cost and throughput using cloud-aware overlays. arXiv preprint arXiv:2210.07259, 2022."},{"key":"e_1_2_1_35_1","unstructured":"Juicefs data synchronization. https:\/\/juicefs.com\/docs\/community\/guide\/sync#distributed-sync. Accessed on 09\/29\/2024."},{"key":"e_1_2_1_36_1","unstructured":"llama3 details. https:\/\/ai.meta.com\/blog\/meta-llama-3\/."},{"key":"e_1_2_1_37_1","unstructured":"Inc. MinIO. MinIO | S3 & Kubernetes Native Object Storage for AI \u2014 min.io. https:\/\/min.io\/. [Accessed 14-04-2024]."},{"key":"e_1_2_1_38_1","volume-title":"An end-to-end pipeline perspective on video streaming in best-effort networks: A survey and tutorial","author":"Peroni Leonardo","year":"2024","unstructured":"Leonardo Peroni and Sergey Gorinsky. An end-to-end pipeline perspective on video streaming in best-effort networks: A survey and tutorial, 2024."},{"key":"e_1_2_1_39_1","first-page":"06","author":"Perry Marcus","year":"2010","unstructured":"Marcus Perry. The Exponentially Weighted Moving Average. 06 2010.","journal-title":"The Exponentially Weighted Moving Average."},{"key":"e_1_2_1_40_1","unstructured":"Google Cloud Platform. Consistency | Cloud Storage | Google Cloud \u2014 cloud.google.com. https:\/\/cloud.google.com\/storage\/docs\/consistency. [Accessed 19-04-2024]."},{"key":"e_1_2_1_41_1","volume-title":"Separating storage and compute for transaction and analytics\". https:\/\/www.singlestore.com\/blog\/separating-storage-and-compute-for-transaction-and-analytics\/","author":"Prout Adam","year":"2021","unstructured":"Adam Prout. \"learnings from snowflake and aurora: Separating storage and compute for transaction and analytics\". https:\/\/www.singlestore.com\/blog\/separating-storage-and-compute-for-transaction-and-analytics\/, 2021. [Accessed 26-08-2024]."},{"key":"e_1_2_1_42_1","volume-title":"Analysis of status update in wireless networks with successive interference cancellation","author":"Abdul Razzaque Asmad Bin","year":"2024","unstructured":"Asmad Bin Abdul Razzaque and Andrea Baiocchi. Analysis of status update in wireless networks with successive interference cancellation, 2024."},{"key":"e_1_2_1_43_1","unstructured":"s3-consistency-model. https:\/\/aws.amazon.com\/s3\/consistency\/. Accessed on 10\/01\/2024."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLOUD.2014.11"},{"key":"e_1_2_1_45_1","unstructured":"Scylladb eventual consistency. https:\/\/www.scylladb.com\/glossary\/eventual-consistency\/. Accessed on 10\/01\/2024."},{"key":"e_1_2_1_46_1","unstructured":"Amazon Web Services. Amazon S3 | Strong Consistency | Amazon Web Services \u2014 aws.amazon.com. https:\/\/aws.amazon.com\/s3\/consistency\/. [Accessed 19-04-2024]."},{"key":"e_1_2_1_47_1","unstructured":"sqlite-usecases. https:\/\/www.sqlite.org\/features.html. Accessed 10-01-2024."},{"key":"e_1_2_1_48_1","unstructured":"stevenmatthew. Data redundancy - Azure Storage \u2014 learn.microsoft.com. https:\/\/learn.microsoft.com\/en-us\/azure\/storage\/common\/storage-redundancy. [Accessed 14-04-2024]."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522731"},{"key":"e_1_2_1_50_1","volume-title":"2011 USENIX Annual Technical Conference (USENIX ATC 11)","author":"Tran Nguyen","year":"2011","unstructured":"Nguyen Tran, Marcos K. Aguilera, and Mahesh Balakrishnan. Online migration for geo-distributed storage systems. In 2011 USENIX Annual Technical Conference (USENIX ATC 11), Portland, OR, June 2011. USENIX Association."},{"key":"e_1_2_1_51_1","first-page":"296","volume-title":"21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)","author":"Wooders Sarah","year":"2024","unstructured":"Sarah Wooders, Shu Liu, Paras Jain, Xiangxi Mo, Joseph E. Gonzalez, Vincent Liu, and Ion Stoica. Cloudcast: High-Throughput, Cost-Aware overlay multicast in the cloud. In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24), pages 281\u2013296, Santa Clara, CA, April 2024. USENIX Association."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2517349.2522730"},{"key":"e_1_2_1_53_1","unstructured":"Tian Xia Zhanghao Wu Ziming Mao and Zongheng Yang. Introducing SkyServe: 50 https:\/\/blog.skypilot.co\/introducing-sky-serve\/. [Accessed 14-04-2024]."},{"key":"e_1_2_1_54_1","first-page":"455","volume-title":"20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)","author":"Yang Zongheng","year":"2023","unstructured":"Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, and Ion Stoica. SkyPilot: An intercloud broker for sky computing. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23), pages 437\u2013455, Boston, MA, April 2023. USENIX Association."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3734839.3734846","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,29]],"date-time":"2025-08-29T16:02:22Z","timestamp":1756483342000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3734839.3734846"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3]]},"references-count":54,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2025,3]]}},"alternative-id":["10.14778\/3734839.3734846"],"URL":"https:\/\/doi.org\/10.14778\/3734839.3734846","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,3]]},"assertion":[{"value":"2025-08-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}