{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,16]],"date-time":"2025-12-16T12:52:10Z","timestamp":1765889530504,"version":"3.44.0"},"reference-count":61,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,8]]},"abstract":"<jats:p>In the era of big data, the landscape of data management and analytics has significantly transformed, presenting diverse challenges for cloud platforms. Modern data warehouses face increasing challenges in handling hybrid transactional and analytical processing (HTAP) workloads efficiently in cloud environments. Traditional shared-nothing architectures provide high-performance query execution but suffer from high storage costs and limited elasticity, while shared-storage approaches improve scalability but often struggle with query efficiency due to increased data movement and indexing overhead. Furthermore, existing execution engines lack optimized support for vectorized processing and real-time analytics, limiting their ability to handle large-scale workloads efficiently.<\/jats:p>\n          <jats:p>To address these limitations, we introduce AnalyticDB-PG (ADB-PG), a cloud-native, high-performance data warehouse designed for modern analytical workloads. It integrates a unified architecture supporting both Shared-Nothing and Shared-Storage modes, allowing flexible deployment and seamless elasticity. In ADB-PG, we introduce Beam, a hybrid storage engine that efficiently balances row-based and columnar storage for real-time analytics, and Laser, an optimized execution engine leveraging vectorized execution and Just-In-Time compilation to accelerate query processing. The system further incorporates advanced indexing mechanisms, adaptive runtime filtering, and dictionary encoding to enhance performance. Extensive evaluations on TPC-H and TPC-DS benchmarks demonstrate that ADB-PG achieves significant performance improvements while reducing storage and operational costs, making it a compelling solution for modern cloud-based data analytics.<\/jats:p>","DOI":"10.14778\/3750601.3750633","type":"journal-article","created":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T13:38:05Z","timestamp":1758029885000},"page":"5139-5152","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["AnalyticDB-PG: A Cloud-Native High-Performance Data Warehouse in Alibaba Cloud"],"prefix":"10.14778","volume":"18","author":[{"given":"Fangyuan","family":"Zhang","sequence":"first","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Caihua","family":"Yin","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Hua","family":"Fan","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Fenghua","family":"Fang","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Yineng","family":"Chen","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Xuqi","family":"Wang","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Mengqi","family":"Wu","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Bing","family":"Chen","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Tianbo","family":"Jin","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Sibo","family":"Wang","sequence":"additional","affiliation":[{"name":"The Chinese University of Hong Kong, Hong Kong, China"}]},{"given":"Wenchao","family":"Zhou","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]},{"given":"Feifei","family":"Li","sequence":"additional","affiliation":[{"name":"Alibaba Cloud Computing, Hangzhou, China"}]}],"member":"320","published-online":{"date-parts":[[2025,9,16]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Data, data everywhere. The Economist","year":"2012","unstructured":"2012. Data, data everywhere. The Economist (2012)."},{"key":"e_1_2_1_2_1","unstructured":"2022. MongoDB Atlas. https:\/\/www.mongodb.com\/cloud\/atlas\/."},{"key":"e_1_2_1_3_1","unstructured":"2024. AnalyticDB for PostgreSQL. https:\/\/www.alibabacloud.com\/en\/product\/hybriddb-postgresql."},{"key":"e_1_2_1_4_1","unstructured":"2024. Kubernetes. https:\/\/kubernetes.io\/."},{"key":"e_1_2_1_5_1","unstructured":"2024. THE TRANSACTION PROCESSING COUNCIL. http:\/\/www.tpc.org\/tpch\/."},{"key":"e_1_2_1_6_1","unstructured":"2025. AlloyDB for PostgreSQL. https:\/\/cloud.google.com\/alloydb."},{"key":"e_1_2_1_7_1","unstructured":"2025. PostgreSQL. https:\/\/www.postgresql.org\/."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3524284"},{"key":"e_1_2_1_9_1","volume-title":"Stichnoth","author":"Adl-Tabatabai Ali-Reza","year":"1998","unstructured":"Ali-Reza Adl-Tabatabai, Michal Cierniak, Guei-Yuan Lueh, Vishesh M. Parikh, and James M. Stichnoth. 1998. Fast, Effective Code Generation in a Just-In-Time Java Compiler. In PLDI. 280\u2013290."},{"key":"e_1_2_1_10_1","first-page":"3665","article-title":"Photon: A Fast Query Engine for Lakehouse Systems","volume":"15","author":"Agarwal Sameer","year":"2022","unstructured":"Sameer Agarwal, Paul Barham, and Ion Stoica. 2022. Photon: A Fast Query Engine for Lakehouse Systems. PVLDB 15, 12 (2022), 3665\u20133678.","journal-title":"PVLDB"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-002-0074-9"},{"key":"e_1_2_1_12_1","unstructured":"Anastassia Ailamaki David J. DeWitt Mark D. Hill and Marios Skounakis. 2001. Weaving Relations for Cache Performance. In VLDB. 169\u2013180."},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Panagiotis Antonopoulos Alex Budovski Cristian Diaconu Alejandro Hernandez Saenz Jack Hu Hanuma Kodavalla Donald Kossmann Sandeep Lingam Umar Farooq Minhas Naveen Prakash Vijendra Purohit Hugh Qu Chaitanya Sreenivas Ravella Krystyna Reisteter Sheetal Shrotri Dixin Tang and Vikram Wakade. 2019. Socrates: The New SQL Server in the Cloud. In SIGMOD. 1743\u20131756.","DOI":"10.1145\/3299869.3314047"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Nikos Armenatzoglou Sanuj Basu Naga Bhanoori Mengchu Cai Naresh Chainani Kiran Chinta Venkatraman Govindaraju Todd J Green Monish Gupta Sebastian Hillig et al. 2022. Amazon Redshift re-invented. In SIGMOD. 2205\u20132217.","DOI":"10.1145\/3514221.3526045"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tele.2015.12.005"},{"key":"e_1_2_1_16_1","volume-title":"Paul Leventis, Ala Luszczak, Prashanth Menon, Mostafa Mokhtar, Gene Pang, Sameer Paranjpye, Greg Rahn, Bart Samwel, Tom van Bussel, Herman Van Hovell, Maryann Xue, Reynold Xin, and Matei Zaharia.","author":"Behm Alexander","year":"2022","unstructured":"Alexander Behm, Shoumik Palkar, Utkarsh Agarwal, Timothy Armstrong, David Cashman, Ankur Dave, Todd Greenstein, Shant Hovsepian, Ryan Johnson, Arvind Sai Krishnan, Paul Leventis, Ala Luszczak, Prashanth Menon, Mostafa Mokhtar, Gene Pang, Sameer Paranjpye, Greg Rahn, Bart Samwel, Tom van Bussel, Herman Van Hovell, Maryann Xue, Reynold Xin, and Matei Zaharia. 2022. Photon: A Fast Query Engine for Lakehouse Systems. In SIGMOD. ACM, 2326\u20132339."},{"key":"e_1_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Wei Cao Feifei Li Gui Huang Jianghang Lou Jianwei Zhao Dengcheng He Mengshi Sun Yingqiang Zhang Sheng Wang Xueqiang Wu Han Liao Zilin Chen Xiaojian Fang Mo Chen Chenghui Liang Yanxin Luo Huanming Wang Songlei Wang Zhanfeng Ma Xinjun Yang Xiang Peng Yubin Ruan Yuhui Wang Jie Zhou Jianying Wang Qingda Hu and Junbin Kang. 2022. PolarDB-X: An Elastic Distributed Relational Database for Cloud-Native Applications. In ICDE. 2859\u20132872.","DOI":"10.1109\/ICDE53745.2022.00259"},{"key":"e_1_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Wei Cao Yingqiang Zhang Xinjun Yang Feifei Li Sheng Wang Qingda Hu Xuntao Cheng Zongzhi Chen Zhenjun Liu Jing Fang Bo Wang Yuhui Wang Haiqing Sun Ze Yang Zhushi Cheng Sen Chen Jian Wu Wei Hu Jianwei Zhao Yusong Gao Songlu Cai Yunyang Zhang and Jiawang Tong. 2021. PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers. In SIGMOD. 2477\u20132489.","DOI":"10.1145\/3448016.3457560"},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Sashikanth Chandrasekaran and Roger Bamford. 2003. Shared Cache - The Future of Parallel Databases. In ICDE. 840\u2013850.","DOI":"10.1109\/ICDE.2003.1260883"},{"key":"e_1_2_1_20_1","unstructured":"Alibaba Cloud. 2022. Elastic Compute Service Block Storage. https:\/\/www.alibabacloud.com\/blog\/what-is-elastic-block-storage_597401."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1988842.1988850"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2491245"},{"key":"e_1_2_1_23_1","doi-asserted-by":"crossref","unstructured":"Benoit Dageville Thierry Cruanes Marcin Zukowski Vadim Antonov Artin Avanes Jon Bock Jonathan Claybaugh Daniel Engovatov Martin Hentschel Jiansheng Huang et al. 2016. The snowflake elastic data warehouse. In SIGMOD. 215\u2013226.","DOI":"10.1145\/2882903.2903741"},{"key":"e_1_2_1_24_1","doi-asserted-by":"crossref","unstructured":"Beno\u00eet Dageville Thierry Cruanes Marcin Zukowski Vadim Antonov Artin Avanes Jon Bock Jonathan Claybaugh Daniel Engovatov Martin Hentschel Jiansheng Huang Allison W. Lee Ashish Motivala Abdul Q. Munir Steven Pelley Peter Povinec Greg Rahn Spyridon Triantafyllis and Philipp Unterbrunner. 2016. The Snowflake Elastic Data Warehouse. In SIGMOD. 215\u2013226.","DOI":"10.1145\/2882903.2903741"},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Nedim Dedic and Clare Stanier. 2016. An Evaluation of the Challenges of Multilingualism in Data Warehouse Development. In ICEIS. 196\u2013206.","DOI":"10.5220\/0005858401960206"},{"key":"e_1_2_1_26_1","volume-title":"Taurus Database: How to be Fast, Available, and Frugal in the Cloud. In SIGMOD. 1463\u20131478.","author":"Depoutovitch Alex","year":"2020","unstructured":"Alex Depoutovitch, Chong Chen, Jin Chen, Paul Larson, Shu Lin, Jack Ng, Wenlin Cui, Qiang Liu, Wei Huang, Yong Xiao, and Yongjun He. 2020. Taurus Database: How to be Fast, Available, and Frugal in the Cloud. In SIGMOD. 1463\u20131478."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.14778\/3611540.3611542"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/3685800.3685822"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1972.5009071"},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Anurag Gupta Deepak Agarwal Derek Tan Jakub Kulesza Rahul Pathak Stefano Stefani and Vidhya Srinivasan. 2015. Amazon Redshift and the Case for Simpler Data Warehouses. In SIGMOD. 1917\u20131923.","DOI":"10.1145\/2723372.2742795"},{"key":"e_1_2_1_31_1","volume-title":"The world's technological capacity to store, communicate, and compute information. science 332, 6025","author":"Hilbert Martin","year":"2011","unstructured":"Martin Hilbert and Priscila L\u00f3pez. 2011. The world's technological capacity to store, communicate, and compute information. science 332, 6025 (2011), 60\u201365."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415535"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415535"},{"key":"e_1_2_1_34_1","unstructured":"John F Hughes. 2014. Computer graphics: principles and practice."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1536616.1536632"},{"key":"e_1_2_1_36_1","volume-title":"Big data challenges and opportunities in the hype of Industry 4.0","author":"Khan Maqbool","unstructured":"Maqbool Khan, Xiaotong Wu, Xiaolong Xu, and Wanchun Dou. 2017. Big data challenges and opportunities in the hype of Industry 4.0. In ICC. IEEE, 1\u20136."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.14778\/2367502.2367572"},{"key":"e_1_2_1_38_1","volume-title":"Deep learning. nature 521, 7553","author":"LeCun Yann","year":"2015","unstructured":"Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436\u2013444."},{"key":"e_1_2_1_39_1","volume-title":"Richard Black, Andrew Douglas, Nathanael Cheriere, Daniel Fryer, Kai Mast, Angela Demke Brown, Ana Klimovic, Andy Slowey, and Antony I. T. Rowstron.","author":"Legtchenko Sergey","year":"2017","unstructured":"Sergey Legtchenko, Hugh Williams, Kaveh Razavi, Austin Donnelly, Richard Black, Andrew Douglas, Nathanael Cheriere, Daniel Fryer, Kai Mast, Angela Demke Brown, Ana Klimovic, Andy Slowey, and Antony I. T. Rowstron. 2017. Understanding Rack-Scale Disaggregated Storage. In USENIX HotStorage."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554893"},{"key":"e_1_2_1_41_1","unstructured":"Zijun Li Jiagan Cheng Quan Chen Eryu Guan Zizheng Bian Yi Tao Bin Zha Qiang Wang Weidong Han and Minyi Guo. 2022. RunD: A Lightweight Secure Container Runtime for High-density Deployment and High-concurrency Startup in Serverless Computing. In USENIX ATC. 53\u201368."},{"key":"e_1_2_1_42_1","volume-title":"Mell and Timothy Grance","author":"Peter","year":"2011","unstructured":"Peter M. Mell and Timothy Grance. 2011. The NIST Definition of Cloud Computing. Special Publication (NIST SP) 800-145 (2011)."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2003.1222722"},{"key":"e_1_2_1_44_1","first-page":"3235","article-title":"ClickHouse: An Analytic DBMS for Interactive Applications","volume":"14","author":"Milovidov Alexey","year":"2021","unstructured":"Alexey Milovidov, Yakov Olkhovskiy, and Ivan Zhukov. 2021. ClickHouse: An Analytic DBMS for Interactive Applications. PVLDB 14, 12 (2021), 3235\u20133247.","journal-title":"PVLDB"},{"key":"e_1_2_1_45_1","volume-title":"Molesky and Krithi Ramamritham","author":"Lory","year":"1995","unstructured":"Lory D. Molesky and Krithi Ramamritham. 1995. Recovery Protocols for Shared Memory Database Systems. In SIGMOD. 11\u201322."},{"key":"e_1_2_1_46_1","first-page":"29","article-title":"Umbra: A Disk-Based System with In-Memory Performance","volume":"20","author":"Neumann Thomas","year":"2020","unstructured":"Thomas Neumann and Michael J Freitag. 2020. Umbra: A Disk-Based System with In-Memory Performance.. In CIDR, Vol. 20. 29.","journal-title":"CIDR"},{"key":"e_1_2_1_47_1","volume-title":"Velox: Meta's Unified Execution Engine. In SIGMOD. 2221\u20132234.","author":"Pedreira Pedro","year":"2023","unstructured":"Pedro Pedreira, Krishna Puttaswamy, Xiaoxuan Meng, Marios Kokkodis, Orri Erling, Nikhil Benesch, and Brian Nixon. 2023. Velox: Meta's Unified Execution Engine. In SIGMOD. 2221\u20132234."},{"key":"e_1_2_1_48_1","doi-asserted-by":"crossref","unstructured":"Adam Prout Szu-Po Wang Joseph Victor Zhou Sun Yongzhu Li Jack Chen Evan Bergeron Eric Hanson Robert Walzer Rodrigo Gomes et al. 2022. Cloud-native transactions and analytics in singlestore. In SIGMOD. 2340\u20132352.","DOI":"10.1145\/3514221.3526055"},{"key":"e_1_2_1_49_1","doi-asserted-by":"crossref","unstructured":"Mark Raasveldt and Hannes M\u00fchleisen. 2019. DuckDB: an Embeddable Analytical Database. In SIGMOD. 1981\u20131984.","DOI":"10.1145\/3299869.3320212"},{"key":"e_1_2_1_50_1","volume-title":"21st twente student conference on IT","author":"Scheepers Mathijs Jeroen","unstructured":"Mathijs Jeroen Scheepers. 2014. Virtualization and containerization of application infrastructure: A comparison. In 21st twente student conference on IT, Vol. 21. 1\u20137."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.14778\/3685800.3685802"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824065"},{"key":"e_1_2_1_53_1","first-page":"4","article-title":"The Case for Shared Nothing","volume":"9","author":"Stonebraker Michael","year":"1986","unstructured":"Michael Stonebraker. 1986. The Case for Shared Nothing. IEEE Database Eng. Bull. 9, 1 (1986), 4\u20139.","journal-title":"IEEE Database Eng. Bull."},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3386134"},{"key":"e_1_2_1_55_1","doi-asserted-by":"crossref","unstructured":"Rebecca Taft Irfan Sharif Andrei Matei Nathan VanBenschoten Jordan Lewis Tobias Grieger Kai Niemi Andy Woods Anne Birzin Raphael Poss Paul Bardea Amruta Ranade Ben Darnell Bram Gruneir Justin Jaffray Lucy Zhang and Peter Mattis. 2020. CockroachDB: The Resilient Geo-Distributed SQL Database. In SIGMOD. 1493\u20131509.","DOI":"10.1145\/3318464.3386134"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056101"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196937"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3003665.3003669"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415541"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.14778\/3554821.3554830"},{"key":"e_1_2_1_61_1","unstructured":"Matei Zaharia Mosharaf Chowdhury Tathagata Das Ankur Dave Justin Ma Murphy McCauley Michael J. Franklin Scott Shenker and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In NSDI. 15\u201328."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3750601.3750633","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T13:38:32Z","timestamp":1758029912000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3750601.3750633"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8]]},"references-count":61,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["10.14778\/3750601.3750633"],"URL":"https:\/\/doi.org\/10.14778\/3750601.3750633","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,8]]},"assertion":[{"value":"2025-09-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}