{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T19:49:06Z","timestamp":1774986546584,"version":"3.50.1"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,6,17]]},"abstract":"<jats:p>Recent studies have made it possible to integrate learning techniques into database systems for practical utilization. In particular, the state-of-the-art studies hook the conventional query optimizer to explore multiple execution plan candidates, then choose the optimal one with a learned model. This framework simplifies the integration of learning techniques into the database system. However, these methods still have room for improvement due to their limited plan exploration space and ineffective learning from execution plans. In this work, we propose Athena, an effective learning-based framework of query optimizer enhancer. It consists of three key components: (i) an order-centric plan explorer, (ii) a Tree-Mamba plan comparator and (iii) a time-weighted loss function. We implement Athena on top of the open-source database PostgreSQL and demonstrate its superiority via extensive experiments. Specifically, We achieve 1.75x, 1.95x, 5.69x, and 2.74x speedups over the vanilla PostgreSQL on the JOB, STATS-CEB, TPC-DS, and DSB benchmarks, respectively. Athena is 1.74x, 1.87x, 1.66x, and 2.28x faster than the state-of-the-art competitor Lero on these benchmarks. Additionally, Athena is open-sourced and it can be easily adapted to other relational database systems as all these proposed techniques in Athena are generic.<\/jats:p>","DOI":"10.1145\/3725395","type":"journal-article","created":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T21:23:29Z","timestamp":1750281809000},"page":"1-24","source":"Crossref","is-referenced-by-count":0,"title":["Athena: An Effective Learning-based Framework for Query Optimizer Performance Improvement"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7245-6873","authenticated-orcid":false,"given":"Runzhong","family":"Li","sequence":"first","affiliation":[{"name":"Southern University of Science and Technology, Shenzhen, Guangdong, China and The Hong Kong Polytechnic University, Hong Kong SAR, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-0560-1549","authenticated-orcid":false,"given":"Qilong","family":"Li","sequence":"additional","affiliation":[{"name":"Southern University of Science and Technology, Shenzhen, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8784-8711","authenticated-orcid":false,"given":"Haotian","family":"Liu","sequence":"additional","affiliation":[{"name":"Southern University of Science and Technology, Shenzhen, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3645-5520","authenticated-orcid":false,"given":"Rui","family":"Mao","sequence":"additional","affiliation":[{"name":"Shenzhen University, Shenzhen, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3370-471X","authenticated-orcid":false,"given":"Qing","family":"Li","sequence":"additional","affiliation":[{"name":"The Hong Kong Polytechnic University, Hong Kong SAR, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8424-0092","authenticated-orcid":false,"given":"Bo","family":"Tang","sequence":"additional","affiliation":[{"name":"Southern University of Science and Technology, Shenzhen, China"}]}],"member":"320","published-online":{"date-parts":[[2025,6,18]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"2024. MySQL. https:\/\/www.mysql.com."},{"key":"e_1_2_2_2_1","unstructured":"2025. Athena Implementation. https:\/\/github.com\/DBGroup-SUSTech\/Athena"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/320455.320457"},{"key":"e_1_2_2_4_1","volume-title":"LPLM: A Neural Language Model for Cardinality Estimation of LIKE-Queries. SIGMOD. 2, 1, 54:1--54:25.","author":"Aytimur Mehmet","year":"2024","unstructured":"Mehmet Aytimur, Silvan Reiner, Leonard W\u00f6rteler, Theodoros Chondrogiannis, and Michael Grossniklaus. 2024. LPLM: A Neural Language Model for Cardinality Estimation of LIKE-Queries. SIGMOD. 2, 1, 54:1--54:25."},{"key":"e_1_2_2_5_1","doi-asserted-by":"crossref","unstructured":"Chris Burges Tal Shaked Erin Renshaw Ari Lazier Matt Deeds Nicole Hamilton and Greg Hullender. 2005. Learning to rank using gradient descent. In ICML. 89--96.","DOI":"10.1145\/1102351.1102363"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.14778\/3587136.3587150"},{"key":"e_1_2_2_7_1","first-page":"1","article-title":"SafeBound","volume":"1","author":"Deeds Kyle B.","year":"2023","unstructured":"Kyle B. Deeds, Dan Suciu, and Magdalena Balazinska. 2023. SafeBound: A Practical System for Generating Cardinality Bounds. SIGMOD. 1, 1, 1--26.","journal-title":"A Practical System for Generating Cardinality Bounds. SIGMOD."},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/3484224.3484234"},{"key":"e_1_2_2_9_1","first-page":"1","article-title":"Kepler","volume":"1","author":"Doshi Lyric","year":"2023","unstructured":"Lyric Doshi, Vincent Zhuang, Gaurav Jain, Ryan Marcus, Haoyu Huang, Deniz Altinb\u00fcken, Eugene Brevdo, and Campbell Fraser. 2023. Kepler: Robust Learning for Parametric Query Optimization. SIGMOD. 1, 1, 1--25.","journal-title":"Robust Learning for Parametric Query Optimization. SIGMOD."},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.14778\/3329772.3329780"},{"key":"e_1_2_2_11_1","volume-title":"Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv.","author":"Gu Albert","year":"2023","unstructured":"Albert Gu and Tri Dao. 2023. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv."},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/3503585.3503586"},{"key":"e_1_2_2_13_1","doi-asserted-by":"crossref","unstructured":"Shohedul Hasan Saravanan Thirumuruganathan Jees Augustine Nick Koudas and Gautam Das. 2020. Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries. In SIGMOD. 1035--1050.","DOI":"10.1145\/3318464.3389741"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654932"},{"key":"e_1_2_2_15_1","unstructured":"Dan Hendrycks and Kevin Gimpel. 2023. Gaussian Error Linear Units (GELUs). arXiv:1606.08415"},{"key":"e_1_2_2_16_1","doi-asserted-by":"crossref","unstructured":"Benjamin Hilprecht Andreas Schmidt Moritz Kulessa Alejandro Molina Kristian Kersting and Carsten Binnig. 2020. DeepDB: Learn from Data not from Queries! PVLDB. 13 7 992--1005.","DOI":"10.14778\/3384345.3384349"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415535"},{"key":"e_1_2_2_18_1","unstructured":"Haozhe Ji Pei Ke Zhipeng Hu Rongsheng Zhang and Minlie Huang. 2023. Tailoring Language Generation Models under Total Variation Distance. In ICLR."},{"key":"e_1_2_2_19_1","volume-title":"ASM: Harmonizing Autoregressive Model, Sampling, and Multi-dimensional Statistics Merging for Cardinality Estimation. SIGMOD. 2, 1, 45:1--45:27.","author":"Kim Kyoungmin","year":"2024","unstructured":"Kyoungmin Kim, Sangoh Lee, Injung Kim, and Wook-Shin Han. 2024. ASM: Harmonizing Autoregressive Model, Sampling, and Multi-dimensional Statistics Merging for Cardinality Estimation. SIGMOD. 2, 1, 45:1--45:27."},{"key":"e_1_2_2_20_1","volume-title":"Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In CIDR.","author":"Kipf Andreas","year":"2019","unstructured":"Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter A. Boncz, and Alfons Kemper. 2019. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In CIDR."},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850583.2850594"},{"key":"e_1_2_2_22_1","first-page":"197","article-title":"ALECE","volume":"17","author":"Li Pengfei","year":"2023","unstructured":"Pengfei Li, Wenqing Wei, Rong Zhu, Bolin Ding, Jingren Zhou, and Hua Lu. 2023. ALECE: An Attention-based Learned Cardinality Estimator for SPJ Queries on Dynamic Workloads. PVLDB. 17, 2, 197--210.","journal-title":"PVLDB."},{"key":"e_1_2_2_23_1","doi-asserted-by":"crossref","unstructured":"Yingze Li Hongzhi Wang and Xianglong Liu. 2024. One Seed Two Birds: A Unified Learned Structure for Exact and Approximate Counting. SIGMOD. 2 1 15:1--15:26.","DOI":"10.1145\/3639270"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476254"},{"key":"e_1_2_2_25_1","doi-asserted-by":"crossref","unstructured":"Zhenghua Lyu Huan Hubert Zhang Gang Xiong Gang Guo Haozhou Wang Jinbao Chen Asim Praveen Yu Yang Xiaoming Gao Alexandra Wang Wen Lin Ashwin Agrawal Junfeng Yang Hao Wu Xiaoliang Li Feng Guo Jiang Wu Jesse Zhang and Venkatesh Raghavan. 2021. Greenplum: A Hybrid Database for Transactional and Analytical Workloads. In SIGMOD. 2530--2542.","DOI":"10.1145\/3448016.3457562"},{"key":"e_1_2_2_26_1","volume-title":"Bao: Making Learned Query Optimization Practical. In SIGMOD. 1275--1288.","author":"Marcus Ryan","year":"2021","unstructured":"Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, and Tim Kraska. 2021. Bao: Making Learned Query Optimization Practical. In SIGMOD. 1275--1288."},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342644"},{"key":"e_1_2_2_28_1","doi-asserted-by":"crossref","unstructured":"Ryan Marcus and Olga Papaemmanouil. 2018. Deep Reinforcement Learning for Join Order Enumeration. In aiDM@SIGMOD. 3:1--3:4.","DOI":"10.1145\/3211954.3211957"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342646"},{"key":"e_1_2_2_30_1","volume-title":"Lemo: A Cache-Enhanced Learned Optimizer for Concurrent Queries. SIGMOD. 1, 4, Article 247, 247:1--247:26 pages.","author":"Mo Songsong","year":"2023","unstructured":"Songsong Mo, Yile Chen, Hao Wang, Gao Cong, and Zhifeng Bao. 2023. Lemo: A Cache-Enhanced Learned Optimizer for Concurrent Queries. SIGMOD. 1, 4, Article 247, 247:1--247:26 pages."},{"key":"e_1_2_2_31_1","doi-asserted-by":"crossref","unstructured":"Lili Mou Ge Li Lu Zhang Tao Wang and Zhi Jin. 2016. Convolutional Neural Networks over Tree Structures for Programming Language Processing. In AAAI. 1287--1293.","DOI":"10.1609\/aaai.v30i1.10139"},{"key":"e_1_2_2_32_1","first-page":"2019","article-title":"Flow-Loss","volume":"14","author":"Negi Parimarjan","year":"2021","unstructured":"Parimarjan Negi, Ryan Marcus, Andreas Kipf, Hongzi Mao, Nesime Tatbul, Tim Kraska, and Mohammad Alizadeh. 2021. Flow-Loss: Learning Cardinality Estimates That Matter. PVLDB. 14, 11, 2019--2032.","journal-title":"Learning Cardinality Estimates That Matter. PVLDB."},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/3583140.3583164"},{"key":"e_1_2_2_34_1","unstructured":"PGTune 2024. PGTune - calculate configuration for PostgreSQL based on the maximum performance for a given hardware configuration. https:\/\/pgtune.leopard.in.ua"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3636218.3636229"},{"key":"e_1_2_2_36_1","volume-title":"Price","author":"Selinger Patricia G.","year":"1979","unstructured":"Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, and Thomas G. Price. 1979. Access Path Selection in a Relational Database Management System. In SIGMOD. 23--34."},{"key":"e_1_2_2_37_1","volume-title":"GLU Variants Improve Transformer. CoRR abs\/2002.05202","author":"Shazeer Noam","year":"2020","unstructured":"Noam Shazeer. 2020. GLU Variants Improve Transformer. CoRR abs\/2002.05202 (2020)."},{"key":"e_1_2_2_38_1","unstructured":"Michael Stillger Guy M. Lohman Volker Markl and Mokhtar Kandil. 2001. LEO - DB2's LEarning Optimizer. In PVLDB. 19--28."},{"key":"e_1_2_2_39_1","volume-title":"Rowe","author":"Stonebraker Michael","year":"1986","unstructured":"Michael Stonebraker and Lawrence A. Rowe. 1986. The Design of Postgres. In SIGMOD. 340--355."},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/3368289.3368296"},{"key":"e_1_2_2_41_1","unstructured":"Transaction Processing Performance Council(TPC). 2021. TPC-DS Vesion 2 and Version 3. http:\/\/www.tpc.org\/tpcds\/"},{"key":"e_1_2_2_42_1","volume-title":"Shuai Li, Zunyao Mao, and Bo Tang.","author":"Wang Fang","year":"2023","unstructured":"Fang Wang, Xiao Yan, Man Lung Yiu, Shuai Li, Zunyao Mao, and Bo Tang. 2023. Speeding Up End-to-end Query Execution via Learning-based Progressive Cardinality Estimation. SIGMOD. 1, 1, 28:1--28:25."},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.14778\/3485450.3485458"},{"key":"e_1_2_2_44_1","first-page":"3934","article-title":"CEDA","volume":"16","author":"Wang Zilong","year":"2023","unstructured":"Zilong Wang, Qixiong Zeng, Ning Wang, Haowen Lu, and Yue Zhang. 2023. CEDA: Learned Cardinality Estimation with Domain Adaptation. PVLDB. 16, 12, 3934--3937.","journal-title":"Learned Cardinality Estimation with Domain Adaptation. PVLDB."},{"key":"e_1_2_2_45_1","doi-asserted-by":"crossref","unstructured":"Peizhi Wu and Gao Cong. 2021. A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation. In SIGMOD. 2009--2022.","DOI":"10.1145\/3448016.3452830"},{"key":"e_1_2_2_46_1","first-page":"272","article-title":"Learning to be a Statistician","volume":"15","author":"Wu Renzhi","year":"2021","unstructured":"Renzhi Wu, Bolin Ding, Xu Chu, Zhewei Wei, Xiening Dai, Tao Guan, and Jingren Zhou. 2021. Learning to be a Statistician: Learned Estimator for Number of Distinct Values. PVLDB. 15, 2, 272--284.","journal-title":"Learned Estimator for Number of Distinct Values. PVLDB."},{"key":"e_1_2_2_47_1","doi-asserted-by":"crossref","unstructured":"Ziniu Wu Parimarjan Negi Mohammad Alizadeh Tim Kraska and Samuel Madden. 2023. FactorJoin: A New Cardinality Estimation Framework for Join Queries. SIGMOD. 1 1 41:1--41:27.","DOI":"10.1145\/3588721"},{"key":"e_1_2_2_48_1","unstructured":"Ziniu Wu and Amir Shaikhha. 2020. BayesCard: A Unified Bayesian Framework for Cardinality Estimation. CoRR. abs\/2012.14743."},{"key":"e_1_2_2_49_1","unstructured":"Ruibin Xiong Yunchang Yang Di He Kai Zheng Shuxin Zheng Chen Xing Huishuai Zhang Yanyan Lan Liwei Wang and Tie-Yan Liu. 2020. On layer normalization in the transformer architecture. In ICML. 10524--10533."},{"key":"e_1_2_2_50_1","doi-asserted-by":"crossref","unstructured":"Zongheng Yang Wei-Lin Chiang Sifei Luan Gautam Mittal Michael Luo and Ion Stoica. [n. d.]. Balsa: Learning a Query Optimizer Without Expert Demonstrations. In SIGMOD. 931--944.","DOI":"10.1145\/3514221.3517885"},{"key":"e_1_2_2_51_1","first-page":"61","article-title":"NeuroCard","volume":"14","author":"Yang Zongheng","year":"2020","unstructured":"Zongheng Yang, Amog Kamsetty, Sifei Luan, Eric Liang, Yan Duan, Xi Chen, and Ion Stoica. 2020. NeuroCard: One Cardinality Estimator for All Tables. PVLDB. 14, 1, 61--73.","journal-title":"One Cardinality Estimator for All Tables. PVLDB."},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.14778\/3368289.3368294"},{"key":"e_1_2_2_53_1","doi-asserted-by":"crossref","unstructured":"Xiang Yu Guoliang Li Chengliang Chai and Nan Tang. 2020. Reinforcement Learning with Tree-LSTM for Join Order Selection. In ICDE. 1297--1308.","DOI":"10.1109\/ICDE48307.2020.00116"},{"key":"e_1_2_2_54_1","first-page":"1466","article-title":"Lero","volume":"16","author":"Zhu Rong","year":"2023","unstructured":"Rong Zhu, Wei Chen, Bolin Ding, Xingguang Chen, Andreas Pfadler, Ziniu Wu, and Jingren Zhou. 2023. Lero: A Learning-to-Rank Query Optimizer. PVLDB. 16, 6, 1466--1479.","journal-title":"A Learning-to-Rank Query Optimizer. PVLDB."},{"key":"e_1_2_2_55_1","first-page":"1489","article-title":"FLAT","volume":"14","author":"Zhu Rong","year":"2021","unstructured":"Rong Zhu, Ziniu Wu, Yuxing Han, Kai Zeng, Andreas Pfadler, Zhengping Qian, Jingren Zhou, and Bin Cui. 2021. FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation. PVLDB. 14, 9, 1489--1502.","journal-title":"Fast, Lightweight and Accurate Method for Cardinality Estimation. PVLDB."}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3725395","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T18:55:10Z","timestamp":1774983310000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3725395"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,17]]},"references-count":55,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,6,17]]}},"alternative-id":["10.1145\/3725395"],"URL":"https:\/\/doi.org\/10.1145\/3725395","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,17]]}}}