{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:24:59Z","timestamp":1760149499095,"version":"build-2065373602"},"reference-count":41,"publisher":"MDPI AG","issue":"17","license":[{"start":{"date-parts":[[2023,8,23]],"date-time":"2023-08-23T00:00:00Z","timestamp":1692748800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["62266043","62262064","61966035","XJEDU2016S106","2022D01C56"],"award-info":[{"award-number":["62266043","62262064","61966035","XJEDU2016S106","2022D01C56"]}]},{"name":"Key R&amp;D projects in Xinjiang Uygur Autonomous Region","award":["62266043","62262064","61966035","XJEDU2016S106","2022D01C56"],"award-info":[{"award-number":["62266043","62262064","61966035","XJEDU2016S106","2022D01C56"]}]},{"name":"Natural Science Foundation of Xinjiang Uygur Autonomous Region of China","award":["62266043","62262064","61966035","XJEDU2016S106","2022D01C56"],"award-info":[{"award-number":["62266043","62262064","61966035","XJEDU2016S106","2022D01C56"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Cardinality estimation is critical for database management systems (DBMSs) to execute query optimization tasks, which can guide the query optimizer in choosing the best execution plan. However, traditional cardinality estimation methods cannot provide accurate estimates because they cannot accurately capture the correlation between multiple tables. Several recent studies have revealed that learning-based cardinality estimation methods can address the shortcomings of traditional methods and provide more accurate estimates. However, the learning-based cardinality estimation methods still have large errors when an SQL query involves multiple tables or is very complex. To address this problem, we propose a sampling-based tree long short-term memory (TreeLSTM) neural network to model queries. The proposed model addresses the weakness of traditional methods when no sampled tuples match the predicates and considers the join relationship between multiple tables and the conjunction and disjunction operations between predicates. We construct subexpressions as trees using operator types between predicates and improve the performance and accuracy of cardinality estimation by capturing the join-crossing correlations between tables and the order dependencies between predicates. In addition, we construct a new loss function to overcome the drawback that Q-error cannot distinguish between large and small cardinalities. Extensive experimental results from real-world datasets show that our proposed model improves the estimation quality and outperforms traditional cardinality estimation methods and the other compared deep learning methods in three evaluation metrics: Q-error, MAE, and SMAPE.<\/jats:p>","DOI":"10.3390\/s23177364","type":"journal-article","created":{"date-parts":[[2023,8,24]],"date-time":"2023-08-24T10:47:08Z","timestamp":1692874028000},"page":"7364","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["A Cardinality Estimator in Complex Database Systems Based on TreeLSTM"],"prefix":"10.3390","volume":"23","author":[{"given":"Kaiyang","family":"Qi","sequence":"first","affiliation":[{"name":"School of Software, Xinjiang University, Urumqi 830091, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiong","family":"Yu","sequence":"additional","affiliation":[{"name":"School of Software, Xinjiang University, Urumqi 830091, China"},{"name":"College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhenzhen","family":"He","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,8,23]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Negi, P., Marcus, R., Mao, H., Tatbul, N., Kraska, T., and Alizadeh, M. (2020, January 20\u201324). Cost-Guided Cardinality Estimation: Focus Where it Matters. Proceedings of the 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW), Dallas, TX, USA.","DOI":"10.1109\/ICDEW49219.2020.00034"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1145\/3186728.3164145","article-title":"Cardinality Estimation: An Experimental Survey","volume":"11","author":"Harmouch","year":"2017","journal-title":"Proc. VLDB Endow."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"204","DOI":"10.14778\/2850583.2850594","article-title":"How Good Are Query Optimizers, Really?","volume":"9","author":"Leis","year":"2015","journal-title":"Proc. VLDB Endow."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1007\/s00778-017-0480-7","article-title":"Query optimization through the looking glass, and what we found running the Join Order Benchmark","volume":"27","author":"Leis","year":"2018","journal-title":"VLDB J."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Perron, M., Shang, Z., Kraska, T., and Stonebraker, M. (2019, January 8\u201311). How I Learned to Stop Worrying and Love Re-optimization. Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China.","DOI":"10.1109\/ICDE.2019.00191"},{"key":"ref_6","unstructured":"Liu, H., Xu, M., Yu, Z., Corvinelli, V., and Zuzarte, C. (2015, January 1\u20134). Cardinality estimation using neural networks. Proceedings of the Conference of the Centre for Advanced Studies on Collaborative Research, Toronto, ON, Canada."},{"key":"ref_7","unstructured":"Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P.A., and Kemper, A. (2018). Learned Cardinalities: Estimating Correlated Joins with Deep Learning. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Marcus, R., and Papaemmanouil, O. (2018, January 10). Deep Reinforcement Learning for Join Order Enumeration. Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, Houston, TX, USA.","DOI":"10.1145\/3211954.3211957"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Sun, J., and Li, G. (2019). An End-to-End Learning-based Cost Estimator. arXiv.","DOI":"10.14778\/3368289.3368296"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"He, Z., Yu, J., Gu, T., Li, Z., Du, X., and Li, P. (2023). Query cost estimation in graph databases via emphasizing query dependencies by using a neural reasoning network. Concurr. Comput. Pract. Exp., e7817.","DOI":"10.1002\/cpe.7817"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Wang, T., and Chan, C.Y. (2020, January 20\u201324). Improved Correlated Sampling for Join Size Estimation. Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA.","DOI":"10.1109\/ICDE48307.2020.00035"},{"key":"ref_12","unstructured":"(2022, March 26). State of New York, Available online: https:\/\/catalog.data.gov\/dataset\/vehicle-snowmobile-and-boat-registrations."},{"key":"ref_13","unstructured":"To, H., Chiang, K., and Shahabi, C. (November, January 27). Entropy-based histograms for selectivity estimation. Proceedings of the 22nd ACM international conference on Information & Knowledge Management, San Francisco, CA, USA."},{"key":"ref_14","first-page":"201","article-title":"Cardinality Estimation Based on Cluster Analysis","volume":"35","author":"Zeng","year":"2017","journal-title":"J. Inf. Sci. Eng."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1007\/s10619-018-07255-6","article-title":"IHP: Improving the utility in differential private histogram publication","volume":"37","author":"Li","year":"2019","journal-title":"Distrib. Parallel Databases"},{"key":"ref_16","first-page":"966","article-title":"Remove Minimum (RM): An Error-Tolerant Scheme for Cardinality Estimate by HyperLogLog","volume":"19","author":"Reviriego","year":"2022","journal-title":"IEEE Trans. Dependable Secur. Comput."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Kipf, A., Vorona, D., M\u00fcller, J., Kipf, T., Radke, B., Leis, V., Boncz, P.A., Neumann, T., and Kemper, A. (July, January 30). Estimating Cardinalities with Deep Sketches. Proceedings of the 2019 International Conference on Management of Data, Amsterdam, The Netherlands.","DOI":"10.1145\/3299869.3320218"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2775","DOI":"10.1016\/j.comnet.2013.05.011","article-title":"An algorithm for privacy-preserving distributed user statistics","volume":"57","author":"Tschorsch","year":"2013","journal-title":"Comput. Netw."},{"key":"ref_19","unstructured":"Izenov, Y., Datta, A., Rusu, F., and Shin, J.H. (2021). Online Sketch-based Query Optimization. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Wu, W., Naughton, J.F., and Singh, H. (July, January 26). Sampling-Based Query Re-Optimization. Proceedings of the 2016 International Conference on Management of Data, San Francisco, CA, USA.","DOI":"10.1145\/2882903.2882914"},{"key":"ref_21","unstructured":"Leis, V., Radke, B., Gubichev, A., Kemper, A., and Neumann, T. (2017, January 8\u201311). Cardinality Estimation Done Right: Index-Based Join Sampling. Proceedings of the Conference on Innovative Data Systems Research, Chaminade, CA, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1007\/s41019-020-00149-7","article-title":"A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration","volume":"6","author":"Lan","year":"2021","journal-title":"Data Sci. Eng."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"85","DOI":"10.14778\/3485450.3485459","article-title":"Learned Cardinality Estimation: A Design Space Exploration and A Comparative Evaluation","volume":"15","author":"Sun","year":"2021","journal-title":"Proc. VLDB Endow."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Heimel, M., Kiefer, M., and Markl, V. (June, January 31). Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation. Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, VIC, Australia.","DOI":"10.1145\/2723372.2749438"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"2085","DOI":"10.14778\/3151106.3151112","article-title":"Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models","volume":"10","author":"Kiefer","year":"2017","journal-title":"Proc. VLDB Endow."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Hilprecht, B., Schmidt, A., Kulessa, M., Molina, A., Kersting, K., and Binnig, C. (2019). DeepDB: Learn from Data, not from Queries!. arXiv.","DOI":"10.14778\/3384345.3384349"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Marcus, R., Negi, P., Mao, H., Zhang, C., Alizadeh, M., Kraska, T., Papaemmanouil, O., and Tatbul, N. (2019). Neo: A Learned Query Optimizer. arXiv.","DOI":"10.14778\/3342263.3342644"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yang, Z., Liang, E., Kamsetty, A., Wu, C., Duan, Y., Chen, P., Abbeel, P., Hellerstein, J.M., Krishnan, S., and Stoica, I. (2019). Deep Unsupervised Cardinality Estimation. arXiv.","DOI":"10.14778\/3368289.3368294"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"61","DOI":"10.14778\/3421424.3421432","article-title":"NeuroCard: One Cardinality Estimator for All Tables","volume":"14","author":"Yang","year":"2020","journal-title":"Proc. VLDB Endow."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Hasan, S., Thirumuruganathan, S., Augustine, J., Koudas, N., and Das, G. (2020, January 14\u201319). Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries. Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, Portland, OR, USA.","DOI":"10.1145\/3318464.3389741"},{"key":"ref_31","unstructured":"Malik, T., Burns, R.C., and Chawla, N. (2007, January 12\u201315). A Black-Box Approach to Query Cardinality Estimation. Proceedings of the Conference on Innovative Data Systems Research, Amsterdam, The Netherlands."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wu, P., and Cong, G. (2021, January 20\u201325). A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation. Proceedings of the 2021 International Conference on Management of Data, Online.","DOI":"10.1145\/3448016.3452830"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1044","DOI":"10.14778\/3329772.3329780","article-title":"Selectivity Estimation for Range Predicates using Lightweight Models","volume":"12","author":"Dutt","year":"2019","journal-title":"Proc. VLDB Endow."},{"key":"ref_34","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. arXiv."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"41","DOI":"10.51408\/1963-0058","article-title":"Cardinality Estimation of an SQL Query Using Recursive Neural Networks","volume":"54","author":"Karamyan","year":"2020","journal-title":"Math. Probl. Comput. Sci."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Woltmann, L., Hartmann, C., Thiele, M., Habich, D., and Lehner, W. (2019, January 5). Cardinality estimation with local deep learning models. Proceedings of the Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, Amsterdam, The Netherlands.","DOI":"10.1145\/3329859.3329875"},{"key":"ref_37","unstructured":"Tolstikhin, I.O., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Yung, J., Keysers, D., Uszkoreit, J., and Lucic, M. (2021, January 6\u201314). MLP-Mixer: An all-MLP Architecture for Vision. Proceedings of the Neural Information Processing Systems, Online."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Tai, K.S., Socher, R., and Manning, C.D. (2015). Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. arXiv.","DOI":"10.3115\/v1\/P15-1150"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"982","DOI":"10.14778\/1687627.1687738","article-title":"Preventing Bad Plans by Bounding the Impact of Cardinality Estimation Errors","volume":"2","author":"Moerkotte","year":"2009","journal-title":"Proc. VLDB Endow."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Zhao, K., Yu, J.X., He, Z., and Zhang, H. (2021). Uncertainty-aware Cardinality Estimation by Neural Network Gaussian Process. arXiv.","DOI":"10.1145\/3514221.3526156"},{"key":"ref_41","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A Method for Stochastic Optimization. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/17\/7364\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:37:47Z","timestamp":1760128667000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/17\/7364"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,23]]},"references-count":41,"journal-issue":{"issue":"17","published-online":{"date-parts":[[2023,9]]}},"alternative-id":["s23177364"],"URL":"https:\/\/doi.org\/10.3390\/s23177364","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2023,8,23]]}}}