{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,27]],"date-time":"2025-09-27T13:46:39Z","timestamp":1758980799086},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"2","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2023,10]]},"abstract":"<jats:p>We study the problem of synthesizing a core fragment of relational queries called select-project-join (SPJ) queries from input-output examples. Search-based synthesis techniques are suited to synthesizing projections and joins by navigating the network of relational tables but require additional supervision for synthesizing comparison predicates. On the other hand, decision tree learning techniques are suited to synthesizing comparison predicates when the input database can be summarized as a single labelled relational table. In this paper, we adapt and interleave methods from the domains of relational query synthesis and decision tree learning, and present an end-to-end framework for synthesizing relational queries with categorical and numerical comparison predicates. Our technique guarantees the completeness of the synthesis procedure and strongly encourages minimality of the synthesized program. We present Libra, an implementation of this technique and evaluate it on a benchmark suite of 1,475 instances of queries over 159 databases with multiple tables. Libra solves 1,361 of these instances in an average of 59 seconds per instance. It outperforms state-of-the-art program synthesis tools Scythe and PatSQL in terms of both the running time and the quality of the synthesized programs.<\/jats:p>","DOI":"10.14778\/3626292.3626306","type":"journal-article","created":{"date-parts":[[2023,12,11]],"date-time":"2023-12-11T23:24:55Z","timestamp":1702337095000},"page":"250-263","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Relational Query Synthesis \u22c8 Decision Tree Learning"],"prefix":"10.14778","volume":"17","author":[{"given":"Aaditya","family":"Naik","sequence":"first","affiliation":[{"name":"University of Pennsylvania, Philadelphia, US"}]},{"given":"Aalok","family":"Thakkar","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, US"}]},{"given":"Adam","family":"Stein","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, US"}]},{"given":"Rajeev","family":"Alur","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, US"}]},{"given":"Mayur","family":"Naik","sequence":"additional","affiliation":[{"name":"University of Pennsylvania, Philadelphia, US"}]}],"member":"320","published-online":{"date-parts":[[2023,10]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Foundations of Databases","author":"Abiteboul Serge","unstructured":"Serge Abiteboul , Richard Hull , and Victor Vianu . 1995. Foundations of Databases . Addison-Wesley , Boston, MA . Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley, Boston, MA."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-54577-5_18"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/0890-5401(87)90052-6"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389776"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/359094.359101"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.14778\/3397230.3397235"},{"key":"e_1_2_1_8_1","volume-title":"Inductive logic programming at 30: a new introduction. arXiv preprint arXiv:2008.07912 (08","author":"Cropper Andrew","year":"2020","unstructured":"Andrew Cropper and Sebastijan Dumancic . 2020. Inductive logic programming at 30: a new introduction. arXiv preprint arXiv:2008.07912 (08 2020 ). Andrew Cropper and Sebastijan Dumancic. 2020. Inductive logic programming at 30: a new introduction. arXiv preprint arXiv:2008.07912 (08 2020)."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-020-05934-z"},{"key":"e_1_2_1_10_1","volume-title":"SQL and Relational Theory: How to Write Accurate SQL Code","author":"Date C","unstructured":"C Date . 2009. SQL and Relational Theory: How to Write Accurate SQL Code . O'Reilly Media, Inc. C Date. 2009. SQL and Relational Theory: How to Write Accurate SQL Code. O'Reilly Media, Inc."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1033"},{"key":"e_1_2_1_13_1","volume-title":"Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. arXiv preprint arXiv:1905.06088","author":"Avila Garcez Artur","year":"2019","unstructured":"Artur d' Avila Garcez , Marco Gori , Luis C Lamb , Luciano Serafini , Michael Spranger , and Son N Tran . 2019. Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. arXiv preprint arXiv:1905.06088 ( 2019 ). Artur d'Avila Garcez, Marco Gori, Luis C Lamb, Luciano Serafini, Michael Spranger, and Son N Tran. 2019. Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. arXiv preprint arXiv:1905.06088 (2019)."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2914770.2837664"},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Aritra Ghosh Naresh Manwani and P. S. Sastry. 2017. On the Robustness of Decision Tree Learning Under Label Noise. In Advances in Knowledge Discovery and Data Mining Jinho Kim Kyuseok Shim Longbing Cao Jae-Gil Lee Xuemin Lin and Yang-Sae Moon (Eds.). Springer International Publishing Cham 685--697.  Aritra Ghosh Naresh Manwani and P. S. Sastry. 2017. On the Robustness of Decision Tree Learning Under Label Noise. In Advances in Knowledge Discovery and Data Mining Jinho Kim Kyuseok Shim Longbing Cao Jae-Gil Lee Xuemin Lin and Yang-Sae Moon (Eds.). Springer International Publishing Cham 685--697.","DOI":"10.1007\/978-3-319-57454-7_53"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1926385.1926423"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-35488-8"},{"key":"e_1_2_1_18_1","volume-title":"MRDTL: A multi-relational decision tree learning algorithm. Master's thesis","author":"Hector Leiva","year":"2002","unstructured":"Leiva Hector . 2002 . MRDTL: A multi-relational decision tree learning algorithm. Master's thesis . Iowa State University . Leiva Hector. 2002. MRDTL: A multi-relational decision tree learning algorithm. Master's thesis. Iowa State University."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3385412.3386027"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(84)90077-1"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1089"},{"key":"e_1_2_1_22_1","unstructured":"M. Ammar Ben Khadra. 2017. E3Solver: decision tree unification by enumeration. arXiv:arXiv:1710.07021  M. Ammar Ben Khadra. 2017. E3Solver: decision tree unification by enumeration. arXiv:arXiv:1710.07021"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/3401960.3401970"},{"key":"e_1_2_1_24_1","volume-title":"Learning invariants using decision trees. arXiv preprint arXiv:1501.04725","author":"Krishna Siddharth","year":"2015","unstructured":"Siddharth Krishna , Christian Puhrsch , and Thomas Wies . 2015. Learning invariants using decision trees. arXiv preprint arXiv:1501.04725 ( 2015 ). Siddharth Krishna, Christian Puhrsch, and Thomas Wies. 2015. Learning invariants using decision trees. arXiv preprint arXiv:1501.04725 (2015)."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882952"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882952"},{"key":"e_1_2_1_27_1","volume-title":"The ILASP system for Inductive Learning of Answer Set Programs. CoRR abs\/2005.00904","author":"Law Mark","year":"2020","unstructured":"Mark Law , Alessandra Russo , and Krysia Broda . 2020. The ILASP system for Inductive Learning of Answer Set Programs. CoRR abs\/2005.00904 ( 2020 ). Mark Law, Alessandra Russo, and Krysia Broda. 2020. The ILASP system for Inductive Learning of Answer Set Programs. CoRR abs\/2005.00904 (2020)."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i7.16799"},{"key":"e_1_2_1_29_1","volume-title":"Machine Learning","author":"Mitchell Tom M.","unstructured":"Tom M. Mitchell . 1997. Machine Learning . McGraw-Hill , New York . Tom M. Mitchell. 1997. Machine Learning. McGraw-Hill, New York."},{"key":"e_1_2_1_30_1","volume-title":"Induction of decision trees. Machine Learning 1, 1","author":"Quinlan J. R.","year":"1986","unstructured":"J. R. Quinlan . 1986. Induction of decision trees. Machine Learning 1, 1 ( 1986 ). J. R. Quinlan. 1986. Induction of decision trees. Machine Learning 1, 1 (1986)."},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00116251"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/bf00117105"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3371130"},{"key":"e_1_2_1_34_1","volume-title":"Top-Down Induction of Decision Trees Classifiers-A survey. Systems, Man, and Cybernetics, Part C: Applications and Reviews","author":"Rokach Lior","year":"2005","unstructured":"Lior Rokach and Oded Maimon . 2005. Top-Down Induction of Decision Trees Classifiers-A survey. Systems, Man, and Cybernetics, Part C: Applications and Reviews , IEEE Transactions on ( 2005 ), 487. Lior Rokach and Oded Maimon. 2005. Top-Down Induction of Decision Trees Classifiers-A survey. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on (2005), 487."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1142\/9097"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2007.06.014"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 21st National Conference on Artificial Intelligence -","volume":"1","author":"Su Jiang","year":"2006","unstructured":"Jiang Su and Harry Zhang . 2006 . A Fast Decision Tree Learning Algorithm . In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1 (Boston, Massachusetts) (AAAI'06). AAAI Press, 500--505. Jiang Su and Harry Zhang. 2006. A Fast Decision Tree Learning Algorithm. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1 (Boston, Massachusetts) (AAAI'06). AAAI Press, 500--505."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4899-7641-3_10"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476249.3476253"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3453483.3454098"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.175"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3058738"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3140587.3062365"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-007-0114-2"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133887"},{"key":"e_1_2_1_46_1","volume-title":"Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. arXiv preprint arXiv:1809.08887","author":"Yu Tao","year":"2018","unstructured":"Tao Yu , Rui Zhang , Kai Yang , Michihiro Yasunaga , Dongxu Wang , Zifan Li , James Ma , Irene Li , Qingning Yao , Shanelle Roman , 2018 . Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. arXiv preprint arXiv:1809.08887 (2018). Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, et al. 2018. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. arXiv preprint arXiv:1809.08887 (2018)."},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the Thirteenth National Conference on Artificial Intelligence -","volume":"2","author":"John","year":"1864","unstructured":"John M. Zelle and Raymond J. Mooney. 1996. Learning to Parse Database Queries Using Inductive Logic Programming . In Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2 (Portland, Oregon). 1050--1055. http:\/\/dl.acm.org\/citation.cfm?id= 1864 519.1864543 John M. Zelle and Raymond J. Mooney. 1996. Learning to Parse Database Queries Using Inductive Logic Programming. In Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume 2 (Portland, Oregon). 1050--1055. http:\/\/dl.acm.org\/citation.cfm?id=1864519.1864543"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.14778\/2735508.2735510"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3519939.3523712"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3626292.3626306","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,8]],"date-time":"2024-01-08T23:07:19Z","timestamp":1704755239000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3626292.3626306"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10]]},"references-count":47,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,10]]}},"alternative-id":["10.14778\/3626292.3626306"],"URL":"https:\/\/doi.org\/10.14778\/3626292.3626306","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2023,10]]},"assertion":[{"value":"2023-10-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}