{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,28]],"date-time":"2025-09-28T00:06:13Z","timestamp":1759017973742,"version":"3.44.0"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Form. Asp. Comput."],"abstract":"<jats:p>In recent years, more people have seen their work depend on data manipulation tasks. However, many of these users do not have the background in programming required to write SQL queries, particularly complex ones. One way of helping these users is automatically synthesizing the SQL query given a small set of examples. Several program synthesizers for SQL have been recently proposed, but they do not leverage multicore architectures to improve synthesis performance.<\/jats:p>\n          <jats:p>\n            This paper proposes\n            <jats:sc>Cubes<\/jats:sc>\n            , a parallel program synthesizer for the domain of SQL queries using input-output examples. Since input-output examples are an under-specification of the desired SQL query, sometimes, the synthesized query does not match the user\u2019s intent.\n            <jats:sc>Cubes<\/jats:sc>\n            incorporates a new disambiguation procedure based on fuzzing techniques that interacts with the user and increases the confidence that the returned query matches the user intent.\n          <\/jats:p>\n          <jats:p>We perform an extensive evaluation on around 4000 SQL queries from different domains. Experimental results show that our sequential version can solve more instances than other state-of-the-art SQL synthesizers. Moreover, the parallel approach can scale up to 16 processes with super-linear speedups for many hard instances. Our disambiguation approach is critical to achieving an accuracy of around 60%, significantly larger than other SQL synthesizers.<\/jats:p>","DOI":"10.1145\/3768578","type":"journal-article","created":{"date-parts":[[2025,9,27]],"date-time":"2025-09-27T11:14:52Z","timestamp":1758971692000},"update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["CUBES: A Parallel Synthesizer for SQL Using Examples"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7006-9829","authenticated-orcid":false,"given":"Ricardo","family":"Brancas","sequence":"first","affiliation":[{"name":"Universidade de Lisboa, Instituto Superior T\u00e9cnico","place":["Lisbon, Portugal"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4089-7206","authenticated-orcid":false,"given":"Miguel","family":"Terra-Neves","sequence":"additional","affiliation":[{"name":"OutSystems","place":["Lisbon, Portugal"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4233-1348","authenticated-orcid":false,"given":"Miguel","family":"Ventura","sequence":"additional","affiliation":[{"name":"OutSystems","place":["Lisbon, Portugal"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4205-2189","authenticated-orcid":false,"given":"Vasco","family":"Manquinho","sequence":"additional","affiliation":[{"name":"INESC-ID \/ Instituto Superior T\u00e9cnico","place":["Lisboa, Portugal"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1525-1382","authenticated-orcid":false,"given":"Ruben","family":"Martins","sequence":"additional","affiliation":[{"name":"Computer Science Department, Carnegie Mellon University","place":["Pittsburgh, United States"]}]}],"member":"320","published-online":{"date-parts":[[2025,9,27]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Fourth Pragmatics of SAT workshop, a workshop of the SAT 2013 conference","author":"Aigner Martin","year":"2013","unstructured":"Martin Aigner, Armin Biere, Christoph\u00a0M. Kirsch, Aina Niemetz, and Mathias Preiner. 2013. Analysis of Portfolio-Style Parallel SAT Solving on Current Multi-Core Architectures. In POS-13. Fourth Pragmatics of SAT workshop, a workshop of the SAT 2013 conference, July 7, 2013, Helsinki, Finland(EPiC Series in Computing, Vol.\u00a0 29), Daniel\u00a0Le Berre (Ed.). EasyChair, 28\u201340. https:\/\/easychair.org\/publications\/paper\/nHs"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3208071"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-24318-4_12"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-57259-3_11"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/3236187.3236200"},{"key":"e_1_2_1_6_1","volume-title":"Cosette: An Automated Prover for SQL. In 8th Biennial Conference on Innovative Data Systems Research, CIDR","author":"Chu Shumo","year":"2017","unstructured":"Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017. Cosette: An Automated Prover for SQL. In 8th Biennial Conference on Innovative Data Systems Research, CIDR 2017, Chaminade, CA, USA, January 8-11, 2017, Online Proceedings. www.cidrdb.org. http:\/\/cidrdb.org\/cidr2017\/papers\/p51-chu-cidr17.pdf"},{"key":"e_1_2_1_7_1","volume-title":"SQUARES: A SQL Synthesizer Using Query Reverse Engineering. Master\u2019s thesis. Instituto Superior T\u00e9cnico, Universidade de Lisboa.","author":"Marques da Silva Pedro Miguel","year":"2019","unstructured":"Pedro Miguel Orvalho\u00a0Marques da Silva. 2019. SQUARES: A SQL Synthesizer Using Query Reverse Engineering. Master\u2019s thesis. Instituto Superior T\u00e9cnico, Universidade de Lisboa."},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS\u201908\/ETAPS\u201908)","author":"De\u00a0Moura Leonardo","year":"2008","unstructured":"Leonardo De\u00a0Moura and Nikolaj Bj\u00f8rner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS\u201908\/ETAPS\u201908). Springer-Verlag, Berlin, Heidelberg, 337\u2013340."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3062341.3062351"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-72016-2_9"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.14778\/3641204.3641221"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1017\/S1471068418000340"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024","author":"Gu Yu","year":"2024","unstructured":"Yu Gu, Yiheng Shu, Hao Yu, Xiao Liu, Yuxiao Dong, Jie Tang, Jayanth Srinivasa, Hugo Latapie, and Yu Su. 2024. Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024, Miami, FL, USA, November 12-16, 2024, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.). Association for Computational Linguistics, 7646\u20137663. https:\/\/aclanthology.org\/2024.emnlp-main.436"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1926385.1926423"},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"S. Gulwani O. Polozov and R. Singh. 2017. Program Synthesis. now.","DOI":"10.1561\/9781680832938"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","unstructured":"Youssef Hamadi and Lakhdar Sais (Eds.). 2018. Handbook of Parallel Constraint Reasoning. Springer. doi: 10.1007\/978-3-319-63516-3","DOI":"10.1007\/978-3-319-63516-3"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3409732"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-63516-3_2"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","unstructured":"Zijin Hong Zheng Yuan Qinggang Zhang Hao Chen Junnan Dong Feiran Huang and Xiao Huang. 2024. Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL. CoRR abs\/2406.08426(2024). arXiv:2406.08426 doi: 10.48550\/ARXIV.2406.08426","DOI":"10.48550\/ARXIV.2406.08426"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3385412.3386025"},{"key":"e_1_2_1_21_1","volume-title":"Speech and Language Processing","author":"Jurafsky Daniel","unstructured":"Daniel Jurafsky and James\u00a0H Martin. 2009. Speech and Language Processing(2nd ed.). Prentice Hall.","edition":"2"},{"key":"e_1_2_1_22_1","volume-title":"Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition","author":"Jurafsky Dan","unstructured":"Dan Jurafsky and James\u00a0H. Martin. 2009. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition. Prentice Hall, Pearson Education International.","edition":"2"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183727"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.14778\/3681954.3682003"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/2831360.2831369"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.IS.2019.03.002"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.14778\/3352063.3352098"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2807442.2807459"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3626292.3626306"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397481.3450680"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.2478\/amcs-2019-0019"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-30048-7_34"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/3415478.3415492"},{"key":"e_1_2_1_34_1","volume-title":"Database Management Systems","author":"Ramakrishnan Raghu","unstructured":"Raghu Ramakrishnan and Johannes Gehrke. 2002. Database Management Systems(third ed.). McGraw-Hill, Inc., USA."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3324884.3416613"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.3233\/SAT190083"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.COLA.2021.101022"},{"key":"e_1_2_1_38_1","volume-title":"PaMira - A Parallel SAT Solver with Knowledge Sharing. In International Workshop on Microprocessor Test and Verification. 29\u201336","author":"Schubert Tobias","year":"2005","unstructured":"Tobias Schubert, Matthew Lewis, and Bernd Becker. 2005. PaMira - A Parallel SAT Solver with Knowledge Sharing. In International Workshop on Microprocessor Test and Verification. 29\u201336."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1609\/AAAI.V35I15.17627"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1287\/ijoc.2017.0762"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSE-NIER.2017.7"},{"key":"e_1_2_1_42_1","volume-title":"Introduction to the Theory of Computation","author":"Sipser Michael","unstructured":"Michael Sipser. 2012. Introduction to the Theory of Computation (3rd edition ed.). Course Technology Cengage Learning, Boston, MA.","edition":"3"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.COLA.2023.101252"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-10672-9_3"},{"key":"e_1_2_1_45_1","doi-asserted-by":"crossref","first-page":"1937","DOI":"10.14778\/3476249.3476253","article-title":"PATSQL: Efficient Synthesis of SQL Queries from Example Tables\u00a0with Quick Inference of Projected Columns","volume":"14","author":"Takenouchi Keita","year":"2021","unstructured":"Keita Takenouchi, Takashi Ishio, Joji Okada, and Yuji Sakata. 2021. PATSQL: Efficient Synthesis of SQL Queries from Example Tables\u00a0with Quick Inference of Projected Columns. Proc. VLDB Endow. 14, 11 (2021), 1937\u20131949.","journal-title":"Proc. VLDB Endow."},{"key":"e_1_2_1_46_1","volume-title":"Proc. International Conference on Management of Data. ACM, 535\u2013548","author":"Tran Quoc\u00a0Trung","year":"2009","unstructured":"Quoc\u00a0Trung Tran, Chee-Yong Chan, and Srinivasan Parthasarathy. 2009. Query by output. In Proc. International Conference on Management of Data. ACM, 535\u2013548."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559902"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-013-0349-3"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.677"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3058738"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3062341.3062365"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133887"},{"key":"e_1_2_1_53_1","volume-title":"GraPPa: Grammar-Augmented Pre-Training for Table\u00a0Semantic Parsing. In 9th International Conference on Learning Representations, ICLR 2021","author":"Yu Tao","year":"2021","unstructured":"Tao Yu, Chien-Sheng Wu, Xi\u00a0Victoria Lin, Bailin Wang, Yi\u00a0Chern Tan, Xinyi Yang, Dragomir\u00a0R. Radev, Richard Socher, and Caiming Xiong. 2021. GraPPa: Grammar-Augmented Pre-Training for Table\u00a0Semantic Parsing. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https:\/\/openreview.net\/forum?id=kyaIeYj4zZ"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/d18-1425"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465320"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2013.6693082"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASE.2013.6693082"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.29"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342267"}],"container-title":["Formal Aspects of Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3768578","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,27]],"date-time":"2025-09-27T11:14:56Z","timestamp":1758971696000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3768578"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,27]]},"references-count":59,"alternative-id":["10.1145\/3768578"],"URL":"https:\/\/doi.org\/10.1145\/3768578","relation":{},"ISSN":["0934-5043","1433-299X"],"issn-type":[{"value":"0934-5043","type":"print"},{"value":"1433-299X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,27]]},"assertion":[{"value":"2024-08-31","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-15","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"3768578"}}