{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T20:33:06Z","timestamp":1776889986204,"version":"3.51.2"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,6,13]],"date-time":"2023-06-13T00:00:00Z","timestamp":1686614400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2023,6,13]]},"abstract":"<jats:p>A common problem with adopting Text-to-SQL translation in database systems is poor generalization. Specifically, when there is limited training data on new datasets, existing few-shot Text-to-SQL techniques, even with carefully designed textual prompts on pre-trained language models (PLMs), tend to be ineffective. In this paper, we present a divide-and-conquer framework to better support few-shot Text-to-SQL translation, which divides Text-to-SQL translation into two stages (or sub-tasks), such that each sub-task is simpler to be tackled. The first stage, called the structure stage, steers a PLM to generate an SQL structure (including SQL commands such as SELECT, FROM, WHERE and SQL operators such as &lt;\", ?&gt;\") with placeholders for missing identifiers. The second stage, called the content stage, guides a PLM to populate the placeholders in the generated SQL structure with concrete values (including SQL identifies such as table names, column names, and constant values). We propose a hybrid prompt strategy that combines learnable vectors and fixed vectors (i.e., word embeddings of textual prompts), such that the hybrid prompt can learn contextual information to better guide PLMs for prediction in both stages. In addition, we design keyword constrained decoding to ensure the validity of generated SQL structures, and structure guided decoding to guarantee the model to fill correct content. Extensive experiments, by comparing with ten state-of-the-art Text-to-SQL solutions at the time of writing, show that SC-Prompt significantly outperforms them in the few-shot scenario. In particular, on the widely-adopted Spider dataset, given less than 500 labeled training examples (5% of the official training set), SC-Prompt outperforms the previous SOTA methods by around 5% on accuracy.<\/jats:p>","DOI":"10.1145\/3589292","type":"journal-article","created":{"date-parts":[[2023,6,20]],"date-time":"2023-06-20T20:26:45Z","timestamp":1687292805000},"page":"1-28","source":"Crossref","is-referenced-by-count":48,"title":["Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning"],"prefix":"10.1145","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9413-2068","authenticated-orcid":false,"given":"Zihui","family":"Gu","sequence":"first","affiliation":[{"name":"Renmin University of China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4729-9903","authenticated-orcid":false,"given":"Ju","family":"Fan","sequence":"additional","affiliation":[{"name":"Renmin University of China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2832-0295","authenticated-orcid":false,"given":"Nan","family":"Tang","sequence":"additional","affiliation":[{"name":"QCRI &amp; HKUST (GZ), Doha, Qatar"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9909-8607","authenticated-orcid":false,"given":"Lei","family":"Cao","sequence":"additional","affiliation":[{"name":"MIT CSAIL &amp; University of Arizona, Boston, MA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3905-2521","authenticated-orcid":false,"given":"Bowen","family":"Jia","sequence":"additional","affiliation":[{"name":"Renmin University of China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7470-3265","authenticated-orcid":false,"given":"Sam","family":"Madden","sequence":"additional","affiliation":[{"name":"MIT CSAIL, Boston, MA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5757-9135","authenticated-orcid":false,"given":"Xiaoyong","family":"Du","sequence":"additional","affiliation":[{"name":"Renmin University of China, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2023,6,20]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_2_1","volume-title":"Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020","author":"Brown Tom B.","year":"2020","unstructured":"Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6--12, 2020, virtual, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i14.17550"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00324"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_14_1","volume-title":"A Systematic Survey of Prompting Methods in Natural Language Processing. CoRR","author":"Liu Pengfei","year":"2021","unstructured":"Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. CoRR, Vol. abs\/2107.13586 (2021). showeprint[arXiv]2107.13586 https:\/\/arxiv.org\/abs\/2107.13586"},{"key":"e_1_2_2_15_1","unstructured":"Oracle. 2019. https:\/\/docs.oracle.com\/en\/cloud\/saas\/service\/18b\/favau\/natural-language-processing.html."},{"key":"e_1_2_2_16_1","volume-title":"True Few-Shot Learning with Language Models. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021","author":"Perez Ethan","year":"2021","unstructured":"Ethan Perez, Douwe Kiela, and Kyunghyun Cho. 2021. True Few-Shot Learning with Language Models. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6--14, 2021, virtual, Marc'Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 11054--11070. https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/5c04925674920eb58467fb52ce4ef728-Abstract.html"},{"key":"e_1_2_2_17_1","unstructured":"Alec Radford Jeffrey Wu Rewon Child David Luan Dario Amodei Ilya Sutskever et al. 2019. Language models are unsupervised multitask learners. OpenAI blog Vol. 1 8 (2019) 9."},{"key":"e_1_2_2_18_1","first-page":"1","article-title":"2020a. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020a. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., Vol. 21 (2020), 140:1--140:67. http:\/\/jmlr.org\/papers\/v21\/20-074.html","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_2_19_1","first-page":"1","article-title":"2020b. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020b. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., Vol. 21 (2020), 140:1--140:67. http:\/\/jmlr.org\/papers\/v21\/20-074.html","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_21_1","unstructured":"Salesforce. 2020. https:\/\/blog.salesforceairesearch.com\/talk-to-your-data-one-model-any-database\/."},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_28_1","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998--6008. https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html"},{"key":"e_1_2_2_29_1","volume-title":"Bowman","author":"Wang Alex","year":"2019","unstructured":"Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019b. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8--14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alch\u00e9 -Buc, Emily B. Fox, and Roman Garnett (Eds.). 3261--3275. https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/4496bf24afe7fab6f046bf4923da8de6-Abstract.html"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_34_1","unstructured":"Tianbao Xie Chen Henry Wu Peng Shi Ruiqi Zhong Torsten Scholak Michihiro Yasunaga Chien-Sheng Wu Ming Zhong Pengcheng Yin Sida I. Wang Victor Zhong Bailin Wang Chengzu Li Connor Boyle Ansong Ni Ziyu Yao Dragomir R. Radev Caiming Xiong Lingpeng Kong Rui Zhang Noah A. Smith Luke Zettlemoyer and Tao Yu. 2022. UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models. CoRR Vol. abs\/2201.05966 (2022). showeprint[arXiv]2201.05966 https:\/\/arxiv.org\/abs\/2201.05966"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_37_1","volume-title":"GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing. In 9th International Conference on Learning Representations, ICLR 2021","author":"Yu Tao","year":"2021","unstructured":"Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir R. Radev, Richard Socher, and Caiming Xiong. 2021. GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3--7, 2021. OpenReview.net. https:\/\/openreview.net\/forum?id=kyaIeYj4zZ"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_2_2_40_1","volume-title":"Mooney","author":"Zelle John M.","year":"1996","unstructured":"John M. Zelle and Raymond J. Mooney. 1996. Learning to Parse Database Queries Using Inductive Logic Programming. In Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference, AAAI 96, IAAI 96, Portland, Oregon, USA, August 4--8, 1996, Volume 2, William J. Clancey and Daniel S. Weld (Eds.). AAAI Press \/ The MIT Press, 1050--1055. http:\/\/www.aaai.org\/Library\/AAAI\/1996\/aaai96--156.php"},{"key":"e_1_2_2_41_1","volume-title":"Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18","volume":"11339","author":"Zhang Jingqing","year":"2020","unstructured":"Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. 2020. PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13--18 July 2020, Virtual Event (Proceedings of Machine Learning Research, Vol. 119). PMLR, 11328--11339. http:\/\/proceedings.mlr.press\/v119\/zhang20ae.html"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589292","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3589292","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:46:13Z","timestamp":1750178773000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3589292"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,13]]},"references-count":42,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,6,13]]}},"alternative-id":["10.1145\/3589292"],"URL":"https:\/\/doi.org\/10.1145\/3589292","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,13]]}}}