{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,3]],"date-time":"2025-12-03T17:24:05Z","timestamp":1764782645422,"version":"3.46.0"},"publisher-location":"New York, NY, USA","reference-count":30,"publisher":"ACM","funder":[{"name":"National Research Foundation of Korea","award":["RS-2023-00222663, RS-2023-00262885"],"award-info":[{"award-number":["RS-2023-00222663, RS-2023-00262885"]}]},{"name":"BK21 FOUR &#x28;Fostering Outstanding Universities for Research&#x29;","award":[""],"award-info":[{"award-number":[""]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,12,7]]},"DOI":"10.1145\/3767695.3769489","type":"proceedings-article","created":{"date-parts":[[2025,12,3]],"date-time":"2025-12-03T17:14:58Z","timestamp":1764782098000},"page":"123-132","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Retrieval-Augmented NL2SQL Generation with Data-Centric Query Capsules for Enterprise Applications"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-4449-2381","authenticated-orcid":false,"given":"Jisoo","family":"Jang","sequence":"first","affiliation":[{"name":"Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-0496-3479","authenticated-orcid":false,"given":"Wen-Syan","family":"Li","sequence":"additional","affiliation":[{"name":"Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea"}]}],"member":"320","published-online":{"date-parts":[[2025,12,6]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"crossref","unstructured":"I. Androutsopoulos G. D. Ritchie and P. Thanisch. 1995. Natural Language Interfaces to Databases - An Introduction. arXiv:cmp-lg\/9503016 [cmp-lg]","DOI":"10.1017\/S135132490000005X"},{"key":"e_1_3_2_1_2_1","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language models are few-shot learners. Advances in neural information processing systems Vol. 33 (2020) 1877-1901."},{"key":"e_1_3_2_1_3_1","unstructured":"dcai.csail.mit.edu. 2024. Data-Centric AI vs. Model-Centric AIf. https:\/\/dcai.csail.mit.edu\/2024\/data-centric-model-centric\/"},{"key":"e_1_3_2_1_4_1","volume-title":"Semantic Decomposition of Question and SQL for Text-to-SQL Parsing. In Findings of the Association for Computational Linguistics: EMNLP","author":"Eyal Ben","year":"2023","unstructured":"Ben Eyal, Moran Mahabi, Ophir Haroche, Amir Bachar, and Michael Elhadad. 2023. Semantic Decomposition of Question and SQL for Text-to-SQL Parsing. In Findings of the Association for Computational Linguistics: EMNLP 2023. Association for Computational Linguistics."},{"key":"e_1_3_2_1_5_1","volume-title":"A case-based reasoning framework for adaptive prompting in cross-domain text-to-sql. CoRR","author":"Guo Chunxi","year":"2023","unstructured":"Chunxi Guo, Zhiliang Tian, Jintao Tang, Pancheng Wang, Zhihua Wen, Kang Yang, and Ting Wang. 2023. A case-based reasoning framework for adaptive prompting in cross-domain text-to-sql. CoRR (2023)."},{"key":"e_1_3_2_1_6_1","volume-title":"Towards complex text-to-sql in cross-domain database with intermediate representation. arXiv preprint arXiv:1905.08205","author":"Guo Jiaqi","year":"2019","unstructured":"Jiaqi Guo, Zecheng Zhan, Yan Gao, Yan Xiao, Jian-Guang Lou, Ting Liu, and Dongmei Zhang. 2019. Towards complex text-to-sql in cross-domain database with intermediate representation. arXiv preprint arXiv:1905.08205 (2019)."},{"key":"e_1_3_2_1_7_1","unstructured":"TPC Benchmark H(TPC-H). 2022. TPC-H Version 3(28 April 2022). https:\/\/www.tpc.org\/tpch\/"},{"key":"e_1_3_2_1_8_1","unstructured":"Patrick Lewis Ethan Perez Aleksandra Piktus Fabio Petroni Vladimir Karpukhin Naman Goyal Heinrich K\u00fcttler Mike Lewis Wen-tau Yih Tim Rockt\u00e4schel et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems Vol. 33 (2020) 9459-9474."},{"key":"e_1_3_2_1_9_1","volume-title":"CodeS: Towards Building Open-source Language Models for Text-to-SQL. arXiv preprint arXiv:2402.16347","author":"Li Haoyang","year":"2024","unstructured":"Haoyang Li, Jing Zhang, Hanbing Liu, Ju Fan, Xiaokang Zhang, Jun Zhu, Renjie Wei, Hongyan Pan, Cuiping Li, and Hong Chen. 2024b. CodeS: Towards Building Open-source Language Models for Text-to-SQL. arXiv preprint arXiv:2402.16347 (2024)."},{"key":"e_1_3_2_1_10_1","volume-title":"Advances in Neural Information Processing Systems","volume":"36","author":"Li Jinyang","year":"2024","unstructured":"Jinyang Li, Binyuan Hui, Ge Qu, Jiaxi Yang, Binhua Li, Bowen Li, Bailin Wang, Bowen Qin, Ruiying Geng, Nan Huo, et al., 2024a. Can llm already serve as a database interface? a big bench for large-scale database grounded text-to-sqls. Advances in Neural Information Processing Systems, Vol. 36 (2024)."},{"key":"e_1_3_2_1_11_1","volume-title":"What Makes Good In-Context Examples for GPT-3? arXiv preprint arXiv:2101.06804","author":"Liu Jiachang","year":"2021","unstructured":"Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. 2021. What Makes Good In-Context Examples for GPT-3? arXiv preprint arXiv:2101.06804 (2021)."},{"key":"e_1_3_2_1_12_1","volume-title":"The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models. arXiv preprint arXiv:2408.07702","author":"Maamari Karime","year":"2024","unstructured":"Karime Maamari, Fadhil Abubaker, Daniel Jaroslawicz, and Amine Mhedhbi. 2024. The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models. arXiv preprint arXiv:2408.07702 (2024)."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02295996"},{"key":"e_1_3_2_1_14_1","unstructured":"OpenAI. 2024a. OpenAI GPT-4o. https:\/\/openai.com\/index\/hello-gpt-4o\/"},{"key":"e_1_3_2_1_15_1","unstructured":"OpenAI. 2024b. OpenAI GPT-4o-mini. https:\/\/openai.com\/index\/gpt-4o-mini-advancing-cost-efficient-intelligence\/"},{"key":"e_1_3_2_1_16_1","volume-title":"Advances in Neural Information Processing Systems","volume":"36","author":"Papicchio Simone","year":"2024","unstructured":"Simone Papicchio, Paolo Papotti, and Luca Cagliero. 2024. Qatch: Benchmarking sql-centric tasks with table representation learning models on your data. Advances in Neural Information Processing Systems, Vol. 36 (2024)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2699485"},{"key":"e_1_3_2_1_18_1","volume-title":"Advances in Neural Information Processing Systems","volume":"36","author":"Pourreza Mohammadreza","year":"2024","unstructured":"Mohammadreza Pourreza and Davood Rafiei. 2024. Din-sql: Decomposed in-context learning of text-to-sql with self-correction. Advances in Neural Information Processing Systems, Vol. 36 (2024)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE60146.2024.00009"},{"key":"e_1_3_2_1_20_1","volume-title":"Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance. arXiv preprint arXiv:2404.08817","author":"Song Yewei","year":"2024","unstructured":"Yewei Song, Cedric Lothritz, Daniel Tang, Tegawend\u00e9 F Bissyand\u00e9, and Jacques Klein. 2024. Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance. arXiv preprint arXiv:2404.08817 (2024)."},{"key":"e_1_3_2_1_21_1","volume-title":"Chess: Contextual harnessing for efficient sql synthesis. arXiv preprint arXiv:2405.16755","author":"Talaei Shayan","year":"2024","unstructured":"Shayan Talaei, Mohammadreza Pourreza, Yu-Chen Chang, Azalia Mirhoseini, and Amin Saberi. 2024. Chess: Contextual harnessing for efficient sql synthesis. arXiv preprint arXiv:2405.16755 (2024)."},{"key":"e_1_3_2_1_22_1","volume-title":"Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, et al.","author":"Team Gemini","year":"2024","unstructured":"Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, et al., 2024. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530 (2024)."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-acl.352"},{"key":"e_1_3_2_1_24_1","volume-title":"Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. arXiv preprint arXiv:1911.04942","author":"Wang Bailin","year":"2019","unstructured":"Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, and Matthew Richardson. 2019. Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. arXiv preprint arXiv:1911.04942 (2019)."},{"key":"e_1_3_2_1_25_1","first-page":"17","volume-title":"DSL","volume":"97","author":"Wang Daniel C","year":"1997","unstructured":"Daniel C Wang, Andrew W Appel, Jeffrey L Korn, and Christopher S Serra. 1997. The zephyr abstract syntax description language. In DSL, Vol. 97. 17-17."},{"key":"e_1_3_2_1_26_1","volume-title":"Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. Advances in neural information processing systems","author":"Wang Wenhui","year":"2020","unstructured":"Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. 2020. Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. Advances in neural information processing systems, Vol. 33 (2020), 5776-5788."},{"key":"e_1_3_2_1_27_1","volume-title":"Denny Zhou, et al.","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al., 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, Vol. 35 (2022), 24824-24837."},{"key":"e_1_3_2_1_28_1","volume-title":"Breakthroughs in statistics: Methodology and distribution","author":"Wilcoxon Frank","unstructured":"Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in statistics: Methodology and distribution. Springer, 196-202."},{"key":"e_1_3_2_1_29_1","volume-title":"Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. arXiv preprint arXiv:1809.08887","author":"Yu Tao","year":"2018","unstructured":"Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, et al., 2018. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. arXiv preprint arXiv:1809.08887 (2018)."},{"key":"e_1_3_2_1_30_1","volume-title":"Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675","author":"Zhang Tianyi","year":"2019","unstructured":"Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. 2019. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675 (2019)."}],"event":{"name":"SIGIR-AP 2025:Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region","location":"Xi'an China","sponsor":["SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 2025 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region"],"original-title":[],"deposited":{"date-parts":[[2025,12,3]],"date-time":"2025-12-03T17:17:42Z","timestamp":1764782262000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3767695.3769489"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,6]]},"references-count":30,"alternative-id":["10.1145\/3767695.3769489","10.1145\/3767695"],"URL":"https:\/\/doi.org\/10.1145\/3767695.3769489","relation":{},"subject":[],"published":{"date-parts":[[2025,12,6]]},"assertion":[{"value":"2025-12-06","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}