{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T01:47:10Z","timestamp":1780451230771,"version":"3.54.1"},"reference-count":117,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2025,1,12]],"date-time":"2025-01-12T00:00:00Z","timestamp":1736640000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,1,12]],"date-time":"2025-01-12T00:00:00Z","timestamp":1736640000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Front. Comput. Sci."],"published-print":{"date-parts":[[2025,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Tables, typically two-dimensional and structured to store large amounts of data, are essential in daily activities like database queries, spreadsheet manipulations, Web table question answering, and image table information extraction. Automating these table-centric tasks with Large Language Models (LLMs) or Visual Language Models (VLMs) offers significant public benefits, garnering interest from academia and industry. This survey provides a comprehensive overview of table-related tasks, examining both user scenarios and technical aspects. It covers traditional tasks like table question answering as well as emerging fields such as spreadsheet manipulation and table data analysis. We summarize the training techniques for LLMs and VLMs tailored for table processing. Additionally, we discuss prompt engineering, particularly the use of LLM-powered agents, for various table-related tasks. Finally, we highlight several challenges, including diverse user input when serving and slow thinking using chain-of-thought.<\/jats:p>","DOI":"10.1007\/s11704-024-40763-6","type":"journal-article","created":{"date-parts":[[2025,1,12]],"date-time":"2025-01-12T18:02:00Z","timestamp":1736704920000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":32,"title":["Large language model for table processing: a survey"],"prefix":"10.1007","volume":"19","author":[{"given":"Weizheng","family":"Lu","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jing","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ju","family":"Fan","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zihao","family":"Fu","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yueguo","family":"Chen","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaoyong","family":"Du","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2025,1,12]]},"reference":[{"key":"40763_CR1","first-page":"5426","volume-title":"Proceedings of the 31st International Joint Conference on Artificial Intelligence","author":"H Dong","year":"2022","unstructured":"Dong H, Cheng Z, He X, Zhou M, Zhou A, Zhou F, Liu A, Han S, Zhang D. Table pre-training: a survey on model architectures, pre-training objectives, and downstream tasks. In: Proceedings of the 31st International Joint Conference on Artificial Intelligence. 2022, 5426\u20135435"},{"key":"40763_CR2","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1162\/tacl_a_00544","volume":"11","author":"G Badaro","year":"2023","unstructured":"Badaro G, Saeed M, Papotti P. Transformers for tabular data representation: a survey of models and applications. Transactions of the Association for Computational Linguistics, 2023, 11: 227\u2013249","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"40763_CR3","unstructured":"Fang X, Xu W, Tan F A, Zhang J, Hu Z, Qi Y, Nickleach S, Socolinsky D, Sengamedu S, Faloutsos C. Large language models(LLMs) on tabular data: prediction, generation, and understanding \u2014 a survey. 2024, arXiv preprint arXiv: 2402.17944"},{"key":"40763_CR4","unstructured":"Zhang X, Wang D, Dou L, Zhu Q, Che W. A survey of table reasoning with large language models. 2024, arXiv preprint arXiv: 2402.08259"},{"key":"40763_CR5","unstructured":"Zhao W X, Zhou K, Li J, Tang T, Wang X, Hou Y, Min Y, Zhang B, Zhang J, Dong Z, Du Y, Yang C, Chen Y, Chen Z, Jiang J, Ren R, Li Y, Tang X, Liu Z, Liu P, Nie J Y, Wen J R. A survey of large language models. 2023, arXiv preprint arXiv: 2303.18223"},{"issue":"1","key":"40763_CR6","first-page":"140","volume":"21","author":"C Raffel","year":"2020","unstructured":"Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 2020, 21(1): 140","journal-title":"Journal of Machine Learning Research"},{"key":"40763_CR7","doi-asserted-by":"publisher","first-page":"8413","DOI":"10.18653\/v1\/2020.acl-main.745","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"P Yin","year":"2020","unstructured":"Yin P, Neubig G, Yih W, Riedel S. TaBERT: pretraining for joint understanding of textual and tabular data. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 8413\u20138426"},{"key":"40763_CR8","doi-asserted-by":"publisher","first-page":"4320","DOI":"10.18653\/v1\/2020.acl-main.398","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"J Herzig","year":"2020","unstructured":"Herzig J, Nowak P K, M\u00fcller T, Piccinno F, Eisenschlos J. TaPas: weakly supervised table parsing via pre-training. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 4320\u20134333"},{"issue":"3","key":"40763_CR9","doi-asserted-by":"publisher","first-page":"307","DOI":"10.14778\/3430915.3430921","volume":"14","author":"X Deng","year":"2020","unstructured":"Deng X, Sun H, Lees A, Wu Y, Yu C. TURL: table understanding through representation learning. Proceedings of the VLDB Endowment, 2020, 14(3): 307\u2013319","journal-title":"Proceedings of the VLDB Endowment"},{"key":"40763_CR10","volume-title":"Proceedings of the 10th International Conference on Learning Representations","author":"Q Liu","year":"2022","unstructured":"Liu Q, Chen B, Guo J, Ziyadi M, Lin Z, Chen W, Lou J. TAPEX: table pre-training via learning a neural SQL executor. In: Proceedings of the 10th International Conference on Learning Representations. 2022"},{"key":"40763_CR11","first-page":"6024","volume-title":"Proceedings of 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"T Zhang","year":"2024","unstructured":"Zhang T, Yue X, Li Y, Sun H. TableLlama: towards open large generalist models for tables. In: Proceedings of 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2024, 6024\u20136044"},{"issue":"3","key":"40763_CR12","doi-asserted-by":"publisher","first-page":"176","DOI":"10.1145\/3654979","volume":"2","author":"P Li","year":"2024","unstructured":"Li P, He Y, Yashar D, Cui W, Ge S, Zhang H, Fainman D R, Zhang D, Chaudhuri S. Table-GPT: table fine-tuned GPT for diverse table tasks. Proceedings of the ACM on Management of Data, 2024, 2(3): 176","journal-title":"Proceedings of the ACM on Management of Data"},{"key":"40763_CR13","first-page":"9102","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics","author":"M Zheng","year":"2024","unstructured":"Zheng M, Feng X, Si Q, She Q, Lin Z, Jiang W, Wang W. Multimodal table understanding. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024, 9102\u20139124"},{"key":"40763_CR14","doi-asserted-by":"publisher","first-page":"645","DOI":"10.1145\/3616855.3635752","volume-title":"Proceedings of the 17th ACM International Conference on Web Search and Data Mining","author":"Y Sui","year":"2024","unstructured":"Sui Y, Zhou M, Zhou M, Han S, Zhang D. Table meets LLM: can large language models understand structured table data? A benchmark and empirical study. In: Proceedings of the 17th ACM International Conference on Web Search and Data Mining. 2024, 645\u2013654"},{"issue":"8","key":"40763_CR15","doi-asserted-by":"publisher","first-page":"1981","DOI":"10.14778\/3659437.3659452","volume":"17","author":"Y Zhang","year":"2024","unstructured":"Zhang Y, Henkel J, Floratou A, Cahoon J, Deep S, Patel J M. ReAcTable: enhancing ReAct for table question answering. Proceedings of the VLDB Endowment, 2024, 17(8): 1981\u20131994","journal-title":"Proceedings of the VLDB Endowment"},{"key":"40763_CR16","first-page":"4952","volume-title":"Proceedings of the 37th International Conference on Neural Information Processing Systems","author":"H Li","year":"2023","unstructured":"Li H, Su J, Chen Y, Li Q, Zhang Z. SheetCopilot: bringing software productivity to the next level through large language models. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 4952\u20134984"},{"key":"40763_CR17","first-page":"19544","volume-title":"Proceedings of the 41st International Conference on Machine Learning","author":"X Hu","year":"2024","unstructured":"Hu X, Zhao Z, Wei S, Chai Z, Ma Q, Wang G, Wang X, Su J, Xu J, Zhu M, Cheng Y, Yuan J, Li J, Kuang K, Yang Y, Yang H, Wu F. InfiAgent-DABench: evaluating agents on data analysis tasks. In: Proceedings of the 41st International Conference on Machine Learning. 2024, 19544\u201319572"},{"key":"40763_CR18","volume-title":"Proceedings of the 10th International Conference on Learning Representations","author":"J Wei","year":"2022","unstructured":"Wei J, Bosma M, Zhao V, Guu K, Yu A W, Lester B, Du N, Dai A M, Le Q V. Finetuned language models are zero-shot learners. In: Proceedings of the 10th International Conference on Learning Representations. 2022"},{"key":"40763_CR19","first-page":"159","volume-title":"Proceedings of the 34th International Conference on Neural Information Processing Systems","author":"T B Brown","year":"2020","unstructured":"Brown T B, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, HerbertVoss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D M, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D. Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 159"},{"key":"40763_CR20","first-page":"24824","volume-title":"Proceedings of the 36th International Conference on Neural Information Processing Systems","author":"J Wei","year":"2022","unstructured":"Wei J, Wang X, Schuurmans D, Bosma M, Ichter B, Xia F, Chi E H, Le Q V, Zhou D. Chain-of-thought prompting elicits reasoning in large language models. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 24824\u201324837"},{"issue":"6","key":"40763_CR21","doi-asserted-by":"publisher","first-page":"186345","DOI":"10.1007\/s11704-024-40231-1","volume":"18","author":"L Wang","year":"2024","unstructured":"Wang L, Ma C, Feng X, Zhang Z, Yang H, Zhang J, Chen Z, Tang J, Chen X, Lin Y, Zhao W X, Wei Z, Wen J. A survey on large language model based autonomous agents. Frontiers of Computer Science, 2024, 18(6): 186345","journal-title":"Frontiers of Computer Science"},{"key":"40763_CR22","volume-title":"Survey resources","author":"W Lu","year":"2024","unstructured":"Lu W. Survey resources. See Github.com\/godaai\/llm-table-survey website, 2024"},{"key":"40763_CR23","doi-asserted-by":"publisher","first-page":"174","DOI":"10.1007\/978-981-19-7596-7_14","volume-title":"Proceedings of the 7th China Conference on Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy","author":"N Jin","year":"2022","unstructured":"Jin N, Siebert J, Li D, Chen Q. A survey on table question answering: recent advances. In: Proceedings of the 7th China Conference on Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy. 2022, 174\u2013186"},{"key":"40763_CR24","unstructured":"Qin B, Hui B, Wang L, Yang M, Li J, Li B, Geng R, Cao R, Sun J, Si L, Huang F, Li Y. A survey on text-to-SQL parsing: concepts, methods, and future directions. 2022, arXiv preprint arXiv: 2208.13629"},{"key":"40763_CR25","unstructured":"Hong Z, Yuan Z, Zhang Q, Chen H, Dong J, Huang F, Huang X. Next-Generation database interfaces: a survey of LLM-based text-to-SQL. 2024, arXiv preprint arXiv: 2406.08426"},{"issue":"2","key":"40763_CR26","first-page":"13","volume":"11","author":"S Zhang","year":"2020","unstructured":"Zhang S, Balog K. Web table extraction, retrieval, and augmentation: a survey. ACM Transactions on Intelligent Systems and Technology (TIST), 2020, 11(2): 13","journal-title":"ACM Transactions on Intelligent Systems and Technology (TIST)"},{"key":"40763_CR27","doi-asserted-by":"publisher","first-page":"1589","DOI":"10.1145\/3318464.3389782","volume-title":"Proceedings of 2020 ACM SIGMOD International Conference on Management of Data","author":"S Rahman","year":"2020","unstructured":"Rahman S, Mack K, Bendre M, Zhang R, Karahalios K, Parameswaran A. Benchmarking spreadsheet systems. In: Proceedings of 2020 ACM SIGMOD International Conference on Management of Data. 2020, 1589\u20131599"},{"key":"40763_CR28","first-page":"210","volume-title":"Proceedings of the 20th International Conference on Extending Database Technology","author":"D Ritze","year":"2017","unstructured":"Ritze D, Bizer C. Matching Web tables to DBpedia - a feature utility study. In: Proceedings of the 20th International Conference on Extending Database Technology. 2017, 210\u2013221"},{"key":"40763_CR29","doi-asserted-by":"publisher","first-page":"425","DOI":"10.1007\/978-3-319-25007-6_25","volume-title":"Proceedings of the 14th International Semantic Web Conference on The Semantic Web - ISWC 2015","author":"C S Bhagavatula","year":"2015","unstructured":"Bhagavatula C S, Noraset T, Downey D. TabEL: entity linking in Web tables. In: Proceedings of the 14th International Semantic Web Conference on The Semantic Web - ISWC 2015. 2015, 425\u2013441"},{"key":"40763_CR30","volume-title":"Proceedings of the 11th International Conference on Learning Representations","author":"Z Cheng","year":"2023","unstructured":"Cheng Z, Xie T, Shi P, Li C, Nadkarni R, Hu Y, Xiong C, Radev D, Ostendorf M, Zettlemoyer L, Smith N A, Yu T. Binding language models in symbolic languages. In: Proceedings of the 11th International Conference on Learning Representations. 2023"},{"key":"40763_CR31","volume-title":"Proceedings of the 12th International Conference on Learning Representations","author":"Z Wang","year":"2024","unstructured":"Wang Z, Zhang H, Li C L, Eisenschlos J M, Perot V, Wang Z, Miculicich L, Fujii Y, Shang J, Lee C Y, Pfister T. Chain-of-table: evolving tables in the reasoning chain for table understanding. In: Proceedings of the 12th International Conference on Learning Representations. 2024"},{"key":"40763_CR32","first-page":"1470","volume-title":"Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing","author":"P Pasupat","year":"2015","unstructured":"Pasupat P, Liang P. Compositional semantic parsing on semi-structured tables. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015, 1470\u20131480"},{"key":"40763_CR33","doi-asserted-by":"publisher","first-page":"174","DOI":"10.1145\/3539618.3591708","volume-title":"Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Y Ye","year":"2023","unstructured":"Ye Y, Hui B, Yang M, Li B, Huang F, Li Y. Large language models are versatile decomposers: decomposing evidence and questions for table-based reasoning. In: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2023, 174\u2013184"},{"key":"40763_CR34","volume-title":"Proceedings of the 33rd Neural Information Processing Systems","author":"W Chen","year":"2019","unstructured":"Chen W, Wang H, Chen J, Zhang Y, Wang H, Li S, Zhou X, Wang W Y. TabFact: a large-scale dataset for table-based fact Verification. In: Proceedings of the 33rd Neural Information Processing Systems. 2019"},{"key":"40763_CR35","first-page":"1173","volume-title":"Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing","author":"A Parikh","year":"2020","unstructured":"Parikh A, Wang X, Gehrmann S, Faruqui M, Dhingra B, Yang D, Das D. ToTTo: a controlled table-to-text generation dataset. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. 2020, 1173\u20131186"},{"key":"40763_CR36","volume-title":"Proceedings of Machine Learning and Systems 6 (MLSys 2024) Conference","author":"Y Qian","year":"2024","unstructured":"Qian Y, He Y, Zhu R, Huang J, Ma Z, Wang H, Wang Y, Sun X, Lian D, Ding B, Zhou J. UniDM: a Unified framework for data manipulation with large language models. In: Proceedings of Machine Learning and Systems 6 (MLSys 2024) Conference. 2024"},{"key":"40763_CR37","unstructured":"Ahmad M S, Naeem Z A, Eltabakh M, Ouzzani M, Tang N. RetClean: retrieval-based data cleaning using foundation models and data lakes. 2023, arXiv preprint arXiv: 2303.16909"},{"key":"40763_CR38","doi-asserted-by":"crossref","unstructured":"Chen Y, Yuan Y, Zhang Z, Zheng Y, Liu J, Ni F, Hao J. SheetAgent: towards a generalist agent for spreadsheet reasoning and manipulation via large language models. 2024, arXiv preprint arXiv: 2403.03636","DOI":"10.1145\/3696410.3714962"},{"key":"40763_CR39","unstructured":"Ma Z, Zhang B, Zhang J, Yu J, Zhang X, Zhang X, Luo S, Wang X, Tang J. SpreadsheetBench: towards challenging real world spreadsheet manipulation. 2024, arXiv preprint arXiv: 2406.14991"},{"issue":"3","key":"40763_CR40","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1145\/3654930","volume":"2","author":"H Li","year":"2024","unstructured":"Li H, Zhang J, Liu H, Fan J, Zhang X, Zhu J, Wei R, Pan H, Li C, Chen H. CodeS: towards building open-source language models for text-to-SQL. Proceedings of the ACM on Management of Data, 2024, 2(3): 127","journal-title":"Proceedings of the ACM on Management of Data"},{"issue":"5","key":"40763_CR41","first-page":"1132","volume":"17","author":"D Gao","year":"2024","unstructured":"Gao D, Wang H, Li Y, Sun X, Qian Y, Ding B, Zhou J. Text-to-SQL empowered by large language models: a benchmark evaluation. In: Proceedings of the VLDB Endowment, 2024, 17(5): 1132\u20131145","journal-title":"In: Proceedings of the VLDB Endowment"},{"key":"40763_CR42","doi-asserted-by":"publisher","first-page":"3911","DOI":"10.18653\/v1\/D18-1425","volume-title":"Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing","author":"T Yu","year":"2018","unstructured":"Yu T, Zhang R, Yang K, Yasunaga M, Wang D, Li Z, Ma J, Li I, Yao Q, Roman S, Zhang Z, Radev D. Spider: a large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task. In: Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 3911\u20133921"},{"key":"40763_CR43","unstructured":"Zhang W, Shen Y, Lu W, Zhuang Y T. Data-Copilot: bridging billions of data and humans with autonomous workflow. 2023, arXiv preprint arXiv: 2306.07209"},{"key":"40763_CR44","volume-title":"Proceedings of the 12th International Conference on Learning Representations","author":"Y Xu","year":"2024","unstructured":"Xu Y, Su H, Xing C, Mi B, Liu Q, Shi W, Hui B, Zhou F, Liu Y, Xie T, Cheng Z, Zhao S, Kong L, Wang B, Xiong C, Yu T. Lemur: harmonizing natural language and code for language agents. In: Proceedings of the 12th International Conference on Learning Representations. 2024"},{"key":"40763_CR45","first-page":"18319","volume-title":"Proceedings of the 40th International Conference on Machine Learning","author":"Y Lai","year":"2023","unstructured":"Lai Y, Li C, Wang Y, Zhang T, Zhong R, Zettlemoyer L, Yih W T, Fried D, Wang S, Yu T. DS-1000: a natural and reliable benchmark for data science code generation. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 18319\u201318345"},{"key":"40763_CR46","first-page":"2437","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics","author":"L Chen","year":"2023","unstructured":"Chen L, Huang C, Zheng X, Lin J, Huang X. TableVLM: multi-modal pre-training for table structure recognition. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023, 2437\u20132449"},{"key":"40763_CR47","first-page":"1918","volume-title":"Proceedings of the 12th Language Resources and Evaluation Conference","author":"M Li","year":"2020","unstructured":"Li M, Cui L, Huang S, Wei F, Zhou M, Li Z. TableBank: table benchmark for Image-based table detection and recognition. In: Proceedings of the 12th Language Resources and Evaluation Conference. 2020, 1918\u20131925"},{"key":"40763_CR48","doi-asserted-by":"crossref","unstructured":"Zhao W, Feng H, Liu Q, Tang J, Wei S, Wu B, Liao L, Ye Y, Liu H, Zhou W, Li H, Huang C. TabPedia: towards comprehensive visual table understanding with concept synergy. 2024, arXiv preprint arXiv: 2406.01326","DOI":"10.52202\/079017-0230"},{"key":"40763_CR49","doi-asserted-by":"publisher","first-page":"564","DOI":"10.1007\/978-3-030-58589-1_34","volume-title":"Proceedings of the 16th European Conference on Computer Vision - ECCV 2020","author":"X Zhong","year":"2020","unstructured":"Zhong X, ShafieiBavani E, Jimeno Yepes A. Image-based table recognition: data, model, and evaluation. In: Proceedings of the 16th European Conference on Computer Vision - ECCV 2020. 2020, 564\u2013580"},{"issue":"12","key":"40763_CR50","doi-asserted-by":"publisher","first-page":"993","DOI":"10.14778\/2994509.2994518","volume":"9","author":"Z Abedjan","year":"2016","unstructured":"Abedjan Z, Chu X, Deng D, Fernandez R C, Ilyas I F, Ouzzani M, Papotti P, Stonebraker M, Tang N. Detecting data errors: where are we and what needs to be done? Proceedings of the VLDB Endowment, 2016, 9(12): 993\u20131004","journal-title":"Proceedings of the VLDB Endowment"},{"issue":"OOPSLA1","key":"40763_CR51","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1145\/3586030","volume":"7","author":"S Barke","year":"2023","unstructured":"Barke S, James M B, Polikarpova N. Grounded copilot: how programmers interact with code-generating models. Proceedings of the ACM on Programming Languages, 2023, 7(OOPSLA1): 78","journal-title":"Proceedings of the ACM on Programming Languages"},{"key":"40763_CR52","volume-title":"Proceedings of the Table Representation Learning Workshop at NeurIPS 2023","author":"A Singha","year":"2023","unstructured":"Singha A, Cambronero J, Gulwani S, Le V, Parnin C. Tabular representation, noisy operators, and impacts on table structure understanding tasks in LLMs. In: Proceedings of the Table Representation Learning Workshop at NeurIPS 2023. 2023"},{"key":"40763_CR53","unstructured":"Tian Y, Zhao J, Dong H, Xiong J, Xia S, Zhou M, Lin Y, Cambronero J, He Y, Han S, Zhang D. SpreadsheetLLM: encoding spreadsheets for large language models. 2024, arXiv preprint arXiv: 2407.09025"},{"key":"40763_CR54","doi-asserted-by":"publisher","first-page":"14935","DOI":"10.18653\/v1\/2023.findings-emnlp.996","volume-title":"Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023","author":"L Nan","year":"2023","unstructured":"Nan L, Zhao Y, Zou W, Ri N, Tae J, Zhang E, Cohan A, Radev D. Enhancing text-to-SQL capabilities of large language models: a study on prompt design strategies. In: Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023. 2023, 14935\u201314956"},{"key":"40763_CR55","doi-asserted-by":"publisher","first-page":"407","DOI":"10.18653\/v1\/2024.findings-acl.23","volume-title":"Proceedings of the Findings of the Association for Computational Linguistics: ACL 2024","author":"N Deng","year":"2024","unstructured":"Deng N, Sun Z, He R, Sikka A, Chen Y, Ma L, Zhang Y, Mihalcea R. Tables as texts or images: evaluating the table reasoning ability of LLMs and MLLMs. In: Proceedings of the Findings of the Association for Computational Linguistics: ACL 2024. 2024, 407\u2013426"},{"key":"40763_CR56","doi-asserted-by":"publisher","first-page":"1192","DOI":"10.1145\/3394486.3403172","volume-title":"Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","author":"Y Xu","year":"2020","unstructured":"Xu Y, Li M, Cui L, Huang S, Wei F, Zhou M. LayoutLM: pre-training of text and layout for document image understanding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020, 1192\u20131200"},{"key":"40763_CR57","first-page":"2579","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing","author":"Y Xu","year":"2021","unstructured":"Xu Y, Xu Y, Lv T, Cui L, Wei F, Wang G, Lu Y, Florencio D, Zhang C, Che W, Zhang M, Zhou L. LayoutLMv2: multi-modal pre-training for visually-rich document understanding. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021, 2579\u20132591"},{"key":"40763_CR58","doi-asserted-by":"publisher","first-page":"116","DOI":"10.18653\/v1\/2024.alvr-1.10","volume-title":"Proceedings of the 3rd Workshop on Advances in Language and Vision Research (ALVR)","author":"S Xia","year":"2024","unstructured":"Xia S, Xiong J, Dong H, Zhao J, Tian Y, Zhou M, He Y, Han S, Zhang D. Vision language models for spreadsheet understanding: challenges and opportunities. In: Proceedings of the 3rd Workshop on Advances in Language and Vision Research (ALVR). 2024, 116\u2013128"},{"key":"40763_CR59","first-page":"770","volume-title":"Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition","author":"K He","year":"2016","unstructured":"He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770\u2013778"},{"key":"40763_CR60","volume-title":"Proceedings of International Conference on Learning Representations","author":"A Dosovitskiy","year":"2021","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An image is worth 16\u00d716 words: transformers for image recognition at scale. In: Proceedings of International Conference on Learning Representations. 2021"},{"key":"40763_CR61","first-page":"4171","volume-title":"Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics","author":"J Devlin","year":"2019","unstructured":"Devlin J, Chang M W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics. 2019, 4171\u20134186"},{"key":"40763_CR62","first-page":"3446","volume-title":"Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"H Iida","year":"2021","unstructured":"Iida H, Thai D, Manjunatha V, Iyyer M. TABBIE: pretrained representations of tabular data. In: Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021, 3446\u20133456"},{"key":"40763_CR63","first-page":"13067","volume-title":"Proceedings of the 37th AAAI Conference on Artificial Intelligence","author":"H Li","year":"2023","unstructured":"Li H, Zhang J, Li C, Chen H. RESDSQL: decoupling schema linking and skeleton parsing for text-to-SQL. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. 2023, 13067\u201313075"},{"key":"40763_CR64","volume-title":"Proceedings of the 41st International Conference on Machine Learning","author":"Y Wei","year":"2024","unstructured":"Wei Y, Wang Z, Liu J, Ding Y, Zhang L. Magicoder: empowering code generation with OSS-instruct. In: Proceedings of the 41st International Conference on Machine Learning. 2024"},{"key":"40763_CR65","first-page":"7864","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics","author":"J Yang","year":"2024","unstructured":"Yang J, Hui B, Yang M, Yang J, Lin J, Zhou C. Synthesizing text-to-SQL data from weak and strong LLMs. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024, 7864\u20137875"},{"key":"40763_CR66","first-page":"93","volume-title":"Proceedings of Companion of 2024 International Conference on Management of Data","author":"C Zhang","year":"2024","unstructured":"Zhang C, Mao Y, Fan Y, Mi Y, Gao Y, Chen L, Lou D, Lin J. FinSQL: model-agnostic LLMs-based text-to-SQL framework for financial analysis. In: Proceedings of Companion of 2024 International Conference on Management of Data. 2024, 93\u2013105"},{"key":"40763_CR67","doi-asserted-by":"crossref","unstructured":"Zhang X, Zhang J, Ma Z, Li Y, Zhang B, Li G, Yao Z, Xu K, Zhou J, Zhang-Li D, Yu J, Zhao S, Li J, Tang J. TableLLM: enabling tabular data manipulation by LLMs in real office usage scenarios. 2024, arXiv preprint arXiv: 2403.19318","DOI":"10.18653\/v1\/2025.findings-acl.538"},{"key":"40763_CR68","unstructured":"Zhuang A, Zhang G, Zheng T, Du X, Wang J, Ren W, Huang S W, Fu J, Yue X, Chen W. StructLM: towards building generalist models for structured knowledge grounding. 2024, arXiv preprint arXiv: 2402.16671"},{"issue":"11","key":"40763_CR69","doi-asserted-by":"publisher","first-page":"2750","DOI":"10.14778\/3681954.3681960","volume":"17","author":"J Fan","year":"2024","unstructured":"Fan J, Gu Z, Zhang S, Zhang Y, Chen Z, Cao L, Li G, Madden S, Du X, Tang N. Combining small language models and large language models for zero-shot NL2SQL. Proceedings of the VLDB Endowment, 2024, 17(11): 2750\u20132763","journal-title":"Proceedings of the VLDB Endowment"},{"key":"40763_CR70","first-page":"6721","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics","author":"I Alonso","year":"2024","unstructured":"Alonso I, Agirre E, Lapata M. PixT3: pixel-based table-to-text generation. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024, 6721\u20136736"},{"key":"40763_CR71","unstructured":"Parmar J, Satheesh S, Patwary M, Shoeybi M, Catanzaro B. Reuse, don\u2019t retrain: a recipe for continued pretraining of language models. 2024, arXiv preprint arXiv: 2407.07263"},{"key":"40763_CR72","first-page":"2735","volume-title":"Proceedings of the 37th International Conference on Neural Information Processing System","author":"Z Li","year":"2023","unstructured":"Li Z, Peng B, He P, Galley M, Gao J, Yan X. Guiding large language models via directional stimulus prompting. In: Proceedings of the 37th International Conference on Neural Information Processing System. 2023, 2735"},{"key":"40763_CR73","volume-title":"Proceedings of the 12th International Conference on Learning Representations","author":"Z Luo","year":"2024","unstructured":"Luo Z, Xu C, Zhao P, Sun Q, Geng X, Hu W, Tao C, Ma J, Lin Q, Jiang D. WizardCoder: empowering code large language models with Evol-Instruct. In: Proceedings of the 12th International Conference on Learning Representations. 2024"},{"key":"40763_CR74","first-page":"2338","volume-title":"Proceedings of the 37th International Conference on Neural Information Processing System","author":"R Rafailov","year":"2023","unstructured":"Rafailov R, Sharma A, Mitchell E, Ermon S, Manning C D, Finn C. Direct preference optimization: your language model is secretly a reward model. In: Proceedings of the 37th International Conference on Neural Information Processing System. 2023, 2338"},{"key":"40763_CR75","doi-asserted-by":"publisher","first-page":"3045","DOI":"10.18653\/v1\/2021.emnlp-main.243","volume-title":"Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing","author":"B Lester","year":"2021","unstructured":"Lester B, Al-Rfou R, Constant N. The power of scale for parameter-efficient prompt tuning. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 3045\u20133059"},{"key":"40763_CR76","volume-title":"Proceedings of the 10th International Conference on Learning Representations","author":"E Hu","year":"2022","unstructured":"Hu E, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, Chen W. LoRA: lowrank adaptation of large language models. In: Proceedings of the 10th International Conference on Learning Representations. 2022"},{"issue":"11","key":"40763_CR77","doi-asserted-by":"publisher","first-page":"3318","DOI":"10.14778\/3681954.3682003","volume":"17","author":"B Li","year":"2024","unstructured":"Li B, Luo Y, Chai C, Li G, Tang N. The dawn of natural language to SQL: are we fully ready? Proceedings of the VLDB Endowment, 2024, 17(11): 3318\u20133331","journal-title":"Proceedings of the VLDB Endowment"},{"key":"40763_CR78","volume-title":"Transactions on Machine Learning Research","author":"R Li","year":"2023","unstructured":"Li R, Allal L B, Zi Y, Muennighoff N, Kocetkov D, Mou C, Marone M, Akiki C, Li J, Chim J, Liu Q, Zheltonozhskii E, Zhuo T Y, Wang T, Dehaene O, Davaadorj M, Lamy-Poirier J, Monteiro J, Shliazhko O, Gontier N, Meade N, Zebaze A, Yee M H, Umapathi L K, Zhu J, Lipkin B, Oblokulov M, Wang Z R, Murthy R, Stillerman J, Patel S S, Abulkhanov D, Zocca M, Dey M, Zhang Z, Fahmy N, Bhattacharyya U, Yu W, Singh S, Luccioni S, Villegas P, Kunakov M, Zhdanov F, Romero M, Lee T, Timor N, Ding J, Schlesinger C, Schoelkopf H, Ebert J, Dao T, Mishra M, Gu A, Robinson J, Anderson C J, Dolan-Gavitt B, Contractor D, Reddy S, Fried D, Bahdanau D, Jernite Y, Ferrandis C M, Hughes S, Wolf T, Guha A, von Werra L, de Vries H. StarCoder: may the source be with you! Transactions on Machine Learning Research, 2023"},{"issue":"12","key":"40763_CR79","doi-asserted-by":"publisher","first-page":"248","DOI":"10.1145\/3571730","volume":"55","author":"Z Ji","year":"2023","unstructured":"Ji Z, Lee N, Frieske R, Yu T, Su D, Xu Y, Ishii E, Bang Y J, Madotto A, Fung P. Survey of hallucination in natural language generation. ACM Computing Surveys, 2023, 55(12): 248","journal-title":"ACM Computing Surveys"},{"key":"40763_CR80","first-page":"4604","volume-title":"Proceedings of 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"A Nassar","year":"2022","unstructured":"Nassar A, Livathinos N, Lysak M, Staar P. TableFormer: table structure understanding with transformers. In: Proceedings of 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2022, 4604\u20134613"},{"key":"40763_CR81","volume-title":"Proceedings of the 11th International Conference on Learning Representations","author":"S Yao","year":"2023","unstructured":"Yao S, Zhao J, Yu D, Du N, Shafran I, Narasimhan K R, Cao Y. ReAct: synergizing reasoning and acting in language models. In: Proceedings of the 11th International Conference on Learning Representations. 2023"},{"key":"40763_CR82","first-page":"12824","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics","author":"Y Zhao","year":"2024","unstructured":"Zhao Y, Chen L, Cohan A, Zhao C. TaPERA: enhancing faithfulness and interpretability in long-form table QA by content planning and execution-based reasoning. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024, 12824\u201312840"},{"key":"40763_CR83","first-page":"1244","volume-title":"Proceedings of 2024 Conference of the North American Chapter of the Association for Computational Linguistics","author":"Z Zhang","year":"2024","unstructured":"Zhang Z, Gao Y, Lou J G. E5: zero-shot hierarchical table analysis using augmented LLMs via explain, extract, execute, exhibit and extrapolate. In: Proceedings of 2024 Conference of the North American Chapter of the Association for Computational Linguistics. 2024, 1244\u20131258"},{"key":"40763_CR84","volume-title":"Proceedings of the 11th International Conference on Learning Representations","author":"D Zhou","year":"2023","unstructured":"Zhou D, Schaerli N, Hou L, Wei J, Scales N, Wang X, Schuurmans D, Cui C, Bousquet O, Le Q V, Chi E H. Least-to-most prompting enables complex reasoning in large language models. In: Proceedings of the 11th International Conference on Learning Representations. 2023"},{"key":"40763_CR85","volume-title":"Proceedings of the 37th Conference on Neural Information Processing Systems","author":"M Pourreza","year":"2023","unstructured":"Pourreza M, Rafiei D. DIN-SQL: decomposed in-context learning of text-to-SQL with self-correction. In: Proceedings of the 37th Conference on Neural Information Processing Systems. 2023"},{"key":"40763_CR86","doi-asserted-by":"publisher","first-page":"10796","DOI":"10.18653\/v1\/2024.findings-acl.641","volume-title":"Proceedings of Findings of the Association for Computational Linguistics: ACL 2024","author":"Y Xie","year":"2024","unstructured":"Xie Y, Jin X, Xie T, Matrixmxlin M, Chen L, Yu C, Lei C, Zhuo C, Hu B, Li Z. Decomposition for enhancing attention: improving LLM-based text-to-SQL through workflow paradigm. In: Proceedings of Findings of the Association for Computational Linguistics: ACL 2024. 2024, 10796\u201310816"},{"key":"40763_CR87","first-page":"5725","volume-title":"Proceedings of 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"M Nahid","year":"2024","unstructured":"Nahid M, Rafiei D. TabSQLify: enhancing reasoning capabilities of LLMs through table decomposition. In: Proceedings of 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2024, 5725\u20135737"},{"key":"40763_CR88","volume-title":"Proceedings of the 11th International Conference on Learning Representations","author":"X Wang","year":"2023","unstructured":"Wang X, Wei J, Schuurmans D, Le Q V, Chi E H, Narang S, Chowdhery A, Zhou D. Self-consistency improves chain of thought reasoning in language models. In: Proceedings of the 11th International Conference on Learning Representations. 2023"},{"key":"40763_CR89","unstructured":"Lee D, Park C, Kim J, Park H. MCS-SQL: leveraging multiple prompts and multiple-choice selection for text-to-SQL generation. 2024, arXiv preprint arXiv: 2405.07467"},{"key":"40763_CR90","unstructured":"Jiang S, Wang Y, Wang Y. SelfEvolve: a code evolution framework via large language models. 2023, arXiv preprint arXiv: 2306.02907"},{"key":"40763_CR91","first-page":"6769","volume-title":"Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing","author":"V Karpukhin","year":"2020","unstructured":"Karpukhin V, Oguz B, Min S, Lewis P, Wu L, Edunov S, Chen D, Yih W T. Dense passage retrieval for open-domain question answering. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. 2020, 6769\u20136781"},{"key":"40763_CR92","doi-asserted-by":"publisher","first-page":"9237","DOI":"10.18653\/v1\/2023.emnlp-main.574","volume-title":"Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing","author":"J Jiang","year":"2023","unstructured":"Jiang J, Zhou K, Dong Z, Ye K, Zhao X, Wen J R. StructGPT: a general framework for large language model to reason over structured data. In: Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing. 2023, 9237\u20139251"},{"key":"40763_CR93","doi-asserted-by":"crossref","unstructured":"Sui Y, Zou J, Zhou M, He X, Du L, Han S, Zhang D M. TAP4LLM: table provider on sampling, augmenting, and packing semi-structured data for large language model reasoning. 2023, arXiv preprint arXiv: 2312.09039","DOI":"10.18653\/v1\/2024.findings-emnlp.603"},{"key":"40763_CR94","first-page":"484","volume-title":"Proceedings of Companion of 2024 International Conference on Management of Data","author":"S Chen","year":"2024","unstructured":"Chen S, Liu H, Jin W, Sun X, Feng X, Fan J, Du X, Tang N. ChatPipe: orchestrating data preparation pipelines by optimizing human-ChatGPT interactions. In: Proceedings of Companion of 2024 International Conference on Management of Data. 2024, 484\u2013487"},{"key":"40763_CR95","first-page":"486","volume-title":"Proceedings of the 2nd Annual Conference on Learning for Dynamics and Control","author":"J Fan","year":"2020","unstructured":"Fan J, Wang Z, Xie Y, Yang Z. A theoretical analysis of deep Q-learning. In: Proceedings of the 2nd Annual Conference on Learning for Dynamics and Control. 2020, 486\u2013489"},{"key":"40763_CR96","doi-asserted-by":"publisher","first-page":"14786","DOI":"10.18653\/v1\/2023.emnlp-main.914","volume-title":"Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing","author":"B Zhao","year":"2023","unstructured":"Zhao B, Ji C, Zhang Y, He W, Wang Y, Wang Q, Feng R, Zhang X. Large language models are complex table parsers. In: Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing. 2023, 14786\u201314802"},{"key":"40763_CR97","unstructured":"Li J, Huo N, Gao Y, Shi J, Zhao Y, Qu G, Wu Y, Ma C, Lou J G, Cheng R. Tapilot-crossing: benchmarking and evolving LLMs towards interactive data analysis agents. 2024, arXiv preprint arXiv: 2403.05307v1"},{"key":"40763_CR98","unstructured":"Zhong V, Xiong C, Socher R. Seq2SQL: generating structured queries from natural language using reinforcement learning. 2017, arXiv preprint arXiv: 1709.00103"},{"key":"40763_CR99","first-page":"6064","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics","author":"Y Zhao","year":"2023","unstructured":"Zhao Y, Zhao C, Nan L, Qi Z, Zhang W, Tang X, Mi B, Radev D. RobuT: a systematic study of table QA robustness against human-annotated adversarial perturbations. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023, 6064\u20136081"},{"key":"40763_CR100","first-page":"1821","volume-title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics","author":"M Iyyer","year":"2017","unstructured":"Iyyer M, Yih W T, Chang M W. Search-based neural structured learning for sequential question answering. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 1821\u20131831"},{"key":"40763_CR101","first-page":"1835","volume-title":"Proceedings of the 37th International Conference on Neural Information Processing Systems","author":"J Li","year":"2023","unstructured":"Li J, Hui B, Qu G, Yang J, Li B, Li B, Wang B, Qin B, Geng R, Huo N, Zhou X, Ma C, Li G, Chang K C C, Huang F, Cheng R, Li Y. Can LLM already serve as a database interface? A big bench for large-scale database grounded text-to-SQLs. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 1835"},{"key":"40763_CR102","unstructured":"Motl J, Schulte O. The CTU Prague relational learning repository. 2024, arXiv preprint arXiv: 1511.03086"},{"key":"40763_CR103","volume-title":"Proceedings of the 11th International Conference on Learning Representations","author":"S Chang","year":"2023","unstructured":"Chang S, Wang J, Dong M, Pan L, Zhu H, Li A, Lan W, Zhang S, Jiang J, Lilien J, Ash S, Wang W, Wang Z, Castelli V, Ng P, Xiang B. Dr.Spider: a diagnostic evaluation benchmark towards text-to-SQL robustness. In: Proceedings of the 11th International Conference on Learning Representations. 2023"},{"issue":"4","key":"40763_CR104","doi-asserted-by":"publisher","first-page":"685","DOI":"10.14778\/3636218.3636225","volume":"17","author":"Y Zhang","year":"2024","unstructured":"Zhang Y, Deriu J, Katsogiannis-Meimarakis G, Kosten C, Koutrika G, Stockinger K. ScienceBenchmark: a complex real-world benchmark for evaluating natural language to SQL systems. Proceedings of the VLDB Endowment, 2024, 17(4): 685\u2013698","journal-title":"Proceedings of the VLDB Endowment"},{"key":"40763_CR105","doi-asserted-by":"publisher","first-page":"9471","DOI":"10.18653\/v1\/2023.findings-acl.604","volume-title":"Proceedings of Findings of the Association for Computational Linguistics: ACL 2023","author":"X He","year":"2023","unstructured":"He X, Zhou M, Zhou M, Xu J, Lv X, Li T, Shao Y, Han S, Yuan Z, Zhang D. AnaMeta: a table understanding dataset of field metadata knowledge shared by multi-dimensional data analysis tasks. In: Proceedings of Findings of the Association for Computational Linguistics: ACL 2023. 2023, 9471\u20139492"},{"key":"40763_CR106","first-page":"514","volume-title":"Proceedings of the 17th International Conference on the Semantic Web","author":"E Jim\u00e9nez-Ruiz","year":"2020","unstructured":"Jim\u00e9nez-Ruiz E, Hassanzadeh O, Efthymiou V, Chen J, Srinivas K. SemTab 2019: resources to benchmark tabular data to knowledge graph matching systems. In: Proceedings of the 17th International Conference on the Semantic Web. 2020, 514\u2013530"},{"issue":"1","key":"40763_CR107","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1145\/3588710","volume":"1","author":"M Hulsebos","year":"2023","unstructured":"Hulsebos M, Demiralp \u00c7, Groth P. GitTables: a large-scale corpus of relational tables. Proceedings of the ACM on Management of Data, 2023, 1(1): 30","journal-title":"Proceedings of the ACM on Management of Data"},{"issue":"3","key":"40763_CR108","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1145\/3654975","volume":"2","author":"T D\u00f6hmen","year":"2024","unstructured":"D\u00f6hmen T, Geacu R, Hulsebos M, Schelter S. SchemaPile: a large collection of relational database schemas. Proceedings of the ACM on Management of Data, 2024, 2(3): 172","journal-title":"Proceedings of the ACM on Management of Data"},{"key":"40763_CR109","first-page":"356","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics","author":"N Wretblad","year":"2024","unstructured":"Wretblad N, Riseby F, Biswas R, Ahmadi A, Holmstr\u00f6m O. Understanding the effects of noise in text-to-SQL: an examination of the BIRD-bench benchmark. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024, 356\u2013369"},{"key":"40763_CR110","volume-title":"LlamaIndex","author":"J Liu","year":"2022","unstructured":"Liu J. LlamaIndex. See Docs.llamaindex.ai\/en\/stable\/ website, 2022"},{"issue":"12","key":"40763_CR111","doi-asserted-by":"publisher","first-page":"4365","DOI":"10.14778\/3685800.3685876","volume":"17","author":"S Xue","year":"2024","unstructured":"Xue S, Qi D, Jiang C, Cheng F, Chen K, Zhang Z, Zhang H, Wei G, Zhao W, Zhou F, Yi H, Liu S, Yang H, Chen F. Demonstration of DB-GPT: next generation data interaction system empowered by large language models. Proceedings of the VLDB Endowment, 2024, 17(12): 4365\u20134368","journal-title":"Proceedings of the VLDB Endowment"},{"key":"40763_CR112","volume-title":"AI. Vanna","author":"Vanna","year":"2023","unstructured":"Vanna. AI. Vanna. See Github.com\/vanna-ai\/vanna website, 2023"},{"key":"40763_CR113","volume-title":"Pandas-ai","author":"G Venturi","year":"2023","unstructured":"Venturi G. Pandas-ai. See Github.com\/Sinaptik-AI\/pandas-ai website, 2023"},{"key":"40763_CR114","doi-asserted-by":"publisher","first-page":"1388","DOI":"10.18653\/v1\/2024.findings-acl.82","volume-title":"Proceedings of Findings of the Association for Computational Linguistics: ACL 2024","author":"C Pang","year":"2024","unstructured":"Pang C, Cao Y, Yang C, Luo P. Uncovering limitations of large language models in information seeking from tables. In: Proceedings of Findings of the Association for Computational Linguistics: ACL 2024. 2024, 1388\u20131409"},{"key":"40763_CR115","unstructured":"Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M A, Lacroix T, Rozi\u00e8re B, Goyal N, Hambro E, Azhar F, Rodriguez A, Joulin A, Grave E, Lample G. LLaMA: open and efficient foundation language models. 2023, arXiv preprint arXiv: 2302.13971"},{"key":"40763_CR116","doi-asserted-by":"publisher","first-page":"611","DOI":"10.1145\/3600006.3613165","volume-title":"Proceedings of the 29th Symposium on Operating Systems Principles","author":"W Kwon","year":"2023","unstructured":"Kwon W, Li Z, Zhuang S, Sheng Y, Zheng L, Yu C H, Gonzalez J, Zhang H, Stoica I. Efficient memory management for large language model serving with PagedAttention. In: Proceedings of the 29th Symposium on Operating Systems Principles. 2023, 611\u2013626"},{"key":"40763_CR117","volume-title":"Thinking, Fast and Slow","author":"D Kahneman","year":"2011","unstructured":"Kahneman D. Thinking, Fast and Slow. London: Farrar, Straus and Giroux, 2011"}],"container-title":["Frontiers of Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11704-024-40763-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11704-024-40763-6","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11704-024-40763-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,22]],"date-time":"2026-03-22T22:02:30Z","timestamp":1774216950000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11704-024-40763-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,12]]},"references-count":117,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,2]]}},"alternative-id":["40763"],"URL":"https:\/\/doi.org\/10.1007\/s11704-024-40763-6","relation":{},"ISSN":["2095-2228","2095-2236"],"issn-type":[{"value":"2095-2228","type":"print"},{"value":"2095-2236","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,12]]},"assertion":[{"value":"26 July 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 November 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 January 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Competing interests\n                      The authors declare that they have no competing nterests or financial conflicts to disclose.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics"}}],"article-number":"192350"}}