{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T20:36:33Z","timestamp":1780346193037,"version":"3.54.1"},"reference-count":13,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,8]]},"abstract":"<jats:p>The rise of LLM has enabled natural language-based table assistants, but existing systems assume users already have a well-formed table, neglecting the challenge of table discovery in large-scale table pools. To address this, we introduce TableCopilot, an LLM-powered assistant for interactive, precise, and personalized table discovery and analysis. We define a novel scenario, nlcTD, where users provide both a natural language condition and a query table, enabling intuitive and flexible table discovery for users of all expertise levels. To handle this, we propose Crofuma, a cross-fusion-based approach that learns and aggregates single-modal and cross-modal matching scores. Experimental results show Crofuma outperforms SOTA single-input methods by at least 12% on NDCG@5. We also release an instructional video, codebase, datasets, and other resources on GitHub to encourage community contributions. TableCopilot sets a new standard for interactive table assistants, making advanced table discovery accessible and integrated.<\/jats:p>","DOI":"10.14778\/3750601.3750681","type":"journal-article","created":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T13:38:05Z","timestamp":1758029885000},"page":"5399-5402","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["TableCopilot: A Table Assistant Empowered by Natural Language Conditional Table Discovery"],"prefix":"10.14778","volume":"18","author":[{"given":"Lingxi","family":"Cui","sequence":"first","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Guanyu","family":"Jiang","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Huan","family":"Li","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ke","family":"Chen","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Lidan","family":"Shou","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou High-Tech Zone (Binjiang), Institute of Blockchain and Data Security, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Gang","family":"Chen","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,9,16]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2025. nlcTables. https:\/\/github.com\/SuDIS-ZJU\/nlcTables"},{"key":"e_1_2_1_2_1","unstructured":"2025. TableCopilot Project. https:\/\/sudis-zju.github.io\/table-copilot\/"},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","unstructured":"C Chai Y Deng Y Zhan Z Cao Y Zhang L Cao Z Wang Yand Zhang Y Yuan G Wang and N Tang. 2024. LakeCompass: An End-to-End System for Data Maintenance Search and Analysis in Data Lakes. PVLDB 4381\u20134384.","DOI":"10.14778\/3685800.3685880"},{"key":"e_1_2_1_4_1","first-page":"08283","article-title":"TableCopilot: A Table Assistant Empowered by Natural Language Conditional Table Discovery","volume":"2507","author":"Cui L","year":"2025","unstructured":"L Cui, G Jiang, H Li, K Chen, L Shou, and G Chen. 2025. TableCopilot: A Table Assistant Empowered by Natural Language Conditional Table Discovery. ArXiv:2507.08283.","journal-title":"ArXiv"},{"key":"e_1_2_1_5_1","first-page":"21523","article-title":"Tabular Data Augmentation for Machine Learning","volume":"2407","author":"Cui L","year":"2024","unstructured":"L Cui, H Li, K Chen, L Shou, and G Chen. 2024. Tabular Data Augmentation for Machine Learning: Progress and Prospects of Embracing Generative AI. ArXiv:2407.21523.","journal-title":"Progress and Prospects of Embracing Generative AI. ArXiv"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Y Dong C Xiao T Nozawa M Enomoto and M Oyamada. 2023. DeepJoin:Joinable Table Discovery with Pre-Trained Language Models. PVLDB 2458\u20132470.","DOI":"10.14778\/3603581.3603587"},{"key":"e_1_2_1_7_1","doi-asserted-by":"crossref","unstructured":"G Fan J Wang Y Li D Zhang and R J. Miller. 2023. Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning. PVLDB 1726\u20131739.","DOI":"10.14778\/3587136.3587146"},{"key":"e_1_2_1_8_1","volume-title":"Proc. ACM Manag. Data, 1\u201325","author":"Khatiwada A","year":"2023","unstructured":"A Khatiwada, G Fan, R Shraga, Z Chen, W Gatterbauer, R. Miller, and M Riedewald. 2023. SANTOS: Relationship-based Semantic Table Union Search. Proc. ACM Manag. Data, 1\u201325."},{"key":"e_1_2_1_9_1","volume-title":"Proc. ACM Manag. Data, 176","author":"Li P","year":"2024","unstructured":"P Li, Y He, D Yashar, W Cui, S Ge, H Zhang, D R Fainman, D Zhang, and S Chaudhuri. 2024. Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks. Proc. ACM Manag. Data, 176."},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","unstructured":"M Trabelsi Z Chen S Zhang B D. Davison and J Heflin. 2022. StruBERT: Structure-aware BERT for Table Search and Matching. In WWW. 442\u2013451.","DOI":"10.1145\/3485447.3511972"},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","unstructured":"F Wang K Sun M Chen J Pujara and P Szekely. 2021. Retrieving Complex Tables with Multi-Granular Graph Representation Learning. SIGIR 1472\u20131482.","DOI":"10.1145\/3404835.3462909"},{"key":"e_1_2_1_12_1","first-page":"19318","article-title":"TABLELLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios","volume":"2403","author":"Zhang X","year":"2024","unstructured":"X Zhang, S Luo, B Zhang, Z Ma, J Zhang, G Li Y Li, Z Yao, K Xu, J Zhou, D Zhang-Li, J Yu, S Zhao, J Li, and J Tang. 2024. TABLELLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios. ArXiv:2403.19318.","journal-title":"ArXiv"},{"key":"e_1_2_1_13_1","volume-title":"Proc. ACM SIGMOD. 847\u2013864","author":"Zhu E","unstructured":"E Zhu, D Deng, F Nargesian, and R J. Miller. 2019. JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes. In Proc. ACM SIGMOD. 847\u2013864."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3750601.3750681","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,16]],"date-time":"2025-09-16T13:43:17Z","timestamp":1758030197000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3750601.3750681"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8]]},"references-count":13,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["10.14778\/3750601.3750681"],"URL":"https:\/\/doi.org\/10.14778\/3750601.3750681","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,8]]},"assertion":[{"value":"2025-09-16","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}