{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,26]],"date-time":"2025-12-26T08:49:20Z","timestamp":1766738960044},"reference-count":38,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2022,8]]},"abstract":"<jats:p>In the last few years, the natural language processing community witnessed advances in neural representations of free texts with transformer-based language models (LMs). Given the importance of knowledge available in relational tables, recent research efforts extend LMs by developing neural representations for tabular data. In this tutorial, we present these proposals with two main goals. First, we introduce to a database audience the potentials and the limitations of current models. Second, we demonstrate the large variety of data applications that benefit from the transformer architecture. The tutorial aims at encouraging database researchers to engage and contribute to this new direction, and at empowering practitioners with a new set of tools for applications involving text and tabular data.<\/jats:p>","DOI":"10.14778\/3554821.3554890","type":"journal-article","created":{"date-parts":[[2022,9,29]],"date-time":"2022-09-29T22:28:39Z","timestamp":1664490519000},"page":"3746-3749","source":"Crossref","is-referenced-by-count":10,"title":["Transformers for tabular data representation"],"prefix":"10.14778","volume":"15","author":[{"given":"Gilbert","family":"Badaro","sequence":"first","affiliation":[{"name":"EURECOM, Biot, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Paolo","family":"Papotti","sequence":"additional","affiliation":[{"name":"EURECOM, Biot, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,9,29]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Oana Cocarascu, and Arpit Mittal.","author":"Aly Rami","year":"2021","unstructured":"Rami Aly , Zhijiang Guo , Michael Sejr Schlichtkrull , James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Oana Cocarascu, and Arpit Mittal. 2021 . FEVEROUS : Fact Extraction and VERification Over Unstructured and Structured information. In NeurIPS Datasets and Benchmarks Track (Round 1). Rami Aly, Zhijiang Guo, Michael Sejr Schlichtkrull, James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Oana Cocarascu, and Arpit Mittal. 2021. FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information. In NeurIPS Datasets and Benchmarks Track (Round 1)."},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3295662"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3404854"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-25007-6_25"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3389742"},{"key":"e_1_2_1_7_1","volume-title":"TabFact: A Large-scale Dataset for Table-based Fact Verification. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rkeJRhNYDH","author":"Chen Wenhu","year":"2020","unstructured":"Wenhu Chen , Hongmin Wang , Jianshu Chen , Yunkai Zhang , Hong Wang , Shiyang Li , Xiyou Zhou , and William Yang Wang . 2020 . TabFact: A Large-scale Dataset for Table-based Fact Verification. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rkeJRhNYDH Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, and William Yang Wang. 2020. TabFact: A Large-scale Dataset for Table-based Fact Verification. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=rkeJRhNYDH"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.14778\/3430915.3430921"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1423"},{"key":"e_1_2_1_10_1","volume-title":"International Conference on Learning Representations.","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy , Lucas Beyer , Alexander Kolesnikov , Dirk Weissenborn , Xiaohua Zhai , Thomas Unterthiner , Mostafa Dehghani , Matthias Minderer , Georg Heigold , Sylvain Gelly , 2020 . An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale . In International Conference on Learning Representations. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations."},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Lun Du Fei Gao Xu Chen Ran Jia Junshan Wang Jiang Zhang Shi Han and Dongmei Zhang. 2021. TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data. In ACM SIGKDD. 322--331. Lun Du Fei Gao Xu Chen Ran Jia Junshan Wang Jiang Zhang Shi Han and Dongmei Zhang. 2021. TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data. In ACM SIGKDD. 322--331.","DOI":"10.1145\/3447548.3467228"},{"key":"e_1_2_1_12_1","volume-title":"MATE: Multi-view Attention for Table Transformer Efficiency. arXiv preprint arXiv:2109.04312","author":"Eisenschlos Julian Martin","year":"2021","unstructured":"Julian Martin Eisenschlos , Maharshi Gor , Thomas M\u00fcller , and William W Cohen . 2021 . MATE: Multi-view Attention for Table Transformer Efficiency. arXiv preprint arXiv:2109.04312 (2021). Julian Martin Eisenschlos, Maharshi Gor, Thomas M\u00fcller, and William W Cohen. 2021. MATE: Multi-view Attention for Table Transformer Efficiency. arXiv preprint arXiv:2109.04312 (2021)."},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Michael Glass Mustafa Canim Alfio Gliozzo Saneem Chemmengath Vishwajeet Kumar Rishav Chakravarti Avirup Sil Feifei Pan Samarth Bharadwaj and Nicolas Rodolfo Fauceglia. 2021. Capturing Row and Column Semantics in Transformer Based Question Answering over Tables. In NACL: HLT. 1212--1224. Michael Glass Mustafa Canim Alfio Gliozzo Saneem Chemmengath Vishwajeet Kumar Rishav Chakravarti Avirup Sil Feifei Pan Samarth Bharadwaj and Nicolas Rodolfo Fauceglia. 2021. Capturing Row and Column Semantics in Transformer Based Question Answering over Tables. In NACL: HLT. 1212--1224.","DOI":"10.18653\/v1\/2021.naacl-main.96"},{"key":"e_1_2_1_14_1","volume-title":"AST: Audio Spectrogram Transformer. arXiv preprint arXiv:2104.01778","author":"Gong Yuan","year":"2021","unstructured":"Yuan Gong , Yu-An Chung , and James Glass . 2021 . AST: Audio Spectrogram Transformer. arXiv preprint arXiv:2104.01778 (2021). Yuan Gong, Yu-An Chung, and James Glass. 2021. AST: Audio Spectrogram Transformer. arXiv preprint arXiv:2104.01778 (2021)."},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Jonathan Herzig Thomas Mueller Syrine Krichene and Julian Eisenschlos. 2021. Open Domain Question Answering over Tables via Dense Retrieval. In NACL: HLT. 512--519. Jonathan Herzig Thomas Mueller Syrine Krichene and Julian Eisenschlos. 2021. Open Domain Question Answering over Tables via Dense Retrieval. In NACL: HLT. 512--519.","DOI":"10.18653\/v1\/2021.naacl-main.43"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.398"},{"key":"e_1_2_1_17_1","volume-title":"TABBIE: Pretrained Representations of Tabular Data. In NACL: HLT. 3446--3456.","author":"Iida Hiroshi","year":"2021","unstructured":"Hiroshi Iida , Dung Thai , Varun Manjunatha , and Mohit Iyyer . 2021 . TABBIE: Pretrained Representations of Tabular Data. In NACL: HLT. 3446--3456. Hiroshi Iida, Dung Thai, Varun Manjunatha, and Mohit Iyyer. 2021. TABBIE: Pretrained Representations of Tabular Data. In NACL: HLT. 3446--3456."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407841"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3448016.3457543"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.mrqa-1.8"},{"key":"e_1_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Oliver Lehmberg Dominique Ritze Robert Meusel and Christian Bizer. 2016. A large public corpus of web tables containing time and context metadata. In WWW Companion. 75--76. Oliver Lehmberg Dominique Ritze Robert Meusel and Christian Bizer. 2016. A large public corpus of web tables containing time and context metadata. In WWW Companion. 75--76.","DOI":"10.1145\/2872518.2889386"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/3421424.3421431"},{"key":"e_1_2_1_23_1","volume-title":"TAPEX: Table pre-training via learning a neural SQL executor. arXiv preprint arXiv:2107.07653","author":"Liu Qian","year":"2021","unstructured":"Qian Liu , Bei Chen , Jiaqi Guo , Zeqi Lin , and Jian-guang Lou. 2021 . TAPEX: Table pre-training via learning a neural SQL executor. arXiv preprint arXiv:2107.07653 (2021). Qian Liu, Bei Chen, Jiaqi Guo, Zeqi Lin, and Jian-guang Lou. 2021. TAPEX: Table pre-training via learning a neural SQL executor. arXiv preprint arXiv:2107.07653 (2021)."},{"key":"e_1_2_1_24_1","volume-title":"RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR abs\/1907.11692","author":"Liu Yinhan","year":"2019","unstructured":"Yinhan Liu , Myle Ott , Naman Goyal , Jingfei Du , Mandar Joshi , Danqi Chen , Omer Levy , Mike Lewis , Luke Zettlemoyer , and Veselin Stoyanov . 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR abs\/1907.11692 ( 2019 ). arXiv:1907.11692 http:\/\/arxiv.org\/abs\/1907.11692 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR abs\/1907.11692 (2019). arXiv:1907.11692 http:\/\/arxiv.org\/abs\/1907.11692"},{"key":"e_1_2_1_25_1","volume-title":"CLTR: An End-to-End, Transformer-Based System for Cell-Level Table Retrieval and Table Question Answering. In ACL System Demonstrations. 202--209.","author":"Pan Feifei","year":"2021","unstructured":"Feifei Pan , Mustafa Canim , Michael Glass , Alfio Gliozzo , and Peter Fox . 2021 . CLTR: An End-to-End, Transformer-Based System for Cell-Level Table Retrieval and Table Question Answering. In ACL System Demonstrations. 202--209. Feifei Pan, Mustafa Canim, Michael Glass, Alfio Gliozzo, and Peter Fox. 2021. CLTR: An End-to-End, Transformer-Based System for Cell-Level Table Retrieval and Table Question Answering. In ACL System Demonstrations. 202--209."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.442"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3381831"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i09.7123"},{"key":"e_1_2_1_29_1","volume-title":"Annotating Columns with Pre-trained Language Models. arXiv preprint arXiv:2104.01785","author":"Suhara Yoshihiko","year":"2021","unstructured":"Yoshihiko Suhara , Jinfeng Li , Yuliang Li , Dan Zhang , \u00c7a\u011fatay Demiralp , Chen Chen , and Wang-Chiew Tan . 2021. Annotating Columns with Pre-trained Language Models. arXiv preprint arXiv:2104.01785 ( 2021 ). Yoshihiko Suhara, Jinfeng Li, Yuliang Li, Dan Zhang, \u00c7a\u011fatay Demiralp, Chen Chen, and Wang-Chiew Tan. 2021. Annotating Columns with Pre-trained Language Models. arXiv preprint arXiv:2104.01785 (2021)."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.14778\/3457390.3457391"},{"key":"e_1_2_1_31_1","doi-asserted-by":"crossref","unstructured":"James Thorne Majid Yazdani Marzieh Saeidi Fabrizio Silvestri Sebastian Riedel and Alon Halevy. 2021. Database reasoning over text. In ACL. 3091--3104. James Thorne Majid Yazdani Marzieh Saeidi Fabrizio Silvestri Sebastian Riedel and Alon Halevy. 2021. Database reasoning over text. In ACL. 3091--3104.","DOI":"10.18653\/v1\/2021.acl-long.241"},{"key":"e_1_2_1_32_1","volume-title":"Pythia: Unsupervised Generation of Ambiguous Textual Claims from Relational Data. In SIGMOD - Demo track. ACM.","author":"Veltri Enzo","year":"2022","unstructured":"Enzo Veltri , Donatello Santoro , Gilbert Badaro , Mohammed Saeed , and Paolo Papotti . 2022 . Pythia: Unsupervised Generation of Ambiguous Textual Claims from Relational Data. In SIGMOD - Demo track. ACM. Enzo Veltri, Donatello Santoro, Gilbert Badaro, Mohammed Saeed, and Paolo Papotti. 2022. Pythia: Unsupervised Generation of Ambiguous Textual Claims from Relational Data. In SIGMOD - Demo track. ACM."},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Fei Wang Kexuan Sun Muhao Chen Jay Pujara and Pedro Szekely. 2021. Retrieving Complex Tables with Multi-Granular Graph Representation Learning. In SIGIR. ACM 1472--1482. Fei Wang Kexuan Sun Muhao Chen Jay Pujara and Pedro Szekely. 2021. Retrieving Complex Tables with Multi-Granular Graph Representation Learning. In SIGIR. ACM 1472--1482.","DOI":"10.1145\/3404835.3462909"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467434"},{"key":"e_1_2_1_35_1","first-page":"90","volume-title":"Exploring Decomposition for Table-based Fact Verification. In EMNLP 2021","author":"Yang Xiaoyu","year":"2021","unstructured":"Xiaoyu Yang and Xiaodan Zhu . 2021 . Exploring Decomposition for Table-based Fact Verification. In EMNLP 2021 . ACL, Punta Cana, Dominican Republic, 1045--1052. https:\/\/aclanthology.org\/ 2021.findings-emnlp. 90 Xiaoyu Yang and Xiaodan Zhu. 2021. Exploring Decomposition for Table-based Fact Verification. In EMNLP 2021. ACL, Punta Cana, Dominican Republic, 1045--1052. https:\/\/aclanthology.org\/2021.findings-emnlp.90"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.745"},{"key":"e_1_2_1_37_1","volume-title":"GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=kyaIeYj4zZ","author":"Yu Tao","year":"2021","unstructured":"Tao Yu , Chien-Sheng Wu , Xi Victoria Lin , bailin wang, Yi Chern Tan , Xinyi Yang , Dragomir Radev , richard socher, and Caiming Xiong . 2021 . GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=kyaIeYj4zZ Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, bailin wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, richard socher, and Caiming Xiong. 2021. GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=kyaIeYj4zZ"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1425"},{"key":"e_1_2_1_39_1","volume-title":"Seq2SQL: Generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103","author":"Zhong Victor","year":"2017","unstructured":"Victor Zhong , Caiming Xiong , and Richard Socher . 2017. Seq2SQL: Generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103 ( 2017 ). Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:1709.00103 (2017)."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3554821.3554890","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,4]],"date-time":"2024-10-04T23:39:32Z","timestamp":1728085172000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3554821.3554890"}},"subtitle":["a tutorial on models and applications"],"short-title":[],"issued":{"date-parts":[[2022,8]]},"references-count":38,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2022,8]]}},"alternative-id":["10.14778\/3554821.3554890"],"URL":"https:\/\/doi.org\/10.14778\/3554821.3554890","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2022,8]]}}}