{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T14:57:56Z","timestamp":1776092276111,"version":"3.50.1"},"reference-count":39,"publisher":"Association for Computing Machinery (ACM)","issue":"10","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2023,6]]},"abstract":"<jats:p>Spreadsheets are widely used for table manipulation and presentation. Stylistic formatting of these tables is an important property for presentation and analysis. As a result, popular spreadsheet software, such as Excel, supports automatically formatting tables based on rules. Unfortunately, writing such formatting rules can be challenging for users as it requires knowledge of the underlying rule language and data logic. We present Cornet, a system that tackles the novel problem of automatically learning such formatting rules from user-provided formatted cells. Cornet takes inspiration from advances in inductive programming and combines symbolic rule enumeration with a neural ranker to learn conditional formatting rules. To motivate and evaluate our approach, we extracted tables with over 450K unique formatting rules from a corpus of over 1.8M real worksheets. Since we are the first to introduce the task of automatically learning conditional formatting rules, we compare Cornet to a wide range of symbolic and neural baselines adapted from related domains. Our results show that Cornet accurately learns rules across varying setups. Additionally, we show that in some cases Cornet can find rules that are shorter than those written by users and can also discover rules in spreadsheets that users have manually formatted. Furthermore, we present two case studies investigating the generality of our approach by extending Cornet to related data tasks (e.g., filtering) and generalizing to conditional formatting over multiple columns.<\/jats:p>","DOI":"10.14778\/3603581.3603600","type":"journal-article","created":{"date-parts":[[2023,8,8]],"date-time":"2023-08-08T19:06:48Z","timestamp":1691521608000},"page":"2632-2644","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Cornet: Learning Table Formatting Rules By Example"],"prefix":"10.14778","volume":"16","author":[{"given":"Mukul","family":"Singh","sequence":"first","affiliation":[{"name":"Microsoft, Delhi, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jos\u00e9 Cambronero","family":"S\u00e1nchez","sequence":"additional","affiliation":[{"name":"Microsoft, New Haven, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sumit","family":"Gulwani","sequence":"additional","affiliation":[{"name":"Microsoft, Redmond, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vu","family":"Le","sequence":"additional","affiliation":[{"name":"Microsoft, Redmond, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Carina","family":"Negreanu","sequence":"additional","affiliation":[{"name":"Microsoft Research, Cambridge, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mohammad","family":"Raza","sequence":"additional","affiliation":[{"name":"Microsoft, Redmond, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gust","family":"Verbruggen","sequence":"additional","affiliation":[{"name":"Microsoft, Keerbergen, Belgium"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,8,8]]},"reference":[{"key":"e_1_2_1_1_1","first-page":"85105","article-title":"Spreadsheet Conditional Formatting: An Untapped Resource for Mathematics Education","volume":"1","author":"Abramovich Sergei","year":"2004","unstructured":"Sergei Abramovich , Stephen Sugden , Sergei Abramovich , and Stephen J Sugden . 2004 . Spreadsheet Conditional Formatting: An Untapped Resource for Mathematics Education . Spreadsheets in Education 1 (2004), 85105 . Sergei Abramovich, Stephen Sugden, Sergei Abramovich, and Stephen J Sugden. 2004. Spreadsheet Conditional Formatting: An Untapped Resource for Mathematics Education. Spreadsheets in Education 1 (2004), 85105.","journal-title":"Spreadsheets in Education"},{"key":"e_1_2_1_2_1","volume-title":"2015 IEEE\/ACM 12th Working Conference on Mining Software Repositories. IEEE, IEEE\/ACM","author":"Barik Titus","year":"2015","unstructured":"Titus Barik , Kevin Lubick , Justin Smith , John Slankas , and Emerson Murphy-Hill . 2015 . Fuse: a reproducible, extendable, internet-scale corpus of spreadsheets . In 2015 IEEE\/ACM 12th Working Conference on Mining Software Repositories. IEEE, IEEE\/ACM , Florence, Italy, 486--489. Titus Barik, Kevin Lubick, Justin Smith, John Slankas, and Emerson Murphy-Hill. 2015. Fuse: a reproducible, extendable, internet-scale corpus of spreadsheets. In 2015 IEEE\/ACM 12th Working Conference on Mining Software Repositories. IEEE, IEEE\/ACM, Florence, Italy, 486--489."},{"key":"e_1_2_1_3_1","volume-title":"Top-down induction of first-order logical decision trees. Artificial intelligence 101, 1--2","author":"Blockeel Hendrik","year":"1998","unstructured":"Hendrik Blockeel and Luc De Raedt . 1998. Top-down induction of first-order logical decision trees. Artificial intelligence 101, 1--2 ( 1998 ), 285--297. Hendrik Blockeel and Luc De Raedt. 1998. Top-down induction of first-order logical decision trees. Artificial intelligence 101, 1--2 (1998), 285--297."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.)","volume":"139","author":"Chen Xinyun","year":"2021","unstructured":"Xinyun Chen , Petros Maniatis , Rishabh Singh , Charles Sutton , Hanjun Dai , Max Lin , and Denny Zhou . 2021 . SpreadsheetCoder: Formula Prediction from Semi-structured Context . In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.) , Vol. 139 . PMLR, virtual, 1661--1672. https:\/\/proceedings.mlr.press\/v139\/chen21m.html Xinyun Chen, Petros Maniatis, Rishabh Singh, Charles Sutton, Hanjun Dai, Max Lin, and Denny Zhou. 2021. SpreadsheetCoder: Formula Prediction from Semi-structured Context. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research), Marina Meila and Tong Zhang (Eds.), Vol. 139. PMLR, virtual, 1661--1672. https:\/\/proceedings.mlr.press\/v139\/chen21m.html"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-020-05934-z"},{"key":"e_1_2_1_6_1","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019","volume":"1","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019 , Minneapolis, MN, USA, June 2--7 , 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, Minneapolis, USA, 4171--4186. 10.18653\/v1\/n19-1423 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, Minneapolis, USA, 4171--4186. 10.18653\/v1\/n19-1423"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 29th ACM International Conference on Information & Knowledge Management","author":"Dong Haoyu","year":"2020","unstructured":"Haoyu Dong , Jinyu Wang , Zhouyu Fu , Shi Han , and Dongmei Zhang . 2020 . Neural Formatting for Spreadsheet Tables . In Proceedings of the 29th ACM International Conference on Information & Knowledge Management ( Virtual Event, Ireland) (CIKM '20). Association for Computing Machinery, New York, NY, USA, 305--314. 10.1145\/3340531.341 1943 Haoyu Dong, Jinyu Wang, Zhouyu Fu, Shi Han, and Dongmei Zhang. 2020. Neural Formatting for Spreadsheet Tables. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM '20). Association for Computing Machinery, New York, NY, USA, 305--314. 10.1145\/3340531.3411943"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery","author":"Drosos Ian","year":"2020","unstructured":"Ian Drosos , Titus Barik , Philip J. Guo , Robert DeLine , and Sumit Gulwani . 2020 . Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery , New York, NY, USA, 1--12. 10.1145\/3313831.3376442 Ian Drosos, Titus Barik, Philip J. Guo, Robert DeLine, and Sumit Gulwani. 2020. Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1--12. 10.1145\/3313831.3376442"},{"key":"e_1_2_1_9_1","volume-title":"IJCAI 2017 (ijcai 2017 ed.). IJCAI 2017","author":"Ellis Kevin","year":"2017","unstructured":"Kevin Ellis and Sumit Gulwani . 2017 . Learning to Learn Programs from Examples: Going Beyond Program Structure . In IJCAI 2017 (ijcai 2017 ed.). IJCAI 2017 , Melbourne, Australia, 1638--1645. www.microsoft.com\/research\/publication\/learning-learn-programs-examples-going-beyond-program-structure\/ Kevin Ellis and Sumit Gulwani. 2017. Learning to Learn Programs from Examples: Going Beyond Program Structure. In IJCAI 2017 (ijcai 2017 ed.). IJCAI 2017, Melbourne, Australia, 1638--1645. www.microsoft.com\/research\/publication\/learning-learn-programs-examples-going-beyond-program-structure\/"},{"key":"e_1_2_1_10_1","unstructured":"Microsoft Excel. 2022. Excel Tech Help Forum. https:\/\/techcommunity.microsoft.com\/t5\/forums\/searchpage\/tab\/message?q=conditional%20formatting. Last Accessed: 2022-06-30.  Microsoft Excel. 2022. Excel Tech Help Forum. https:\/\/techcommunity.microsoft.com\/t5\/forums\/searchpage\/tab\/message?q=conditional%20formatting. Last Accessed: 2022-06-30."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342266"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 2021 International Conference on Management of Data","author":"Fariha Anna","year":"2021","unstructured":"Anna Fariha , Ashish Tiwari , Alexandra Meliou , Arjun Radhakrishna , and Sumit Gulwani . 2021 . CoCo: Interactive Exploration of Conformance Constraints for Data Understanding and Data Cleaning . In Proceedings of the 2021 International Conference on Management of Data ( Virtual Event, China) (SIGMOD '21). Association for Computing Machinery, New York, NY, USA, 2706--2710. 10.1145\/3448016.3452750 Anna Fariha, Ashish Tiwari, Alexandra Meliou, Arjun Radhakrishna, and Sumit Gulwani. 2021. CoCo: Interactive Exploration of Conformance Constraints for Data Understanding and Data Cleaning. In Proceedings of the 2021 International Conference on Management of Data (Virtual Event, China) (SIGMOD '21). Association for Computing Machinery, New York, NY, USA, 2706--2710. 10.1145\/3448016.3452750"},{"key":"e_1_2_1_13_1","first-page":"139","volume-title":"CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In EMNLP 2020","author":"Feng Zhangyin","year":"2020","unstructured":"Zhangyin Feng , Daya Guo , Duyu Tang , Nan Duan , Xiaocheng Feng , Ming Gong , Linjun Shou , Bing Qin , Ting Liu , Daxin Jiang , and Ming Zhou . 2020 . CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In EMNLP 2020 . Association for Computational Linguistics, Online, 1536--1547. 10. 18653\/v1\/2020.findings-emnlp. 139 Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In EMNLP 2020. Association for Computational Linguistics, Online, 1536--1547. 10.18653\/v1\/2020.findings-emnlp.139"},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the first workshop on End-user software engineering. Association for Computing Machinery","author":"Fisher Marc","year":"2005","unstructured":"Marc Fisher and Gregg Rothermel . 2005 . The EUSES spreadsheet corpus: a shared resource for supporting experimentation with spreadsheet dependability mechanisms . In Proceedings of the first workshop on End-user software engineering. Association for Computing Machinery , New York, NY, USA, 1--5. Marc Fisher and Gregg Rothermel. 2005. The EUSES spreadsheet corpus: a shared resource for supporting experimentation with spreadsheet dependability mechanisms. In Proceedings of the first workshop on End-user software engineering. Association for Computing Machinery, New York, NY, USA, 1--5."},{"key":"e_1_2_1_15_1","volume-title":"PoPL'11, January 26--28","author":"Gulwani Sumit","year":"2011","unstructured":"Sumit Gulwani . 2011. Automating String Processing in Spreadsheets using Input-Output Examples . In PoPL'11, January 26--28 , 2011 , Austin, Texas, USA. Association for Computing Machinery , New York, NY, USA, 317--330. https:\/\/www.microsoft.com\/en-us\/research\/publication\/automating-string-processing-spreadsheets-using-input-output-examples\/ Sumit Gulwani. 2011. Automating String Processing in Spreadsheets using Input-Output Examples. In PoPL'11, January 26--28, 2011, Austin, Texas, USA. Association for Computing Machinery, New York, NY, USA, 317--330. https:\/\/www.microsoft.com\/en-us\/research\/publication\/automating-string-processing-spreadsheets-using-input-output-examples\/"},{"key":"e_1_2_1_16_1","volume-title":"Object-Oriented Programming, Systems, Languages & Applications (OOPSLA). ACM","author":"Gulwani Sumit","unstructured":"Sumit Gulwani , Vu Le , Arjun Radhakrishna , Ivan Radicek , and Mohammad Raza . 2020. Structure interpretation of text formats . In Object-Oriented Programming, Systems, Languages & Applications (OOPSLA). ACM , Association for Computing Machinery , New York, NY, USA , 29. https:\/\/www.microsoft.com\/en-us\/research\/publication\/structure-interpretation-of-text-formats\/ Sumit Gulwani, Vu Le, Arjun Radhakrishna, Ivan Radicek, and Mohammad Raza. 2020. Structure interpretation of text formats. In Object-Oriented Programming, Systems, Languages & Applications (OOPSLA). ACM, Association for Computing Machinery, New York, NY, USA, 29. https:\/\/www.microsoft.com\/en-us\/research\/publication\/structure-interpretation-of-text-formats\/"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics","author":"Herzig Jonathan","year":"2020","unstructured":"Jonathan Herzig , Pawe\u0142 Krzysztof Nowak , Thomas M\u00fcller , Francesco Piccinno , and Julian Martin Eisenschlos . 2020 . Tapas: Weakly Supervised Table Parsing via Pre-training . In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics , Seattle, Washington, United States, 4320--4333. https:\/\/www.aclweb.org\/anthology\/ 2020.acl-main.398\/ Jonathan Herzig, Pawe\u0142 Krzysztof Nowak, Thomas M\u00fcller, Francesco Piccinno, and Julian Martin Eisenschlos. 2020. Tapas: Weakly Supervised Table Parsing via Pre-training. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Seattle, Washington, United States, 4320--4333. https:\/\/www.aclweb.org\/anthology\/2020.acl-main.398\/"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 2005 ACM symposium on Document engineering. Association for Computing Machinery","author":"Hurst Nathan","year":"2005","unstructured":"Nathan Hurst , Kim Marriott , and Peter Moulder . 2005 . Toward tighter tables . In Proceedings of the 2005 ACM symposium on Document engineering. Association for Computing Machinery , New York, NY, USA, 74--83. Nathan Hurst, Kim Marriott, and Peter Moulder. 2005. Toward tighter tables. In Proceedings of the 2005 ACM symposium on Document engineering. Association for Computing Machinery, New York, NY, USA, 74--83."},{"key":"e_1_2_1_19_1","volume-title":"Finding groups in data: an introduction to cluster analysis","author":"Kaufman Leonard","unstructured":"Leonard Kaufman and Peter J Rousseeuw . 2009. Finding groups in data: an introduction to cluster analysis . John Wiley & Sons , online. Leonard Kaufman and Peter J Rousseeuw. 2009. Finding groups in data: an introduction to cluster analysis. John Wiley & Sons, online."},{"key":"e_1_2_1_20_1","volume-title":"FlashExtract: a framework for data extraction by examples. In 2014 Programming Language Design and Implementation","author":"Le Vu","unstructured":"Vu Le and Sumit Gulwani . 2014. FlashExtract: a framework for data extraction by examples. In 2014 Programming Language Design and Implementation . ACM , New York, NY, USA , 542--553. https:\/\/www.microsoft.com\/en-us\/research\/publication\/flashextract-framework-data-extraction-examples\/ Vu Le and Sumit Gulwani. 2014. FlashExtract: a framework for data extraction by examples. In 2014 Programming Language Design and Implementation. ACM, New York, NY, USA, 542--553. https:\/\/www.microsoft.com\/en-us\/research\/publication\/flashextract-framework-data-extraction-examples\/"},{"key":"e_1_2_1_21_1","volume-title":"Computer Vision - ECCV","author":"Lee Kuang-Huei","year":"2018","unstructured":"Kuang-Huei Lee , Xi Chen , Gang Hua , Houdong Hu , and Xiaodong He. 2018. Stacked Cross Attention for Image-Text Matching . In Computer Vision - ECCV 2018 , Vittorio Ferrari, Martial Hebert , Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing , Cham, 212--228. Kuang-Huei Lee, Xi Chen, Gang Hua, Houdong Hu, and Xiaodong He. 2018. Stacked Cross Attention for Image-Text Matching. In Computer Vision - ECCV 2018, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 212--228."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/2831360.2831369"},{"key":"e_1_2_1_23_1","first-page":"1","article-title":"Can we generate shellcodes via natural language? An empirical study","volume":"29","author":"Liguori Pietro","year":"2022","unstructured":"Pietro Liguori , Erfan Al-Hossami , Domenico Cotroneo , Roberto Natella , Bojan Cukic , and Samira Shaikh . 2022 . Can we generate shellcodes via natural language? An empirical study . Automated Software Engineering 29 (2022), 1 -- 34 . Pietro Liguori, Erfan Al-Hossami, Domenico Cotroneo, Roberto Natella, Bojan Cukic, and Samira Shaikh. 2022. Can we generate shellcodes via natural language? An empirical study. Automated Software Engineering 29 (2022), 1--34.","journal-title":"Automated Software Engineering"},{"key":"e_1_2_1_24_1","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1016\/j.cad.2005.11.006","article-title":"Active layout engine: Algorithms and applications in variable data printing","volume":"38","author":"Lin Xiaofan","year":"2006","unstructured":"Xiaofan Lin . 2006 . Active layout engine: Algorithms and applications in variable data printing . Computer-Aided Design 38 , 5 (2006), 444 -- 456 . Xiaofan Lin. 2006. Active layout engine: Algorithms and applications in variable data printing. Computer-Aided Design 38, 5 (2006), 444--456.","journal-title":"Computer-Aided Design"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-016-0429-2"},{"key":"e_1_2_1_26_1","unstructured":"Joseph N. 2022. Number of Google Sheets and Excel Users Worldwide. https:\/\/askwonder.com\/research\/number-google-sheets-users-worldwide-eoskdoxav. Last Accessed: 2022-07-30.  Joseph N. 2022. Number of Google Sheets and Excel Users Worldwide. https:\/\/askwonder.com\/research\/number-google-sheets-users-worldwide-eoskdoxav. Last Accessed: 2022-07-30."},{"key":"e_1_2_1_27_1","unstructured":"Nagarajan Natarajan Danny Simmons Naren Datha Prateek Jain and Sumit Gulwani. 2019. Learning Natural Programs from a Few Examples in Real-Time. In AIStats. PMLR online 1714--1722. https:\/\/www.microsoft.com\/en-us\/research\/publication\/learning-natural-programs-from-a-few-examples-in-real-time\/  Nagarajan Natarajan Danny Simmons Naren Datha Prateek Jain and Sumit Gulwani. 2019. Learning Natural Programs from a Few Examples in Real-Time. In AIStats. PMLR online 1714--1722. https:\/\/www.microsoft.com\/en-us\/research\/publication\/learning-natural-programs-from-a-few-examples-in-real-time\/"},{"key":"e_1_2_1_28_1","volume-title":"The Active Modeler: Mathematical Modeling With Microsoft Excel","author":"Neuwirth Erich","unstructured":"Erich Neuwirth and Deane Arganbright . 2003. The Active Modeler: Mathematical Modeling With Microsoft Excel . Duxbury Press , online. Erich Neuwirth and Deane Arganbright. 2003. The Active Modeler: Mathematical Modeling With Microsoft Excel. Duxbury Press, online."},{"key":"e_1_2_1_29_1","volume-title":"BUSTLE: Bottom-up program-Synthesis Through Learning-guided Exploration. ArXiv abs\/2007.14381","author":"Odena Augustus","year":"2021","unstructured":"Augustus Odena , Kensen Shi , David Bieber , Rishabh Singh , and Charles Sutton . 2021 . BUSTLE: Bottom-up program-Synthesis Through Learning-guided Exploration. ArXiv abs\/2007.14381 (2021). Augustus Odena, Kensen Shi, David Bieber, Rishabh Singh, and Charles Sutton. 2021. BUSTLE: Bottom-up program-Synthesis Through Learning-guided Exploration. ArXiv abs\/2007.14381 (2021)."},{"key":"e_1_2_1_30_1","volume-title":"Millstein","author":"Padhi Saswat","year":"2017","unstructured":"Saswat Padhi , Prateek Jain , Daniel Perelman , Oleksandr Polozov , Sumit Gulwani , and Todd D . Millstein . 2017 . FlashProfile: Interactive Synthesis of Syntactic Profiles. CoRR abs\/1709.05725, Article 150 (2017), 28 pages. arXiv:1709.05725 http:\/\/arxiv.org\/abs\/1709.05725 Saswat Padhi, Prateek Jain, Daniel Perelman, Oleksandr Polozov, Sumit Gulwani, and Todd D. Millstein. 2017. FlashProfile: Interactive Synthesis of Syntactic Profiles. CoRR abs\/1709.05725, Article 150 (2017), 28 pages. arXiv:1709.05725 http:\/\/arxiv.org\/abs\/1709.05725"},{"key":"e_1_2_1_31_1","volume-title":"Synchromesh: Reliable code generation from pre-trained language models. CoRR abs\/2201.11227","author":"Poesia Gabriel","year":"2022","unstructured":"Gabriel Poesia , Oleksandr Polozov , Vu Le , Ashish Tiwari , Gustavo Soares , Christopher Meek , and Sumit Gulwani . 2022 . Synchromesh: Reliable code generation from pre-trained language models. CoRR abs\/2201.11227 (2022). arXiv:2201.11227 https:\/\/arxiv.org\/abs\/2201.11227 Gabriel Poesia, Oleksandr Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, and Sumit Gulwani. 2022. Synchromesh: Reliable code generation from pre-trained language models. CoRR abs\/2201.11227 (2022). arXiv:2201.11227 https:\/\/arxiv.org\/abs\/2201.11227"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2858965.2814310"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery","author":"Raza Mohammad","year":"2020","unstructured":"Mohammad Raza and Sumit Gulwani . 2020 . Web data extraction using hybrid program synthesis: A combination of top-down and bottom-up inference . In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery , New York, NY, USA , 1967--1978. Mohammad Raza and Sumit Gulwani. 2020. Web data extraction using hybrid program synthesis: A combination of top-down and bottom-up inference. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, New York, NY, USA, 1967--1978."},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data","author":"Shen Yanyan","year":"2014","unstructured":"Yanyan Shen , Kaushik Chakrabarti , Surajit Chaudhuri , Bolin Ding , and Lev Novik . 2014 . Discovering Queries Based on Example Tuples . In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data ( Snowbird, Utah, USA) (SIGMOD '14). Association for Computing Machinery, New York, NY, USA, 493--504. 10.1145\/2588555.2593664 Yanyan Shen, Kaushik Chakrabarti, Surajit Chaudhuri, Bolin Ding, and Lev Novik. 2014. Discovering Queries Based on Example Tuples. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Snowbird, Utah, USA) (SIGMOD '14). Association for Computing Machinery, New York, NY, USA, 493--504. 10.1145\/2588555.2593664"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence 35","author":"Sun Kexuan","year":"2021","unstructured":"Kexuan Sun , Harsha Rayudu , and Jay Pujara . 2021 . A Hybrid Probabilistic Approach for Table Understanding . Proceedings of the AAAI Conference on Artificial Intelligence 35 , 5 (May 2021), 4366--4374. https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/16562 Kexuan Sun, Harsha Rayudu, and Jay Pujara. 2021. A Hybrid Probabilistic Approach for Table Understanding. Proceedings of the AAAI Conference on Artificial Intelligence 35, 5 (May 2021), 4366--4374. https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/16562"},{"key":"e_1_2_1_36_1","volume-title":"Attention is all you need. Advances in neural information processing systems 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N Gomez , \u0141ukasz Kaiser , and Illia Polosukhin . 2017. Attention is all you need. Advances in neural information processing systems 30 ( 2017 ). Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the Eighteenth International Conference on Machine Learning (ICML '01)","author":"Wagstaff Kiri","year":"2001","unstructured":"Kiri Wagstaff , Claire Cardie , Seth Rogers , and Stefan Schr\u00f6dl . 2001 . Constrained K-Means Clustering with Background Knowledge . In Proceedings of the Eighteenth International Conference on Machine Learning (ICML '01) . Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 577--584. Kiri Wagstaff, Claire Cardie, Seth Rogers, and Stefan Schr\u00f6dl. 2001. Constrained K-Means Clustering with Background Knowledge. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 577--584."},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD '21)","author":"Wang Zhiruo","year":"2021","unstructured":"Zhiruo Wang , Haoyu Dong , Ran Jia , Jia Li , Zhiyi Fu , Shi Han , and Dongmei Zhang . 2021 . TUTA: Tree-Based Transformers for Generally Structured Table Pre-Training . In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD '21) . Association for Computing Machinery, New York, USA, 1780--1790. 10.1145\/3447548.3467434 Zhiruo Wang, Haoyu Dong, Ran Jia, Jia Li, Zhiyi Fu, Shi Han, and Dongmei Zhang. 2021. TUTA: Tree-Based Transformers for Generally Structured Table Pre-Training. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD '21). Association for Computing Machinery, New York, USA, 1780--1790. 10.1145\/3447548.3467434"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8413--8426","author":"Yin Pengcheng","year":"2020","unstructured":"Pengcheng Yin , Graham Neubig , Wen-tau Yih, and Sebastian Riedel . 2020 . TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data . In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8413--8426 . 10.18653\/v1\/2020.acl-main.745 Pengcheng Yin, Graham Neubig, Wen-tau Yih, and Sebastian Riedel. 2020. TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8413--8426. 10.18653\/v1\/2020.acl-main.745"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3603581.3603600","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,8]],"date-time":"2023-08-08T19:16:01Z","timestamp":1691522161000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3603581.3603600"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6]]},"references-count":39,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2023,6]]}},"alternative-id":["10.14778\/3603581.3603600"],"URL":"https:\/\/doi.org\/10.14778\/3603581.3603600","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2023,6]]},"assertion":[{"value":"2023-08-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}