{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T12:42:49Z","timestamp":1756384969418,"version":"3.44.0"},"reference-count":34,"publisher":"Association for Computing Machinery (ACM)","issue":"3","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Web"],"published-print":{"date-parts":[[2025,8,31]]},"abstract":"<jats:p>On the web, huge corpora of tables exist, which can include millions of tables, as in the case of Wikipedia. Maintaining them can be a time-consuming task and, in the case of many authors and editors, also requires a great deal of coordination to ensure high quality, complete, consistent, and readable schemata. In this work, we investigate how to provide automatic suggestions to improve the schema of webtables, namely, how to recommend schema changes. For this purpose, we derive rules from past schema changes via a lattice-based approach and then rank these rules to provide the best-fitting suggestions for each webtable.<\/jats:p>\n          <jats:p>Making use of the entire edit history of Wikipedia tables, we can compare our suggestions with the changes that were actually performed by editors. We show that for 75.13% of the changes in the test set, we make a correct recommendation, namely a change that was also observed subsequently on Wikipedia. In 58.66% of the cases, our recommendation even covers the entire observed change. Finally, we rank the recommendations with a mean reciprocal rank (MRR) of 0.73 and 0.69 for matches and full matches, respectively. A validation of our approach on three Fandom wikis confirms its effectiveness and generality.<\/jats:p>\n          <jats:p\/>","DOI":"10.1145\/3742920","type":"journal-article","created":{"date-parts":[[2025,6,30]],"date-time":"2025-06-30T08:20:52Z","timestamp":1751271652000},"page":"1-29","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Schema Change Recommendation for User-Curated Webtables Using Temporal Data"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-9517-7707","authenticated-orcid":false,"given":"Tobias","family":"Bleifu\u00df","sequence":"first","affiliation":[{"name":"Hasso-Plattner Institute for Digital Engineering","place":["Potsdam, Germany"]},{"name":"University of Potsdam","place":["Potsdam, Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9939-4932","authenticated-orcid":false,"given":"Leon","family":"Bornemann","sequence":"additional","affiliation":[{"name":"Hasso-Plattner Institute for Digital Engineering","place":["Potsdam, Germany"]},{"name":"University of Potsdam","place":["Potsdam, Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4483-1389","authenticated-orcid":false,"given":"Felix","family":"Naumann","sequence":"additional","affiliation":[{"name":"Hasso-Plattner Institute for Digital Engineering","place":["Potsdam, Germany"]},{"name":"University of Potsdam","place":["Potsdam, Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7609-9217","authenticated-orcid":false,"given":"Divesh","family":"Srivastava","sequence":"additional","affiliation":[{"name":"AT& T Chief Data Office","place":["Bedminster, United States"]}]}],"member":"320","published-online":{"date-parts":[[2025,8,28]]},"reference":[{"issue":"2","key":"e_1_3_2_2_2","first-page":"85","article-title":"Exploring change: A new dimension of data analytics","volume":"12","author":"Bleifu\u00df Tobias","year":"2018","unstructured":"Tobias Bleifu\u00df, Leon Bornemann, Theodore Johnson, Dmitri V. Kalashnikov, Felix Naumann, and Divesh Srivastava. 2018. Exploring change: A new dimension of data analytics. PVLDB 12, 2 (2018), 85\u201398.","journal-title":"PVLDB"},{"key":"e_1_3_2_3_2","first-page":"20","volume-title":"Proceedings of the SEA-Data@VLDB","author":"Bleifu\u00df Tobias","year":"2021","unstructured":"Tobias Bleifu\u00df, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, and Divesh Srivastava. 2021. The secret life of Wikipedia tables. In Proceedings of the SEA-Data@VLDB. 20\u201326."},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00115"},{"issue":"23","key":"e_1_3_2_5_2","first-page":"81","article-title":"From RankNet to LambdaRank to LambdaMART: An overview","volume":"11","author":"Burges Christopher J. C.","year":"2010","unstructured":"Christopher J. C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An overview. Learning 11, 23-581 (2010), 81.","journal-title":"Learning"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453916"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3184558.3191601"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.5555\/2560129"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00045"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/2872427.2874816"},{"issue":"10","key":"e_1_3_2_11_2","first-page":"1165","article-title":"Transform-data-by-example (TDE) an extensible search engine for data transformations","volume":"11","author":"He Yeye","year":"2018","unstructured":"Yeye He, Xu Chu, Kris Ganjam, Yudian Zheng, Vivek Narasayya, and Surajit Chaudhuri. 2018. Transform-data-by-example (TDE) an extensible search engine for data transformations. PVLDB 11, 10 (2018), 1165\u20131177.","journal-title":"PVLDB"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.14778\/3231751.3231766"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330993"},{"issue":"12","key":"e_1_3_2_14_2","first-page":"2368","article-title":"Auto-transform: learning-to-transform by patterns","volume":"13","author":"Jin Zhongjun","year":"2020","unstructured":"Zhongjun Jin, Yeye He, and Surajit Chauduri. 2020. Auto-transform: learning-to-transform by patterns. PVLDB 13, 12 (2020), 2368\u20132381.","journal-title":"PVLDB"},{"key":"e_1_3_2_15_2","article-title":"LightGBM: A highly efficient gradient boosting decision tree","volume":"30","author":"Ke Guolin","year":"2017","unstructured":"Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 30 (2017), 3146\u20133154.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.14778\/3137628.3137657"},{"key":"e_1_3_2_17_2","first-page":"707","volume-title":"Proceedings of the Soviet Physics Doklady","author":"Levenshtein Vladimir I.","year":"1966","unstructured":"Vladimir I. Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Proceedings of the Soviet Physics Doklady. Soviet Union, 707\u2013710."},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.14778\/3192965.3192973"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.14778\/3384345.3384346"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3308558.3313584"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/2491411.2491431"},{"key":"e_1_3_2_22_2","volume-title":"Proceedings of the International Conference on Language Resources and Evaluation (LREC)","author":"Radev Dragomir R.","year":"2002","unstructured":"Dragomir R. Radev, Hong Qi, Harris Wu, and Weiguo Fan. 2002. Evaluating Web-based Question Answering Systems. In Proceedings of the International Conference on Language Resources and Evaluation (LREC)."},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1007\/s007780100057"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/0950-5849(95)91494-K"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.14778\/3583140.3583169"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.datak.2007.09.003"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00008"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.48786\/EDBT.2025.25"},{"key":"e_1_3_2_29_2","first-page":"28","volume-title":"Proceedings of the International Conference on Extending Database Technology (EDBT)","author":"Vassiliadis Panos","year":"2023","unstructured":"Panos Vassiliadis, Fation Shehaj, George Kalampokis, and A. Zarras. 2023. Joint source and schema evolution: Insights from a study of 195 FOSS projects. In Proceedings of the International Conference on Extending Database Technology (EDBT). 28\u201331."},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3442442.3452342"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-demos.4"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13740-016-0066-3"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.14778\/3407790.3407793"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/3603109"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.emnlp-main.820"}],"container-title":["ACM Transactions on the Web"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3742920","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,28]],"date-time":"2025-08-28T12:25:12Z","timestamp":1756383912000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3742920"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,28]]},"references-count":34,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,8,31]]}},"alternative-id":["10.1145\/3742920"],"URL":"https:\/\/doi.org\/10.1145\/3742920","relation":{},"ISSN":["1559-1131","1559-114X"],"issn-type":[{"type":"print","value":"1559-1131"},{"type":"electronic","value":"1559-114X"}],"subject":[],"published":{"date-parts":[[2025,8,28]]},"assertion":[{"value":"2024-01-29","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-06","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-28","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}