{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T18:13:21Z","timestamp":1771697601772,"version":"3.50.1"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2017,6]]},"abstract":"<jats:p>Traditional equi-join relies solely on string equality comparisons to perform joins. However, in scenarios such as ad-hoc data analysis in spreadsheets, users increasingly need to join tables whose join-columns are from the same semantic domain but use different textual representations, for which transformations are needed before equi-join can be performed. We developed Auto-Join, a system that can automatically search over a rich space of operators to compose a transformation program, whose execution makes input tables equi-join-able. We developed an optimal sampling strategy that allows Auto-Join to scale to large datasets efficiently, while ensuring joins succeed with high probability. Our evaluation using real test cases collected from both public web tables and proprietary enterprise tables shows that the proposed system performs the desired transformation joins efficiently and with high quality.<\/jats:p>","DOI":"10.14778\/3115404.3115409","type":"journal-article","created":{"date-parts":[[2017,9,7]],"date-time":"2017-09-07T13:35:53Z","timestamp":1504791353000},"page":"1034-1045","source":"Crossref","is-referenced-by-count":55,"title":["Auto-join"],"prefix":"10.14778","volume":"10","author":[{"given":"Erkang","family":"Zhu","sequence":"first","affiliation":[{"name":"University of Toronto"}]},{"given":"Yeye","family":"He","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]},{"given":"Surajit","family":"Chaudhuri","sequence":"additional","affiliation":[{"name":"Microsoft Research"}]}],"member":"320","published-online":{"date-parts":[[2017,6]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"DBLP. http:\/\/dblp.uni-trier.de\/.  DBLP. http:\/\/dblp.uni-trier.de\/."},{"key":"e_1_2_1_2_1","unstructured":"Google Web Tables. http:\/\/research.google.com\/tables.  Google Web Tables. http:\/\/research.google.com\/tables."},{"key":"e_1_2_1_3_1","unstructured":"Informatica Rev. https:\/\/www.informatica.com\/products\/data-quality\/rev.html.  Informatica Rev. https:\/\/www.informatica.com\/products\/data-quality\/rev.html."},{"key":"e_1_2_1_4_1","unstructured":"Microsoft Excel Power Query. http:\/\/office.microsoft.com\/powerbi.  Microsoft Excel Power Query. http:\/\/office.microsoft.com\/powerbi."},{"key":"e_1_2_1_5_1","unstructured":"Power query: Merge queries. https:\/\/support.office.com\/en-us\/article\/Merge-queries-Power-Query-fd157620-5470-4c0f-b132-7ca2616d17f9.  Power query: Merge queries. https:\/\/support.office.com\/en-us\/article\/Merge-queries-Power-Query-fd157620-5470-4c0f-b132-7ca2616d17f9."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.5555\/2655418"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-7552(97)00031-7"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.9"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1214\/aop\/1176994428"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/564691.564719"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007612"},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","unstructured":"N. Ganguly A. Deutsch and A. Mukherjee. Dynamics on and of complex networks: Applications to biology computer science and the social sciences. 2009.   N. Ganguly A. Deutsch and A. Mukherjee. Dynamics on and of complex networks: Applications to biology computer science and the social sciences. 2009.","DOI":"10.1007\/978-0-8176-4751-3"},{"key":"e_1_2_1_13_1","volume-title":"Cambridge University Press","author":"Gauch H. G.","year":"2003"},{"key":"e_1_2_1_14_1","volume-title":"Inc.","author":"Hare J.","year":"2016"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1993498.1993536"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536336.2536345"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824036"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3064034"},{"key":"e_1_2_1_19_1","volume-title":"Morgan Kaufmann","author":"Lieberman H.","year":"2001"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1137\/0222058"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/0005-1098(78)90005-5"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.14778\/2977797.2977807"},{"key":"e_1_2_1_23_1","volume-title":"PVLDB","author":"Warren R. H.","year":"2006"},{"key":"e_1_2_1_24_1","unstructured":"E. Zhu Y. He and S. Chaudhuri. AutoJoin: Joining Tables by Leveraging Transformations (Full Version). https:\/\/www.microsoft.com\/en-us\/research\/publication\/auto-join-joining-tables-leveraging-transformations\/.  E. Zhu Y. He and S. Chaudhuri. AutoJoin: Joining Tables by Leveraging Transformations (Full Version). https:\/\/www.microsoft.com\/en-us\/research\/publication\/auto-join-joining-tables-leveraging-transformations\/."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3115404.3115409","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:48:37Z","timestamp":1672220917000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3115404.3115409"}},"subtitle":["joining tables by leveraging transformations"],"short-title":[],"issued":{"date-parts":[[2017,6]]},"references-count":24,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2017,6]]}},"alternative-id":["10.14778\/3115404.3115409"],"URL":"https:\/\/doi.org\/10.14778\/3115404.3115409","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2017,6]]}}}