{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T01:25:03Z","timestamp":1781054703164,"version":"3.54.1"},"publisher-location":"New York, NY, USA","reference-count":49,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,5,31]],"date-time":"2020-05-31T00:00:00Z","timestamp":1590883200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSERC Strategic Grant"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,6,11]]},"DOI":"10.1145\/3318464.3380605","type":"proceedings-article","created":{"date-parts":[[2020,5,29]],"date-time":"2020-05-29T17:12:33Z","timestamp":1590772353000},"page":"1939-1950","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":37,"title":["Organizing Data Lakes for Navigation"],"prefix":"10.1145","author":[{"given":"Fatemeh","family":"Nargesian","sequence":"first","affiliation":[{"name":"University of Rochester, Rochester, NY, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ken Q.","family":"Pu","sequence":"additional","affiliation":[{"name":"University of Ontario Institute of Technology, Oshawa, NY, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Erkang","family":"Zhu","sequence":"additional","affiliation":[{"name":"Microsoft Research, Seattle, WA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bahar","family":"Ghadiri Bashardoost","sequence":"additional","affiliation":[{"name":"University of Toronto, Toronto, ON, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ren\u00e9e J.","family":"Miller","sequence":"additional","affiliation":[{"name":"Northeastern University, Boston, MA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2020,5,31]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.websem.2015.12.002"},{"key":"e_1_3_2_2_2_1","first-page":"1","article-title":"Skluma: A Statistical Learning Pipeline for Taming Unkempt Data Repositories","volume":"41","author":"Beckman Paul","year":"2017","unstructured":"Paul Beckman , Tyler J. Skluzacek , Kyle Chard , and Ian T. Foster . 2017 . Skluma: A Statistical Learning Pipeline for Taming Unkempt Data Repositories . In Scientific and Statistical Database Management. 41 : 1 -- 41 :4. Paul Beckman, Tyler J. Skluzacek, Kyle Chard, and Ian T. Foster. 2017. Skluma: A Statistical Learning Pipeline for Taming Unkempt Data Repositories. In Scientific and Statistical Database Management. 41:1--41:4.","journal-title":"Scientific and Statistical Database Management."},{"key":"e_1_3_2_2_3_1","volume-title":"Noy","author":"Brickley Dan","year":"2019","unstructured":"Dan Brickley , Matthew Burgess , and Natasha F . Noy . 2019 . Google Dataset Search: Building a search engine for datasets in an open Web ecosystem. In WWW. 1365--1375. Dan Brickley, Matthew Burgess, and Natasha F. Noy. 2019. Google Dataset Search: Building a search engine for datasets in an open Web ecosystem. In WWW. 1365--1375."},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/2817912.2817913"},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687750"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"crossref","unstructured":"Anne Callery. 1996. Yahoo! Cataloging the Web. misc.library.ucsb.edu\/untangle\/callery.html  Anne Callery. 1996. Yahoo! Cataloging the Web. misc.library.ucsb.edu\/untangle\/callery.html","DOI":"10.1080\/19386389709512369"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"crossref","unstructured":"Anish Das Sarma Lujun Fang Nitin Gupta Alon Y. Halevy Hongrae Lee Fei Wu Reynold Xin and Cong Yu. 2012. Finding related tables. In SIGMOD. 817--828.  Anish Das Sarma Lujun Fang Nitin Gupta Alon Y. Halevy Hongrae Lee Fei Wu Reynold Xin and Cong Yu. 2012. Finding related tables. In SIGMOD. 817--828.","DOI":"10.1145\/2213836.2213962"},{"key":"e_1_3_2_2_8_1","volume-title":"Ziawasch Abedjan, Sibo Wang, Michael Stonebraker, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, and Nan Tang.","author":"Deng Dong","year":"2017","unstructured":"Dong Deng , Raul Castro Fernandez , Ziawasch Abedjan, Sibo Wang, Michael Stonebraker, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, and Nan Tang. 2017 . The Data Civilizer System. In CIDR. Dong Deng, Raul Castro Fernandez, Ziawasch Abedjan, Sibo Wang, Michael Stonebraker, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, and Nan Tang. 2017. The Data Civilizer System. In CIDR."},{"key":"e_1_3_2_2_9_1","first-page":"1786","article-title":"Supporting Keyword Search in Product Database","volume":"6","author":"Duan Huizhong","year":"2013","unstructured":"Huizhong Duan , ChengXiang Zhai , Jinxing Cheng , and Abhishek Gattani . 2013 . Supporting Keyword Search in Product Database : A Probabilistic Approach. PVLDB , Vol. 6 , 14 (2013), 1786 -- 1797 . Huizhong Duan, ChengXiang Zhai, Jinxing Cheng, and Abhishek Gattani. 2013. Supporting Keyword Search in Product Database: A Probabilistic Approach. PVLDB, Vol. 6, 14 (2013), 1786--1797.","journal-title":"A Probabilistic Approach. PVLDB"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.11648\/j.ajtas.20160501.11"},{"key":"e_1_3_2_2_11_1","volume-title":"Aurum: A Data Discovery System","author":"Fernandez Raul Castro","year":"2018","unstructured":"Raul Castro Fernandez , Ziawasch Abedjan , Famien Koko , Gina Yuan , Samuel Madden , and Michael Stonebraker . 2018 a. Aurum: A Data Discovery System . In ICDE. IEEE , 1001--1012. Raul Castro Fernandez, Ziawasch Abedjan, Famien Koko, Gina Yuan, Samuel Madden, and Michael Stonebraker. 2018a. Aurum: A Data Discovery System. In ICDE. IEEE, 1001--1012."},{"key":"e_1_3_2_2_12_1","volume-title":"Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, and Nan Tang.","author":"Fernandez Raul Castro","year":"2018","unstructured":"Raul Castro Fernandez , Essam Mansour , Abdulhakim Ali Qahtan , Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, and Nan Tang. 2018 b. Seeping Semantics : Linking Datasets Using Word Embeddings for Data Discovery. In ICDE. IEEE , 989--1000. Raul Castro Fernandez, Essam Mansour, Abdulhakim Ali Qahtan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, and Nan Tang. 2018b. Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery. In ICDE. IEEE, 989--1000."},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1020249912095"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"crossref","unstructured":"Lisa M Given (Ed.). 2008. The Sage encyclopedia of qualitative research methods .Sage Publications Los Angeles Calif.  Lisa M Given (Ed.). 2008. The Sage encyclopedia of qualitative research methods .Sage Publications Los Angeles Calif.","DOI":"10.4135\/9781412963909"},{"key":"e_1_3_2_2_15_1","volume-title":"Goods: Organizing Google's Datasets. In SIGMOD. 795--806.","author":"Halevy Alon","year":"2016","unstructured":"Alon Halevy , Flip Korn , Natalya F. Noy , Christopher Olston , Neoklis Polyzotis , Sudip Roy , and Steven Euijong Whang . 2016 . Goods: Organizing Google's Datasets. In SIGMOD. 795--806. Alon Halevy, Flip Korn, Natalya F. Noy, Christopher Olston, Neoklis Polyzotis, Sudip Roy, and Steven Euijong Whang. 2016. Goods: Organizing Google's Datasets. In SIGMOD. 795--806."},{"key":"e_1_3_2_2_16_1","volume-title":"Proceedings of the 10th Int. Workshop on Ontology Matching. 25--34","author":"Hassanzadeh Oktie","year":"2015","unstructured":"Oktie Hassanzadeh , Michael J Ward , Mariano Rodriguez-Muro , and Kavitha Srinivas . 2015 . Understanding a large corpus of web tables through matching with knowledge bases: an empirical study . In Proceedings of the 10th Int. Workshop on Ontology Matching. 25--34 . Oktie Hassanzadeh, Michael J Ward, Mariano Rodriguez-Muro, and Kavitha Srinivas. 2015. Understanding a large corpus of web tables through matching with knowledge bases: an empirical study. In Proceedings of the 10th Int. Workshop on Ontology Matching. 25--34."},{"key":"e_1_3_2_2_17_1","volume-title":"Ground: A Data Context Service. In CIDR.","author":"Hellerstein Joseph M.","year":"2017","unstructured":"Joseph M. Hellerstein , Vikram Sreekanti , Joseph E. Gonzalez , James Dalton , Akon Dey , Sreyashi Nag , Krishna Ramachandran , Sudhanshu Arora , Arka Bhattacharyya , Shirshanka Das , Mark Donsky , Gabriel Fierro , Chang She , Carl Steinbach , Venkat Subramanian , and Eric Sun . 2017 . Ground: A Data Context Service. In CIDR. Joseph M. Hellerstein, Vikram Sreekanti, Joseph E. Gonzalez, James Dalton, Akon Dey, Sreyashi Nag, Krishna Ramachandran, Sudhanshu Arora, Arka Bhattacharyya, Shirshanka Das, Mark Donsky, Gabriel Fierro, Chang She, Carl Steinbach, Venkat Subramanian, and Eric Sun. 2017. Ground: A Data Context Service. In CIDR."},{"key":"e_1_3_2_2_18_1","volume-title":"Michiel A. Bakker, Emanuel Zgraggen, Arvind Satyanarayan, Tim Kraska, Cagatay Demiralp, and C\u00e9sar A. Hidalgo.","author":"Hulsebos Madelon","year":"2019","unstructured":"Madelon Hulsebos , Kevin Zeng Hu , Michiel A. Bakker, Emanuel Zgraggen, Arvind Satyanarayan, Tim Kraska, Cagatay Demiralp, and C\u00e9sar A. Hidalgo. 2019 . Sherlock : A Deep Learning Approach to Semantic Data Type Detection. In KDD. 1500--1508. Madelon Hulsebos, Kevin Zeng Hu, Michiel A. Bakker, Emanuel Zgraggen, Arvind Satyanarayan, Tim Kraska, Cagatay Demiralp, and C\u00e9sar A. Hidalgo. 2019. Sherlock: A Deep Learning Approach to Semantic Data Type Detection. In KDD. 1500--1508."},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"crossref","unstructured":"Yusra Ibrahim Mirek Riedewald and Gerhard Weikum. 2016. Making Sense of Entities and Quantities in Web Tables. In CIKM. 1703--1712.  Yusra Ibrahim Mirek Riedewald and Gerhard Weikum. 2016. Making Sense of Entities and Quantities in Web Tables. In CIKM. 1703--1712.","DOI":"10.1145\/2983323.2983772"},{"key":"e_1_3_2_2_20_1","volume-title":"Bridging Quantities in Tables and Text","author":"Ibrahim Yusra","unstructured":"Yusra Ibrahim , Mirek Riedewald , Gerhard Weikum , and Demetrios Zeinalipour-Yazti . 2019. Bridging Quantities in Tables and Text . In ICDE. IEEE , 1010--1021. Yusra Ibrahim, Mirek Riedewald, Gerhard Weikum, and Demetrios Zeinalipour-Yazti. 2019. Bridging Quantities in Tables and Text. In ICDE. IEEE, 1010--1021."},{"key":"e_1_3_2_2_21_1","volume-title":"Bag of Tricks for Efficient Text Classification. ACL","author":"Joulin Armand","year":"2017","unstructured":"Armand Joulin , Edouard Grave , Piotr Bojanowski , and Tomas Mikolov . 2017. Bag of Tricks for Efficient Text Classification. ACL ( 2017 ). Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2017. Bag of Tricks for Efficient Text Classification. ACL (2017)."},{"key":"e_1_3_2_2_22_1","volume-title":"Miller","author":"Kandogan Eser","year":"2015","unstructured":"Eser Kandogan , Mary Roth , Peter M. Schwarz , Joshua Hui , Ignacio G. Terrizzano , Christina Christodoulakis , and Ren\u00e9e J . Miller . 2015 . LabBook: Metadata-driven social collaborative data analysis. In IEEE Big Data . 431--440. Eser Kandogan, Mary Roth, Peter M. Schwarz, Joshua Hui, Ignacio G. Terrizzano, Christina Christodoulakis, and Ren\u00e9e J. Miller. 2015. LabBook: Metadata-driven social collaborative data analysis. In IEEE Big Data. 431--440."},{"key":"e_1_3_2_2_23_1","volume-title":"Clustering by Means of Medoids. Data Analysis based on the L1-Norm and Related Methods","author":"Kaufmann Leonard","year":"1987","unstructured":"Leonard Kaufmann and Peter Rousseeuw . 1987. Clustering by Means of Medoids. Data Analysis based on the L1-Norm and Related Methods ( 1987 ). Leonard Kaufmann and Peter Rousseeuw. 1987. Clustering by Means of Medoids. Data Analysis based on the L1-Norm and Related Methods (1987)."},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2012.02.017"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"crossref","unstructured":"Jonathan Koren Yi Zhang and Xue Liu. 2008. Personalized interactive faceted search. In WWW. 477--486.  Jonathan Koren Yi Zhang and Xue Liu. 2008. Personalized interactive faceted search. In WWW. 477--486.","DOI":"10.1145\/1367497.1367562"},{"key":"e_1_3_2_2_26_1","unstructured":"Zornitsa Kozareva and Eduard Hovy. 2010. A Semi-supervised Method to Learn and Construct Taxonomies Using the Web. In EMNLP. 1110--1118.  Zornitsa Kozareva and Eduard Hovy. 2010. A Semi-supervised Method to Learn and Construct Taxonomies Using the Web. In EMNLP. 1110--1118."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/1577069.1755874"},{"key":"e_1_3_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1921005"},{"key":"e_1_3_2_2_29_1","volume-title":"Introduction to information retrieval","author":"Manning Christopher D.","unstructured":"Christopher D. Manning , Prabhakar Raghavan , and Hinrich Sch\u00fctze . 2008. Introduction to information retrieval . Cambridge University Press . Christopher D. Manning, Prabhakar Raghavan, and Hinrich Sch\u00fctze. 2008. Introduction to information retrieval .Cambridge University Press."},{"key":"e_1_3_2_2_30_1","volume-title":"Abdulhakim Ali Qahtan, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, and Nan Tang.","author":"Mansour Essam","year":"2018","unstructured":"Essam Mansour , Dong Deng , Raul Castro Fernandez , Abdulhakim Ali Qahtan, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, and Nan Tang. 2018 . Building Data Civilizer Pipelines with an Advanced Workflow Engine. In ICDE. IEEE , 1593--1596. Essam Mansour, Dong Deng, Raul Castro Fernandez, Abdulhakim Ali Qahtan, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, and Nan Tang. 2018. Building Data Civilizer Pipelines with an Advanced Workflow Engine. In ICDE. IEEE, 1593--1596."},{"key":"e_1_3_2_2_31_1","unstructured":"Tomas Mikolov Ilya Sutskever Kai Chen Gregory S. Corrado and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 3111--3119.  Tomas Mikolov Ilya Sutskever Kai Chen Gregory S. Corrado and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 3111--3119."},{"key":"e_1_3_2_2_32_1","first-page":"59","article-title":"Making Open Data Transparent: Data Discovery on Open Data","volume":"41","author":"Miller Ren\u00e9e J.","year":"2018","unstructured":"Ren\u00e9e J. Miller , Fatemeh Nargesian , Erkang Zhu , Christina Christodoulakis , Ken Q. Pu , and Periklis Andritsos . 2018 . Making Open Data Transparent: Data Discovery on Open Data . IEEE Data Eng. Bull. , Vol. 41 , 2 (2018), 59 -- 70 . Ren\u00e9e J. Miller, Fatemeh Nargesian, Erkang Zhu, Christina Christodoulakis, Ken Q. Pu, and Periklis Andritsos. 2018. Making Open Data Transparent: Data Discovery on Open Data. IEEE Data Eng. Bull., Vol. 41, 2 (2018), 59--70.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_3_2_2_33_1","volume-title":"Erkang Zhu, and Ren\u00e9 e J. Miller.","author":"Nargesian Fatemeh","year":"2020","unstructured":"Fatemeh Nargesian , Ken Q. Pu , Bahar Ghadiri Bashardoost , Erkang Zhu, and Ren\u00e9 e J. Miller. 2020 . Data Lake Organization . arXiv:1812.07024. Fatemeh Nargesian, Ken Q. Pu, Bahar Ghadiri Bashardoost, Erkang Zhu, and Ren\u00e9 e J. Miller. 2020. Data Lake Organization. arXiv:1812.07024."},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/3192965.3192973"},{"key":"e_1_3_2_2_35_1","volume-title":"Manning","author":"Pennington Jeffrey","year":"2014","unstructured":"Jeffrey Pennington , Richard Socher , and Christopher D . Manning . 2014 . GloVe: Global Vectors for Word Representation. In EMNLP. 1532--1543. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In EMNLP. 1532--1543."},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.14778\/2336664.2336665"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"crossref","unstructured":"Jeffrey Pound Stelios Paparizos and Panayiotis Tsaparas. 2011a. Facet discovery for structured web search: a query-log mining approach. In SIGMOD. 169--180.  Jeffrey Pound Stelios Paparizos and Panayiotis Tsaparas. 2011a. Facet discovery for structured web search: a query-log mining approach. In SIGMOD. 169--180.","DOI":"10.1145\/1989323.1989342"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"crossref","unstructured":"Jeffrey Pound Stelios Paparizos and Panayiotis Tsaparas. 2011b. Facet Discovery for Structured Web Search: A Query-log Mining Approach. In SIGMOD. 169--180.  Jeffrey Pound Stelios Paparizos and Panayiotis Tsaparas. 2011b. Facet Discovery for Structured Web Search: A Query-log Mining Approach. In SIGMOD. 169--180.","DOI":"10.1145\/1989323.1989342"},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"crossref","unstructured":"Dominique Ritze Oliver Lehmberg Yaser Oulabi and Christian Bizer. 2016. Profiling the Potential of Web Tables for Augmenting Cross-domain Knowledge Bases. In WWW. 251--261.  Dominique Ritze Oliver Lehmberg Yaser Oulabi and Christian Bizer. 2016. Profiling the Potential of Web Tables for Augmenting Cross-domain Knowledge Bases. In WWW. 251--261.","DOI":"10.1145\/2872427.2883017"},{"key":"e_1_3_2_2_40_1","volume-title":"International Conference on Computational Linguistics. The Association for Computer Linguistics, 801--808","author":"Snow Rion","unstructured":"Rion Snow , Daniel Jurafsky , and Andrew Y. Ng . 2006. Semantic Taxonomy Induction from Heterogenous Evidence . In International Conference on Computational Linguistics. The Association for Computer Linguistics, 801--808 . Rion Snow, Daniel Jurafsky, and Andrew Y. Ng. 2006. Semantic Taxonomy Induction from Heterogenous Evidence. In International Conference on Computational Linguistics. The Association for Computer Linguistics, 801--808."},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1162\/COLI_a_00146"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/2002938.2002939"},{"key":"e_1_3_2_2_43_1","volume-title":"d.]. Available online at xapian.org (Last accessed on","year":"2020","unstructured":"Xapian. [n. d.]. Available online at xapian.org (Last accessed on Feb 29, 2020 ). Xapian. [n. d.]. Available online at xapian.org (Last accessed on Feb 29, 2020)."},{"key":"e_1_3_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.3115\/1687878.1687918"},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-018-0505-x"},{"key":"e_1_3_2_2_46_1","first-page":"041","article-title":"A survey of faceted search","volume":"12","author":"Zheng Bweijunl","year":"2013","unstructured":"Bweijunl Zheng , Wei Zhang , and Xiaoyu Fu Boqin Feng . 2013 . A survey of faceted search . Journal of Web engineering , Vol. 12 , 1&2 (2013), 041 -- 064 . Bweijunl Zheng, Wei Zhang, and Xiaoyu Fu Boqin Feng. 2013. A survey of faceted search. Journal of Web engineering, Vol. 12, 1&2 (2013), 041--064.","journal-title":"Journal of Web engineering"},{"key":"e_1_3_2_2_47_1","volume-title":"Miller","author":"Zhu Erkang","year":"2019","unstructured":"Erkang Zhu , Dong Deng , Fatemeh Nargesian , and Ren\u00e9e J . Miller . 2019 . JOSIE : Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes. In SIGMOD. 847--864. Erkang Zhu, Dong Deng, Fatemeh Nargesian, and Ren\u00e9e J. Miller. 2019. JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes. In SIGMOD. 847--864."},{"key":"e_1_3_2_2_48_1","first-page":"1185","article-title":"LSH Ensemble","volume":"9","author":"Zhu Erkang","year":"2016","unstructured":"Erkang Zhu , Fatemeh Nargesian , Ken Q. Pu , and Ren\u00e9e J. Miller . 2016 . LSH Ensemble : Internet-Scale Domain Search. PVLDB , Vol. 9 , 12 (2016), 1185 -- 1196 . Erkang Zhu, Fatemeh Nargesian, Ken Q. Pu, and Ren\u00e9e J. Miller. 2016. LSH Ensemble: Internet-Scale Domain Search. PVLDB, Vol. 9, 12 (2016), 1185--1196.","journal-title":"Internet-Scale Domain Search. PVLDB"},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137765.3137788"}],"event":{"name":"SIGMOD\/PODS '20: International Conference on Management of Data","location":"Portland OR USA","acronym":"SIGMOD\/PODS '20","sponsor":["SIGMOD ACM Special Interest Group on Management of Data"]},"container-title":["Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3318464.3380605","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3318464.3380605","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:38:23Z","timestamp":1750199903000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3318464.3380605"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,31]]},"references-count":49,"alternative-id":["10.1145\/3318464.3380605","10.1145\/3318464"],"URL":"https:\/\/doi.org\/10.1145\/3318464.3380605","relation":{},"subject":[],"published":{"date-parts":[[2020,5,31]]},"assertion":[{"value":"2020-05-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}