{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T08:55:20Z","timestamp":1775638520755,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":35,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,7,25]],"date-time":"2019-07-25T00:00:00Z","timestamp":1564012800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,7,25]]},"DOI":"10.1145\/3292500.3330993","type":"proceedings-article","created":{"date-parts":[[2019,7,26]],"date-time":"2019-07-26T13:17:26Z","timestamp":1564147046000},"page":"1500-1508","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":145,"title":["Sherlock"],"prefix":"10.1145","author":[{"given":"Madelon","family":"Hulsebos","sequence":"first","affiliation":[{"name":"Massachusetts Institute of Technology, Cambridge, MA, USA"}]},{"given":"Kevin","family":"Hu","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, Cambridge, MA, USA"}]},{"given":"Michiel","family":"Bakker","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, Cambridge, MA, USA"}]},{"given":"Emanuel","family":"Zgraggen","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, Cambridge, MA, USA"}]},{"given":"Arvind","family":"Satyanarayan","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, Cambridge, MA, USA"}]},{"given":"Tim","family":"Kraska","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, Cambridge, MA, USA"}]},{"given":"\u00c7agatay","family":"Demiralp","sequence":"additional","affiliation":[{"name":"Megagon Labs, Mountain View, CA, USA"}]},{"given":"C\u00e9sar","family":"Hidalgo","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, Cambridge, MA, USA"}]}],"member":"320","published-online":{"date-parts":[[2019,7,25]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)","author":"Mart'in","unstructured":"Mart'in Abadi et almbox. 2016. TensorFlow: A system for large-scale machine learning . In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) . 265--283. Mart'in Abadi et almbox. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265--283."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","unstructured":"S\u00f6ren Auer Christian Bizer Georgi Kobilarov Jens Lehmann Richard Cyganiak and Zachary Ives. 2007. DBpedia: A nucleus for a web of open data. (2007) 722--735.   S\u00f6ren Auer Christian Bizer Georgi Kobilarov Jens Lehmann Richard Cyganiak and Zachary Ives. 2007. DBpedia: A nucleus for a web of open data. (2007) 722--735.","DOI":"10.1007\/978-3-540-76298-0_52"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2016.2515587"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376746"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.14778\/1453856.1453916"},{"key":"e_1_3_2_1_6_1","volume-title":"Aurum: A Data Discovery System. 1001--1012.","author":"Fernandez Raul Castro","year":"2018","unstructured":"Raul Castro Fernandez , Ziawasch Abedjan , Famien Koko , Gina Yuan , Samuel Madden , and Michael Stonebraker . 2018 a. Aurum: A Data Discovery System. 1001--1012. Raul Castro Fernandez, Ziawasch Abedjan, Famien Koko, Gina Yuan, Samuel Madden, and Michael Stonebraker. 2018a. Aurum: A Data Discovery System. 1001--1012."},{"key":"e_1_3_2_1_7_1","volume-title":"Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery.","author":"Fernandez Raul Castro","year":"2018","unstructured":"Raul Castro Fernandez , Essam Mansour , Abdulhakim Qahtan , Ahmed Elmagarmid , Ihab Ilyas , Samuel Madden , Mourad Ouzzani , Michael Stonebraker , and Nan Tang . 2018 b. Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery. Raul Castro Fernandez, Essam Mansour, Abdulhakim Qahtan, Ahmed Elmagarmid, Ihab Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, and Nan Tang. 2018b. Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery."},{"key":"e_1_3_2_1_8_1","volume-title":"Document embedding with paragraph vectors. arXiv preprint arXiv:1507.07998","author":"Dai Andrew M","year":"2015","unstructured":"Andrew M Dai , Christopher Olah , and Quoc V Le. 2015. Document embedding with paragraph vectors. arXiv preprint arXiv:1507.07998 ( 2015 ). Andrew M Dai, Christopher Olah, and Quoc V Le. 2015. Document embedding with paragraph vectors. arXiv preprint arXiv:1507.07998 (2015)."},{"key":"e_1_3_2_1_9_1","volume-title":"Datalib: JavaScript Data Utilities","author":"Lab Interactive Data","year":"2019","unstructured":"Interactive Data Lab . 2019 . Datalib: JavaScript Data Utilities . http:\/\/vega.github.io\/datalib Interactive Data Lab. 2019. Datalib: JavaScript Data Utilities. http:\/\/vega.github.io\/datalib"},{"key":"e_1_3_2_1_10_1","unstructured":"Open Knowledge Foundation. 2019. messytables $cdot$ PyPi. https:\/\/pypi.org\/project\/messytables  Open Knowledge Foundation. 2019. messytables $cdot$ PyPi. https:\/\/pypi.org\/project\/messytables"},{"key":"e_1_3_2_1_11_1","volume-title":"Proceedings on the International Conference on Artificial Intelligence (ICAI) .","author":"Goel Aman","year":"2012","unstructured":"Aman Goel , Craig A Knoblock , and Kristina Lerman . 2012 . Exploiting structure within data for accurate labeling using conditional random fields . In Proceedings on the International Conference on Artificial Intelligence (ICAI) . Aman Goel, Craig A Knoblock, and Kristina Lerman. 2012. Exploiting structure within data for accurate labeling using conditional random fields. In Proceedings on the International Conference on Artificial Intelligence (ICAI) ."},{"key":"e_1_3_2_1_12_1","unstructured":"Google. 2019. Google Data Studio. https:\/\/datastudio.google.com  Google. 2019. Google Data Studio. https:\/\/datastudio.google.com"},{"key":"e_1_3_2_1_13_1","unstructured":"Christopher Groskopf and contributors. 2016. csvkit . https:\/\/csvkit.readthedocs.org  Christopher Groskopf and contributors. 2016. csvkit . https:\/\/csvkit.readthedocs.org"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300892"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1978942.1979444"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2806416.2806475"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/1357054.1357127"},{"key":"e_1_3_2_1_18_1","volume-title":"International Conference on Machine Learning. 1188--1196","author":"Le Quoc","year":"2014","unstructured":"Quoc Le and Tomas Mikolov . 2014 . Distributed representations of sentences and documents . In International Conference on Machine Learning. 1188--1196 . Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International Conference on Machine Learning. 1188--1196."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1921005"},{"key":"e_1_3_2_1_20_1","unstructured":"Microsoft. 2019. Power BI | Interactive Data Visualization BI . https:\/\/powerbi.microsoft.com  Microsoft. 2019. Power BI | Interactive Data Visualization BI . https:\/\/powerbi.microsoft.com"},{"key":"e_1_3_2_1_21_1","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Fabian Pedregosa","year":"2011","unstructured":"Fabian Pedregosa et almbox. 2011 . Scikit-learn: Machine Learning in Python . Journal of Machine Learning Research , Vol. 12 (2011), 2825 -- 2830 . Fabian Pedregosa et almbox. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research , Vol. 12 (2011), 2825--2830.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46523-4_27"},{"key":"e_1_3_2_1_24_1","volume-title":"A Specialist Approach for Classification of Column Data . Master's thesis","author":"Puranik Nikhil Waman","unstructured":"Nikhil Waman Puranik . 2012. A Specialist Approach for Classification of Column Data . Master's thesis . University of Maryland , Baltimore County . Nikhil Waman Puranik. 2012. A Specialist Approach for Classification of Column Data . Master's thesis. University of Maryland, Baltimore County."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s007780100057"},{"key":"e_1_3_2_1_26_1","volume-title":"Proceedings of the 27th International Conference on Very Large Data Bases (VLDB '01)","author":"Raman Vijayshankar","unstructured":"Vijayshankar Raman and Joseph M. Hellerstein . 2001. Potter's Wheel: An Interactive Data Cleaning System . In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB '01) . Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 381--390. Vijayshankar Raman and Joseph M. Hellerstein. 2001. Potter's Wheel: An Interactive Data Cleaning System. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 381--390."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-18818-8_25"},{"key":"e_1_3_2_1_28_1","volume-title":"Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, 45--50","author":"Radim","unstructured":"Radim v Rehr uv rek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora . In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, 45--50 . Radim v Rehr uv rek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, 45--50."},{"key":"e_1_3_2_1_29_1","volume-title":"Matching web tables to DBpedia -- a feature utility study. context","author":"Ritze Dominique","year":"2017","unstructured":"Dominique Ritze and Christian Bizer . 2017. Matching web tables to DBpedia -- a feature utility study. context , Vol. 42 , 41 ( 2017 ), 19. Dominique Ritze and Christian Bizer. 2017. Matching web tables to DBpedia -- a feature utility study. context , Vol. 42, 41 (2017), 19."},{"key":"e_1_3_2_1_30_1","volume-title":"Proceedings of the Second Web Science Conference .","author":"Syed Zareen","year":"2010","unstructured":"Zareen Syed , Tim Finin , Varish Mulwad , Anupam Joshi , 2010 . Exploiting a web of semantic data for interpreting tables . In Proceedings of the Second Web Science Conference . Zareen Syed, Tim Finin, Varish Mulwad, Anupam Joshi, et almbox. 2010. Exploiting a web of semantic data for interpreting tables. In Proceedings of the Second Web Science Conference ."},{"key":"e_1_3_2_1_31_1","unstructured":"Trifacta. 2019. Data Wrangling Tools & Software. https:\/\/www.trifacta.com  Trifacta. 2019. Data Wrangling Tools & Software. https:\/\/www.trifacta.com"},{"key":"e_1_3_2_1_32_1","unstructured":"Princeton University. 2010. About WordNet. https:\/\/wordnet.princeton.edu  Princeton University. 2010. About WordNet. https:\/\/wordnet.princeton.edu"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/2002938.2002939"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196888"},{"key":"e_1_3_2_1_35_1","volume-title":"Utilizing Regular Expressions for Instance-Based Schema Matching. CEUR Workshop Proceedings","volume":"946","author":"Zapilko Benjamin","year":"2012","unstructured":"Benjamin Zapilko , Matth\"aus Zloch, and Johann Schaible . 2012 . Utilizing Regular Expressions for Instance-Based Schema Matching. CEUR Workshop Proceedings , Vol. 946 . Benjamin Zapilko, Matth\"aus Zloch, and Johann Schaible. 2012. Utilizing Regular Expressions for Instance-Based Schema Matching. CEUR Workshop Proceedings , Vol. 946."}],"event":{"name":"KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","location":"Anchorage AK USA","acronym":"KDD '19","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"]},"container-title":["Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3292500.3330993","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3292500.3330993","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:26:04Z","timestamp":1750206364000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3292500.3330993"}},"subtitle":["A Deep Learning Approach to Semantic Data Type Detection"],"short-title":[],"issued":{"date-parts":[[2019,7,25]]},"references-count":35,"alternative-id":["10.1145\/3292500.3330993","10.1145\/3292500"],"URL":"https:\/\/doi.org\/10.1145\/3292500.3330993","relation":{},"subject":[],"published":{"date-parts":[[2019,7,25]]},"assertion":[{"value":"2019-07-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}