{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,10]],"date-time":"2026-05-10T10:20:27Z","timestamp":1778408427472,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":41,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,6,25]],"date-time":"2019-06-25T00:00:00Z","timestamp":1561420800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,6,25]]},"DOI":"10.1145\/3299869.3314036","type":"proceedings-article","created":{"date-parts":[[2019,6,18]],"date-time":"2019-06-18T17:41:43Z","timestamp":1560879703000},"page":"362-375","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":50,"title":["Snorkel DryBell"],"prefix":"10.1145","author":[{"given":"Stephen H.","family":"Bach","sequence":"first","affiliation":[{"name":"Brown University, Providence, RI, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Rodriguez","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yintao","family":"Liu","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chong","family":"Luo","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haidong","family":"Shao","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cassandra","family":"Xia","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Souvik","family":"Sen","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alex","family":"Ratner","sequence":"additional","affiliation":[{"name":"Stanford University, Stanford, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Braden","family":"Hancock","sequence":"additional","affiliation":[{"name":"Stanford University, Stanford, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Houman","family":"Alborzi","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rahul","family":"Kuchhal","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chris","family":"R\u00e9","sequence":"additional","affiliation":[{"name":"Stanford University, Stanford, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rob","family":"Malkin","sequence":"additional","affiliation":[{"name":"Google, Mountain View, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,6,25]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"TensorFlow: A System for Large-scale Machine Learning. In USENIX Conference on Operating Systems Design and Implementation (OSDI) .","author":"Abadi Mart'in","year":"2016","unstructured":"Mart'in Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A System for Large-scale Machine Learning. In USENIX Conference on Operating Systems Design and Implementation (OSDI) . Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In USENIX Conference on Operating Systems Design and Implementation (OSDI) ."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1970.1054472"},{"key":"e_1_3_2_1_3_1","volume-title":"International Conference on Machine Learning (ICML) .","author":"Bach Stephen H.","year":"2017","unstructured":"Stephen H. Bach , Bryan He , Alexander Ratner , and Christopher R\u00e9 . 2017 . Learning the Structure of Generative Models without Labeled Data . In International Conference on Machine Learning (ICML) . Stephen H. Bach, Bryan He, Alexander Ratner, and Christopher R\u00e9. 2017. Learning the Structure of Generative Models without Labeled Data. In International Conference on Machine Learning (ICML) ."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098021"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/279943.279962"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33460-3_15"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"crossref","unstructured":"O. Chapelle B. Sch\u00f6lkopf and A. Zien (Eds.). 2006. Semi-Supervised Learning .MIT Press.  O. Chapelle B. Sch\u00f6lkopf and A. Zien (Eds.). 2006. Semi-Supervised Learning .MIT Press.","DOI":"10.7551\/mitpress\/9780262033589.001.0001"},{"key":"e_1_3_2_1_8_1","volume-title":"AutoAugment: Learning Augmentation Policies from Data. arXiv preprint arXiv:1805.09501","author":"Cubuk Ekin D","year":"2018","unstructured":"Ekin D Cubuk , Barret Zoph , Dandelion Mane , Vijay Vasudevan , and Quoc V Le. 2018. AutoAugment: Learning Augmentation Policies from Data. arXiv preprint arXiv:1805.09501 ( 2018 ). Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V Le. 2018. AutoAugment: Learning Augmentation Policies from Data. arXiv preprint arXiv:1805.09501 (2018)."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2488388.2488414"},{"key":"e_1_3_2_1_10_1","volume-title":"Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied statistics","author":"Dawid Alexander Philip","year":"1979","unstructured":"Alexander Philip Dawid and Allan M Skene . 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied statistics ( 1979 ), 20--28. Alexander Philip Dawid and Allan M Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied statistics (1979), 20--28."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"crossref","unstructured":"X. L. Dong and D. Srivastava. 2015. Big Data Integration .Morgan & Claypool Publishers.  X. L. Dong and D. Srivastava. 2015. Big Data Integration .Morgan & Claypool Publishers.","DOI":"10.1007\/978-3-031-01853-4"},{"key":"e_1_3_2_1_12_1","unstructured":"Google. 2019. Cloud AI. https:\/\/cloud.google.com\/products\/ai\/.  Google. 2019. Cloud AI. https:\/\/cloud.google.com\/products\/ai\/."},{"key":"e_1_3_2_1_13_1","unstructured":"Edouard Grave Moustapha M Cisse and Armand Joulin. 2017. Unbounded cache model for online language modeling with open vocabulary. In Advances in Neural Information Processing Systems (NeurIPS .   Edouard Grave Moustapha M Cisse and Armand Joulin. 2017. Unbounded cache model for online language modeling with open vocabulary. In Advances in Neural Information Processing Systems (NeurIPS ."},{"key":"e_1_3_2_1_14_1","volume-title":"Semantic parsing for task oriented dialog using hierarchical representations. arXiv preprint arXiv:1810.07942","author":"Gupta Sonal","year":"2018","unstructured":"Sonal Gupta , Rushin Shah , Mrinal Mohit , Anuj Kumar , and Mike Lewis . 2018. Semantic parsing for task oriented dialog using hierarchical representations. arXiv preprint arXiv:1810.07942 ( 2018 ). Sonal Gupta, Rushin Shah, Mrinal Mohit, Anuj Kumar, and Mike Lewis. 2018. Semantic parsing for task oriented dialog using hierarchical representations. arXiv preprint arXiv:1810.07942 (2018)."},{"key":"e_1_3_2_1_15_1","volume-title":"Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531","author":"Hinton Geoffrey","year":"2015","unstructured":"Geoffrey Hinton , Oriol Vinyals , and Jeff Dean . 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 ( 2015 ). Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_1_17_1","unstructured":"Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings convolutional neural networks and incremental parsing. (2017).  Matthew Honnibal and Ines Montani. 2017. spaCy 2: Natural language understanding with Bloom embeddings convolutional neural networks and incremental parsing. (2017)."},{"key":"e_1_3_2_1_18_1","volume-title":"ICML Workshop on Challenges in Representation Learning .","author":"Lee Dong-Hyun","year":"2013","unstructured":"Dong-Hyun Lee . 2013 . Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks . In ICML Workshop on Challenges in Representation Learning . Dong-Hyun Lee. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In ICML Workshop on Challenges in Representation Learning ."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897350.2897352"},{"key":"e_1_3_2_1_20_1","volume-title":"Exploring the Limits of Weakly Supervised Pretraining. In European Conference on Computer Vision (ECCV) .","author":"Mahajan Dhruv","unstructured":"Dhruv Mahajan , Ross Girshick , Vignesh Ramanathan , Kaiming He , Manohar Paluri , Yixuan Li , Ashwin Bharambe , and Laurens van der Maaten. 2018 . Exploring the Limits of Weakly Supervised Pretraining. In European Conference on Computer Vision (ECCV) . Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens van der Maaten. 2018. Exploring the Limits of Weakly Supervised Pretraining. In European Conference on Computer Vision (ECCV) ."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-5010"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487575.2488200"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/1690219.1690287"},{"key":"e_1_3_2_1_24_1","volume-title":"Proceedings of the 29th International conference on machine learning (ICML-12)","author":"Mnih Volodymyr","year":"2012","unstructured":"Volodymyr Mnih and Geoffrey E Hinton . 2012 . Learning to label aerial images from noisy data . In Proceedings of the 29th International conference on machine learning (ICML-12) . 567--574. Volodymyr Mnih and Geoffrey E Hinton. 2012. Learning to label aerial images from noisy data. In Proceedings of the 29th International conference on machine learning (ICML-12). 567--574."},{"key":"e_1_3_2_1_25_1","unstructured":"ONNX. 2017. Open Neural Network Exchange. https:\/\/github.com\/onnx\/onnx .  ONNX. 2017. Open Neural Network Exchange. https:\/\/github.com\/onnx\/onnx ."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.191"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2009.191"},{"key":"e_1_3_2_1_28_1","first-page":"3","article-title":"Data cleaning: Problems and current approaches","volume":"23","author":"Rahm Erhard","year":"2000","unstructured":"Erhard Rahm and Hong Hai Do . 2000 . Data cleaning: Problems and current approaches . IEEE Data Eng. Bull. , Vol. 23 , 4 (2000), 3 -- 13 . Erhard Rahm and Hong Hai Do. 2000. Data cleaning: Problems and current approaches. IEEE Data Eng. Bull., Vol. 23, 4 (2000), 3--13.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.14778\/3157794.3157797"},{"key":"e_1_3_2_1_30_1","volume-title":"Sen Wu, Daniel Selsam, and Christopher R\u00e9.","author":"Ratner Alexander J","year":"2016","unstructured":"Alexander J Ratner , Christopher M De Sa , Sen Wu, Daniel Selsam, and Christopher R\u00e9. 2016 . Data programming: Creating large training sets, quickly. In Advances in neural information processing systems. 3567--3575. Alexander J Ratner, Christopher M De Sa, Sen Wu, Daniel Selsam, and Christopher R\u00e9. 2016. Data programming: Creating large training sets, quickly. In Advances in neural information processing systems. 3567--3575."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"crossref","unstructured":"Alexander J Ratner Braden Hancock Jared Dunnmon Frederic Sala Shreyash Pandey and Christopher R\u00e9. 2019. Training Complex Models with Multi-Task Weak Supervision. In AAAI .  Alexander J Ratner Braden Hancock Jared Dunnmon Frederic Sala Shreyash Pandey and Christopher R\u00e9. 2019. Training Complex Models with Multi-Task Weak Supervision. In AAAI .","DOI":"10.1609\/aaai.v33i01.33014763"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137628.3137631"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3035951"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1965.1053799"},{"key":"e_1_3_2_1_35_1","unstructured":"Amazon Web Services. 2019. Amazon Comprehend. https:\/\/aws.amazon.com\/comprehend\/.  Amazon Web Services. 2019. Amazon Comprehend. https:\/\/aws.amazon.com\/comprehend\/."},{"key":"e_1_3_2_1_36_1","volume-title":"Active Learning","author":"Settles B.","unstructured":"B. Settles . 2012. Active Learning . Morgan & Claypool Publishers . B. Settles. 2012. Active Learning .Morgan & Claypool Publishers."},{"key":"e_1_3_2_1_37_1","volume-title":"Revisiting Unreasonable Effectiveness of Data in Deep Learning Era. CoRR","author":"Sun Chen","year":"2017","unstructured":"Chen Sun , Abhinav Shrivastava , Saurabh Singh , and Abhinav Gupta . 2017. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era. CoRR , Vol. abs\/ 1707 .02968 ( 2017 ). arxiv: 1707.02968 http:\/\/arxiv.org\/abs\/1707.02968 Chen Sun, Abhinav Shrivastava, Saurabh Singh, and Abhinav Gupta. 2017. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era. CoRR, Vol. abs\/1707.02968 (2017). arxiv: 1707.02968 http:\/\/arxiv.org\/abs\/1707.02968"},{"key":"e_1_3_2_1_38_1","volume-title":"Zero-shot learning - A comprehensive evaluation of the good, the bad and the ugly","author":"Xian Yongqin","year":"2018","unstructured":"Yongqin Xian , Christoph H Lampert , Bernt Schiele , and Zeynep Akata . 2018. Zero-shot learning - A comprehensive evaluation of the good, the bad and the ugly . IEEE Transactions on Pattern Analysis and Machine Intelligence ( 2018 ). Yongqin Xian, Christoph H Lampert, Bernt Schiele, and Zeynep Akata. 2018. Zero-shot learning - A comprehensive evaluation of the good, the bad and the ugly. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018)."},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3041021.3054201"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2904442"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.14778\/2168651.2168656"}],"event":{"name":"SIGMOD\/PODS '19: International Conference on Management of Data","location":"Amsterdam Netherlands","acronym":"SIGMOD\/PODS '19","sponsor":["SIGMOD ACM Special Interest Group on Management of Data"]},"container-title":["Proceedings of the 2019 International Conference on Management of Data"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3299869.3314036","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3299869.3314036","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T01:02:16Z","timestamp":1750208536000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3299869.3314036"}},"subtitle":["A Case Study in Deploying Weak Supervision at Industrial Scale"],"short-title":[],"issued":{"date-parts":[[2019,6,25]]},"references-count":41,"alternative-id":["10.1145\/3299869.3314036","10.1145\/3299869"],"URL":"https:\/\/doi.org\/10.1145\/3299869.3314036","relation":{},"subject":[],"published":{"date-parts":[[2019,6,25]]},"assertion":[{"value":"2019-06-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}