{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,13]],"date-time":"2025-11-13T07:19:23Z","timestamp":1763018363240,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":54,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,4,19]],"date-time":"2021-04-19T00:00:00Z","timestamp":1618790400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,4,19]]},"DOI":"10.1145\/3442381.3449955","type":"proceedings-article","created":{"date-parts":[[2021,6,3]],"date-time":"2021-06-03T19:35:17Z","timestamp":1622748917000},"page":"1317-1327","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Consistent Sampling Through Extremal Process"],"prefix":"10.1145","author":[{"given":"Ping","family":"Li","sequence":"first","affiliation":[{"name":"Baidu Research, USA"}]},{"given":"Xiaoyun","family":"Li","sequence":"additional","affiliation":[{"name":"Baidu Research, USA"}]},{"given":"Gennady","family":"Samorodnitsky","sequence":"additional","affiliation":[{"name":"Cornell University, USA"}]},{"given":"Weijie","family":"Zhao","sequence":"additional","affiliation":[{"name":"Baidu Research, USA"}]}],"member":"320","published-online":{"date-parts":[[2021,6,3]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1080\/02664763.2016.1142945"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2019.01.023"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/1498759.1498835"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2010.172"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"L\u00e9on Bottou Olivier Chapelle Dennis DeCoste and Jason Weston (Eds.). 2007. Large-Scale Kernel Machines. The MIT Press Cambridge MA.  L\u00e9on Bottou Olivier Chapelle Dennis DeCoste and Jason Weston (Eds.). 2007. Large-Scale Kernel Machines. The MIT Press Cambridge MA.","DOI":"10.7551\/mitpress\/7496.001.0001"},{"key":"e_1_3_2_1_6_1","volume-title":"Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing (STOC)","author":"Broder Z.","year":"1998","unstructured":"Andrei\u00a0 Z. Broder , Moses Charikar , Alan\u00a0 M. Frieze , and Michael Mitzenmacher . 1998 . Min-Wise Independent Permutations . In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing (STOC) . Dallas, TX, 327\u2013336. Andrei\u00a0Z. Broder, Moses Charikar, Alan\u00a0M. Frieze, and Michael Mitzenmacher. 1998. Min-Wise Independent Permutations. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing (STOC). Dallas, TX, 327\u2013336."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-7552(97)00031-7"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1341531.1341547"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/509907.509965"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557137"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557049"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1873601.1873626"},{"key":"e_1_3_2_1_13_1","volume-title":"Proceedings of the 25th International Conference on Computational Linguistics (COLING)","author":"Delgado D.","year":"2014","unstructured":"Agust\u00edn\u00a0 D. Delgado , Raquel Mart\u00ednez-Unanue , V\u00edctor Fresno-Fern\u00e1ndez , and Soto Montalvo . 2014 . A Data Driven Approach for Person Name Disambiguation in Web Search Results . In Proceedings of the 25th International Conference on Computational Linguistics (COLING) . Dublin, Ireland, 301\u2013310. Agust\u00edn\u00a0D. Delgado, Raquel Mart\u00ednez-Unanue, V\u00edctor Fresno-Fern\u00e1ndez, and Soto Montalvo. 2014. A Data Driven Approach for Person Name Disambiguation in Web Search Results. In Proceedings of the 25th International Conference on Computational Linguistics (COLING). Dublin, Ireland, 301\u2013310."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/1513876.1513879"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177700394"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220089"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/775152.775246"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1496909.1496926"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.5555\/2750482.2750507"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3340531.3412715"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1183614.1183683"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526761"},{"key":"e_1_3_2_1_23_1","first-page":"25","article-title":"Weighted Set-Based String Similarity","volume":"33","author":"Hadjieleftheriou Marios","year":"2010","unstructured":"Marios Hadjieleftheriou and Divesh Srivastava . 2010 . Weighted Set-Based String Similarity . IEEE Data Eng. Bull. 33 , 1 (2010), 25 \u2013 36 . Marios Hadjieleftheriou and Divesh Srivastava. 2010. Weighted Set-Based String Similarity. IEEE Data Eng. Bull. 33, 1 (2010), 25\u201336.","journal-title":"IEEE Data Eng. Bull."},{"key":"e_1_3_2_1_25_1","volume-title":"Proceedings of the Third International Workshop on the Web and Databases (WebDB)","author":"Haveliwala H.","year":"2000","unstructured":"Taher\u00a0 H. Haveliwala , Aristides Gionis , and Piotr Indyk . 2000 . Scalable Techniques for Clustering the Web . In Proceedings of the Third International Workshop on the Web and Databases (WebDB) . Dallas, TX, 129\u2013134. Taher\u00a0H. Haveliwala, Aristides Gionis, and Piotr Indyk. 2000. Scalable Techniques for Clustering the Web. In Proceedings of the Third International Workshop on the Web and Databases (WebDB). Dallas, TX, 129\u2013134."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276876"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2010.80"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/1341531.1341560"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/SFFCS.1999.814572"},{"volume-title":"Proceedings of the 2020 International Conference on Management of Data (SIGMOD). Online conference [Portland, OR, USA], 2589\u20132599","author":"Lei Yifan","key":"e_1_3_2_1_30_1","unstructured":"Yifan Lei , Qiang Huang , Mohan\u00a0 S. Kankanhalli , and Anthony K . \u00a0H. Tung. 2020. Locality-Sensitive Hashing Scheme based on Longest Circular Co-Substring . In Proceedings of the 2020 International Conference on Management of Data (SIGMOD). Online conference [Portland, OR, USA], 2589\u20132599 . Yifan Lei, Qiang Huang, Mohan\u00a0S. Kankanhalli, and Anthony K.\u00a0H. Tung. 2020. Locality-Sensitive Hashing Scheme based on Longest Circular Co-Substring. In Proceedings of the 2020 International Conference on Management of Data (SIGMOD). Online conference [Portland, OR, USA], 2589\u20132599."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783406"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098081"},{"key":"e_1_3_2_1_33_1","unstructured":"Ping Li. 2018. Several Tunable GMM Kernels. arXiv preprint arXiv:1805.02830(2018).  Ping Li. 2018. Several Tunable GMM Kernels. arXiv preprint arXiv:1805.02830(2018)."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.3115\/1220575.1220664"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772759"},{"volume-title":"Advances in Neural Information Processing Systems (NeurIPS).","author":"Li Ping","key":"e_1_3_2_1_36_1","unstructured":"Ping Li , Xiaoyun Li , and Cun-Hui Zhang . 2019. Re-randomized Densification for One Permutation Hashing and Bin-wise Consistent Weighted Sampling . In Advances in Neural Information Processing Systems (NeurIPS). Vancouver, Canada . Ping Li, Xiaoyun Li, and Cun-Hui Zhang. 2019. Re-randomized Densification for One Permutation Hashing and Bin-wise Consistent Weighted Sampling. In Advances in Neural Information Processing Systems (NeurIPS). Vancouver, Canada."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052679"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i5.16543"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939783"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1498759.1498832"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526769"},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/SP.2015.49"},{"volume-title":"Advances in Neural Information Processing Systems (NeurIPS).","author":"Pouget-Abadie Jean","key":"e_1_3_2_1_44_1","unstructured":"Jean Pouget-Abadie , Kevin Aydin , Warren Schudy , Kay Brodersen , and Vahab\u00a0 S. Mirrokni . 2019. Variance Reduction in Bipartite Experiments through Correlation Clustering . In Advances in Neural Information Processing Systems (NeurIPS). Vancouver, Canada , 13288\u201313298. Jean Pouget-Abadie, Kevin Aydin, Warren Schudy, Kay Brodersen, and Vahab\u00a0S. Mirrokni. 2019. Variance Reduction in Bipartite Experiments through Correlation Clustering. In Advances in Neural Information Processing Systems (NeurIPS). Vancouver, Canada, 13288\u201313298."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098111"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623740"},{"key":"e_1_3_2_1_47_1","unstructured":"Anshumali Shrivastava. 2016. Simple and Efficient Weighted Minwise Hashing. In Advances in Neural Information Processing Systems (NIPS). Barcelona Spain 1498\u20131506.  Anshumali Shrivastava. 2016. Simple and Efficient Weighted Minwise Hashing. In Advances in Neural Information Processing Systems (NIPS). Barcelona Spain 1498\u20131506."},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33460-3_36"},{"key":"e_1_3_2_1_49_1","volume-title":"Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS)","author":"Shrivastava Anshumali","year":"2014","unstructured":"Anshumali Shrivastava and Ping Li . 2014 . In Defense of Minhash over Simhash . In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS) . Reykjavik, Iceland, 886\u2013894. Anshumali Shrivastava and Ping Li. 2014. In Defense of Minhash over Simhash. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS). Reykjavik, Iceland, 886\u2013894."},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2736277.2741285"},{"key":"e_1_3_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58523-5_19"},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1240"},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/1326561.1326564"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.180"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330951"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/3299869.3300065"}],"event":{"name":"WWW '21: The Web Conference 2021","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web"],"location":"Ljubljana Slovenia","acronym":"WWW '21"},"container-title":["Proceedings of the Web Conference 2021"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3442381.3449955","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3442381.3449955","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:24:32Z","timestamp":1750195472000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3442381.3449955"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,19]]},"references-count":54,"alternative-id":["10.1145\/3442381.3449955","10.1145\/3442381"],"URL":"https:\/\/doi.org\/10.1145\/3442381.3449955","relation":{},"subject":[],"published":{"date-parts":[[2021,4,19]]},"assertion":[{"value":"2021-06-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}