{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,21]],"date-time":"2025-12-21T01:36:54Z","timestamp":1766281014854,"version":"3.41.0"},"reference-count":74,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2019,2,13]],"date-time":"2019-02-13T00:00:00Z","timestamp":1550016000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Royal Society Wolfson Research Merit Award"},{"name":"Alan Turing Institute under EPSRC","award":["EP\/N510129\/1"],"award-info":[{"award-number":["EP\/N510129\/1"]}]},{"name":"European Research Council","award":["ERC-2014-CoG 647557"],"award-info":[{"award-number":["ERC-2014-CoG 647557"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2020,1,31]]},"abstract":"<jats:p>\n            The notion of\n            <jats:italic>L<\/jats:italic>\n            <jats:sub>\n              <jats:italic>p<\/jats:italic>\n            <\/jats:sub>\n            sampling, and corresponding algorithms known as\n            <jats:italic>L<\/jats:italic>\n            <jats:sub>\n              <jats:italic>p<\/jats:italic>\n            <\/jats:sub>\n            samplers, has found a wide range of applications in the design of data stream algorithms and beyond. In this survey, we present some of the core algorithms to achieve this sampling distribution based on ideas from hashing, sampling, and sketching. We give results for the special cases of insertion-only inputs, lower bounds for the sampling problems, and ways to efficiently sample multiple elements. We describe a range of applications of\n            <jats:italic>L<\/jats:italic>\n            <jats:sub>\n              <jats:italic>p<\/jats:italic>\n            <\/jats:sub>\n            sampling, drawing on problems across the domain of computer science, from matrix and graph computations, as well as to geometric and vector streaming problems.\n          <\/jats:p>","DOI":"10.1145\/3297715","type":"journal-article","created":{"date-parts":[[2019,2,14]],"date-time":"2019-02-14T19:36:17Z","timestamp":1550172977000},"page":"1-31","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["<i>L<\/i>\n            <sub>\n              <i>p<\/i>\n            <\/sub>\n            Samplers and Their Applications"],"prefix":"10.1145","volume":"52","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0698-0922","authenticated-orcid":false,"given":"Graham","family":"Cormode","sequence":"first","affiliation":[{"name":"University of Warwick, Coventry, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hossein","family":"Jowhari","sequence":"additional","affiliation":[{"name":"K. N. Toosi University of Technology, Tehran, Iran"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,2,13]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601438"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/2095116.2095156"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2007.914344"},{"volume-title":"Advances in Neural Information Processing Systems 28: Proceedings of the Annual Conference on Neural Information Processing Systems. 775--783","author":"Alaoui Ahmed El","key":"e_1_2_1_4_1","unstructured":"Ahmed El Alaoui and Michael W. Mahoney . 2015. Fast randomized kernel ridge regression with statistical guarantees . In Advances in Neural Information Processing Systems 28: Proceedings of the Annual Conference on Neural Information Processing Systems. 775--783 . Ahmed El Alaoui and Michael W. Mahoney. 2015. Fast randomized kernel ridge regression with statistical guarantees. In Advances in Neural Information Processing Systems 28: Proceedings of the Annual Conference on Neural Information Processing Systems. 775--783."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.1997.1545"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7953381"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2011.82"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2847259"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-40164-0_8"},{"key":"e_1_2_1_10_1","unstructured":"Avrim Blum John Hopcroft and Ravindran Kannan. 2018. Foundations of Data Science. Retrieved from http:\/\/www.cs.cornell.edu\/jeh\/book.pdf.  Avrim Blum John Hopcroft and Ravindran Kannan. 2018. Foundations of Data Science. Retrieved from http:\/\/www.cs.cornell.edu\/jeh\/book.pdf."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3034786.3034798"},{"volume-title":"Proceedings of the 34th International Conference on Machine Learning (ICML\u201917)","author":"Braverman Vladimir","key":"e_1_2_1_12_1","unstructured":"Vladimir Braverman , Gereon Frahling , Harry Lang , Christian Sohler , and Lin F. Yang . 2017. Clustering high dimensional dynamic data streams . In Proceedings of the 34th International Conference on Machine Learning (ICML\u201917) . 576--585. Retrieved from http:\/\/proceedings.mlr.press\/v70\/braverman17a.html. Vladimir Braverman, Gereon Frahling, Harry Lang, Christian Sohler, and Lin F. Yang. 2017. Clustering high dimensional dynamic data streams. In Proceedings of the 34th International Conference on Machine Learning (ICML\u201917). 576--585. Retrieved from http:\/\/proceedings.mlr.press\/v70\/braverman17a.html."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the Conference on Compression and Complexity of Sequences (SEQUENCES\u201997)","author":"Broder A.","year":"1997","unstructured":"A. Broder . 1997 . On the resemblance and containment of documents . In Proceedings of the Conference on Compression and Complexity of Sequences (SEQUENCES\u201997) . 21--29. A. Broder. 1997. On the resemblance and containment of documents. In Proceedings of the Conference on Compression and Complexity of Sequences (SEQUENCES\u201997). 21--29."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/276698.276781"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/SFCS.2005.5464411"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-0000(79)90044-8"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0304-3975(03)00400-6"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1137\/S0097539702403244"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1265530.1265566"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/1254882.1254926"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.5555\/3039686.3039801"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms. 151--156","author":"Coppersmith Don","year":"2004","unstructured":"Don Coppersmith and Ravi Kumar . 2004 . An improved data stream algorithm for frequency moments . In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms. 151--156 . Retrieved from http:\/\/dl.acm.org\/citation.cfm?id&equals;982792.982815. Don Coppersmith and Ravi Kumar. 2004. An improved data stream algorithm for frequency moments. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms. 151--156. Retrieved from http:\/\/dl.acm.org\/citation.cfm?id&equals;982792.982815."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10619-013-7131-9"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jalgor.2003.12.001"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/1083592.1083599"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1137\/070696507"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/2503308.2503352"},{"volume-title":"Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. 1127--1136","author":"Drineas Petros","key":"e_1_2_1_28_1","unstructured":"Petros Drineas , Michael W. Mahoney , and S. Muthukrishnan . 2006. Sampling algorithms for l<sub>2<\/sub> regression and applications . In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. 1127--1136 . Petros Drineas, Michael W. Mahoney, and S. Muthukrishnan. 2006. Sampling algorithms for l<sub>2<\/sub> regression and applications. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. 1127--1136."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2254756.2254800"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2005.846400"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1314690.1314696"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/1138831.1711169"},{"volume-title":"Proceedings of the 10th International Workshop on Algorithms and Data Structures (WADS\u201907)","author":"Eppstein David","key":"e_1_2_1_33_1","unstructured":"David Eppstein and Michael T. Goodrich . 2007. Space-efficient straggler identification in round-trip data streams via Newton\u2019s identities and invertible bloom filters . In Proceedings of the 10th International Workshop on Algorithms and Data Structures (WADS\u201907) . 637--648. David Eppstein and Michael T. Goodrich. 2007. Space-efficient straggler identification in round-trip data streams via Newton\u2019s identities and invertible bloom filters. In Proceedings of the 10th International Workshop on Algorithms and Data Structures (WADS\u201907). 637--648."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/633025.633056"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1962.10480667"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/1064092.1064116"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/1060590.1060622"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/1039488.1039494"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2007.02.031"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1142351.1142392"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/276304.276334"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1002\/sim.3613"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02187876"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1006\/jagm.2000.1131"},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the Symposium on Foundations of Computer Science (FOCS\u201918)","author":"Jayaram Rajesh","year":"1808","unstructured":"Rajesh Jayaram and David P. Woodruff . 2018. Perfect L<sub>P<\/sub> sampling in a data stream . In Proceedings of the Symposium on Foundations of Computer Science (FOCS\u201918) . Retrieved from http:\/\/arxiv.org\/abs\/ 1808 .05497. Rajesh Jayaram and David P. Woodruff. 2018. Perfect L<sub>P<\/sub> sampling in a data stream. In Proceedings of the Symposium on Foundations of Computer Science (FOCS\u201918). Retrieved from http:\/\/arxiv.org\/abs\/1808.05497."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2009.82"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2700395"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1989284.1989289"},{"key":"e_1_2_1_50_1","volume-title":"Woodruff","author":"Kane Daniel M.","year":"2010","unstructured":"Daniel M. Kane , Jelani Nelson , and David P . Woodruff . 2010 . On the exact space complexity of sketching and streaming small norms. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA\u201910). 1161--1178. Daniel M. Kane, Jelani Nelson, and David P. Woodruff. 2010. On the exact space complexity of sketching and streaming small norms. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA\u201910). 1161--1178."},{"key":"e_1_2_1_51_1","doi-asserted-by":"crossref","unstructured":"Michael Kapralov Jelani Nelson Jakub Pachocki Zhengyu Wang David P. Woodruff and Mobin Yahyazadeh. 2017. Optimal lower bounds for universal relation and for samplers and finding duplicates in streams. CoRR abs\/1704.00633. Retrieved from http:\/\/arxiv.org\/abs\/1704.00633.  Michael Kapralov Jelani Nelson Jakub Pachocki Zhengyu Wang David P. Woodruff and Mobin Yahyazadeh. 2017. Optimal lower bounds for universal relation and for samplers and finding duplicates in streams. CoRR abs\/1704.00633. Retrieved from http:\/\/arxiv.org\/abs\/1704.00633.","DOI":"10.1109\/FOCS.2017.50"},{"volume-title":"Communication Complexity\u2014A New Approach to Circuit Depth","author":"Karchmer Mauricio","key":"e_1_2_1_52_1","unstructured":"Mauricio Karchmer . 1989. Communication Complexity\u2014A New Approach to Circuit Depth . MIT Press . Mauricio Karchmer. 1989. Communication Complexity\u2014A New Approach to Circuit Depth. MIT Press."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01206317"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/62212.62265"},{"key":"e_1_2_1_55_1","unstructured":"Donald E. Knuth. 1969. The Art of Computer Programming Volume 2: (2nd Ed.) Seminumerical Algorithms. Addison Wesley Longman Publishing Co. Inc.  Donald E. Knuth. 1969. The Art of Computer Programming Volume 2: (2nd Ed.) Seminumerical Algorithms. Addison Wesley Longman Publishing Co. Inc."},{"key":"e_1_2_1_56_1","volume-title":"Vol 3, Sorting and Searching","author":"Knuth D. E.","unstructured":"D. E. Knuth . 1998. The Art of Computer Programming , Vol 3, Sorting and Searching ( 2 nd ed.). Addison-Wesley . D. E. Knuth. 1998. The Art of Computer Programming, Vol 3, Sorting and Searching (2nd ed.). Addison-Wesley.","edition":"2"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/1150402.1150479"},{"volume-title":"The Theory of Error Correcting Codes. North-Holland Pub","author":"MacWilliams Florence Jessie","key":"e_1_2_1_58_1","unstructured":"Florence Jessie MacWilliams and N. J. A. Sloane . 1977. The Theory of Error Correcting Codes. North-Holland Pub . Co. New York, Amsterdam, New York. Retrieved from http:\/\/opac.inria.fr\/record&equals;b1084490 Includes index. Florence Jessie MacWilliams and N. J. A. Sloane. 1977. The Theory of Error Correcting Codes. North-Holland Pub. Co. New York, Amsterdam, New York. Retrieved from http:\/\/opac.inria.fr\/record&equals;b1084490 Includes index."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/2627692.2627694"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/2902251.2902283"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1145\/225058.225093"},{"volume-title":"Probability and Computing: Randomized Algorithms and Probabilistic Analysis","author":"Mitzenmacher Michael","key":"e_1_2_1_62_1","unstructured":"Michael Mitzenmacher and Eli Upfal . 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis . Cambridge University Press . Michael Mitzenmacher and Eli Upfal. 2005. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press."},{"key":"e_1_2_1_63_1","volume-title":"Woodruff","author":"Monemizadeh Morteza","year":"2010","unstructured":"Morteza Monemizadeh and David P . Woodruff . 2010 . 1-pass relative-error L-sampling with applications. In Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms. 1143--1160. Morteza Monemizadeh and David P. Woodruff. 2010. 1-pass relative-error L-sampling with applications. In Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms. 1143--1160."},{"key":"e_1_2_1_64_1","doi-asserted-by":"crossref","unstructured":"R. Motwani and P. Raghavan. 1995. Randomized Algorithms. Cambridge University Press.   R. Motwani and P. Raghavan. 1995. Randomized Algorithms. Cambridge University Press.","DOI":"10.1017\/CBO9780511814075"},{"key":"e_1_2_1_65_1","unstructured":"Jelani Nelson and Huacheng Yu. 2018. Optimal lower bounds for distributed and streaming spanning forest computation. CoRR abs\/1807.05135. Retrieved from http:\/\/arxiv.org\/abs\/1807.05135.  Jelani Nelson and Huacheng Yu. 2018. Optimal lower bounds for distributed and streaming spanning forest computation. CoRR abs\/1807.05135. Retrieved from http:\/\/arxiv.org\/abs\/1807.05135."},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.14778\/2556549.2556569"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/1879141.1879192"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.0081-1750.2004.00152.x"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1137\/S089548019223872X"},{"volume-title":"Proceedings of the Conference on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX\/RANDOM\u201915)","author":"Sun Xiaoming","key":"e_1_2_1_70_1","unstructured":"Xiaoming Sun and David P. Woodruff . 2015. Tight bounds for graph problems in insertion streams . In Proceedings of the Conference on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX\/RANDOM\u201915) . 435--448. Xiaoming Sun and David P. Woodruff. 2015. Tight bounds for graph problems in insertion streams. In Proceedings of the Conference on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX\/RANDOM\u201915). 435--448."},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.5555\/791230.792294"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/2488608.2488655"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1137\/1116025"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1145\/3147.3165"},{"volume-title":"Proceedings of the 32nd IEEE International Conference on Data Engineering. 847--858","author":"David","key":"e_1_2_1_75_1","unstructured":"David P. Woodruff and Peilin Zhong. 2016. Distributed low rank approximation of implicit functions of a matrix . In Proceedings of the 32nd IEEE International Conference on Data Engineering. 847--858 . David P. Woodruff and Peilin Zhong. 2016. Distributed low rank approximation of implicit functions of a matrix. In Proceedings of the 32nd IEEE International Conference on Data Engineering. 847--858."}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3297715","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3297715","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:54:10Z","timestamp":1750204450000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3297715"}},"subtitle":["A Survey"],"short-title":[],"issued":{"date-parts":[[2019,2,13]]},"references-count":74,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,1,31]]}},"alternative-id":["10.1145\/3297715"],"URL":"https:\/\/doi.org\/10.1145\/3297715","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"type":"print","value":"0360-0300"},{"type":"electronic","value":"1557-7341"}],"subject":[],"published":{"date-parts":[[2019,2,13]]},"assertion":[{"value":"2017-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-02-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}