{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,4,6]],"date-time":"2022-04-06T00:02:44Z","timestamp":1649203364027},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"POPL","license":[{"start":{"date-parts":[[2019,12,20]],"date-time":"2019-12-20T00:00:00Z","timestamp":1576800000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Program. Lang."],"published-print":{"date-parts":[[2020,1]]},"abstract":"\n This paper addresses a fundamental problem in random variate generation: given access to a random source that emits a stream of independent fair bits, what is the most accurate and entropy-efficient algorithm for sampling from a discrete probability distribution (\n p<\/jats:italic>\n 1<\/jats:sub>\n , \u2026,\n p<\/jats:italic>\n \n n<\/jats:italic>\n <\/jats:sub>\n ), where the probabilities of the output distribution (\n p\u0302<\/jats:italic>\n 1<\/jats:sub>\n , \u2026,\n p\u0302<\/jats:italic>\n \n n<\/jats:italic>\n <\/jats:sub>\n ) of the sampling algorithm must be specified using at most\n k<\/jats:italic>\n bits of precision? We present a theoretical framework for formulating this problem and provide new techniques for finding sampling algorithms that are optimal both statistically (in the sense of sampling accuracy) and information-theoretically (in the sense of entropy consumption). We leverage these results to build a system that, for a broad family of measures of statistical accuracy, delivers a sampling algorithm whose expected entropy usage is minimal among those that induce the same distribution (i.e., is \u201centropy-optimal\u201d) and whose output distribution (\n p\u0302<\/jats:italic>\n 1<\/jats:sub>\n , \u2026,\n p\u0302<\/jats:italic>\n \n n<\/jats:italic>\n <\/jats:sub>\n ) is a closest approximation to the target distribution (\n p<\/jats:italic>\n 1<\/jats:sub>\n , \u2026,\n p<\/jats:italic>\n \n n<\/jats:italic>\n <\/jats:sub>\n ) among all entropy-optimal sampling algorithms that operate within the specified\n k<\/jats:italic>\n -bit precision. This optimal approximate sampler is also a closer approximation than any (possibly entropy-suboptimal) sampler that consumes a bounded amount of entropy with the specified precision, a class which includes floating-point implementations of inversion sampling and related methods found in many software libraries. We evaluate the accuracy, entropy consumption, precision requirements, and wall-clock runtime of our optimal approximate sampling algorithms on a broad set of distributions, demonstrating the ways that they are superior to existing approximate samplers and establishing that they often consume significantly fewer resources than are needed by exact samplers.\n <\/jats:p>","DOI":"10.1145\/3371104","type":"journal-article","created":{"date-parts":[[2019,12,20]],"date-time":"2019-12-20T19:45:25Z","timestamp":1576871125000},"page":"1-31","source":"Crossref","is-referenced-by-count":1,"title":["Optimal approximate sampling from discrete probability distributions"],"prefix":"10.1145","volume":"4","author":[{"given":"Feras A.","family":"Saad","sequence":"first","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]},{"given":"Cameron E.","family":"Freer","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]},{"given":"Martin C.","family":"Rinard","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]},{"given":"Vikash K.","family":"Mansinghka","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]}],"member":"320","reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.532895"},{"key":"e_1_2_2_2_1","first-page":"1","article-title":"A General Class of Coefficients of Divergence of One Distribution from Another","volume":"28","author":"Ali S. M.","year":"1966","journal-title":"J. R. Stat. Soc. B."},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2003.11.006"},{"key":"e_1_2_2_4_1","volume-title":"Topics in Current Physics","volume":"7","author":"Ed Kurt Binder","year":"1986"},{"key":"e_1_2_2_5_1","volume-title":"Efficient Generation \u03f5-close to G(n, p) and Generalizations. (April","author":"Blanca Antonio","year":"2012"},{"key":"e_1_2_2_6_1","volume-title":"Complexity and Real Computation","author":"Blum Lenore"},{"key":"e_1_2_2_7_1","first-page":"2","article-title":"Independent Unbiased Coin Flips from a Correlated Biased Source","volume":"6","author":"Blum Manuel","year":"1986","journal-title":"A Finite State Markov Chain. Combinatorica"},{"key":"e_1_2_2_8_1","series-title":"Lecture Notes in Computer Science","volume-title":"ICALP 2013: Proceedings of the 40th International Colloquium on Automata, Languages and Programming (Riga, Latvia)","author":"Bringmann Karl"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-016-0205-0"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2006.878151"},{"key":"e_1_2_2_11_1","volume-title":"Thomas","author":"Cover Thomas M.","year":"2006"},{"key":"e_1_2_2_12_1","volume-title":"Gen: A General-purpose Probabilistic Programming System with Programmable Inference. In PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation","author":"Cusumano-Towner Marco F."},{"key":"e_1_2_2_13_1","volume-title":"Int. J. Reconf. Comput.","author":"de Schryver Christian","year":"2012"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1080\/00949658208810536"},{"key":"e_1_2_2_15_1","doi-asserted-by":"crossref","volume-title":"Non-Uniform Random Variate Generation","author":"Devroye Luc","DOI":"10.1007\/978-1-4613-8643-8"},{"key":"e_1_2_2_16_1","volume-title":"Sampling with Arbitrary Precision. (Feb","author":"Devroye Luc","year":"2015"},{"key":"e_1_2_2_17_1","volume-title":"A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification. J. Mach. Learn. Res. 3 (March","author":"Dhillon Inderjit S.","year":"2003"},{"key":"e_1_2_2_18_1","volume-title":"Retrieved","author":"Djuric Dragan","year":"2019"},{"key":"e_1_2_2_19_1","volume-title":"Towards Efficient Discrete Gaussian Sampling For Lattice-Based Cryptography. In FPL 2015: Proceedings of the 25th International Conference on Field Programmable Logic and Applications","author":"Du Chaohui","year":"2015"},{"key":"e_1_2_2_20_1","first-page":"3","article-title":"Sampling from Discrete Gaussians for Lattice-Based Cryptography On a Constrained","volume":"25","author":"Dwarakanath Nagarjun C.","year":"2014","journal-title":"Device. Appl. Algebr. Eng. Comm."},{"key":"e_1_2_2_21_1","first-page":"3","article-title":"The Efficient Construction of an Unbiased Random","volume":"43","author":"Elias Peter","year":"1972","journal-title":"Sequence. Ann. Math. Stat."},{"key":"e_1_2_2_22_1","first-page":"1","article-title":"Gaussian Sampling in Lattice Based Cryptography","volume":"60","author":"Foll\u00e1th J\u00e1nos","year":"2014","journal-title":"Tatra Mount. Math. Pub."},{"key":"e_1_2_2_23_1","volume-title":"GNU Scientific Library","author":"Galassi Mark"},{"key":"e_1_2_2_24_1","volume-title":"Monte Carlo Methods in Financial Engineering. Stochastic Modeling and Applied Probability","author":"Glasserman Paul"},{"key":"e_1_2_2_25_1","volume-title":"Probabilistic Programming. In FOSE 2014: Proceedings of the on Future of Software Engineering","author":"Gordon Andrew D."},{"key":"e_1_2_2_26_1","first-page":"2","article-title":"Interval Algorithm for Random Number Generation","volume":"43","author":"Han Te Sun","year":"1997","journal-title":"IEEE Trans. Inf. Theory"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.256486"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1287\/opre.6.3.307"},{"key":"e_1_2_2_30_1","volume-title":"Yao","author":"Knuth Donald E.","year":"1976"},{"key":"e_1_2_2_31_1","series-title":"Lecture Notes in Computer Science","volume-title":"Horizons of the Mind. A Tribute to Prakash Panangaden: Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday","author":"Kozen Dexter"},{"key":"e_1_2_2_32_1","series-title":"Lecture Notes in Computer Science","volume-title":"RAMiCS 2018: Proceedings of the 17th International Conference on Relational and Algebraic Methods in Computer Science (Groningen, The Netherlands)","author":"Kozen Dexter"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177729694"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cpc.2009.06.019"},{"key":"e_1_2_2_35_1","volume-title":"User\u2019s Guide to the GNU C++ Library","author":"Lea Dopug"},{"key":"e_1_2_2_36_1","unstructured":"Josef Leydold and Sougata Chaudhuri. 2014. rvgtest: Tools for Analyzing Non-Uniform Pseudo-Random Variate Generators. https:\/\/CRAN.R- project.org\/package=rvgtest R package version 0.7.4. Josef Leydold and Sougata Chaudhuri. 2014. rvgtest: Tools for Analyzing Non-Uniform Pseudo-Random Variate Generators. https:\/\/CRAN.R- project.org\/package=rvgtest R package version 0.7.4."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2006.881731"},{"key":"e_1_2_2_38_1","volume-title":"Monte Carlo Strategies in Scientific Computing","author":"Liu Jun S."},{"key":"e_1_2_2_39_1","volume-title":"Optimal Discrete Uniform Generation from Coin Flips, and Applications. (April","author":"Lumbroso J\u00e9rmie","year":"2013"},{"key":"e_1_2_2_40_1","volume-title":"Building Fast Bayesian Computing Machines Out of Intentionally Stochastic Digital Parts. (Feb","author":"Mansinghka Vikash","year":"2014"},{"key":"e_1_2_2_41_1","volume-title":"Statistics Toolbox User\u2019s Guide. The MathWorks","author":"MathWorks The"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1090\/S0025-5718-1985-0804945-X"},{"key":"e_1_2_2_43_1","volume-title":"Efficient Synthesis of Probabilistic Programs. In PLDI 2015: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation","author":"Nori Aditya V.","year":"2015"},{"key":"e_1_2_2_44_1","first-page":"11","article-title":"Randomizing Functions: Simulation of a Discrete Probability Distribution Using a Source of Unknown Distribution","volume":"52","author":"Loui Pae","year":"2006","journal-title":"IEEE Trans. Inf. Theory"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1080\/14786440009463897"},{"key":"e_1_2_2_46_1","first-page":"1","article-title":"Iterating von Neumann\u2019s Procedure for Extracting Random","volume":"20","author":"Peres Yuval","year":"1992","journal-title":"Bits. Ann. Stat."},{"key":"e_1_2_2_47_1","volume-title":"R: A Language and Environment for Statistical Computing","author":"Team R Core","year":"2014"},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISIT.1991.695225"},{"key":"e_1_2_2_49_1","volume-title":"High Precision Discrete Gaussian Sampling on FPGAs. In SAC 2013: Proceedings of the 20th International Conference on Selected Areas in Cryptography","volume":"8282","author":"Roy Sinha S.","year":"2013"},{"key":"e_1_2_2_50_1","volume-title":"Probabilistic Data Analysis with Probabilistic Programming. (Aug","author":"Saad Feras","year":"2016"},{"key":"e_1_2_2_51_1","volume-title":"Proc. ACM Program. Lang. 3, POPL, Article 37 (Jan.","author":"Saad Feras A.","year":"2019"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1948.tb01338.x"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/2933575.2935313"},{"key":"e_1_2_2_56_1","first-page":"1","article-title":"Tree Algorithms for Unbiased Coin Tossing with a Biased","volume":"12","author":"Stout Quentin F.","year":"1984","journal-title":"Coin. Ann. Probab."},{"key":"e_1_2_2_57_1","first-page":"10","article-title":"Two Algorithms for Random Number Generation Implemented by Using Arithmetic of Limited Precision","volume":"86","author":"Uyematsu Tomohiko","year":"2003","journal-title":"IEICE Trans. Fund. Elec. Comm. Comp. Sci"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.412679"},{"key":"e_1_2_2_59_1","series-title":"National Bureau of Standards Applied Mathematics Series","volume-title":"Various Techniques Used in Connection with Random Digits","author":"von Neumann John"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/32.92917"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1049\/el:19740097"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1145\/355744.355749"}],"container-title":["Proceedings of the ACM on Programming Languages"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3371104","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,10,6]],"date-time":"2021-10-06T06:27:21Z","timestamp":1633501641000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3371104"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1]]},"references-count":59,"journal-issue":{"issue":"POPL","published-print":{"date-parts":[[2020,1]]}},"alternative-id":["10.1145\/3371104"],"URL":"http:\/\/dx.doi.org\/10.1145\/3371104","relation":{},"ISSN":["2475-1421"],"issn-type":[{"value":"2475-1421","type":"electronic"}],"subject":["Safety, Risk, Reliability and Quality","Software"],"published":{"date-parts":[[2020,1]]}}}