{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T03:24:13Z","timestamp":1740108253902,"version":"3.37.3"},"reference-count":19,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2021,10,9]],"date-time":"2021-10-09T00:00:00Z","timestamp":1633737600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,10,9]],"date-time":"2021-10-09T00:00:00Z","timestamp":1633737600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Comput Stat"],"published-print":{"date-parts":[[2022,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this paper, we consider several efficient data structures for the problem of sampling from a dynamically changing discrete probability distribution, where some prior information is known on the distribution of the rates, in particular the maximum and minimum rate, and where the number of possible outcomes N is large. We consider three basic data structures, the Acceptance\u2013Rejection method, the Complete Binary Tree and the Alias method. These can be used as building blocks in a multi-level data structure, where at each of the levels, one of the basic data structures can be used, with the top level selecting a group of events, and the bottom level selecting an element from a group. Depending on assumptions on the distribution of the rates of outcomes, different combinations of the basic structures can be used. We prove that for particular data structures the expected time of sampling and update is constant when the rate distribution follows certain conditions. We show that for any distribution, combining a tree structure with the Acceptance\u2013Rejection method, we have an expected time of sampling and update of<jats:inline-formula><jats:alternatives><jats:tex-math>$$O\\left( \\log \\log {r_{max}}\/{r_{min}}\\right) $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mrow><mml:mi>O<\/mml:mi><mml:mfenced><mml:mo>log<\/mml:mo><mml:mo>log<\/mml:mo><mml:msub><mml:mi>r<\/mml:mi><mml:mrow><mml:mi>max<\/mml:mi><\/mml:mrow><\/mml:msub><mml:mo>\/<\/mml:mo><mml:msub><mml:mi>r<\/mml:mi><mml:mrow><mml:mi>min<\/mml:mi><\/mml:mrow><\/mml:msub><\/mml:mfenced><\/mml:mrow><\/mml:math><\/jats:alternatives><\/jats:inline-formula>is possible, where<jats:inline-formula><jats:alternatives><jats:tex-math>$$r_{max}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:msub><mml:mi>r<\/mml:mi><mml:mrow><mml:mi>max<\/mml:mi><\/mml:mrow><\/mml:msub><\/mml:math><\/jats:alternatives><\/jats:inline-formula>is the maximum rate and<jats:inline-formula><jats:alternatives><jats:tex-math>$$r_{min}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:msub><mml:mi>r<\/mml:mi><mml:mrow><mml:mi>min<\/mml:mi><\/mml:mrow><\/mml:msub><\/mml:math><\/jats:alternatives><\/jats:inline-formula>the minimum rate. We also discuss an implementation of a Two Levels Acceptance\u2013Rejection data structure, that allows expected constant time for sampling, and amortized constant time for updates, assuming that<jats:inline-formula><jats:alternatives><jats:tex-math>$$r_{max}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:msub><mml:mi>r<\/mml:mi><mml:mrow><mml:mi>max<\/mml:mi><\/mml:mrow><\/mml:msub><\/mml:math><\/jats:alternatives><\/jats:inline-formula>and<jats:inline-formula><jats:alternatives><jats:tex-math>$$r_{min}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:msub><mml:mi>r<\/mml:mi><mml:mrow><mml:mi>min<\/mml:mi><\/mml:mrow><\/mml:msub><\/mml:math><\/jats:alternatives><\/jats:inline-formula>are known and the number of events is sufficiently large. We also present an experimental verification, highlighting the limits given by the constraints of a real-life setting.<\/jats:p>","DOI":"10.1007\/s00180-021-01159-3","type":"journal-article","created":{"date-parts":[[2021,10,10]],"date-time":"2021-10-10T20:09:44Z","timestamp":1633896584000},"page":"1203-1228","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Dynamic sampling from a discrete probability distribution with a known distribution of rates"],"prefix":"10.1007","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2194-513X","authenticated-orcid":false,"given":"Federico","family":"D\u2019Ambrosio","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hans L.","family":"Bodlaender","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gerard T.","family":"Barkema","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,10,9]]},"reference":[{"key":"1159_CR1","unstructured":"Black PE (2019) Binary heap. In: Dictionary of algorithms and data structures. NIST"},{"issue":"2","key":"1159_CR2","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1016\/0040-6090(95)06946-1","volume":"272","author":"M Breeman","year":"1996","unstructured":"Breeman M, Barkema GT, Langelaar M, Boerma D (1996) Computer simulation of metal-on-metal epitaxy. Thin Solid Films 272(2):195\u2013207","journal-title":"Thin Solid Films"},{"key":"1159_CR3","unstructured":"Cormen T, Rivest R, Leiserson C (2001) Introduction to algorithms, 2nd edn. The MIT Press"},{"key":"1159_CR4","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1016\/B978-0-12-372512-7.00003-1","volume-title":"Heuristic search, ch 3","author":"S Edelkamp","year":"2012","unstructured":"Edelkamp S, Schr\u00f6dl S (2012) Dictionary data structures. In: Edelkamp S, Schr\u00f6dl S (eds) Heuristic search, ch 3. Morgan Kaufmann, San Francisco, pp 89\u2013159"},{"issue":"25","key":"1159_CR5","doi-asserted-by":"publisher","first-page":"2340","DOI":"10.1021\/j100540a008","volume":"81","author":"DT Gillespie","year":"1977","unstructured":"Gillespie DT (1977) Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81(25):2340\u20132361","journal-title":"J Phys Chem"},{"key":"1159_CR6","series-title":"LNCS","first-page":"253","volume-title":"Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)","author":"T Hagerup","year":"1993","unstructured":"Hagerup T, Mehlhorn K, Munro JI (1993) Maintaining discrete probability distributions optimally. Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 700. LNCS. Springer, Berlin, pp 253\u2013264"},{"key":"1159_CR7","unstructured":"Loeve M (1977) Centering at medians and symmetrization. In: Probability theory I. Springer, p\u00a0256"},{"issue":"1","key":"1159_CR8","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1145\/366193.366228","volume":"6","author":"G Marsaglia","year":"1963","unstructured":"Marsaglia G (1963) Generating discrete random variables in a computer. Commun ACM 6(1):37\u201338","journal-title":"Commun ACM"},{"issue":"4","key":"1159_CR9","doi-asserted-by":"publisher","first-page":"329","DOI":"10.1007\/s00224-003-1078-6","volume":"36","author":"Y Matias","year":"2003","unstructured":"Matias Y, Vitter JS, Ni WC (2003) Dynamic generation of discrete random variates. Theory Comput Syst 36(4):329\u2013357","journal-title":"Theory Comput Syst"},{"issue":"9","key":"1159_CR10","first-page":"273","volume":"49","author":"PM Maurer","year":"2017","unstructured":"Maurer PM (2017) Finite random variates using differential search trees. Simul Ser 49(9):273\u2013284","journal-title":"Simul Ser"},{"key":"1159_CR11","doi-asserted-by":"crossref","unstructured":"Newman MEJ, Barkema GT (1999) Monte Carlo simulations in surface science. In: Monte Carlo methods in statistical physics, ch.\u00a011. Oxford University Press","DOI":"10.1093\/oso\/9780198517962.001.0001"},{"issue":"1","key":"1159_CR12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/151527.151529","volume":"3","author":"S Rajasekaran","year":"1993","unstructured":"Rajasekaran S, Ross KW (1993) Fast algorithms for generating discrete random variates with changing distributions. ACM Trans Model Comput Simul 3(1):1\u201319","journal-title":"ACM Trans Model Comput Simul"},{"issue":"20","key":"1159_CR13","doi-asserted-by":"publisher","first-page":"205101","DOI":"10.1063\/1.2919546","volume":"128","author":"A Slepoy","year":"2008","unstructured":"Slepoy A, Thompson AP, Plimpton SJ (2008) A constant-time kinetic Monte Carlo algorithm for simulation of large biochemical reaction networks. J Chem Phys 128(20):205101","journal-title":"J Chem Phys"},{"key":"1159_CR14","unstructured":"van\u00a0de Klundert B (2019) Efficient generation of discrete random variates. Master\u2019s thesis, Utrecht University"},{"issue":"10","key":"1159_CR15","doi-asserted-by":"publisher","first-page":"6819","DOI":"10.1103\/PhysRevB.34.6819","volume":"34","author":"AF Voter","year":"1986","unstructured":"Voter AF (1986) Classically exact overlayer dynamics: diffusion of rhodium clusters on Rh(100). Phys Rev B 34(10):6819\u20136829","journal-title":"Phys Rev B"},{"issue":"8","key":"1159_CR16","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1049\/el:19740097","volume":"10","author":"AJ Walker","year":"1974","unstructured":"Walker AJ (1974) New fast method for generating discrete random numbers with arbitrary frequency distribution. Electron Lett 10(8):8\u20139","journal-title":"Electron Lett"},{"issue":"3","key":"1159_CR17","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1145\/355744.355749","volume":"3","author":"AJ Walker","year":"1977","unstructured":"Walker AJ (1977) An efficient method for generating discrete random variables with general distributions. ACM Trans Math Softw 3(3):253\u2013256","journal-title":"ACM Trans Math Softw"},{"issue":"3","key":"1159_CR18","doi-asserted-by":"publisher","first-page":"419","DOI":"10.1080\/00401706.1962.10490022","volume":"4","author":"B Welford","year":"1962","unstructured":"Welford B (1962) Note on a method for calculating corrected sums of squares and products. Technometrics 4(3):419\u2013420","journal-title":"Technometrics"},{"issue":"1","key":"1159_CR19","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1137\/0209009","volume":"9","author":"CK Wong","year":"1980","unstructured":"Wong CK, Easton MC (1980) An efficient method for weighted sampling without replacement. SIAM J Comput 9(1):111\u2013113","journal-title":"SIAM J Comput"}],"container-title":["Computational Statistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00180-021-01159-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00180-021-01159-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00180-021-01159-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,10]],"date-time":"2023-11-10T19:57:26Z","timestamp":1699646246000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00180-021-01159-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,9]]},"references-count":19,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,7]]}},"alternative-id":["1159"],"URL":"https:\/\/doi.org\/10.1007\/s00180-021-01159-3","relation":{},"ISSN":["0943-4062","1613-9658"],"issn-type":[{"type":"print","value":"0943-4062"},{"type":"electronic","value":"1613-9658"}],"subject":[],"published":{"date-parts":[[2021,10,9]]},"assertion":[{"value":"14 July 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 September 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 October 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}