{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:28:54Z","timestamp":1760243334684,"version":"build-2065373602"},"reference-count":44,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2014,10,13]],"date-time":"2014-10-13T00:00:00Z","timestamp":1413158400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000930","name":"NSF","doi-asserted-by":"publisher","award":["CCF-1018984","CCF-1065632","CCF-1065494","CCF-1346564"],"award-info":[{"award-number":["CCF-1018984","CCF-1065632","CCF-1065494","CCF-1346564"]}],"id":[{"id":"10.13039\/501100000930","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100007911","name":"University of California, San Diego","doi-asserted-by":"publisher","award":["CALIT2"],"award-info":[{"award-number":["CALIT2"]}],"id":[{"id":"10.13039\/100007911","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Exchangeable random partition processes are the basis for Bayesian approaches to statistical inference in large alphabet settings. On the other hand, the notion of the pattern of a sequence provides an information-theoretic framework for data compression in large alphabet scenarios. Because data compression and parameter estimation are intimately related, we study the redundancy of Bayes estimators coming from Poisson\u2013Dirichlet priors (or \u201cChinese restaurant processes\u201d) and the Pitman\u2013Yor prior. This provides an understanding of these estimators in the setting of unknown discrete alphabets from the perspective of universal compression. In particular, we identify relations between alphabet sizes and sample sizes where the redundancy is small, thereby characterizing useful regimes for these estimators.<\/jats:p>","DOI":"10.3390\/e16105339","type":"journal-article","created":{"date-parts":[[2014,10,14]],"date-time":"2014-10-14T02:13:08Z","timestamp":1413252788000},"page":"5339-5357","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Redundancy of Exchangeable Estimators"],"prefix":"10.3390","volume":"16","author":[{"given":"Narayana","family":"Santhanam","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering, University of Hawaii at Manoa, 2540 Dole Street, Honolulu, HI 96822, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anand","family":"Sarwate","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, Rutgers, The State University of New Jersey, 94 Brett Road, Piscataway, NJ 08854 , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jae","family":"Woo","sequence":"additional","affiliation":[{"name":"Applied Mathematics Program, Yale University, 51 Prospect St, New Haven, CT 06511, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2014,10,13]]},"reference":[{"key":"ref_1","unstructured":"Laplace, P.S. (1995). Philosphical Essay on Probabilities: Translated From the Fifth French Edition of 1825, Springer Verlag."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"De Morgan, A. (1838). An Essay on Probabilities, and on Their Application to Life Contingencies and Insurance Offices, Longman, Orme, Brown, Green & Longmans.","DOI":"10.5962\/bhl.title.29847"},{"key":"ref_3","unstructured":"Smedley, E., Rose, H.J., and Rose, H.J. (1845). Encyclop\u00e6dia Metropolitana, B. Fellowes et. al."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1093\/biomet\/40.3-4.237","article-title":"The population frequencies of species and the estimation of population parameters","volume":"40","author":"Good","year":"1953","journal-title":"Biometrika"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1214\/aos\/1176342372","article-title":"Ferguson distributions via P\u00f3lya urn schemes","volume":"1","author":"Blackwell","year":"1973","journal-title":"Ann. Stat"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1975.tb01024.x","article-title":"Random Discrete Distributions","volume":"37","author":"Kingman","year":"1975","journal-title":"J. R. Stat. Soc. B"},{"key":"ref_7","first-page":"1","article-title":"Random Partitions in Population Genetics","volume":"361","author":"Kingman","year":"1978","journal-title":"Proc. R. Soc"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1112\/jlms\/s2-18.2.374","article-title":"The Representation of Partition Structures","volume":"2","author":"Kingman","year":"1978","journal-title":"J. London Math. Soc"},{"key":"ref_9","first-page":"233","article-title":"De Finetti\u2019s Generalizations of Exchangeability","volume":"2","author":"Jeffrey","year":"1980","journal-title":"Studies in Inductive Logic and Probability"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Aldous, D.J. (1985). \u00c9cole d\u2019\u00c9t\u00e9 de Probabilit\u00e9s de Saint-Flour, XIII\u20141983, Springer.","DOI":"10.1007\/BFb0099420"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1007\/BF00485351","article-title":"Predicting the unpredictable","volume":"90","author":"Zabell","year":"1992","journal-title":"Synthese"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1007\/BF01213386","article-title":"Exchangeable and partially exchangeable random partitions","volume":"102","author":"Pitman","year":"1995","journal-title":"Probab. Theory Relat. Fields"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"855","DOI":"10.1214\/aop\/1024404422","article-title":"The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator","volume":"25","author":"Pitman","year":"1997","journal-title":"Ann. Probab"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1109\/18.54897","article-title":"Information theoretic asymptotics of Bayes methods","volume":"36","author":"Clarke","year":"1990","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/0378-3758(94)90153-8","article-title":"Jeffreys\u2019 prior is asymptotically least favorable under entropy risk","volume":"41","author":"Clarke","year":"1994","journal-title":"J. Stat. Plan. Inference"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1469","DOI":"10.1109\/TIT.2004.830761","article-title":"Universal compression of memoryless sources over unknown alphabets","volume":"50","author":"Orlitsky","year":"2004","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"427","DOI":"10.1126\/science.1088284","article-title":"Always Good Turing: Asymptotically optimal probability estimation","volume":"302","author":"Orlitsky","year":"2003","journal-title":"Science"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Acharya, J., Das, H., Jafarpour, A., Orlitsky, A., and Suresh, A. (2013, January 7\u201312). Tight Bounds for Universal Compression of Large Alphabets. Istanbul, Turkey.","DOI":"10.1109\/ISIT.2013.6620751"},{"key":"ref_19","unstructured":"Acharya, J., Das, H., and Orlitsky, A. (2012, January 3\u20136). Tight Bounds on Profile Redundancy and Distinguishability. Lake Tahoe, NV, USA."},{"key":"ref_20","unstructured":"Acharya, J., Jafarpour, A., Orlitsky, A., and Suresh, A. (2013, January 12\u201314). Optimal Probability Estimation with Applications to Prediction and Classification. Princeton NJ, USA."},{"key":"ref_21","unstructured":"Gr\u00fcnwald, P., Myllym\u00e4ki, P., Tabus, I., Weinberger, M., and Yu, B. (2008). Festschrift in Honor of Jorma Rissanen on the Occasion of His 75th Birthday, Tampere International Center for Signal Processing."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"3207","DOI":"10.1109\/TIT.2011.2137210","article-title":"Probability Estimation in the Rare-Events Regime","volume":"57","author":"Wagner","year":"2011","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_23","first-page":"229","article-title":"Good, Jelinek, Mercer, and Robins on Turing\u2019s estimate of probabilities","volume":"11","author":"Nadas","year":"1991","journal-title":"Am. J. Math. Manag. Sci"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Oostdijk, N., and de Haan, P. (1994). Corpus Based Research into Language, Rodopi.","DOI":"10.1163\/9789004653566"},{"key":"ref_25","unstructured":"Cesa-Bianchi, N., and Goldman, SA. (1, January 28). On the convergence rate of Good Turing estimators. Palo Alto, CA, USA."},{"key":"ref_26","first-page":"1231","article-title":"Concentration Bounds on Unigrams Language Models","volume":"6","author":"Drukh","year":"2005","journal-title":"J. Mach. Learn. Res"},{"key":"ref_27","unstructured":"Meek, C., and Halpern, J. (2004). On modeling profiles instead of values, AUAI Press."},{"key":"ref_28","unstructured":"Gallager, R. (1976). Technical Report LIDS-P-937, M.I.T."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1109\/TIT.1980.1056167","article-title":"A source matching approach to finding minimax codes","volume":"26","author":"Davisson","year":"1980","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_30","first-page":"134","article-title":"Coding of a source with unknown but ordered probabilities","volume":"15","author":"Ryabko","year":"1979","journal-title":"Probl. Inf. Transm"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Kingman, J. (1980). The Mathematics of Genetic Diversity, SIAM.","DOI":"10.1137\/1.9781611970357"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Earman, J., and Norton, JD. (1997). The Cosmos of Science: Essays of Exploration, The University of Pittsburgh Press.","DOI":"10.2307\/j.ctt5vkh2v"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zabell, S. (2005). Symmetry and Its Discontents: Essays on the History of Inductive Probability, Cambridge University Press.","DOI":"10.1017\/CBO9780511614293"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"525","DOI":"10.2307\/1428070","article-title":"Random discrete distributions invariant under size-biased permutation","volume":"28","author":"Pitman","year":"1996","journal-title":"Adv. Appl. Probab"},{"key":"ref_35","unstructured":"Pitman, J. (2006). \u00c9cole d\u2019\u00c9t\u00e9 de Probabilit\u00e9s de Saint-Flour, XXXII\u20142002, Springer."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1214\/aos\/1176342360","article-title":"A Bayesian analysis of some nonparametric problems","volume":"1","author":"Ferguson","year":"1973","journal-title":"Ann. Stat"},{"key":"ref_37","unstructured":"Ramamoorthi, R., and Srikanth, K. (2007). Encyclopedia of Statistical Sciences, John Wiley and Sons."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1016\/0040-5809(72)90035-4","article-title":"The Sampling Theory of Selectively Neutral Alleles","volume":"3","author":"Ewens","year":"1972","journal-title":"Theor. Popul. Biol"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1016\/0040-5809(72)90036-6","article-title":"Addendum to a Paper of W. Ewens","volume":"3","author":"Karlin","year":"1972","journal-title":"Theor. Popul. Biol"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"463","DOI":"10.2307\/1426228","article-title":"The Sampling Theory of Selectively Neutral Alleles","volume":"6","author":"Watterson","year":"1974","journal-title":"Adv. Appl. Probab"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Santhanam, N., and Madiman, M. (2010, January 13\u201318). Patterns and exchangeability. Austin, TX, USA.","DOI":"10.1109\/ISIT.2010.5513581"},{"key":"ref_42","unstructured":"Carlton, M.A. (1999). Applications of the Two-Parameter Poisson-Dirichlet Distribution. [Ph.D. Thesis, University of California, Los Angeles]."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"2954","DOI":"10.1109\/TIT.2006.876351","article-title":"Limit results on pattern entropy","volume":"52","author":"Orlitsky","year":"2006","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_44","unstructured":"Santhanam, N., Anantharam, V., Kavcic, A., and Szpankowski, W. (July, January 29). Data driven weak universal redundancy. Honolulu, HI, USA."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/16\/10\/5339\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T21:16:52Z","timestamp":1760217412000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/16\/10\/5339"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,10,13]]},"references-count":44,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2014,10]]}},"alternative-id":["e16105339"],"URL":"https:\/\/doi.org\/10.3390\/e16105339","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2014,10,13]]}}}