{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T17:37:12Z","timestamp":1778261832321,"version":"3.51.4"},"reference-count":61,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2008,6,17]],"date-time":"2008-06-17T00:00:00Z","timestamp":1213660800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Partly motivated by entropy-estimation problems in neuroscience, we present a detailed and extensive comparison between some of the most popular and effective entropy estimation methods used in practice: The plug-in method, four different estimators based on the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on the Context-Tree Weighting (CTW) method, and the renewal entropy estimator. METHODOLOGY: Three new entropy estimators are introduced; two new LZ-based estimators, and the \u201crenewal entropy estimator,\u201d which is tailored to data generated by a binary renewal process. For two of the four LZ-based estimators, a bootstrap procedure is described for evaluating their standard error, and a practical rule of thumb is heuristically derived for selecting the values of their parameters in practice. THEORY: We prove that, unlike their earlier versions, the two new LZ-based estimators are universally consistent, that is, they converge to the entropy rate for every finite-valued, stationary and ergodic process. An effective method is derived for the accurate approximation of the entropy rate of a finite-state hidden Markov model (HMM) with known distribution. Heuristic calculations are presented and approximate formulas are derived for evaluating the bias and the standard error of each estimator. SIMULATION: All estimators are applied to a wide range of data generated by numerous different processes with varying degrees of dependence and memory. The main conclusions drawn from these experiments include: (i) For all estimators considered, the main source of error is the bias. (ii) The CTW method is repeatedly and consistently seen to provide the most accurate results. (iii) The performance of the LZ-based estimators is often comparable to that of the plug-in method. (iv) The main drawback of the plug-in method is its computational inefficiency; with small word-lengths it fails to detect longer-range structure in the data, and with longer word-lengths the empirical distribution is severely undersampled, leading to large biases.<\/jats:p>","DOI":"10.3390\/entropy-e10020071","type":"journal-article","created":{"date-parts":[[2008,6,17]],"date-time":"2008-06-17T09:20:27Z","timestamp":1213694427000},"page":"71-99","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":90,"title":["Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study"],"prefix":"10.3390","volume":"10","author":[{"given":"Yun","family":"Gao","sequence":"first","affiliation":[{"name":"Knight Equity Markets, L.P., Jersey City, NJ 07310, USA"}]},{"given":"Ioannis","family":"Kontoyiannis","sequence":"additional","affiliation":[{"name":"Department of Informatics, Athens University of Economics and Business, Athens 10434, Greece"}]},{"given":"Elie","family":"Bienenstock","sequence":"additional","affiliation":[{"name":"Division of Applied Mathematics and Department of Neuroscience, Brown University, Providence, RI 02912, USA"}]}],"member":"1968","published-online":{"date-parts":[[2008,6,17]]},"reference":[{"key":"ref_1","unstructured":"Quastler, H. (1955). Information theory in psychology, Free Press."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1137\/1104033","article-title":"On a statistical estimate for the entropy of a sequence of independent random variables","volume":"4","author":"Basharin","year":"1959","journal-title":"Theor. Probability Appl."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"669","DOI":"10.1109\/18.30993","article-title":"Estimating the information content of symbol sequences and efficient codes","volume":"35","author":"Grassberger","year":"1989","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1214\/aop\/1176989934","article-title":"Entropy and prefixes","volume":"20","author":"Shields","year":"1992","journal-title":"Ann. Probab."},{"key":"ref_5","unstructured":"Kelly, F.P. (1994). Proba-bility Statistics and Optimization, Wiley."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1162\/neco.1995.7.2.399","article-title":"The upward bias in measures of information derived from limited data samples","volume":"7","author":"Treves","year":"1995","journal-title":"Neural Comput."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1063\/1.166191","article-title":"Entropy estimation of symbol sequences","volume":"6","author":"Grassberger","year":"1996","journal-title":"Chaos"},{"key":"ref_8","unstructured":"Kontoyiannis, I. (The complexity and entropy of literary styles, 1996). The complexity and entropy of literary styles, [Available from pages.cs.aueb.gr\/users\/yiannisk\/]."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1319","DOI":"10.1109\/18.669425","article-title":"Nonparametric entropy estimation for stationary processes and random fields, with applications to English text","volume":"44","author":"Kontoyiannis","year":"1998","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1315","DOI":"10.1109\/18.761290","article-title":"Estimation of the information by an adaptive partitioning of the observation space","volume":"45","author":"Darbellay","year":"1999","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"2797","DOI":"10.1162\/089976600300014728","article-title":"Asymptotic Bias in Information Estimates and the Exponential (Bell) Polynomials","volume":"12","author":"Victor","year":"2000","journal-title":"Neural Comput."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1002\/rsa.10019","article-title":"Convergence properties of functional estimates for discrete distributions","volume":"19","author":"Antos","year":"2001","journal-title":"Random Structures & Algorithms"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1191","DOI":"10.1162\/089976603321780272","article-title":"Estimation of entropy and mutual information","volume":"15","author":"Paninski","year":"2003","journal-title":"Neural Comput."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1551","DOI":"10.1109\/TIT.2004.830771","article-title":"Universal entropy estimation via block sorting","volume":"50","author":"Cai","year":"2004","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_15","first-page":"31","article-title":"An estimate of an upper bound for the Entropy of English","volume":"18","author":"Brown","year":"1992","journal-title":"Computational Linguistics"},{"key":"ref_16","unstructured":"Chen, S., and Reif, J. (,  1993). Using difficulty of prediction to decrease computation: Fast sort, priority queue and convex hull on entropy bounded inputs. 34th Symposium on Foundations of Computer Science, Los Alamitos, California."},{"key":"ref_17","unstructured":"(,  1995). On the entropy of DNA: Algorithms and measurements based on memory and rapid convergence. Proceedings of the 1995 Sympos. on Discrete Algorithms."},{"key":"ref_18","unstructured":"Stevens, C., and Zador, A. (NIPS, 1995). Information through a Spiking Neuron, NIPS."},{"key":"ref_19","unstructured":"Teahan, W., and Cleary, J. (,  1996). The entropy of English using PPM-based models. Proc. Data Compression Conf. \u2013 DCC 96, Los Alamitos, California."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1103\/PhysRevLett.80.197","article-title":"Entropy and Information in Neural Spike Trains","volume":"80","author":"Strong","year":"1998","journal-title":"Phys. Rev. Lett."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1048","DOI":"10.1121\/1.424990","article-title":"Information entropy of humpback whale song","volume":"105","author":"Suzuki","year":"1999","journal-title":"The Journal of the Acoustical Society of America"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1089\/cmb.1999.6.125","article-title":"Significantly Lower Entropy Estimates for Natural DNA Sequences","volume":"6","author":"Loewenstern","year":"1999","journal-title":"Journal of Computational Biology"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1142\/S0219622003000768","article-title":"Computing the entropy of user navigation in the web","volume":"2","author":"Levene","year":"2000","journal-title":"International Journal of Information Technology and Decision Making"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"542","DOI":"10.1016\/S0960-9822(00)00609-6","article-title":"Information theory in the brain","volume":"10","author":"Reinagel","year":"2000","journal-title":"Current Biology"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"332","DOI":"10.1038\/nn826","article-title":"The information efficacy of a synapse","volume":"5","author":"London","year":"2002","journal-title":"Nature Neurosci."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1113\/jphysiol.2003.053264","article-title":"Measuring spike coding in the rat supraoptic nucleus","volume":"555","author":"Bhumbra","year":"2004","journal-title":"The Journal of Physiology"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Nemenman, W., Bialek, W., and de Ruyter van Steveninck, R. (2004). Entropy and information in neural spike trains: Progress on the sampling problem. Physical Review E, 056111.","DOI":"10.1103\/PhysRevE.69.056111"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2336","DOI":"10.1152\/jn.1997.78.5.2336","article-title":"Decoding visual infomation from a population of retinal ganglion cells","volume":"78","author":"Warland","year":"1997","journal-title":"J. of Neurophysiology"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Kennel, M., and Mees, A. (2002). Context-tree modeling of observed symbolic dynamics. Physical Review E, 66.","DOI":"10.1103\/PhysRevE.66.056209"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1162\/089976604322860677","article-title":"Estimating the entropy rate of spike trains via Lempel-Ziv complexity","volume":"16","author":"Wajnryb","year":"2004","journal-title":"Neural Computation"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1683","DOI":"10.1162\/neco.2007.19.7.1683","article-title":"Estimating information rates with confidence intervals in neural spike trains","volume":"19","author":"Shlens","year":"2007","journal-title":"Neural Comput."},{"key":"ref_32","unstructured":"Gao, Y., Kontoyiannis, I., and Bienenstock, E. (,  2003). Lempel-Ziv and CTW entropy estimators for spike trains. Estimation of entropy Workshop, Neural Information Processing Systems Conference (NIPS), Vancouver, BC, Canada."},{"key":"ref_33","unstructured":"Gao, Y. (2004). Division of Applied Mathematics. [Ph.D. thesis, Brown University]."},{"key":"ref_34","unstructured":"Gao, Y., Kontoyiannis, I., and Bienenstock, E. (2006). IEEE Int. Symp. on Inform. Theory."},{"key":"ref_35","unstructured":"Rieke, F., Warland, D., de Ruyter van Steveninck, R., and Bialek, W. (1999). Spikes, MIT Press. Exploring the neural code, Computational Neuroscience."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1109\/TIT.1977.1055714","article-title":"A universal algorithm for sequential data compression","volume":"23","author":"Ziv","year":"1977","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1109\/TIT.1978.1055934","article-title":"Compression of individual sequences by variable rate coding","volume":"24","author":"Ziv","year":"1978","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1109\/18.382012","article-title":"Context tree weighting: Basic properties","volume":"41","author":"Willems","year":"1995","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1514","DOI":"10.1109\/18.532891","article-title":"Context weighting for general finite-context sources","volume":"42","author":"Willems","year":"1996","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"792","DOI":"10.1109\/18.661523","article-title":"The context-tree weighting method: Extensions","volume":"44","author":"Willems","year":"1998","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_41","unstructured":"Cover, T., and Thomas, J. (1991). Elements of Information Theory, J. Wiley."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Shields, P. (1996). The ergodic theory of discrete sample paths, American Mathematical Society.","DOI":"10.1090\/gsm\/013"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"2200","DOI":"10.1109\/TIT.2004.833360","article-title":"Estimating entropy on m bins given fewer than m samples","volume":"50","author":"Paninski","year":"2004","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1250","DOI":"10.1109\/18.45281","article-title":"Some asymptotic properties of the entropy of a stationary ergodic data source with applications to data compression","volume":"35","author":"Wyner","year":"1989","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1109\/18.179344","article-title":"Entropy and data compression schemes","volume":"39","author":"Ornstein","year":"1993","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1214\/aop\/1176993000","article-title":"Asymptotical growth of a class of random trees","volume":"13","author":"Pittel","year":"1985","journal-title":"Ann. Probab."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1647","DOI":"10.1109\/18.259648","article-title":"Asymptotic properties of data compression and suffix trees","volume":"39","author":"Szpankowski","year":"1993","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1109\/18.382018","article-title":"Improved redundancy of a version of the Lempel-Ziv algorithm","volume":"35","author":"Wyner","year":"1995","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"1176","DOI":"10.1137\/0222070","article-title":"A generalized suffix tree and its (un)expected asymptotic behaviors","volume":"22","author":"Szpankowski","year":"1993","journal-title":"SIAM J. Comput."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"2045","DOI":"10.1109\/18.720530","article-title":"On the role of pattern matching in information theory. (Information theory: 1948\u20131998)","volume":"44","author":"Wyner","year":"1998","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"1303","DOI":"10.1080\/01621459.1994.10476870","article-title":"The stationary bootstrap","volume":"89","author":"Politis","year":"1994","journal-title":"J. Amer. Statist. Assoc."},{"key":"ref_52","unstructured":"Barron, A. (1985). [Ph.D. thesis, Dept. of Electrical Engineering, Stanford University]."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1109\/18.75241","article-title":"Sample converses in source coding theory","volume":"37","author":"Kieffer","year":"1991","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_54","unstructured":"Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry, World Scientific."},{"key":"ref_55","first-page":"177","article-title":"On limit theorems connected with the concept of the entropy of Markov chains","volume":"8","author":"Yushkevich","year":"1953","journal-title":"Uspehi Mat. Nauk"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1137\/1107036","article-title":"Some limit theorems for stationary processes","volume":"7","author":"Ibragimov","year":"1962","journal-title":"Theory Probab. Appl."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"1339","DOI":"10.1109\/18.605604","article-title":"Second-order noiseless source coding theorems","volume":"43","author":"Kontoyiannis","year":"1997","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_58","unstructured":"Volf, P., and Willems, F. (,  1995). On the context tree maximizing algorithm. Proc. of the IEEE International Symposium on Inform. Theory, Whistler, Canada."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"1518","DOI":"10.1109\/TIT.2002.1003838","article-title":"Hidden Markov processes","volume":"48","author":"Ephraim","year":"2002","journal-title":"IEEE Trans. Inform. Theory"},{"key":"ref_60","unstructured":"Jacquet, P., Seroussi, G., and Szpankowski, W. (,  2004). On the entropy of a hidden Markov process. Proc. Data Compression Conf. \u2013 DCC 2004, Snowbird, UT."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1007\/BF00534210","article-title":"On the entropy rate of stationary point processes and its discrete approximation","volume":"44","author":"Papangelou","year":"1978","journal-title":"Z. Wahrsch. Verw. Gebiete"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/10\/2\/71\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T22:20:50Z","timestamp":1760221250000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/10\/2\/71"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,6,17]]},"references-count":61,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2008,6]]}},"alternative-id":["e10020071"],"URL":"https:\/\/doi.org\/10.3390\/entropy-e10020071","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,6,17]]}}}