{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T20:39:07Z","timestamp":1760647147575,"version":"build-2065373602"},"reference-count":47,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2020,1,16]],"date-time":"2020-01-16T00:00:00Z","timestamp":1579132800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Information theory provides a mathematical foundation to measure uncertainty in belief. Belief is represented by a probability distribution that captures our understanding of an outcome\u2019s plausibility. Information measures based on Shannon\u2019s concept of entropy include realization information, Kullback\u2013Leibler divergence, Lindley\u2019s information in experiment, cross entropy, and mutual information. We derive a general theory of information from first principles that accounts for evolving belief and recovers all of these measures. Rather than simply gauging uncertainty, information is understood in this theory to measure change in belief. We may then regard entropy as the information we expect to gain upon realization of a discrete latent random variable. This theory of information is compatible with the Bayesian paradigm in which rational belief is updated as evidence becomes available. Furthermore, this theory admits novel measures of information with well-defined properties, which we explored in both analysis and experiment. This view of information illuminates the study of machine learning by allowing us to quantify information captured by a predictive model and distinguish it from residual information contained in training data. We gain related insights regarding feature selection, anomaly detection, and novel Bayesian approaches.<\/jats:p>","DOI":"10.3390\/e22010108","type":"journal-article","created":{"date-parts":[[2020,1,17]],"date-time":"2020-01-17T04:14:41Z","timestamp":1579234481000},"page":"108","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Generalizing Information to the Evolution of Rational Belief"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2178-3695","authenticated-orcid":false,"given":"Jed A.","family":"Duersch","sequence":"first","affiliation":[{"name":"Sandia National Laboratories, Livermore, CA 94550, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thomas A.","family":"Catanach","sequence":"additional","affiliation":[{"name":"Sandia National Laboratories, Livermore, CA 94550, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,1,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"A Mathematical Theory of Communication","volume":"27","author":"Shannon","year":"1948","journal-title":"Bell Syst. Tech. J."},{"key":"ref_2","first-page":"621","article-title":"M\u00e9moire sur la probabilit\u00e9 des causes par les \u00e9v\u00e9nements","volume":"6","author":"LaPlace","year":"1774","journal-title":"M\u00e9moires de l\u2019Academie Royale des Sciences Present\u00e9s par Divers Savan"},{"doi-asserted-by":"crossref","unstructured":"Jeffreys, H. (1998). The Theory of Probability, OUP Oxford.","key":"ref_3","DOI":"10.1093\/oso\/9780198503682.001.0001"},{"doi-asserted-by":"crossref","unstructured":"Jaynes, E.T. (2003). Probability Theory: The Logic of Science, Cambridge University Press.","key":"ref_4","DOI":"10.1017\/CBO9780511790423"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1214\/aoms\/1177729694","article-title":"On Information and Sufficiency","volume":"22","author":"Kullback","year":"1951","journal-title":"Ann. Math. Stat."},{"unstructured":"Kullback, S. (1997). Information Theory and Statistics, Courier Corporation.","key":"ref_7"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1214\/aoms\/1177728069","article-title":"On a measure of the information provided by an experiment","volume":"27","author":"Lindley","year":"1956","journal-title":"Ann. Math. Stat."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1119\/1.1990764","article-title":"Probability, frequency and reasonable expectation","volume":"14","author":"Cox","year":"1946","journal-title":"Am. J. Phys."},{"doi-asserted-by":"crossref","unstructured":"Ramsey, F.P. (2016). Truth and probability. Readings in Formal Epistemology, Springer.","key":"ref_10","DOI":"10.1007\/978-3-319-20451-2_3"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"251","DOI":"10.2307\/2268221","article-title":"On confirmation and rational betting","volume":"20","author":"Lehman","year":"1955","journal-title":"J. Symbol. Logic"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1007\/BF02025803","article-title":"On rational betting systems","volume":"6","author":"Adams","year":"1962","journal-title":"Arch. Math. Logic"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1177","DOI":"10.1214\/aoms\/1177697494","article-title":"Bayes\u2019 method for bookies","volume":"40","author":"Freedman","year":"1969","journal-title":"Ann. Math. Stat."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1086\/289350","article-title":"Dynamic Coherence and Probability Kinematics","volume":"54","author":"Skyrms","year":"1987","journal-title":"Philos. Sci."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1243","DOI":"10.1080\/01621459.1994.10476865","article-title":"Capturing the intangible concept of information","volume":"89","author":"Soofi","year":"1994","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1349","DOI":"10.1080\/01621459.2000.10474346","article-title":"Principal information theoretic approaches","volume":"95","author":"Soofi","year":"2000","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1109\/TIT.2003.821973","article-title":"Information properties of order statistics and spacings","volume":"50","author":"Ebrahimi","year":"2004","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1111\/j.1751-5823.2010.00105.x","article-title":"Information measures in perspective","volume":"78","author":"Ebrahimi","year":"2010","journal-title":"Int. Stat. Rev."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1007\/s11222-013-9416-2","article-title":"Understanding predictive information criteria for Bayesian models","volume":"24","author":"Gelman","year":"2013","journal-title":"Stat. Comput."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"911","DOI":"10.1214\/aoms\/1177704014","article-title":"Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables","volume":"34","author":"Good","year":"1963","journal-title":"Ann. Math. Stat."},{"unstructured":"MacKay, D.J.C. (2003). Information Theory, Inference and Learning Algorithms, Cambridge University Press.","key":"ref_21"},{"doi-asserted-by":"crossref","unstructured":"Tishby, N., and Zaslavsky, N. (May, January 26). Deep learning and the information bottleneck principle. Proceedings of the 2015 IEEE Information Theory Workshop (ITW), Jerusalem, Israel.","key":"ref_22","DOI":"10.1109\/ITW.2015.7133169"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1111\/j.2517-6161.1951.tb00069.x","article-title":"The theory of information","volume":"13","author":"Barnard","year":"1951","journal-title":"J. R. Stat. Soc. Methodol."},{"unstructured":"R\u00e9nyi, A. (1961). On Measures of Entropy and Information. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, University of California Press.","key":"ref_24"},{"key":"ref_25","first-page":"229","article-title":"Information-type measures of difference of probability distributions and indirect observation","volume":"2","year":"1967","journal-title":"Stud. Sci. Math. Hung."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1016\/j.physa.2004.03.085","article-title":"Generalized statistics: Yet another generalization","volume":"340","author":"Jizba","year":"2004","journal-title":"Physica A"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"20006","DOI":"10.1209\/0295-5075\/93\/20006","article-title":"A comprehensive classification of complex statistical systems and an axiomatic derivation of their entropy and distribution functions","volume":"93","author":"Hanel","year":"2011","journal-title":"EPL Europhysic. Lett."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1016\/j.physa.2014.05.009","article-title":"Generalized Shannon\u2013Khinchin axioms and uniqueness theorem for pseudo-additive entropies","volume":"411","year":"2014","journal-title":"Physica A"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"021121","DOI":"10.1103\/PhysRevE.84.021121","article-title":"Group entropies, correlation laws, and zeta functions","volume":"84","author":"Tempesta","year":"2011","journal-title":"Phys. Rev. E"},{"unstructured":"Bernoulli, J. (1713). Ars Conjectandi, Basileae Impensis Thurnisiorum Fratrum. Available online: https:\/\/books.google.com.hk\/books?hl=en&lr=&id=XPOf7STJ3y4C&oi=fnd&pg=PA1&dq=Ars+conjectandi%3B+Impensis+Thurnisiorum,+fratrum,+1713.&ots=Lj-EfRRgbu&sig=KCYr2_EIoMa1ui-2fIbrhQAV5aE&redir_esc=y&hl=zh-CN&sourceid=cndr#v=onepage&q=Ars%20conjectandi%3B%20Impensis%20Thurnisiorum%2C%20fratrum%2C%201713.&f=false.","key":"ref_30"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"620","DOI":"10.1103\/PhysRev.106.620","article-title":"Information theory and statistical mechanics","volume":"106","author":"Jaynes","year":"1957","journal-title":"Phys. Rev."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1109\/TIT.1980.1056144","article-title":"Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy","volume":"26","author":"Shore","year":"1980","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1109\/TIT.1981.1056373","article-title":"Properties of cross-entropy minimization","volume":"27","author":"Shore","year":"1981","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"120601","DOI":"10.1103\/PhysRevLett.122.120601","article-title":"Maximum Entropy Principle in statistical inference: Case for non-Shannonian entropies","volume":"122","author":"Jizba","year":"2019","journal-title":"Phys. Rev. Lett."},{"doi-asserted-by":"crossref","unstructured":"H\u00e1jek, A. (2008). Dutch book arguments. The Handbook of Rational and Social Choice, Oxford University Press.","key":"ref_35","DOI":"10.1093\/acprof:oso\/9780199290420.003.0008"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"477","DOI":"10.1016\/B978-0-444-52936-7.50013-6","article-title":"Varieties of Bayesianism","volume":"Volume 10","author":"Gabbay","year":"2011","journal-title":"Inductive Logic"},{"unstructured":"Fadeev, D. (1957). Zum Begriff der Entropie einer endlichen Wahrscheinlichkeitsschemas. Arbeiten zur Informationstheorie I, Deutscher Verlag der Wissenschaften.","key":"ref_37"},{"unstructured":"Lindley, D.V. (1980). The Bayesian Approach to Statistics, University of California, Berkeley, Operations Research Center. Technical Report.","key":"ref_38"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"131","DOI":"10.4064\/fm-15-1-131-179","article-title":"Sur une g\u00e9n\u00e9ralisation des int\u00e9grales de MJ Radon","volume":"15","author":"Nikodym","year":"1930","journal-title":"Fundamenta Mathematicae"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"686","DOI":"10.1214\/aos\/1176344689","article-title":"Expected information as expected utility","volume":"7","author":"Bernardo","year":"1979","journal-title":"Ann. Stat."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"700","DOI":"10.1017\/S0305004100009580","article-title":"Theory of statistical estimation","volume":"Volume 22","author":"Fisher","year":"1925","journal-title":"Mathematical Proceedings of the Cambridge Philosophical Society"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"716","DOI":"10.1109\/TAC.1974.1100705","article-title":"A new look at the statistical model identification","volume":"19","author":"Akaike","year":"1974","journal-title":"IEEE Trans. Autom. Control"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1214\/aos\/1176344136","article-title":"Estimating the Dimension of a Model","volume":"6","author":"Schwarz","year":"1978","journal-title":"Ann. Stat."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1071\/WR99107","article-title":"Kullback-Leibler information as a basis for strong inference in ecological studies","volume":"28","author":"Burnham","year":"2001","journal-title":"Wildl. Res."},{"unstructured":"Pearl, J. (2014). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Elsevier.","key":"ref_45"},{"unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press.","key":"ref_46"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2307\/1969031","article-title":"On the Distribution Function of Additive Functions","volume":"47","year":"1946","journal-title":"Ann. Math."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/1\/108\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:43:10Z","timestamp":1760362990000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/1\/108"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1,16]]},"references-count":47,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2020,1]]}},"alternative-id":["e22010108"],"URL":"https:\/\/doi.org\/10.3390\/e22010108","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2020,1,16]]}}}