{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,18]],"date-time":"2026-01-18T09:57:31Z","timestamp":1768730251374,"version":"3.49.0"},"reference-count":46,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2018,9,27]],"date-time":"2018-09-27T00:00:00Z","timestamp":1538006400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Kavli Foundation and the Norwegian Research Council's Center of Excellence scheme","award":["223262"],"award-info":[{"award-number":["223262"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Models can be simple for different reasons: because they yield a simple and computationally efficient interpretation of a generic dataset (e.g., in terms of pairwise dependencies)\u2014as in statistical learning\u2014or because they capture the laws of a specific phenomenon\u2014as e.g., in physics\u2014leading to non-trivial falsifiable predictions. In information theory, the simplicity of a model is quantified by the stochastic complexity, which measures the number of bits needed to encode its parameters. In order to understand how simple models look like, we study the stochastic complexity of spin models with interactions of arbitrary order. We show that bijections within the space of possible interactions preserve the stochastic complexity, which allows to partition the space of all models into equivalence classes. We thus found that the simplicity of a model is not determined by the order of the interactions, but rather by their mutual arrangements. Models where statistical dependencies are localized on non-overlapping groups of few variables are simple, affording predictions on independencies that are easy to falsify. On the contrary, fully connected pairwise models, which are often used in statistical learning, appear to be highly complex, because of their extended set of interactions, and they are hard to falsify.<\/jats:p>","DOI":"10.3390\/e20100739","type":"journal-article","created":{"date-parts":[[2018,9,28]],"date-time":"2018-09-28T02:54:54Z","timestamp":1538103294000},"page":"739","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["The Stochastic Complexity of Spin Models: Are Pairwise Models Really Simple?"],"prefix":"10.3390","volume":"20","author":[{"given":"Alberto","family":"Beretta","sequence":"first","affiliation":[{"name":"The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, I-34014 Trieste, Italy"}]},{"given":"Claudia","family":"Battistin","sequence":"additional","affiliation":[{"name":"Kavli Institute for Systems Neuroscience and Centre for Neural Computation, Norges Teknisk-Naturvitenskapelige Universitet (NTNU), Olav Kyrres Gate 9, 7030 Trondheim, Norway"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3578-5453","authenticated-orcid":false,"given":"Cl\u00e9lia","family":"De Mulatier","sequence":"additional","affiliation":[{"name":"The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, I-34014 Trieste, Italy"},{"name":"Department of Physics and Astronomy, University of Pennsylvania, 209 South 33rd Street, Philadelphia, PA 19104-6396, USA"}]},{"given":"Iacopo","family":"Mastromatteo","sequence":"additional","affiliation":[{"name":"Capital Fund Management, 23 rue de l\u2019Universit\u00e9, 75007 Paris, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9437-6989","authenticated-orcid":false,"given":"Matteo","family":"Marsili","sequence":"additional","affiliation":[{"name":"The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, I-34014 Trieste, Italy"},{"name":"Istituto Nazionale di Fisica Nucleare (INFN) Sezione di Trieste, 34100 Trieste, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2018,9,27]]},"reference":[{"key":"ref_1","unstructured":"Mayer-Schonberger, V., and Cukier, K. (2013). Big Data: A Revolution That Will Transform How We Live, Work and Think, John Murray Publishers."},{"key":"ref_2","unstructured":"Anderson, C. (2018, September 20). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete, 2008. Wired. Available online: https:\/\/www.wired.com\/2008\/06\/pb-theory\/."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"466","DOI":"10.1016\/j.neunet.2010.01.006","article-title":"Are we there yet?","volume":"23","author":"Cristianini","year":"2010","journal-title":"Neural Netw."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"LeCun, Y., Kavukcuoglu, K., and Farabet, C. (June, January 30). Convolutional networks and applications in vision. Proceedings of the 2010 IEEE International Symposium on Circuits and Systems, Paris, France.","DOI":"10.1109\/ISCAS.2010.5537907"},{"key":"ref_5","unstructured":"Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenge, R., Satheesh, S., Sengupta, S., Coates, A., and Ng, A. (arXiv, 2014). Deep Speech: Scaling up end-to-end speech recognition, arXiv."},{"key":"ref_6","unstructured":"Bishop, C. (2006). Pattern Recognition and Machine Learning, Springer-Verlag. (Information Science and Statistics)."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10115-007-0114-2","article-title":"Top 10 algorithms in data mining","volume":"14","author":"Wu","year":"2008","journal-title":"Knowl. Inf. Syst."},{"key":"ref_8","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press. Available online: http:\/\/www.deeplearningbook.org."},{"key":"ref_9","unstructured":"Popper, K. (2002). The Logic of Scientific Discovery (Routledge Classics), Taylor & Francis."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/S1364-6613(02)00005-0","article-title":"Simplicity: A unifying principle in cognitive science?","volume":"7","author":"Chater","year":"2003","journal-title":"Trends Cogn. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1006\/jcss.1997.1501","article-title":"Stochastic complexity in learning","volume":"55","author":"Rissanen","year":"1997","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1016\/0005-1098(78)90005-5","article-title":"Modeling by shortest data description","volume":"14","author":"Rissanen","year":"1978","journal-title":"Automatic"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Gr\u00fcnwald, P. (2007). The Minimum Description Length Principle, MIT Press. (Adaptive Computation and Machine Learning).","DOI":"10.7551\/mitpress\/4643.001.0001"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Chau Nguyen, H., Zecchina, R., and Berg, J. (arXiv, 2017). Inverse statistical problems: From the inverse Ising problem to data science, arXiv.","DOI":"10.1080\/00018732.2017.1341604"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"428","DOI":"10.1049\/iet-syb.2010.0009","article-title":"Multivariate dependence and genetic networks inference","volume":"4","author":"Margolin","year":"2010","journal-title":"IET Syst. Biol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1294","DOI":"10.1007\/s10955-016-1456-5","article-title":"On the Sufficiency of Pairwise Interactions in Maximum Entropy Models of Networks","volume":"162","author":"Merchan","year":"2016","journal-title":"J. Stat. Phys."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1287","DOI":"10.1214\/09-AOS691","article-title":"High-dimensional Ising model selection using \u21131-regularized logistic regression","volume":"38","author":"Ravikumar","year":"2010","journal-title":"Ann. Stat."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"093404","DOI":"10.1088\/1742-5468\/2016\/09\/093404","article-title":"Sparse model selection in the highly under-sampled regime","volume":"2016","author":"Bulso","year":"2016","journal-title":"J. Stat. Mech. Theor. Exp."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1162\/neco.1997.9.2.349","article-title":"Statistical inference, Occam\u2019s razor, and statistical mechanics on the space of probability distributions","volume":"9","author":"Balasubramanian","year":"1997","journal-title":"Neural Comput."},{"key":"ref_20","unstructured":"There is a broader class of models, where subsets \ud835\udcb1 \u2286 \u2133 of operators have the same parameter, i.e., g\u03bc = g\ud835\udcb1 for all \u03bc \u2208 \ud835\udcb1 or g\u03bc are subject to linear constrains. These degenerate models are rarely considered in the inference literature. Here we confine our discussion to non-degenerate models and refer the reader to Section SM-7 of the Supplementary Material for more discussion."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"620","DOI":"10.1103\/PhysRev.106.620","article-title":"Information Theory and Statistical Mechanics","volume":"106","author":"Jaynes","year":"1957","journal-title":"Phys. Rev."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2638","DOI":"10.1103\/PhysRevA.30.2638","article-title":"Alternative approach to maximum-entropy inference","volume":"30","author":"Tikochinsky","year":"1984","journal-title":"Phys. Rev. A"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1109\/18.481776","article-title":"Fisher information and stochastic complexity","volume":"42","author":"Rissanen","year":"1996","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1712","DOI":"10.1109\/18.930912","article-title":"Strong optimality of the normalized ML models as universal codes and information in data","volume":"47","author":"Rissanen","year":"2001","journal-title":"IEEE Trans. Inf. Theo."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1214\/aos\/1176344136","article-title":"Estimating the dimension of a model","volume":"6","author":"Schwarz","year":"1978","journal-title":"Ann. Stat."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"11170","DOI":"10.1073\/pnas.170283897","article-title":"Counting probability distributions: Differential geometry and model selection","volume":"97","author":"Myung","year":"2000","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_27","first-page":"453","article-title":"An Invariant Form for the Prior Probability in Estimation Problems","volume":"186","author":"Jeffreys","year":"1946","journal-title":"Proc. R. Soc. Lond. A Math. Phys. Eng. Sci."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Amari, S. (2016). Information Geometry and Its Applications, Springer. (Applied Mathematical Sciences).","DOI":"10.1007\/978-4-431-55978-8"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1343","DOI":"10.1080\/01621459.1996.10477003","article-title":"The selection of prior distributions by formal rules","volume":"91","author":"Kass","year":"1996","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_30","unstructured":"A simplicial complex [31], in our notation, is a model such that, for any interaction \u03bc \u2208 \u2133, any interaction that involves any subset \u03bd \u2286 \u03bc of spins is also contained in the model (i.e., \u03bd \u2208 \u2133)."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"062311","DOI":"10.1103\/PhysRevE.93.062311","article-title":"Generalized network structures: The configuration model and the canonical ensemble of simplicial complexes","volume":"93","author":"Courtney","year":"2016","journal-title":"Phys. Rev. E"},{"key":"ref_32","unstructured":"Landau, L., and Lifshitz, E. (2013). Statistical Physics, Elsevier Science. [3rd ed.]."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1103\/PhysRev.60.263","article-title":"Statistics of the Two-Dimensional Ferromagnet. Part II","volume":"60","author":"Kramers","year":"1941","journal-title":"Phys. Rev."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"R309","DOI":"10.1088\/0305-4470\/38\/33\/R01","article-title":"Cluster variation method in statistical Physics and probabilistic graphical models","volume":"38","author":"Pelizzola","year":"2005","journal-title":"J. Phys. A Math. Gen."},{"key":"ref_35","unstructured":"The symmetric difference of two sets \u21131 and \u21132 is defined as the set that contains the elements that occur in \u21131 but not in \u21132 and viceversa: \u21131 \u2295 \u21132 = (\u21131 \u222a \u21132) \\ (\u21131 \u2229 \u21132). It corresponds to the XOR operator between the operators of the two loops."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Amari, S., and Nagaoka, H. (2007). Methods of Information Geometry, American Mathematical Society. (Translations of mathematical monographs).","DOI":"10.1090\/mmono\/191"},{"key":"ref_37","first-page":"1","article-title":"Graphical Models, Exponential Families, and Variational Inference","volume":"1","author":"Wainwright","year":"2008","journal-title":"Found. Trends\u00ae Mach. Learn."},{"key":"ref_38","unstructured":"Wainwright, M.J., and Jordan, M.I. (2003, January 1\u20133). Variational inference in graphical models: The view from the marginal polytope. Proceedings of the Forty-First Annual Allerton Conference on Communication, Control, and Computing, Monticello, NY, USA."},{"key":"ref_39","unstructured":"Mastromatteo, I. (arXiv, 2013). On the typical properties of inverse problems in statistical mechanics, arXiv."},{"key":"ref_40","unstructured":"In information geometry [28,36], a model \u2133 defines a manifold in the space of probability distributions. For exponential models (1), the natural metric, in the coordinates g\u03bc, is given by the Fisher Information (5), and the stochastic complexity (4) is the volume of the manifold [26]."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Gresele, L., and Marsili, M. (2017). On maximum entropy and inference. Entropy, 19.","DOI":"10.3390\/e19120642"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/cpa.3160130102","article-title":"The unreasonable effectiveness of mathematics in the natural sciences","volume":"13","author":"Wigner","year":"1960","journal-title":"Commun. Pure Appl. Math."},{"key":"ref_43","unstructured":"In his response to Reference [2] on edge.org, W.D. Willis observes that \u201cModels are interesting precisely because they can take us beyond the data\u201d."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1007","DOI":"10.1038\/nature04701","article-title":"Weak pairwise correlations imply strongly correlated network states in a neural population","volume":"440","author":"Schneidman","year":"2006","journal-title":"Nature"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1007\/s10955-015-1253-6","article-title":"Statistical mechanics of the US Supreme Court","volume":"160","author":"Lee","year":"2015","journal-title":"J. Stat. Phys."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1103\/RevModPhys.74.47","article-title":"Statistical mechanics of complex networks","volume":"74","author":"Albert","year":"2002","journal-title":"Rev. Mod. Phys."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/20\/10\/739\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:22:48Z","timestamp":1760196168000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/20\/10\/739"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,9,27]]},"references-count":46,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2018,10]]}},"alternative-id":["e20100739"],"URL":"https:\/\/doi.org\/10.3390\/e20100739","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,9,27]]}}}