{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T01:15:50Z","timestamp":1775265350471,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"20","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Motivation: Graphical models are often employed to interpret patterns of correlations observed in data through a network of interactions between the variables. Recently, Ising\/Potts models, also known as Markov random fields, have been productively applied to diverse problems in biology, including the prediction of structural contacts from protein sequence data and the description of neural activity patterns. However, inference of such models is a challenging computational problem that cannot be solved exactly. Here, we describe the adaptive cluster expansion (ACE) method to quickly and accurately infer Ising or Potts models based on correlation data. ACE avoids overfitting by constructing a sparse network of interactions sufficient to reproduce the observed correlation data within the statistical error expected due to finite sampling. When convergence of the ACE algorithm is slow, we combine it with a Boltzmann Machine Learning algorithm (BML). We illustrate this method on a variety of biological and artificial datasets and compare it to state-of-the-art approximate methods such as Gaussian and pseudo-likelihood inference.<\/jats:p>\n                  <jats:p>Results: We show that ACE accurately reproduces the true parameters of the underlying model when they are known, and yields accurate statistical descriptions of both biological and artificial data. Models inferred by ACE more accurately describe the statistics of the data, including both the constrained low-order correlations and unconstrained higher-order correlations, compared to those obtained by faster Gaussian and pseudo-likelihood methods. These alternative approaches can recover the structure of the interaction network but typically not the correct strength of interactions, resulting in less accurate generative models.<\/jats:p>\n                  <jats:p>Availability and implementation: The ACE source code, user manual and tutorials with the example data and filtered correlations described herein are freely available on GitHub at https:\/\/github.com\/johnbarton\/ACE.<\/jats:p>\n                  <jats:p>Contacts: \u00a0jpbarton@mit.edu, cocco@lps.ens.fr<\/jats:p>\n                  <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btw328","type":"journal-article","created":{"date-parts":[[2016,6,22]],"date-time":"2016-06-22T15:42:27Z","timestamp":1466610147000},"page":"3089-3097","source":"Crossref","is-referenced-by-count":84,"title":["ACE: adaptive cluster expansion for maximum entropy graphical model inference"],"prefix":"10.1093","volume":"32","author":[{"given":"J. P.","family":"Barton","sequence":"first","affiliation":[{"name":"1 Departments of Chemical Engineering and Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA"},{"name":"2 Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard, Cambridge, MA 02139, USA"}]},{"given":"E.","family":"De Leonardis","sequence":"additional","affiliation":[{"name":"3 Laboratoire de Physique Statistique de L\u2019Ecole Normale Sup\u00e9rieure, CNRS, Ecole Normale Sup\u00e9rieure & Universit\u00e9 P.&M. Curie, Paris, France"},{"name":"4 Computational and Quantitative Biology, UPMC, UMR 7238, Sorbonne Universit\u00e9, Paris, France"}]},{"given":"A.","family":"Coucke","sequence":"additional","affiliation":[{"name":"4 Computational and Quantitative Biology, UPMC, UMR 7238, Sorbonne Universit\u00e9, Paris, France"},{"name":"5 Laboratoire de Physique Th\u00e9orique de L\u2019Ecole Normale Sup\u00e9rieure, CNRS, Ecole Normale Sup\u00e9rieure & Universit\u00e9 P.&M. Curie, Paris, France"}]},{"given":"S.","family":"Cocco","sequence":"additional","affiliation":[{"name":"3 Laboratoire de Physique Statistique de L\u2019Ecole Normale Sup\u00e9rieure, CNRS, Ecole Normale Sup\u00e9rieure & Universit\u00e9 P.&M. Curie, Paris, France"}]}],"member":"286","published-online":{"date-parts":[[2016,6,21]]},"reference":[{"key":"2023020113470766300_btw328-B1","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1207\/s15516709cog0901_7","article-title":"A learning algorithm for Boltzmann machines","volume":"9","author":"Ackley","year":"1985","journal-title":"Cognit. Sci"},{"key":"2023020113470766300_btw328-B2","first-page":"20","article-title":"Differential geometrical theory of statistics","author":"Amari","year":"1987","journal-title":"IMS Monograph vol. 10, Differential Geometry in Statistical Inference"},{"key":"2023020113470766300_btw328-B3","doi-asserted-by":"crossref","first-page":"090201","DOI":"10.1103\/PhysRevLett.108.090201","article-title":"Inverse Ising inference using all the data","volume":"108","author":"Aurell","year":"2012","journal-title":"Phys. Rev. Lett"},{"key":"2023020113470766300_btw328-B4","doi-asserted-by":"crossref","first-page":"P03002","DOI":"10.1088\/1742-5468\/2013\/03\/P03002","article-title":"Ising models for neural activity inferred via selective cluster expansion: structural and coding properties","volume":"2013","author":"Barton","year":"2013","journal-title":"J. Stat. Mech.: Theory Expe"},{"key":"2023020113470766300_btw328-B5","doi-asserted-by":"crossref","first-page":"012132","DOI":"10.1103\/PhysRevE.90.012132","article-title":"Large pseudocounts and L2-norm penalties are necessary for the mean-field inference of Ising and Potts models","volume":"90","author":"Barton","year":"2014","journal-title":"Phys. Rev. E"},{"key":"2023020113470766300_btw328-B6","doi-asserted-by":"crossref","first-page":"1965","DOI":"10.1073\/pnas.1415386112","article-title":"Scaling laws describe memories of host\u2013pathogen riposte in the HIV population","volume":"112","author":"Barton","year":"2015","journal-title":"Proc. Natl. Acad. Sci. U. S. A"},{"key":"2023020113470766300_btw328-B7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10955-015-1441-4","article-title":"On the entropy of protein families","volume":"162","author":"Barton","year":"2016","journal-title":"J. Stat. Phys"},{"key":"2023020113470766300_btw328-B8","doi-asserted-by":"crossref","first-page":"090601","DOI":"10.1103\/PhysRevLett.106.090601","article-title":"Adaptive cluster expansion for inferring Boltzmann machines with noisy data","volume":"106","author":"Cocco","year":"2011","journal-title":"Phys. Rev. Lett"},{"key":"2023020113470766300_btw328-B9","doi-asserted-by":"crossref","first-page":"252","DOI":"10.1007\/s10955-012-0463-4","article-title":"Adaptive cluster expansion for the inverse Ising problem: convergence, algorithm and tests","volume":"147","author":"Cocco","year":"2012","journal-title":"J. Stat. Phys"},{"key":"2023020113470766300_btw328-B10","doi-asserted-by":"crossref","first-page":"14058","DOI":"10.1073\/pnas.0906705106","article-title":"Neuronal couplings between retinal ganglion cells inferred by efficient inverse statistical physics methods","volume":"106","author":"Cocco","year":"2009","journal-title":"Proc. Natl. Acad. Sci. U. S. A"},{"key":"2023020113470766300_btw328-B11","doi-asserted-by":"crossref","first-page":"e1003176","DOI":"10.1371\/journal.pcbi.1003176","article-title":"From principal component to direct coupling analysis of coevolution in proteins: low-eigenvalue modes are needed for structure prediction","volume":"9","author":"Cocco","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023020113470766300_btw328-B12","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1093\/bioinformatics\/btm604","article-title":"Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction","volume":"24","author":"Dunn","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020113470766300_btw328-B13","doi-asserted-by":"crossref","first-page":"012707","DOI":"10.1103\/PhysRevE.87.012707","article-title":"Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models","volume":"87","author":"Ekeberg","year":"2013","journal-title":"Phys. Rev. E"},{"key":"2023020113470766300_btw328-B14","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.jcp.2014.07.024","article-title":"Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences","volume":"276","author":"Ekeberg","year":"2014","journal-title":"J. Comput. Phys"},{"key":"2023020113470766300_btw328-B15","doi-asserted-by":"crossref","first-page":"e1003847","DOI":"10.1371\/journal.pcbi.1003847","article-title":"Improving contact prediction along three dimensions","volume":"10","author":"Feinauer","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023020113470766300_btw328-B16","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1016\/j.immuni.2012.11.022","article-title":"Translating HIV sequences into quantitative fitness landscapes predicts viral vulnerabilities for rational immunogen design","volume":"38","author":"Ferguson","year":"2013","journal-title":"Immunity"},{"key":"2023020113470766300_btw328-B17","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1093\/molbev\/msv211","article-title":"Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1","volume":"33","author":"Figliuzzi","year":"2016","journal-title":"Mol. Biol. Evol"},{"key":"2023020113470766300_btw328-B18","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1038\/nrmicro3490","article-title":"HIV-1 assembly, release and maturation","volume":"13","author":"Freed","year":"2015","journal-title":"Nat. Rev. Microbiol"},{"key":"2023020113470766300_btw328-B19","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1126\/science.1094068","article-title":"Inferring cellular networks using probabilistic graphical models","volume":"303","author":"Friedman","year":"2004","journal-title":"Science"},{"key":"2023020113470766300_btw328-B20","doi-asserted-by":"crossref","first-page":"1360","DOI":"10.1214\/08-AOAS191","article-title":"A weakly informative default prior distribution for logistic and other regression models","volume":"2","author":"Gelman","year":"2008","journal-title":"Ann. Appl. Stat"},{"key":"2023020113470766300_btw328-B21","doi-asserted-by":"crossref","first-page":"P10021","DOI":"10.1088\/1742-5468\/2011\/10\/P10021","article-title":"The inverse ising problem for one-dimensional chains with arbitrary finite-range couplings","volume":"2011","author":"Gori","year":"2011","journal-title":"J. Stat. Mech.: Theory Exp"},{"key":"2023020113470766300_btw328-B22","author":"Hebb","year":"1949"},{"key":"2023020113470766300_btw328-B23","doi-asserted-by":"crossref","first-page":"1607","DOI":"10.1016\/j.cell.2012.04.012","article-title":"Three-dimensional structures of membrane proteins from genomic sequencing","volume":"149","author":"Hopf","year":"2012","journal-title":"Cell"},{"key":"2023020113470766300_btw328-B24","doi-asserted-by":"crossref","first-page":"e1004889","DOI":"10.1371\/journal.pcbi.1004889","article-title":"Benchmarking inverse statistical approaches for protein structure and design with exactly solvable models","volume":"12","author":"Jacquin","year":"2016","journal-title":"PLoS Comput. Biol"},{"key":"2023020113470766300_btw328-B25","doi-asserted-by":"crossref","first-page":"939","DOI":"10.1109\/PROC.1982.12425","article-title":"On the rationale of maximum-entropy methods","volume":"70","author":"Jaynes","year":"1982","journal-title":"Proc. IEEE"},{"key":"2023020113470766300_btw328-B26","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1162\/089976698300017386","article-title":"Efficient learning in Boltzmann machines using linear response theory","volume":"10","author":"Kappen","year":"1998","journal-title":"Neural Comput"},{"key":"2023020113470766300_btw328-B27","doi-asserted-by":"crossref","first-page":"e1003776","DOI":"10.1371\/journal.pcbi.1003776","article-title":"The fitness landscape of HIV-1 Gag: advanced modeling approaches and validation of model predictions by in vitro testing","volume":"10","author":"Mann","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023020113470766300_btw328-B28","doi-asserted-by":"crossref","first-page":"e28766","DOI":"10.1371\/journal.pone.0028766","article-title":"Protein 3D structure computed from evolutionary sequence variation","volume":"6","author":"Marks","year":"2011","journal-title":"PLoS One"},{"key":"2023020113470766300_btw328-B29","doi-asserted-by":"crossref","first-page":"658","DOI":"10.1007\/s10955-013-0707-y","article-title":"Beyond inverse Ising model: structure of the analytical solution","volume":"150","author":"Mastromatteo","year":"2013","journal-title":"J. Stat. Phys"},{"key":"2023020113470766300_btw328-B30","doi-asserted-by":"crossref","first-page":"E1293","DOI":"10.1073\/pnas.1111471108","article-title":"Direct-coupling analysis of residue coevolution captures native contacts across many protein families","volume":"108","author":"Morcos","year":"2011","journal-title":"Proc. Natl. Acad. Sci. U. S. A"},{"key":"2023020113470766300_btw328-B31","doi-asserted-by":"crossref","first-page":"P03004","DOI":"10.1088\/1742-5468\/2012\/03\/P03004","article-title":"Bethe\u2013Peierls approximation and the inverse Ising problem","volume":"2012","author":"Nguyen","year":"2012","journal-title":"J. Stat. Mech.: Theory Exp"},{"key":"2023020113470766300_btw328-B32","doi-asserted-by":"crossref","first-page":"919","DOI":"10.1038\/nn.2337","article-title":"Replay of rule-learning related neural patterns in the prefrontal cortex during sleep","volume":"12","author":"Peyrache","year":"2009","journal-title":"Nat. Neurosci"},{"key":"2023020113470766300_btw328-B33","doi-asserted-by":"crossref","first-page":"1287","DOI":"10.1214\/09-AOS691","article-title":"High-dimensional Ising model selection using l1-regularized logistic regression","volume":"38","author":"Ravikumar","year":"2010","journal-title":"Ann. Stat"},{"key":"2023020113470766300_btw328-B34","author":"Riedmiller","year":"1993"},{"key":"2023020113470766300_btw328-B35","doi-asserted-by":"crossref","first-page":"051915","DOI":"10.1103\/PhysRevE.79.051915","article-title":"Ising model for neural data: Model quality and approximate methods for extracting functional connectivity","volume":"79","author":"Roudi","year":"2009","journal-title":"Phys. Rev. E"},{"key":"2023020113470766300_btw328-B36","doi-asserted-by":"crossref","first-page":"1007","DOI":"10.1038\/nature04701","article-title":"Weak pairwise correlations imply strongly correlated network states in a neural population","volume":"440","author":"Schneidman","year":"2006","journal-title":"Nature"},{"key":"2023020113470766300_btw328-B37","doi-asserted-by":"crossref","first-page":"055001","DOI":"10.1088\/1751-8113\/42\/5\/055001","article-title":"Small-correlation expansions for the inverse Ising problem","volume":"42","author":"Sessak","year":"2009","journal-title":"J. Phys. A: Math. Theor"},{"key":"2023020113470766300_btw328-B38","doi-asserted-by":"crossref","first-page":"5967","DOI":"10.1063\/1.459480","article-title":"Enumeration of all compact conformations of copolymers with random sequence of links","volume":"93","author":"Shakhnovich","year":"1990","journal-title":"J. Chem. Phys"},{"key":"2023020113470766300_btw328-B39","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"A mathematical theory of communication","volume":"27","author":"Shannon","year":"1948","journal-title":"Bell Syst. Tech. J"},{"key":"2023020113470766300_btw328-B40","doi-asserted-by":"crossref","first-page":"062705","DOI":"10.1103\/PhysRevE.88.062705","article-title":"Spin models inferred from patient-derived viral sequence data faithfully describe HIV fitness landscapes","volume":"88","author":"Shekhar","year":"2013","journal-title":"Phys. Rev. E"},{"key":"2023020113470766300_btw328-B41","doi-asserted-by":"crossref","first-page":"10340","DOI":"10.1073\/pnas.1207864109","article-title":"Genomics-aided structure prediction","volume":"109","author":"Su\u0142kowska","year":"2012","journal-title":"Proc. Natl. Acad. Sci. U. S. A"},{"key":"2023020113470766300_btw328-B42","doi-asserted-by":"crossref","first-page":"201508584","DOI":"10.1073\/pnas.1508584112","article-title":"From residue coevolution to protein conformational ensembles and functional dynamics","volume":"112","author":"Sutto","year":"2015","journal-title":"Proc. Natl. Acad. Sci"},{"key":"2023020113470766300_btw328-B43","author":"Tavoni","year":"2015"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/20\/3089\/49021618\/bioinformatics_32_20_3089.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/20\/3089\/49021618\/bioinformatics_32_20_3089.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T18:51:48Z","timestamp":1675277508000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/20\/3089\/2196363"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,6,21]]},"references-count":43,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2016,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw328","relation":{"has-review":[{"id-type":"doi","id":"10.3410\/f.726442736.793548472","asserted-by":"object"}],"has-preprint":[{"id-type":"doi","id":"10.1101\/044677","asserted-by":"object"}]},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,10,15]]},"published":{"date-parts":[[2016,6,21]]}}}