{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T14:25:03Z","timestamp":1777386303939,"version":"3.51.4"},"reference-count":43,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2015,4,16]],"date-time":"2015-04-16T00:00:00Z","timestamp":1429142400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>A directed acyclic graph (DAG) partially represents the conditional independence structure among observations of a system if the local Markov condition holds, that is if every variable is independent of its non-descendants given its parents. In general, there is a whole class of DAGs that represents a given set of conditional independence relations. We are interested in properties of this class that can be derived from observations of a subsystem only. To this end, we prove an information-theoretic inequality that allows for the inference of common ancestors of observed parts in any DAG representing some unknown larger system. More explicitly, we show that a large amount of dependence in terms of mutual information among the observations implies the existence of a common ancestor that distributes this information. Within the causal interpretation of DAGs, our result can be seen as a quantitative extension of Reichenbach\u2019s principle of common cause to more than two variables. Our conclusions are valid also for non-probabilistic observations, such as binary strings, since we state the proof for an axiomatized notion of \u201cmutual information\u201d that includes the stochastic as well as the algorithmic version.<\/jats:p>","DOI":"10.3390\/e17042304","type":"journal-article","created":{"date-parts":[[2015,4,16]],"date-time":"2015-04-16T10:37:57Z","timestamp":1429180677000},"page":"2304-2327","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":36,"title":["Information-Theoretic Inference of Common Ancestors"],"prefix":"10.3390","volume":"17","author":[{"given":"Bastian","family":"Steudel","sequence":"first","affiliation":[{"name":"Max Planck Institute for Mathematics in the Sciences, Inselstra\u00dfe 22, 04103 Leipzig, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nihat","family":"Ay","sequence":"additional","affiliation":[{"name":"Max Planck Institute for Mathematics in the Sciences, Inselstra\u00dfe 22, 04103 Leipzig, Germany"},{"name":"Faculty of Mathematics and Computer Science, University of Leipzig, PF 100920, 04009 Leipzig, Germany"},{"name":"Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2015,4,16]]},"reference":[{"key":"ref_1","unstructured":"Pearl, J. (2000). Causality, Cambridge University Press."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Spirtes, P., Glymour, C., and Scheines, R. (2001). Causation, Prediction, and Search, The MIT Press. [2nd ed.].","DOI":"10.7551\/mitpress\/1754.001.0001"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Lauritzen, S.L. (1996). Graphical Models, Oxford University Press.","DOI":"10.1093\/oso\/9780198522195.001.0001"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"5168","DOI":"10.1109\/TIT.2010.2060095","article-title":"Causal inference using the algorithmic Markov condition","volume":"56","author":"Janzing","year":"2010","journal-title":"IEEE Trans. Inf. Theory."},{"key":"ref_5","unstructured":"Steudel, B., Janzing, D., and Sch\u00f6lkopf, B. (2010, January 17\u201319). Causal markov condition for submodular information measures, Haifa, Israel."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Reichenbach, H. (1956). The Direction of Time, University of Califonia Press.","DOI":"10.1063\/1.3059791"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Cover, T.M., and Thomas, J.A. (2006). Elements of Information Theory, Wiley. [2nd ed.].","DOI":"10.1002\/047174882X"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2443","DOI":"10.1109\/18.945257","article-title":"Algorithmic statistics","volume":"47","author":"Tromp","year":"2001","journal-title":"IEEE Trans. Inf. Theory."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann Publishers Inc.","DOI":"10.1016\/B978-0-08-051489-5.50008-4"},{"key":"ref_10","unstructured":"Mutual information of composed quantum systems satisfies the definition as well, because it can be defined in formal analogy to classical information theory if Shannon entropy is replaced by von Neumann entropy of a quantum state. The properties of mutual information stated above have been used to single out quantum physics from a whole class of no-signaling theories [42]."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1979.tb01052.x","article-title":"Conditional independence in statistical theory","volume":"41","author":"Dawid","year":"1979","journal-title":"J. R. Stat. Soc. Ser. B (Methodol.)."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2699","DOI":"10.1109\/TIT.2010.2046253","article-title":"Information inequalities for joint distributions, with interpretations and applications","volume":"56","author":"Madiman","year":"2010","journal-title":"IEEE Trans. Inf. Theory."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"11539","DOI":"10.1523\/JNEUROSCI.23-37-11539.2003","article-title":"Synergy, redundancy, and independence in population codes","volume":"23","author":"Schneidman","year":"2003","journal-title":"J. Neurosci."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"5195","DOI":"10.1523\/JNEUROSCI.5319-04.2005","article-title":"Synergy, redundancy, and independence in population codes, revisited","volume":"25","author":"Latham","year":"2005","journal-title":"J. Neurosci."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"238701","DOI":"10.1103\/PhysRevLett.91.238701","article-title":"Network information and connected correlations","volume":"91","author":"Schneidman","year":"2003","journal-title":"Phys. Rev. Lett."},{"key":"ref_16","unstructured":"We formulate the independence assumption as Y\u2568X\u02dc|O[n], where X\u02dc denotes all nodes of the DAG-model different from the nodes in O[n] and Y. Note that this assumption does not hold in the original context in which r has been introduced. There, Y is the observation of a stimulus that is presented to some neuronal system and the Oi represent the responses of (areas of) neurons to this stimulus."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Jordan, M.I. (1998). Learning in Graphical Models, Kluwer Academic Publishers.","DOI":"10.1007\/978-94-011-5014-9"},{"key":"ref_18","unstructured":"This terminology is motivated by the general framework of interaction spaces proposed and investigated by Darroch et al. [21] and used by Amari [43] within information geometry."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Li, M., and Vit\u00e1nyi, P. (2007). An Introduction to Kolmogorov Complexity and Its Applications (Text and Monographs in Computer Science), Springer.","DOI":"10.1007\/978-0-387-49820-1"},{"key":"ref_20","unstructured":"Pearl, J. (1995, January 18\u201320). On the testability of causal models with latent and instrumental variables, Montreal, QU, USA."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"522","DOI":"10.1214\/aos\/1176345006","article-title":"Markov fields and log-linear interaction models for contingency tables","volume":"8","author":"Darroch","year":"1980","journal-title":"Ann. Stat."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1665","DOI":"10.1214\/09-AOS760","article-title":"Trek separation for gaussian graphical models","volume":"38","author":"Sullivant","year":"2010","journal-title":"Ann. Stat."},{"key":"ref_23","unstructured":"Riccomagno, E., and Smith, J.Q. (2007). Algebraic causality: Bayes nets and beyond, arXiv, 0709.3377."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"2439","DOI":"10.1016\/j.dam.2008.06.032","article-title":"A refinement of the common cause principle","volume":"157","author":"Ay","year":"2009","journal-title":"Discret. Appl. Math."},{"key":"ref_25","unstructured":"Steudel, B., and Ay, N. (2010). Information-Theoretic Inference of Common Ancestors, arXiv, 1010.5720."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"803","DOI":"10.1109\/TIT.2012.2222863","article-title":"Entropic inequalities and marginal problems","volume":"59","author":"Fritz","year":"2013","journal-title":"IEEE Trans. Inf. Theory."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"043001","DOI":"10.1088\/1367-2630\/16\/4\/043001","article-title":"Causal structures from entropic information: geometry and novel scenarios","volume":"16","author":"Chaves","year":"2014","journal-title":"New J. Phys."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"103001","DOI":"10.1088\/1367-2630\/14\/10\/103001","article-title":"Beyond Bell\u2019s theorem: correlation scenarios","volume":"14","author":"Fritz","year":"2012","journal-title":"New J. Phys."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chaves, R., Majenz, C., and Gross, D. (2015). Information-theoretic implications of quantum causal structures. Nat. Commun., 6.","DOI":"10.1038\/ncomms6766"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"113043","DOI":"10.1088\/1367-2630\/16\/11\/113043","article-title":"Theory-independent limits on correlations from generalized Bayesian networks","volume":"16","author":"Henson","year":"2014","journal-title":"New J. Phys."},{"key":"ref_31","unstructured":"Kalai, A.T., and Mohri, M. (2010, January 17\u201319). Causal Markov condition for submodular information measures, Haifa, Israel."},{"key":"ref_32","unstructured":"Williams, P., and Beer, R. (2010). Nonnegative decomposition of multivariate information, arXiv, 1004.2515."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"2161","DOI":"10.3390\/e16042161","article-title":"Quantifying unique information","volume":"16","author":"Bertschinger","year":"2014","journal-title":"Entropy"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"012130","DOI":"10.1103\/PhysRevE.87.012130","article-title":"Bivariate measure of redundant information","volume":"87","author":"Harder","year":"2013","journal-title":"Phys. Rev. E"},{"key":"ref_35","unstructured":"Griffith, V., and Koch, C. (2013). Quantifying synergistic mutual information, arXiv, 1205.4265."},{"key":"ref_36","unstructured":"Ver Steeg, G., and Galstyan, A. (2014, January 8\u201313). Discovering structure in high-dimensional data through correlation explanation, Montr\u00e9al, QC, Canada."},{"key":"ref_37","unstructured":"Ver Steeg, G., and Galstyan, A. Maximally Informative Hierarchical Representations of High-Dimensional Data, San Diego, CA, USA."},{"key":"ref_38","first-page":"845","article-title":"On Solution Sets of Information Inequalities","volume":"48","author":"Ay","year":"2012","journal-title":"Kybernetika"},{"key":"ref_39","first-page":"284","article-title":"Discriminating between causal structures in Bayesian Networks via partial observations","volume":"50","author":"Moritz","year":"2014","journal-title":"Kybernetika"},{"key":"ref_40","unstructured":"In general there may hold additional conditional independence relations among the observations that are not implied by the local Markov condition together with the semi-graphoid axioms. In fact, it is well known that there so called non-graphical probability distributions whose conditional independence structure can not be completely represented by any DAG."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1016\/B978-0-444-88650-7.50011-1","article-title":"Causal networks: Semantics and expressiveness","volume":"4","author":"Verma","year":"1990","journal-title":"Uncertain. Artif. Intell."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1101","DOI":"10.1038\/nature08400","article-title":"Information causality as a physical principle","volume":"461","author":"Paterek","year":"2009","journal-title":"Nature"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1701","DOI":"10.1109\/18.930911","article-title":"Information geometry on hierarchy of probability distributions","volume":"47","author":"Amari","year":"2001","journal-title":"IEEE Trans. Inf. Theory."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/17\/4\/2304\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T20:44:53Z","timestamp":1760215493000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/17\/4\/2304"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,4,16]]},"references-count":43,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2015,4]]}},"alternative-id":["e17042304"],"URL":"https:\/\/doi.org\/10.3390\/e17042304","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,4,16]]}}}