{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T08:50:58Z","timestamp":1773478258753,"version":"3.50.1"},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T00:00:00Z","timestamp":1773446400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T00:00:00Z","timestamp":1773446400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Department of Mathematics, Imperial College London"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Stat Comput"],"published-print":{"date-parts":[[2026,6]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Models for categorical sequences typically assume exchangeable or first-order dependent sequence elements. These are common assumptions, for example, in models of computer malware traces and protein sequences. Although such simplifying assumptions lead to computational tractability, these models fail to capture long-range, complex dependence structures that may be harnessed for greater predictive power. To this end, a Bayesian modelling framework is proposed to parsimoniously capture rich dependence structures in categorical sequences, with memory efficiency suitable for real-time processing of data streams. Parsimonious Bayesian context trees are introduced as a form of variable-order Markov model with conjugate prior distributions. The novel framework requires fewer parameters than fixed-order Markov models by dropping redundant dependencies and clustering sequential contexts. Approximate inference on the context tree structure is performed via a computationally efficient model-based agglomerative clustering procedure. The proposed framework is tested on synthetic and real-world data examples, and it outperforms existing sequence models when fitted to real protein sequences and honeypot computer terminal sessions.<\/jats:p>","DOI":"10.1007\/s11222-026-10835-7","type":"journal-article","created":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T07:58:08Z","timestamp":1773475088000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Approximate learning of parsimonious Bayesian\u00a0context\u00a0trees"],"prefix":"10.1007","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8611-9966","authenticated-orcid":false,"given":"Daniyar","family":"Ghani","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8767-0810","authenticated-orcid":false,"given":"Nicholas A.","family":"Heard","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4571-6681","authenticated-orcid":false,"given":"Francesco Sanna","family":"Passino","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,3,14]]},"reference":[{"key":"10835_CR1","doi-asserted-by":"publisher","unstructured":"Aldous, D.J.: Exchangeability and related topics. In: Hennequin, P.L. (ed.) \u00c9cole d\u2019\u00c9t\u00e9 de Probabilit\u00e9s de Saint-Flour XIII \u2014 1983, pp. 1\u2013198. Springer, Berlin, Heidelberg (1985). https:\/\/doi.org\/10.1007\/BFb0099421","DOI":"10.1007\/BFb0099421"},{"key":"10835_CR2","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1007\/978-1-59745-514-5_17","volume":"395","author":"TL Bailey","year":"2007","unstructured":"Bailey, T.L.: Discovering sequence motifs. Methods Mol. Biol. 395, 271\u2013292 (2007). https:\/\/doi.org\/10.1007\/978-1-59745-514-5_17","journal-title":"Methods Mol. Biol."},{"key":"10835_CR3","doi-asserted-by":"publisher","first-page":"385","DOI":"10.1613\/jair.1491","volume":"22","author":"R Begleiter","year":"2004","unstructured":"Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order markov models. J. Artif. Intell. Res. 22, 385\u2013421 (2004). https:\/\/doi.org\/10.1613\/jair.1491","journal-title":"J. Artif. Intell. Res."},{"key":"10835_CR4","doi-asserted-by":"publisher","first-page":"1977","DOI":"10.1007\/s00180-022-01310-8","volume":"38","author":"I Bennett","year":"2023","unstructured":"Bennett, I., Martin, D.E.K., Lahiri, S.N.: Fitting sparse Markov models through a collapsed Gibbs sampler. Comput. Statistics 38, 1977\u20131994 (2023). https:\/\/doi.org\/10.1007\/s00180-022-01310-8","journal-title":"Comput. Statistics"},{"key":"10835_CR5","unstructured":"Bourguignon, P.Y., Robelin, D.: Mod\u00e8les de Markov parcimonieux: s\u00e9lection de mod\u00e8le et estimation. In: Proceedings of the 5e \u00c9dition des Journ\u00e9es Ouvertes en Biologie, Informatique et Math\u00e9matiques, Montr\u00e9al (2004) https:\/\/doi.org\/10.13140\/RG.2.1.2558.6083"},{"key":"10835_CR6","doi-asserted-by":"crossref","unstructured":"Collazo, R.A., G\u00f6rgen, C., Smith, J.Q.: Chain Event Graphs. CRC Computer Science and Data Analysis Series. CRC Press, Boca Raton (2018). https:\/\/doi.org\/10.1201\/9781315120515","DOI":"10.1201\/9781315120515"},{"key":"10835_CR7","unstructured":"Dimitrakakis, C.: Bayesian variable order Markov models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, vol. 9, pp. 161\u2013168. PMLR, Sardinia, Italy (2010)"},{"key":"10835_CR8","volume-title":"Pattern Classification and Scene Analysis","author":"RO Duda","year":"1973","unstructured":"Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis, vol. 3. Wiley, New York (1973)"},{"key":"10835_CR9","doi-asserted-by":"publisher","unstructured":"Eggeling, R., Gohr, A., Bourguignon, P.-Y., Wingender, E., Grosse, I.: Inhomogeneous Parsimonious Markov Models. In: Machine Learning and Knowledge Discovery in Databases, pp. 321\u2013336. Springer, Berlin, Heidelberg (2013). https:\/\/doi.org\/10.1007\/978-3-642-40988-2_21","DOI":"10.1007\/978-3-642-40988-2_21"},{"issue":"6","key":"10835_CR10","doi-asserted-by":"publisher","first-page":"879","DOI":"10.1007\/s10994-018-5770-9","volume":"108","author":"R Eggeling","year":"2019","unstructured":"Eggeling, R., Grosse, I., Koivisto, M.: Algorithms for learning parsimonious context trees. Mach. Learn. 108(6), 879\u2013911 (2019). https:\/\/doi.org\/10.1007\/s10994-018-5770-9","journal-title":"Mach. Learn."},{"issue":"2","key":"10835_CR11","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1214\/aos\/1176342360","volume":"1","author":"TS Ferguson","year":"1973","unstructured":"Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Stat. 1(2), 209\u2013230 (1973)","journal-title":"Ann. Stat."},{"key":"10835_CR12","doi-asserted-by":"crossref","unstructured":"Freeman, G., Smith, J.Q.: Bayesian MAP model selection of chain event graphs. J. Multivariate Anal. 102(7), 1152\u20131165 (2011) https:\/\/doi.org\/10.1016\/j.jmva.2011.03.008","DOI":"10.1016\/j.jmva.2011.03.008"},{"key":"10835_CR13","unstructured":"Garc\u00eda, J.E., Gonz\u00e1lez-L\u00f3pez, V.A.: Minimal Markov models. In: Fourth Workshop on Information Theoretic Methods in Science and Engineering: Proceedings, pp. 25\u201328. University of Helsinki, Helsinki (2011)"},{"issue":"4","key":"10835_CR14","doi-asserted-by":"publisher","first-page":"160","DOI":"10.3390\/e19040160","volume":"19","author":"JE Garc\u00eda","year":"2017","unstructured":"Garc\u00eda, J.E., Gonz\u00e1lez-L\u00f3pez, V.A.: Consistent estimation of partition Markov models. Entropy 19(4), 160 (2017). https:\/\/doi.org\/10.3390\/e19040160","journal-title":"Entropy"},{"issue":"473","key":"10835_CR15","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1198\/016214505000000187","volume":"101","author":"NA Heard","year":"2006","unstructured":"Heard, N.A., Holmes, C.C., Stephens, D.A.: A quantitative study of gene regulation involved in the immune response of Anopheline mosquitoes. J. Am. Stat. Assoc. 101(473), 18\u201329 (2006). https:\/\/doi.org\/10.1198\/016214505000000187","journal-title":"J. Am. Stat. Assoc."},{"key":"10835_CR16","doi-asserted-by":"publisher","unstructured":"Heller, K.A., Ghahramani, Z.: Bayesian hierarchical clustering. In: Proceedings of the 22nd International Conference on Machine Learning. ICML \u201905, pp. 297\u2013304 (2005). https:\/\/doi.org\/10.1145\/1102351.1102389","DOI":"10.1145\/1102351.1102389"},{"issue":"1","key":"10835_CR17","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1007\/BF01908075","volume":"2","author":"L Hubert","year":"1985","unstructured":"Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193\u2013218 (1985). https:\/\/doi.org\/10.1007\/BF01908075","journal-title":"J. Classif."},{"issue":"3","key":"10835_CR18","doi-asserted-by":"publisher","first-page":"639","DOI":"10.1111\/sjos.12053","volume":"41","author":"V J\u00e4\u00e4skinen","year":"2014","unstructured":"J\u00e4\u00e4skinen, V., Xiong, J., Corander, J., Koski, T.: Sparse Markov chains for sequence data. Scand. J. Stat. 41(3), 639\u2013655 (2014). https:\/\/doi.org\/10.1111\/sjos.12053","journal-title":"Scand. J. Stat."},{"issue":"4","key":"10835_CR19","doi-asserted-by":"publisher","first-page":"1287","DOI":"10.1111\/rssb.12511","volume":"84","author":"I Kontoyiannis","year":"2022","unstructured":"Kontoyiannis, I., Mertzanis, L., Panotopoulou, A., Papageorgiou, I., Skoularidou, M.: Bayesian context trees: Modelling and exact inference for discrete time series. J. Roy. Stat. Soc. B 84(4), 1287\u20131323 (2022). https:\/\/doi.org\/10.1111\/rssb.12511","journal-title":"J. Roy. Stat. Soc. B"},{"issue":"2","key":"10835_CR20","doi-asserted-by":"publisher","first-page":"435","DOI":"10.1198\/1061860043524","volume":"13","author":"M M\u00e4chler","year":"2004","unstructured":"M\u00e4chler, M., B\u00fchlmann, P.: Variable length Markov chains: Methodology, computing, and software. J. Comput. Graph. Stat. 13(2), 435\u2013455 (2004). https:\/\/doi.org\/10.1198\/1061860043524","journal-title":"J. Comput. Graph. Stat."},{"key":"10835_CR21","doi-asserted-by":"publisher","unstructured":"Papageorgiou, I., Kontoyiannis, I.: Posterior representations for bayesian context trees: Sampling. Estimation and Convergence. Bayesian Analysis 19(2), 501\u2013529 (2024). https:\/\/doi.org\/10.1214\/23-BA1362","DOI":"10.1214\/23-BA1362"},{"issue":"5","key":"10835_CR22","doi-asserted-by":"publisher","first-page":"656","DOI":"10.1109\/TIT.1983.1056741","volume":"29","author":"J Rissanen","year":"1983","unstructured":"Rissanen, J.: A universal data compression system. IEEE Trans. Inf. Theory 29(5), 656\u2013664 (1983). https:\/\/doi.org\/10.1109\/TIT.1983.1056741","journal-title":"IEEE Trans. Inf. Theory"},{"key":"10835_CR23","doi-asserted-by":"crossref","unstructured":"Sanna Passino, F., Mantziou, A., Ghani, D., Thiede, P., Bevington, R., Heard, N.A.: Nested Dirichlet models for unsupervised attack pattern detection in honeypot data. Ann. Appl. Stat. 19(1), 586\u2013613 (2025) https:\/\/doi.org\/10.1214\/24-AOAS1974","DOI":"10.1214\/24-AOAS1974"},{"key":"10835_CR24","doi-asserted-by":"crossref","unstructured":"Smith, J.Q., Anderson, P.E.: Conditional independence and chain event graphs. Artif. Intell. 172(1), 42\u201368 (2008). https:\/\/doi.org\/10.1016\/j.artint.2007.05.004","DOI":"10.1016\/j.artint.2007.05.004"}],"container-title":["Statistics and Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-026-10835-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11222-026-10835-7","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-026-10835-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T07:58:11Z","timestamp":1773475091000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11222-026-10835-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,14]]},"references-count":24,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,6]]}},"alternative-id":["10835"],"URL":"https:\/\/doi.org\/10.1007\/s11222-026-10835-7","relation":{},"ISSN":["0960-3174","1573-1375"],"issn-type":[{"value":"0960-3174","type":"print"},{"value":"1573-1375","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,14]]},"assertion":[{"value":"27 January 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 January 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 March 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"106"}}