{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,6]],"date-time":"2026-06-06T06:30:18Z","timestamp":1780727418372,"version":"3.54.1"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques.<\/jats:p>\n               <jats:p>Results: We propose a novel approach based on a machine learning tool named random forests (RF) to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with RF and postponing the approximation of the posterior probability of the selected model for a second stage also relying on RF. Compared with earlier implementations of ABC model choice, the ABC RF approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least 50) and (iv) it includes an approximation of the posterior probability of the selected model. The call to RF will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets.<\/jats:p>\n               <jats:p>Availability and implementation: The proposed methodology is implemented in the R package abcrf available on the CRAN.<\/jats:p>\n               <jats:p>Contact: \u00a0jean-michel.marin@umontpellier.fr<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv684","type":"journal-article","created":{"date-parts":[[2015,11,21]],"date-time":"2015-11-21T02:29:46Z","timestamp":1448072986000},"page":"859-866","source":"Crossref","is-referenced-by-count":334,"title":["Reliable ABC model choice via random forests"],"prefix":"10.1093","volume":"32","author":[{"given":"Pierre","family":"Pudlo","sequence":"first","affiliation":[{"name":"1 Universit\u00e9 de Montpellier, IMAG, Montpellier,"},{"name":"2 Institut de Biologie Computationnelle (IBC), Montpellier,"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jean-Michel","family":"Marin","sequence":"additional","affiliation":[{"name":"1 Universit\u00e9 de Montpellier, IMAG, Montpellier,"},{"name":"2 Institut de Biologie Computationnelle (IBC), Montpellier,"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Arnaud","family":"Estoup","sequence":"additional","affiliation":[{"name":"2 Institut de Biologie Computationnelle (IBC), Montpellier,"},{"name":"3 CBGP, INRA, Montpellier,"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jean-Marie","family":"Cornuet","sequence":"additional","affiliation":[{"name":"3 CBGP, INRA, Montpellier,"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mathieu","family":"Gautier","sequence":"additional","affiliation":[{"name":"2 Institut de Biologie Computationnelle (IBC), Montpellier,"},{"name":"3 CBGP, INRA, Montpellier,"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Christian P.","family":"Robert","sequence":"additional","affiliation":[{"name":"4 Universit\u00e9 Paris Dauphine, CEREMADE, Paris, France and"},{"name":"5 University of Warwick, Coventry, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2015,11,20]]},"reference":[{"key":"2023020111554927500_btv684-B1","doi-asserted-by":"crossref","first-page":"1027","DOI":"10.1534\/genetics.112.143164","article-title":"A novel approach for choosing summary statistics in approximate Bayesian computation","volume":"192","author":"Aeschbacher","year":"2012","journal-title":"Genetics"},{"key":"2023020111554927500_btv684-B2","doi-asserted-by":"crossref","first-page":"1109","DOI":"10.1093\/molbev\/msu411","article-title":"CodABC: a computational framework to coestimate recombination, substitution, and molecular adaptation rates by approximate Bayesian computation","volume":"32","author":"Arenas","year":"2015","journal-title":"Mol. Biol. Evol."},{"key":"2023020111554927500_btv684-B3","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1007\/s11222-012-9335-7","article-title":"Considerate approaches to constructing summary statistics for ABC model selection","volume":"22","author":"Barnes","year":"2012","journal-title":"Stat. Comput."},{"key":"2023020111554927500_btv684-B4","first-page":"134","article-title":"Joint determination of topology, divergence time and immigration in population trees","volume-title":"Simulations, Genetics and Human Prehistory","author":"Beaumont","year":"2008"},{"key":"2023020111554927500_btv684-B5","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1146\/annurev-ecolsys-102209-144621","article-title":"Approximate Bayesian computation in evolution and ecology","volume":"41","author":"Beaumont","year":"2010","journal-title":"Annu. Rev. Ecol. Evol. Syst."},{"key":"2023020111554927500_btv684-B6","doi-asserted-by":"crossref","first-page":"2025","DOI":"10.1093\/genetics\/162.4.2025","article-title":"Approximate Bayesian computation in population genetics","volume":"162","author":"Beaumont","year":"2002","journal-title":"Genetics"},{"key":"2023020111554927500_btv684-B7","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-4286-2","volume-title":"Statistical Decision Theory and Bayesian Analysis","author":"Berger","year":"1985","edition":"second edition"},{"key":"2023020111554927500_btv684-B8","doi-asserted-by":"crossref","first-page":"2609","DOI":"10.1111\/j.1365-294X.2010.04690.x","article-title":"ABC as a flexible framework to estimate demography over space and time: some cons, many pros","volume":"19","author":"Bertorelle","year":"2010","journal-title":"Mol. Ecol."},{"key":"2023020111554927500_btv684-B9","first-page":"1063","article-title":"Analysis of a random forest model","volume":"13","author":"Biau","year":"2012","journal-title":"J. Machine Learn. Res."},{"key":"2023020111554927500_btv684-B10","first-page":"376","article-title":"New insights into approximate Bayesian computation","volume":"51","author":"Biau","year":"2015","journal-title":"Annales de l\u2019Institut Henri Poincar\u00e9 B Probability Stat."},{"key":"2023020111554927500_btv684-B11","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1214\/12-STS406","article-title":"A comparative review of dimension reduction methods in approximate Bayesian computation","volume":"28","author":"Blum","year":"2013","journal-title":"Stat. Sci."},{"key":"2023020111554927500_btv684-B12","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Machine Learn."},{"key":"2023020111554927500_btv684-B13","doi-asserted-by":"crossref","first-page":"2501","DOI":"10.1093\/molbev\/msu187","article-title":"Detecting concerted demographic response across community assemblages using hierarchical approximate Bayesian computation","volume":"31","author":"Chan","year":"2014","journal-title":"Mol. Biol. Evol."},{"key":"2023020111554927500_btv684-B14","doi-asserted-by":"crossref","first-page":"955","DOI":"10.1111\/j.1365-294X.2004.02107.x","article-title":"Estimating admixture proportions with microsatellites: comparison of methods based on simulated data","volume":"13","author":"Choisy","year":"2004","journal-title":"Mol. Ecol."},{"key":"2023020111554927500_btv684-B15","doi-asserted-by":"crossref","first-page":"2713","DOI":"10.1093\/bioinformatics\/btn514","article-title":"Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation","volume":"24","author":"Cornuet","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020111554927500_btv684-B16","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2105-11-401","article-title":"Inference on population history and model checking using DNA sequence and microsatellite data with the software DIYABC (v1.0)","volume":"11","author":"Cornuet","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023020111554927500_btv684-B17","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1093\/bioinformatics\/btt763","article-title":"DIYABC v2.0: a software to make approximate Bayesian computation inferences about population history using single nucleotide polymorphism, DNA sequence and microsatellite data","volume":"30","author":"Cornuet","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020111554927500_btv684-B18","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1016\/j.tree.2010.04.001","article-title":"Approximate Bayesian computation (ABC) in practice","volume":"25","author":"Csill\u00e8ry","year":"2010","journal-title":"Trends Ecol. Evol."},{"key":"2023020111554927500_btv684-B19","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4612-0711-5","volume-title":"A Probabilistic Theory of Pattern Recognition, volume 31 of Applications of Mathematics (New York)","author":"Devroye","year":"1996"},{"key":"2023020111554927500_btv684-B20","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1214\/11-BA602","article-title":"Likelihood-free estimation of model evidence","volume":"6","author":"Didelot","year":"2011","journal-title":"Bayesian Anal."},{"key":"2023020111554927500_btv684-B21","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1111\/j.1755-0998.2012.03153.x","article-title":"Estimation of demo-genetic model probabilities with approximate Bayesian computation using linear discriminant analysis on summary statistics","volume":"12","author":"Estoup","year":"2012","journal-title":"Mol. Ecol. Resour."},{"key":"2023020111554927500_btv684-B22","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pgen.1003905","article-title":"Robust demographic inference from genomic and SNP data","volume":"9","author":"Excoffier","year":"2013","journal-title":"PLoS Genet."},{"key":"2023020111554927500_btv684-B23","doi-asserted-by":"crossref","first-page":"419","DOI":"10.1111\/j.1467-9868.2011.01010.x","article-title":"Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation","volume":"74","author":"Fearnhead","year":"2012","journal-title":"J. R. Stat. Soc. Ser. B (Stat. Methodol.)"},{"key":"2023020111554927500_btv684-B24","first-page":"427","article-title":"Likelihood-free methods for model choice in Gibbs random fields","volume":"3","author":"Grelaud","year":"2009","journal-title":"Bayesian Anal."},{"key":"2023020111554927500_btv684-B25","volume-title":"The Elements of Statistical Learning. Data Mining, Inference, and Prediction","author":"Hastie","year":"2009","edition":"second edition"},{"key":"2023020111554927500_btv684-B26","doi-asserted-by":"crossref","first-page":"4654","DOI":"10.1111\/j.1365-294X.2011.05322.x","article-title":"Inferring the origin of populations introduced from a genetically structured native range by approximate Bayesian computation: case study of the invasive ladybird Harmonia axyridis","volume":"20","author":"Lombaert","year":"2011","journal-title":"Mol. Ecol."},{"key":"2023020111554927500_btv684-B27","doi-asserted-by":"crossref","first-page":"1167","DOI":"10.1007\/s11222-011-9288-2","article-title":"Approximate Bayesian computational methods","volume":"22","author":"Marin","year":"2012","journal-title":"Stat. Comput."},{"key":"2023020111554927500_btv684-B28","doi-asserted-by":"crossref","first-page":"833","DOI":"10.1111\/rssb.12056","article-title":"Relevant statistics for Bayesian model choice","volume":"76","author":"Marin","year":"2014","journal-title":"J. R. Stat. Soc. Ser. B (Stat. Methodol.)"},{"key":"2023020111554927500_btv684-B29","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1515\/sagmb-2013-0012","article-title":"Semi-automatic selection of summary statistics for ABC model choice","volume":"13","author":"Prangle","year":"2014","journal-title":"Stat. Appl. Genet. Mol. Biol."},{"key":"2023020111554927500_btv684-B30","doi-asserted-by":"crossref","first-page":"1791","DOI":"10.1093\/oxfordjournals.molbev.a026091","article-title":"Population growth of human Y chromosomes: a study of Y chromosome microsatellites","volume":"16","author":"Pritchard","year":"1999","journal-title":"Mol. Biol. Evol."},{"key":"2023020111554927500_btv684-B31","volume-title":"The Bayesian Choice, second edition","author":"Robert","year":"2001"},{"key":"2023020111554927500_btv684-B32","doi-asserted-by":"crossref","first-page":"15112","DOI":"10.1073\/pnas.1102900108","article-title":"Lack of confidence in ABC model choice","volume":"108","author":"Robert","year":"2011","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020111554927500_btv684-B33","doi-asserted-by":"crossref","first-page":"1151","DOI":"10.1214\/aos\/1176346785","article-title":"Bayesianly justifiable and relevant frequency calculations for the applied statistician","volume":"12","author":"Rubin","year":"1984","journal-title":"Ann. Stat."},{"key":"2023020111554927500_btv684-B34","doi-asserted-by":"crossref","first-page":"1716","DOI":"10.1214\/15-AOS1321","article-title":"Consistency of random forests","volume":"43","author":"Scornet","year":"2015","journal-title":"Ann. Stat."},{"key":"2023020111554927500_btv684-B35","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1007\/s11222-014-9514-9","article-title":"Adaptive ABC model choice and geometric summary statistics for hidden Gibbs random fields","volume":"25","author":"Stoehr","year":"2015","journal-title":"Stat. Comput."},{"key":"2023020111554927500_btv684-B36","doi-asserted-by":"crossref","first-page":"505","DOI":"10.1093\/genetics\/145.2.505","article-title":"Inferring coalescence times from DNA sequence data","volume":"145","author":"Tavar\u00e9","year":"1997","journal-title":"Genetics"},{"key":"2023020111554927500_btv684-B37","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature11632","article-title":"An integrated map of genetic variation from 1\u2009092 human genomes","volume":"491","author":"The 1000 Genomes Project Consortium","year":"2012","journal-title":"Nature"},{"key":"2023020111554927500_btv684-B38","doi-asserted-by":"crossref","first-page":"3653","DOI":"10.1093\/molbev\/mss175","article-title":"Inferring the history of population size change from genome-wide SNP data","volume":"29","author":"Theunert","year":"2012","journal-title":"Mol. Biol. Evol."},{"key":"2023020111554927500_btv684-B39","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1098\/rsif.2008.0172","article-title":"Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems","volume":"6","author":"Toni","year":"2009","journal-title":"J. R. Soc. Interface"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/6\/859\/49018353\/bioinformatics_32_6_859.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/6\/859\/49018353\/bioinformatics_32_6_859.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T22:18:46Z","timestamp":1675289926000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/6\/859\/1744513"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,11,20]]},"references-count":39,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2016,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv684","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,3,15]]},"published":{"date-parts":[[2015,11,20]]}}}