{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:05Z","timestamp":1772138045124,"version":"3.50.1"},"reference-count":53,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2021,7,12]],"date-time":"2021-07-12T00:00:00Z","timestamp":1626048000000},"content-version":"vor","delay-in-days":11,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["CCF-1565719"],"award-info":[{"award-number":["CCF-1565719"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["CCF-1714417"],"award-info":[{"award-number":["CCF-1714417"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["DEB-1737898"],"award-info":[{"award-number":["DEB-1737898"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["IOS-1740874"],"award-info":[{"award-number":["IOS-1740874"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"MSU Institute for Cyber-Enabled Research"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,8,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>The standard bootstrap method is used throughout science and engineering to perform general-purpose non-parametric resampling and re-estimation. Among the most widely cited and widely used such applications is the phylogenetic bootstrap method, which Felsenstein proposed in 1985 as a means to place statistical confidence intervals on an estimated phylogeny (or estimate \u2018phylogenetic support\u2019). A key simplifying assumption of the bootstrap method is that input data are independent and identically distributed (i.i.d.). However, the i.i.d. assumption is an over-simplification for biomolecular sequence analysis, as Felsenstein noted.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this study, we introduce a new sequence-aware non-parametric resampling technique, which we refer to as RAWR (\u2018RAndom Walk Resampling\u2019). RAWR consists of random walks that synthesize and extend the standard bootstrap method and the \u2018mirrored inputs\u2019 idea of Landan and Graur. We apply RAWR to the task of phylogenetic support estimation. RAWR\u2019s performance is compared to the state-of-the-art using synthetic and empirical data that span a range of dataset sizes and evolutionary divergence. We show that RAWR support estimates offer comparable or typically superior type I and type II error compared to phylogenetic bootstrap support. We also conduct a re-analysis of large-scale genomic sequence data from a recent study of Darwin\u2019s finches. Our findings clarify phylogenetic uncertainty in a charismatic clade that serves as an important model for complex adaptive evolution.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Data and software are publicly available under open-source software and open data licenses at: https:\/\/gitlab.msu.edu\/liulab\/RAWR-study-datasets-and-scripts.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab263","type":"journal-article","created":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T16:20:16Z","timestamp":1619194816000},"page":"i111-i119","source":"Crossref","is-referenced-by-count":5,"title":["Build a better bootstrap and the RAWR shall beat a random path to your door: phylogenetic support estimation revisited"],"prefix":"10.1093","volume":"37","author":[{"given":"Wei","family":"Wang","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Michigan State University , East Lansing, MI 48824, USA"}]},{"given":"Ahmad","family":"Hejasebazzi","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Michigan State University , East Lansing, MI 48824, USA"}]},{"given":"Julia","family":"Zheng","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Michigan State University , East Lansing, MI 48824, USA"},{"name":"Ecology, Evolution, and Behavior Program, Michigan State University , East Lansing, MI 48824, USA"}]},{"given":"Kevin J","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Michigan State University , East Lansing, MI 48824, USA"},{"name":"Ecology, Evolution, and Behavior Program, Michigan State University , East Lansing, MI 48824, USA"},{"name":"Genetics and Genome Sciences Program, Michigan State University , East Lansing, MI 48824, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,7,12]]},"reference":[{"key":"2023062410165811900_btab263-B1","doi-asserted-by":"crossref","first-page":"2340","DOI":"10.1093\/molbev\/msz142","article-title":"Identifying clusters of high confidence homologies in multiple sequence alignments","volume":"36","author":"Ali","year":"2019","journal-title":"Mol. Biol. Evol"},{"key":"2023062410165811900_btab263-B2","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1080\/10635150600755453","article-title":"Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative","volume":"55","author":"Anisimova","year":"2006","journal-title":"Syst. Biol"},{"key":"2023062410165811900_btab263-B3","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023062410165811900_btab263-B4","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach Learn"},{"key":"2023062410165811900_btab263-B5","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1186\/1471-2105-3-2","article-title":"The Comparative RNA Web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron and other RNAs","volume":"3","author":"Cannone","year":"2002","journal-title":"BMC Bioinformatics"},{"key":"2023062410165811900_btab263-B6","doi-asserted-by":"crossref","first-page":"1972","DOI":"10.1093\/bioinformatics\/btp348","article-title":"trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses","volume":"25","author":"Capella-Guti\u00e9rrez","year":"2009","journal-title":"Bioinformatics"},{"key":"2023062410165811900_btab263-B7","doi-asserted-by":"crossref","first-page":"1625","DOI":"10.1093\/molbev\/msu117","article-title":"TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction","volume":"31","author":"Chang","year":"2014","journal-title":"Mol. Biol. Evol"},{"key":"2023062410165811900_btab263-B8","article-title":"Incorporating alignment uncertainty into Felsenstein\u2019s phylogenetic bootstrap to improve its reliability","author":"Chang","year":"2019","journal-title":"Bioinformatics"},{"key":"2023062410165811900_btab263-B9","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1093\/sysbio\/syx096","article-title":"Generalized bootstrap supports for phylogenetic analyses of protein sequences incorporating alignment uncertainty","volume":"67","author":"Chatzou","year":"2018","journal-title":"Syst. Biol"},{"key":"2023062410165811900_btab263-B10","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1016\/0165-4896(81)90042-1","article-title":"The complexity of computing metric distances between partitions","volume":"1","author":"Day","year":"1981","journal-title":"Math. Soc. Sci"},{"key":"2023062410165811900_btab263-B11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1214\/aos\/1176344552","article-title":"Bootstrap methods: another look at the jackknife","volume":"7","author":"Efron","year":"1979","journal-title":"Ann. Stat"},{"key":"2023062410165811900_btab263-B12","doi-asserted-by":"crossref","first-page":"783","DOI":"10.2307\/2408678","article-title":"Confidence limits on phylogenies: an approach using the bootstrap","volume":"39","author":"Felsenstein","year":"1985","journal-title":"Evolution"},{"key":"2023062410165811900_btab263-B13","volume-title":"Sinauer Associates","author":"Felsenstein","year":"2004"},{"key":"2023062410165811900_btab263-B14","doi-asserted-by":"crossref","first-page":"1879","DOI":"10.1093\/molbev\/msp098","article-title":"INDELible: a flexible simulator of biological sequence evolution","volume":"26","author":"Fletcher","year":"2009","journal-title":"Mol. Biol. Evol"},{"key":"2023062410165811900_btab263-B15","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1007\/978-3-030-00834-5_14","volume-title":"Comparative Genomics","author":"Hejase","year":"2018"},{"key":"2023062410165811900_btab263-B16","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1093\/sysbio\/sys062","article-title":"Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks","volume":"61","author":"Huson","year":"2012","journal-title":"Syst. Biol"},{"key":"2023062410165811900_btab263-B17","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1093\/molbev\/mst010","article-title":"MAFFT multiple sequence alignment software version 7: improvements in performance and usability","volume":"30","author":"Katoh","year":"2013","journal-title":"Mol. Biol. Evol"},{"key":"2023062410165811900_btab263-B18","doi-asserted-by":"crossref","first-page":"6359","DOI":"10.1093\/nar\/gkr334","article-title":"PSAR: measuring multiple sequence alignment reliability by probabilistic sampling","volume":"39","author":"Kim","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023062410165811900_btab263-B19","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1093\/bioinformatics\/btv184","article-title":"ExaML version 3: a tool for phylogenomic analyses on supercomputers","volume":"31","author":"Kozlov","year":"2015","journal-title":"Bioinformatics"},{"key":"2023062410165811900_btab263-B20","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1038\/nature14181","article-title":"Evolution of Darwin\u2019s finches and their beaks revealed by genome sequencing","volume":"518","author":"Lamichhaney","year":"2015","journal-title":"Nature"},{"key":"2023062410165811900_btab263-B21","doi-asserted-by":"crossref","first-page":"1380","DOI":"10.1093\/molbev\/msm060","article-title":"Heads or tails: a simple reliability check for multiple sequence alignments","volume":"24","author":"Landan","year":"2007","journal-title":"Mol. Biol. Evol"},{"key":"2023062410165811900_btab263-B22","first-page":"15","volume-title":"Biocomputing","author":"Landan","year":"2008"},{"key":"2023062410165811900_btab263-B23","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1038\/s41586-018-0043-0","article-title":"Renewing Felsenstein\u2019s phylogenetic bootstrap in the era of big data","volume":"556","author":"Lemoine","year":"2018","journal-title":"Nature"},{"key":"2023062410165811900_btab263-B26","doi-asserted-by":"crossref","first-page":"1561","DOI":"10.1126\/science.1171243","article-title":"Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees","volume":"324","author":"Liu","year":"2009","journal-title":"Science"},{"key":"2023062410165811900_btab263-B27","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1093\/sysbio\/syr095","article-title":"SAT\u00e9-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees","volume":"61","author":"Liu","year":"2012","journal-title":"Syst. Biol"},{"key":"2023062410165811900_btab263-B28","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1093\/sysbio\/46.3.523","article-title":"Gene trees in species trees","volume":"46","author":"Maddison","year":"1997","journal-title":"Syst. Biol"},{"key":"2023062410165811900_btab263-B29","doi-asserted-by":"crossref","first-page":"i541","DOI":"10.1093\/bioinformatics\/btu462","article-title":"ASTRAL: genome-scale coalescent-based species tree estimation","volume":"30","author":"Mirarab","year":"2014","journal-title":"Bioinformatics"},{"key":"2023062410165811900_btab263-B30","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1089\/cmb.2014.0156","article-title":"PASTA: ultra-large multiple sequence alignment for nucleotide and amino-acid sequences","volume":"22","author":"Mirarab","year":"2015","journal-title":"J. Comput. Biol"},{"key":"2023062410165811900_btab263-B31","first-page":"211","article-title":"The accuracy of fast phylogenetic methods for large datasets","author":"Nakhleh","year":"2002","journal-title":"Pac. Symp. BioComput"},{"key":"2023062410165811900_btab263-B32","first-page":"25","article-title":"The effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analysis","volume":"13","author":"Nelesen","year":"2008","journal-title":"Pac. Symp. Biocomput"},{"key":"2023062410165811900_btab263-B33","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1006\/jmbi.2000.4042","article-title":"T-Coffee: a novel method for fast and accurate multiple sequence alignment","volume":"302","author":"Notredame","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023062410165811900_btab263-B34","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023062410165811900_btab263-B35","doi-asserted-by":"crossref","first-page":"1759","DOI":"10.1093\/molbev\/msq066","article-title":"An alignment confidence score capturing robustness to guide tree uncertainty","volume":"27","author":"Penn","year":"2010","journal-title":"Mol. Biol. Evol"},{"key":"2023062410165811900_btab263-B36","doi-asserted-by":"crossref","first-page":"689","DOI":"10.1093\/molbev\/mss264","article-title":"A method of alignment masking for refining the phylogenetic signal of multiple sequence alignments","volume":"30","author":"Rajan","year":"2013","journal-title":"Mol. Biol. Evol"},{"key":"2023062410165811900_btab263-B37","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1109\/TCBB.2006.4","article-title":"A short proof that phylogenetic tree reconstruction by maximum likelihood is hard","volume":"3","author":"Roch","year":"2006","journal-title":"Trans. Comput. Biol. Bioinform"},{"key":"2023062410165811900_btab263-B38","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1016\/S0022-5193(05)80104-3","article-title":"The general stochastic model of nucleotide substitution","volume":"142","author":"Rodriguez","year":"1990","journal-title":"J. Theor. Biol"},{"key":"2023062410165811900_btab263-B39","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1093\/bioinformatics\/19.2.301","article-title":"r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock","volume":"19","author":"Sanderson","year":"2003","journal-title":"Bioinformatics"},{"key":"2023062410165811900_btab263-B41","doi-asserted-by":"crossref","first-page":"W7","DOI":"10.1093\/nar\/gkv318","article-title":"GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters","volume":"43","author":"Sela","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023062410165811900_btab263-B42","doi-asserted-by":"crossref","first-page":"1312","DOI":"10.1093\/bioinformatics\/btu033","article-title":"RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies","volume":"30","author":"Stamatakis","year":"2014","journal-title":"Bioinformatics"},{"key":"2023062410165811900_btab263-B43","doi-asserted-by":"crossref","first-page":"564","DOI":"10.1080\/10635150701472164","article-title":"Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments","volume":"56","author":"Talavera","year":"2007","journal-title":"Syst. Biol"},{"key":"2023062410165811900_btab263-B44","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/BF00163848","article-title":"Inching toward reality: an improved likelihood model of sequence evolution","volume":"34","author":"Thorne","year":"1992","journal-title":"J. Mol. Evol"},{"key":"2023062410165811900_btab263-B45","first-page":"614","article-title":"Bias and confidence in not-quite large samples","volume":"29","author":"Tukey","year":"1958","journal-title":"Ann. Math. Stat"},{"key":"2023062410165811900_btab263-B46","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1038\/514550a","article-title":"The top 100 papers","volume":"514","author":"Van Noorden","year":"2014","journal-title":"Nat. News"},{"key":"2023062410165811900_btab263-B47","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1007\/BF00182747","article-title":"Substitution rate variation among sites in hypervariable region 1 of human mitochondrial DNA","volume":"37","author":"Wakeley","year":"1993","journal-title":"J. Mol. Evol"},{"key":"2023062410165811900_btab263-B48","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1089\/cmb.1994.1.337","article-title":"On the complexity of multiple sequence alignment","volume":"1","author":"Wang","year":"1994","journal-title":"J. Comput. Biol"},{"key":"2023062410165811900_btab263-B49","first-page":"294","article-title":"Non-parametric and semi-parametric support estimation using sequential resampling random walks on biomolecular sequences","volume-title":"RECOMB International Conference on Comparative Genomics","author":"Wang","year":"2018"},{"key":"2023062410165811900_btab263-B50","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1109\/BIBM47256.2019.8983223","article-title":"An application of random walk resampling to phylogenetic HMM inference and learning","author":"Wang","year":"2019","journal-title":"2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)"},{"key":"2023062410165811900_btab263-B51","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1186\/s12864-016-3104-5","article-title":"A performance study of the impact of recombination on species tree analysis","volume":"17","author":"Wang","year":"2016","journal-title":"BMC Genomics"},{"key":"2023062410165811900_btab263-B52","doi-asserted-by":"crossref","first-page":"RRN1308","DOI":"10.1371\/currents.RRN1308","article-title":"Standard maximum likelihood analyses of alignments with gaps can be statistically inconsistent","volume":"4","author":"Warnow","year":"2012","journal-title":"PLoS Curr"},{"key":"2023062410165811900_btab263-B53","doi-asserted-by":"crossref","DOI":"10.1017\/9781316882313","volume-title":"Computational Phylogenetics: An Introduction to Designing Methods for Phylogeny Estimation","author":"Warnow","year":"2017"},{"key":"2023062410165811900_btab263-B54","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1145\/3307339.3342165","article-title":"Scalable statistical introgression mapping using approximate coalescent-based inference","author":"Wuyun","year":"2019","journal-title":"Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Niagara Falls"},{"key":"2023062410165811900_btab263-B55","first-page":"1396","article-title":"Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites","volume":"10","author":"Yang","year":"1993","journal-title":"Mol. Biol. Evol"},{"key":"2023062410165811900_btab263-B56","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1093\/oxfordjournals.molbev.a025811","article-title":"Bayesian phylogenetic inference using DNA sequences: a Markov chain Monte Carlo method","volume":"14","author":"Yang","year":"1997","journal-title":"Mol. Biol. Evol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/Supplement_1\/i111\/50694074\/btab263.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/Supplement_1\/i111\/50694074\/btab263.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,29]],"date-time":"2024-08-29T03:20:57Z","timestamp":1724901657000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/Supplement_1\/i111\/6319682"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,1]]},"references-count":53,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2021,8,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab263","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.02.02.931063","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,7,1]]},"published":{"date-parts":[[2021,7,1]]}}}