{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T12:38:56Z","timestamp":1777120736827,"version":"3.51.4"},"reference-count":30,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T00:00:00Z","timestamp":1771545600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"DOI":"10.13039\/100000153","name":"Division of Biological Infrastructure","doi-asserted-by":"crossref","award":["2144121"],"award-info":[{"award-number":["2144121"]}],"id":[{"id":"10.13039\/100000153","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000153","name":"Division of Biological Infrastructure","doi-asserted-by":"crossref","award":["2214038"],"award-info":[{"award-number":["2214038"]}],"id":[{"id":"10.13039\/100000153","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000143","name":"Division of Computing and Communication Foundations","doi-asserted-by":"crossref","award":["1714417"],"award-info":[{"award-number":["1714417"]}],"id":[{"id":"10.13039\/100000143","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Computational Biology"],"published-print":{"date-parts":[[2026,4,1]]},"abstract":"<jats:p>\n                    A fundamental assumption in phylogenetics and phylogenomics is that a single, global evolutionary model can adequately characterize the substitution processes operating across all sites in a molecular sequence alignment. However, this assumption is frequently violated in practice due to heterogeneity in evolutionary processes, leading to local model mis-specification and potential bias in downstream inference. While a variety of statistical and machine learning-based approaches have been developed to address this issue, these methods often rely on restrictive model assumptions or are designed for narrowly scoped applications, limiting their generalizability across diverse datasets and evolutionary contexts. Here, we present REVEAL (\u201cREsampling and Visual EvALuation\u201d), a general-purpose statistical framework for detecting and localizing model mis-specification in biomolecular sequence data. REVEAL operates without introducing additional assumptions beyond those inherent to standard global model-based analyses. It employs sequence-aware statistical resampling to construct a local support matrix along the sequence alignment, facilitating the identification of site-level model violations. Through extensive simulation experiments, we demonstrate that REVEAL achieves robust control of both type I and type II errors, with precision of\n                    <jats:inline-formula>\n                      <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"inline\" overflow=\"scroll\">\n                        <mml:mrow>\n                          <mml:mn>90<\/mml:mn>\n                          <mml:mi>%<\/mml:mi>\n                        <\/mml:mrow>\n                      <\/mml:math>\n                    <\/jats:inline-formula>\n                    or greater and recall of\n                    <jats:inline-formula>\n                      <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"inline\" overflow=\"scroll\">\n                        <mml:mrow>\n                          <mml:mn>85<\/mml:mn>\n                          <mml:mi>%<\/mml:mi>\n                        <\/mml:mrow>\n                      <\/mml:math>\n                    <\/jats:inline-formula>\n                    or greater across diverse evolutionary scenarios involving different sources of model heterogeneity, varying dataset sizes in terms of sequence length and number of taxa, and other experimental factors. We further apply REVEAL to genomic data from mouse and mosquito, uncovering localized model violations that are consistent with previously reported biological signals. These results establish REVEAL as a flexible and effective tool for evaluating model adequacy in phylogenetic and phylogenomic analyses.\n                  <\/jats:p>","DOI":"10.1177\/15578666261424921","type":"journal-article","created":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T17:05:49Z","timestamp":1771607149000},"page":"482-498","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":0,"title":["A REsampling and Visual EvALuation Method to Detect and Map Local Model Violations During Biomolecular Sequence Analysis"],"prefix":"10.1177","volume":"33","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2584-5718","authenticated-orcid":false,"given":"Meijun","family":"Gao","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Michigan State University, East Lansing, Michigan, USA."}]},{"given":"Kevin J.","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Michigan State University, East Lansing, Michigan, USA."},{"name":"Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, Michigan, USA."},{"name":"Genetics and Genome Sciences Program, Michigan State University, East Lansing, Michigan, USA."}]}],"member":"179","published-online":{"date-parts":[[2026,2,20]]},"reference":[{"key":"e_1_3_3_2_1","doi-asserted-by":"publisher","DOI":"10.1093\/molbev\/msaa038"},{"key":"e_1_3_3_3_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ajhg.2021.08.005"},{"key":"e_1_3_3_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ympev.2023.107905"},{"key":"e_1_3_3_5_1","doi-asserted-by":"publisher","DOI":"10.1534\/genetics.111.137794"},{"key":"e_1_3_3_6_1","doi-asserted-by":"publisher","DOI":"10.1534\/genetics.109.103010"},{"key":"e_1_3_3_7_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1558-5646.1985.tb00420.x"},{"key":"e_1_3_3_8_1","doi-asserted-by":"publisher","DOI":"10.1093\/molbev\/msp098"},{"key":"e_1_3_3_9_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1258524"},{"key":"e_1_3_3_10_1","doi-asserted-by":"crossref","unstructured":"Gao M Liu KJ. Statistical analysis of GC-biased gene conversion and recombination hotspots in eukaryotic genomes: A phylogenetic hidden Markov model-based approach. In: Proceedings of the 12th ACM Conference on Bioinformatics Computational Biology and Health Informatics. ACM; 2021; pp. 1\u201324.","DOI":"10.1145\/3459930.3469509"},{"key":"e_1_3_3_11_1","unstructured":"Garrigan D Geneva A. (2014) msmove: A modified version of Hudson\u2019s coalescent simulator ms allowing for finer control and tracking of migrant genealogies."},{"issue":"1","key":"e_1_3_3_12_1","first-page":"100","article-title":"Algorithm AS 136: A k-means clustering algorithm","volume":"28","author":"Hartigan JA","year":"1979","unstructured":"Hartigan JA, , Wong MA. Algorithm AS 136: A k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat), 1979; 28(1):100\u2013108.","journal-title":"J R Stat Soc Ser C (Appl Stat)"},{"key":"e_1_3_3_13_1","doi-asserted-by":"publisher","DOI":"10.1093\/oso\/9780198529958.001.0001"},{"key":"e_1_3_3_14_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btn522"},{"key":"e_1_3_3_15_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.0030007"},{"key":"e_1_3_3_16_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/18.2.337"},{"key":"e_1_3_3_17_1","doi-asserted-by":"publisher","DOI":"10.1093\/sysbio\/syu036"},{"key":"e_1_3_3_18_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btac234"},{"key":"e_1_3_3_19_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.1171243"},{"key":"e_1_3_3_20_1","doi-asserted-by":"publisher","DOI":"10.1093\/sysbio\/syr095"},{"key":"e_1_3_3_21_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1406298111"},{"key":"e_1_3_3_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-05269-4_15"},{"key":"e_1_3_3_23_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.1010657"},{"key":"e_1_3_3_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0022-5193(05)80104-3"},{"key":"e_1_3_3_25_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/19.2.301"},{"key":"e_1_3_3_26_1","doi-asserted-by":"publisher","DOI":"10.1093\/sysbio\/syy066"},{"key":"e_1_3_3_27_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btab263"},{"key":"e_1_3_3_28_1","doi-asserted-by":"publisher","DOI":"10.1371\/currents.RRN1308"},{"key":"e_1_3_3_29_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.0030217"},{"key":"e_1_3_3_30_1","doi-asserted-by":"crossref","unstructured":"Wuyun Q VanKuren NW Kronforst M et al. Scalable statistical introgression mapping using approximate coalescent-based inference. In: Proceedings of the 10th ACM International Conference on Bioinformatics Computational Biology and Health Informatics. 2019; pp. 504\u2013513.","DOI":"10.1145\/3307339.3342165"},{"key":"e_1_3_3_31_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btad265"}],"container-title":["Journal of Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/15578666261424921","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/15578666261424921","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/15578666261424921","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/15578666261424921","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T11:49:10Z","timestamp":1777117750000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/15578666261424921"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,20]]},"references-count":30,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2026,4,1]]}},"alternative-id":["10.1177\/15578666261424921"],"URL":"https:\/\/doi.org\/10.1177\/15578666261424921","relation":{},"ISSN":["1066-5277","1557-8666"],"issn-type":[{"value":"1066-5277","type":"print"},{"value":"1557-8666","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,20]]}}}