{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,24]],"date-time":"2026-01-24T15:41:33Z","timestamp":1769269293328,"version":"3.49.0"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2025,3,23]],"date-time":"2025-03-23T00:00:00Z","timestamp":1742688000000},"content-version":"vor","delay-in-days":22,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,3,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>RNA biomarkers enable early and precise disease diagnosis, monitoring, and prognosis, facilitating personalized medicine and targeted therapeutic strategies. However, identification of RNA biomarkers is hindered by the challenge of analyzing relatively small yet high-dimensional transcriptomics datasets, typically comprising fewer than 1000 biospecimens but encompassing hundreds of thousands of RNAs, especially noncoding RNAs. This complexity leads to several limitations in existing methods, such as poor reproducibility on independent datasets, inability to directly process omics data, and difficulty in identifying noncoding RNAs as biomarkers. Additionally, these methods often yield results that lack biological interpretation and clinical utility. To overcome these challenges, we present BAMBI (Biostatistical and Artificial-intelligence Methods for Biomarker Identification), a computational tool integrating biostatistical approaches and machine-learning algorithms. By initially reducing high dimensionality through biologically informed statistical methods followed by machine learning\u2013based feature selection, BAMBI significantly enhances the accuracy and clinical utility of identified RNA biomarkers and also includes noncoding RNA biomarkers that existing methods may overlook. BAMBI outperformed existing methods on both real and simulated datasets by identifying individual and panel biomarkers with fewer RNAs while still ensuring superior prediction accuracy. BAMBI was benchmarked on multiple transcriptomics datasets across diseases, including breast cancer, psoriasis, and leukemia. The prognostic biomarkers for acute myeloid leukemia discovered by BAMBI showed significant correlations with patient survival rates in an independent cohort, highlighting its potential for enhancing clinical outcomes. The software is available on GitHub (https:\/\/github.com\/CZhouLab\/BAMBI).<\/jats:p>","DOI":"10.1093\/bib\/bbaf073","type":"journal-article","created":{"date-parts":[[2025,3,23]],"date-time":"2025-03-23T12:58:15Z","timestamp":1742734695000},"source":"Crossref","is-referenced-by-count":4,"title":["BAMBI integrates biostatistical and artificial intelligence methods to improve RNA biomarker discovery"],"prefix":"10.1093","volume":"26","author":[{"given":"Peng","family":"Zhou","sequence":"first","affiliation":[{"name":"Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zixiu","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Feifan","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Euijin","family":"Kwon","sequence":"additional","affiliation":[{"name":"Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]},{"name":"Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tien-Chan","family":"Hsieh","sequence":"additional","affiliation":[{"name":"Division of Hematology-Oncology, Department of Medicine, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shangyuan","family":"Ye","sequence":"additional","affiliation":[{"name":"Biostatistics Shared Resource, Knight Cancer Institute, Oregon Health and Science University , 2720 S Moody Ave, Portland, OR 97201 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shobha","family":"Vasudevan","sequence":"additional","affiliation":[{"name":"Brown RNA Center, Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University , Providence, RI 02903 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jung Ae","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Khanh-Van","family":"Tran","sequence":"additional","affiliation":[{"name":"Division of Cardiology, Department of Medicine, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0351-6235","authenticated-orcid":false,"given":"Chan","family":"Zhou","sequence":"additional","affiliation":[{"name":"Department of Population and Quantitative Health Sciences, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]},{"name":"Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]},{"name":"The RNA Therapeutics Institute, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]},{"name":"UMass Cancer Center, University of Massachusetts Chan Medical School , Worcester, MA 01655 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2025,3,23]]},"reference":[{"key":"2025032311301265300_ref1","author":"FDA","year":"2016"},{"key":"2025032311301265300_ref2","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1038\/nrg.2016.10","article-title":"Translating RNA sequencing into clinical diagnostics: Opportunities and challenges","volume":"17","author":"Byron","year":"2016","journal-title":"Nat Rev Genet"},{"key":"2025032311301265300_ref3","doi-asserted-by":"publisher","DOI":"10.3390\/ncrna3010009","article-title":"RNA biomarkers: Frontier of precision medicine for cancer. Noncoding","volume":"3","author":"Xi","year":"2017","journal-title":"RNA"},{"key":"2025032311301265300_ref4","doi-asserted-by":"publisher","DOI":"10.3389\/fgene.2019.00452","article-title":"Large-scale automatic feature selection for biomarker discovery in high-dimensional omics data","volume":"10","author":"Leclercq","year":"2019","journal-title":"Front Genet"},{"key":"2025032311301265300_ref5","doi-asserted-by":"publisher","DOI":"10.1186\/s12859-021-04443-7","article-title":"ILRC: A hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data","volume":"22","author":"Yu","year":"2021","journal-title":"BMC Bioinformatics"},{"key":"2025032311301265300_ref6","doi-asserted-by":"publisher","first-page":"1115","DOI":"10.1093\/bioinformatics\/btaa935","article-title":"ECMarker: Interpretable machine learning model identifies gene expression biomarkers predicting clinical outcomes and reveals molecular mechanisms of human disease in early stages","volume":"37","author":"Jin","year":"2021","journal-title":"Bioinformatics"},{"key":"2025032311301265300_ref7","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1007\/s00441-023-03816-z","article-title":"The benefits and pitfalls of machine learning for biomarker discovery","volume":"394","author":"Ng","year":"2023","journal-title":"Cell Tissue Res"},{"key":"2025032311301265300_ref8","doi-asserted-by":"publisher","first-page":"e1010357","DOI":"10.1371\/journal.pcbi.1010357","article-title":"Ten quick tips for biomarker discovery and validation analyses using machine learning","volume":"18","author":"Diaz-Uriarte","year":"2022","journal-title":"PLoS Comput Biol"},{"key":"2025032311301265300_ref9","doi-asserted-by":"publisher","DOI":"10.3390\/diagnostics13040664","article-title":"Rise of deep learning clinical applications and challenges in omics data: A systematic review","volume":"13","author":"Mohammed","year":"2023","journal-title":"Diagnostics"},{"key":"2025032311301265300_ref10","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-95228-4_11","article-title":"Bioinformatic methods and resources for biomarker discovery, validation, development, and integration","author":"Perera-Bel","journal-title":"Predictive Biomarkers in Oncology"},{"key":"2025032311301265300_ref11","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"Lecun","year":"2015","journal-title":"Nature"},{"key":"2025032311301265300_ref12","doi-asserted-by":"publisher","DOI":"10.3390\/biom12121839","article-title":"Deep-learning algorithm and concomitant biomarker identification for NSCLC prediction using multi-omics data integration","volume":"12","author":"Park","year":"2022","journal-title":"Biomolecules"},{"key":"2025032311301265300_ref13","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1186\/s12859-021-04527-4","article-title":"DEGnext: Classification of differentially expressed genes from RNA-seq data using a convolutional neural network with transfer learning","volume":"23","author":"Kakati","year":"2022","journal-title":"BMC Bioinformatics"},{"key":"2025032311301265300_ref14","doi-asserted-by":"publisher","DOI":"10.3390\/cancers14020352","article-title":"Noncoding RNAs and deep learning neural network discriminate multi-cancer types","volume":"14","author":"Wang","year":"2022","journal-title":"Cancers (Basel)"},{"key":"2025032311301265300_ref15","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1038\/nature11247","article-title":"An integrated encyclopedia of DNA elements in the human genome","volume":"489","author":"Dunham","year":"2012","journal-title":"Nature"},{"key":"2025032311301265300_ref16","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1007\/s11523-020-00717-x","article-title":"MicroRNAs (miRNAs) and long non-coding RNAs (lncRNAs) as new tools for cancer therapy: First steps from bench to bedside","volume":"15","author":"Ratti","year":"2020","journal-title":"Target Oncol"},{"key":"2025032311301265300_ref17","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1146\/annurev-biochem-051410-092902","article-title":"Genome regulation by long noncoding RNAs","volume":"81","author":"Rinn","year":"2012","journal-title":"Annu Rev Biochem"},{"key":"2025032311301265300_ref18","doi-asserted-by":"publisher","first-page":"1113","DOI":"10.1038\/ng.2764","article-title":"The cancer genome atlas pan-cancer analysis project","volume":"45","author":"The Cancer Genome Atlas Research Network","year":"2013","journal-title":"Nat Genet"},{"key":"2025032311301265300_ref19","doi-asserted-by":"publisher","first-page":"1828","DOI":"10.1038\/jid.2014.28","article-title":"Transcriptome analysis of psoriasis in a large case-control sample: RNA-seq provides insights into disease mechanisms","volume":"134","author":"Li","year":"2014","journal-title":"J Invest Dermatol"},{"key":"2025032311301265300_ref20","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1186\/1751-0473-9-13","article-title":"Biobambam: Tools for read pair collation based algorithms on BAM files","volume":"9","author":"Tischler","year":"2014","journal-title":"Source Code Biol Med"},{"key":"2025032311301265300_ref21","doi-asserted-by":"publisher","first-page":"6745","DOI":"10.1073\/pnas.96.12.6745","article-title":"Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays","volume":"96","author":"Alon","year":"1999","journal-title":"Proc Natl Acad Sci USA"},{"key":"2025032311301265300_ref22","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1016\/S1535-6108(02)00030-2","article-title":"Gene expression correlates of clinical prostate cancer behavior","volume":"1","author":"Singh","year":"2002","journal-title":"Cancer Cell"},{"key":"2025032311301265300_ref23","doi-asserted-by":"publisher","first-page":"D128","DOI":"10.1093\/nar\/gky960","article-title":"Lncbook: A curated knowledgebase of human long non-coding RNAs","volume":"47","author":"Ma","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2025032311301265300_ref24","doi-asserted-by":"publisher","first-page":"907","DOI":"10.1038\/s41587-019-0201-4","article-title":"Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype","volume":"37","author":"Kim","year":"2019","journal-title":"Nat Biotechnol"},{"key":"2025032311301265300_ref25","doi-asserted-by":"publisher","first-page":"166","DOI":"10.1093\/bioinformatics\/btu638","article-title":"HTSeq-A python framework to work with high-throughput sequencing data","volume":"31","author":"Anders","year":"2015","journal-title":"Bioinformatics"},{"key":"2025032311301265300_ref26","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1007\/978-1-61779-361-5_15","volume-title":"Bacterial Molecular Networks: Methods and Protocols","author":"Van Dongen","year":"2012"},{"key":"2025032311301265300_ref27","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1186\/s13073-016-0285-0","article-title":"Long noncoding RNAs expressed in human hepatic stellate cells form networks with extracellular matrix proteins","volume":"8","author":"Zhou","year":"2016","journal-title":"Genome Med"},{"key":"2025032311301265300_ref28","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1038\/nprot.2008.211","article-title":"Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources","volume":"4","author":"Huang","year":"2009","journal-title":"Nat Protoc"},{"key":"2025032311301265300_ref29","doi-asserted-by":"publisher","first-page":"1805","DOI":"10.1093\/bioinformatics\/bts251","article-title":"DAVID-WS: A stateful web service to facilitate gene\/protein list analysis","volume":"28","author":"Jiao","year":"2012","journal-title":"Bioinformatics"},{"key":"2025032311301265300_ref30","first-page":"4765","article-title":"A unified approach to interpreting model predictions","volume":"30","author":"Lundberg","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"key":"2025032311301265300_ref31","doi-asserted-by":"publisher","first-page":"1458","DOI":"10.1172\/JCI83724","article-title":"Thymic stromal lymphopoietin blocks early stages of breast carcinogenesis","volume":"126","author":"Demehri","year":"2016","journal-title":"J Clin Invest"},{"key":"2025032311301265300_ref43","doi-asserted-by":"publisher","DOI":"10.1084\/jem.20201963","article-title":"CD4+ T helper 2 cells suppress breast cancer by inducing terminal differentiation","volume":"219","author":"Boieri","year":"2022","journal-title":"Journal of Experimental Medicine"},{"key":"2025032311301265300_ref44","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0023772","article-title":"Sprouty 2 is an independent prognostic factor in breast cancer and may be useful in stratifying patients for trastuzumab therapy","volume":"6","author":"Faratian","year":"2011","journal-title":"PLoS One"},{"key":"2025032311301265300_ref45","doi-asserted-by":"publisher","first-page":"1525","DOI":"10.1111\/cas.13999","article-title":"The Sprouty\/Spred family as tumor suppressors: Coming of age","volume":"110","author":"Kawazoe","year":"2019","journal-title":"Cancer Sci"},{"key":"2025032311301265300_ref46","doi-asserted-by":"publisher","first-page":"850","DOI":"10.1038\/ncb867","article-title":"Sprouty1 and Sprouty2 provide a control mechanism for the Ras\/MAPK signalling pathway","volume":"4","author":"Hanafusa","year":"2002","journal-title":"Nat Cell Biol"},{"key":"2025032311301265300_ref32","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1177\/002215540305100513","article-title":"S100 protein subcellular localization during epidermal differentiation and psoriasis","volume":"51","author":"Broome","year":"2003","journal-title":"J Histochem Cytochem"},{"key":"2025032311301265300_ref33","doi-asserted-by":"publisher","first-page":"1171","DOI":"10.1016\/j.immuni.2013.11.011","article-title":"S100A8-S100A9 protein complex mediates psoriasis by regulating the expression of complement factor C3","volume":"39","author":"Schonthaler","year":"2013","journal-title":"Immunity"},{"key":"2025032311301265300_ref34","doi-asserted-by":"publisher","first-page":"1678","DOI":"10.1016\/j.jid.2023.02.026","article-title":"S100A9 drives the Chronification of Psoriasiform inflammation by inducing IL-23\/type 3 immunity","volume":"143","author":"Silva de Melo","year":"2023","journal-title":"J Invest Dermatol"},{"key":"2025032311301265300_ref35","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1038\/s41420-021-00769-6","article-title":"Advances in the pathogenesis of psoriasis: From keratinocyte perspective","volume":"8","author":"Zhou","year":"2022","journal-title":"Cell Death Dis"},{"key":"2025032311301265300_ref36","doi-asserted-by":"publisher","DOI":"10.1038\/s41419-020-03305-z","article-title":"Weighted gene coexpression network and experimental analyses identify lncRNA SPRR2C as a regulator of the IL-22-stimulated HaCaT cell phenotype through the miR-330\/STAT1\/S100A7 axis","volume":"12","author":"Luo","year":"2021","journal-title":"Cell Death Dis"},{"key":"2025032311301265300_ref37","doi-asserted-by":"publisher","first-page":"441","DOI":"10.2340\/00015555-2596","article-title":"Overexpression of psoriasin (S100A7) contributes to dysregulated differentiation in psoriasis","volume":"97","author":"Ekman","year":"2017","journal-title":"Acta Derm Venereol"},{"key":"2025032311301265300_ref38","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1111\/j.1346-8138.2004.tb00672.x","article-title":"Unique keratinization process in psoriasis: Late differentiation markers are abolished because of the premature cell death","volume":"31","author":"Iizuka","year":"2004","journal-title":"J Dermatol"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/2\/bbaf073\/62524501\/bbaf073.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/2\/bbaf073\/62524501\/bbaf073.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,23]],"date-time":"2025-03-23T12:58:18Z","timestamp":1742734698000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaf073\/8090548"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3]]},"references-count":42,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2025,3,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaf073","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,3]]},"published":{"date-parts":[[2025,3]]},"article-number":"bbaf073"}}