{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T15:46:37Z","timestamp":1753890397422,"version":"3.41.2"},"reference-count":36,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:00:00Z","timestamp":1750291200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>Gene signature extraction from transcriptomics datasets has been instrumental to identify sets of co-regulated genes, identify associations with prognosis, and for biomarker discovery. Independent component analysis (ICA) is a powerful tool to extract such signatures to uncover hidden patterns in complex data and identify coherent gene sets. The ICARus package offers a robust pipeline to perform ICA on transcriptome datasets. While other packages perform ICA using one value of the main parameter (i.e., the number of signatures), ICARus identifies a range of near-optimal parameter values, iterates through these values, and assesses the robustness and reproducibility of the signature components identified. To test the performance of ICARus, we analyzed transcriptome datasets obtained from COVID-19 patients with different outcomes and from lung adenocarcinoma. We identified several reproducible gene expression signatures significantly associated with prognosis, temporal patterns, and cell type composition. The GSEA of these signatures matched findings from previous clinical studies and revealed potentially new biological mechanisms. ICARus with a vignette is available on Github <jats:ext-link>https:\/\/github.com\/Zha0rong\/ICArus<\/jats:ext-link>.<\/jats:p>","DOI":"10.3389\/fbinf.2025.1604418","type":"journal-article","created":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T05:32:38Z","timestamp":1750311158000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["ICARus: a pipeline to extract robust gene expression signatures from transcriptome datasets"],"prefix":"10.3389","volume":"5","author":[{"given":"Zhaorong","family":"Li","sequence":"first","affiliation":[]},{"given":"Juan I.","family":"Fuxman Bass","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,6,19]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"R106","DOI":"10.1186\/gb-2010-11-10-r106","article-title":"Differential expression analysis for sequence count data","author":"Anders","year":"2010","journal-title":"Genome Biol."},{"key":"B2","doi-asserted-by":"publisher","first-page":"519","DOI":"10.1186\/s12859-022-05043-9","article-title":"Robustica: customizable robust independent component analysis","volume":"23","author":"Anglada-Girotto","year":"2022","journal-title":"BMC Bioinforma."},{"key":"B3","doi-asserted-by":"publisher","first-page":"1235","DOI":"10.1016\/j.celrep.2014.10.035","article-title":"Independent component analysis uncovers the landscape of the bladder tumor transcriptome and reveals insights into luminal and basal subtypes","volume":"9","author":"Biton","year":"2014","journal-title":"Cell Rep."},{"key":"B4","doi-asserted-by":"publisher","first-page":"543","DOI":"10.1038\/nature13385","article-title":"Comprehensive molecular profiling of lung adenocarcinoma","volume":"511","year":"2014","journal-title":"Nature"},{"key":"B5","doi-asserted-by":"publisher","first-page":"e325","DOI":"10.1016\/s2665-9913(20)30127-2","article-title":"Interleukin-1 blockade with high-dose anakinra in patients with COVID-19, acute respiratory distress syndrome, and hyperinflammation: a retrospective cohort study","volume":"2","author":"Cavalli","year":"2020","journal-title":"Lancet Rheumatology"},{"key":"B6","doi-asserted-by":"publisher","first-page":"e253","DOI":"10.1016\/s2665-9913(21)00012-6","article-title":"Interleukin-1 and interleukin-6 inhibition compared with standard management in patients with COVID-19 and hyperinflammation: a cohort study","volume":"3","author":"Cavalli","year":"2021","journal-title":"Lancet Rheumatol."},{"key":"B7","doi-asserted-by":"publisher","first-page":"gkaf018","DOI":"10.1093\/nar\/gkaf018","article-title":"edgeR v4: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets","volume":"53","author":"Chen","year":"2025","journal-title":"Nucleic Acids Res."},{"key":"B8","doi-asserted-by":"publisher","first-page":"505","DOI":"10.1038\/s43018-022-00356-3","article-title":"Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology","volume":"3","author":"Chu","year":"2022","journal-title":"Nat. cancer"},{"key":"B9","doi-asserted-by":"publisher","first-page":"e71","DOI":"10.1093\/nar\/gkv1507","article-title":"TCGAbiolinks: an R\/Bioconductor package for integrative analysis of TCGA data","volume":"44","author":"Colaprico","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"B10","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1186\/s13059-016-0881-8","article-title":"A survey of best practices for RNA-seq data analysis","volume":"17","author":"Conesa","year":"2016","journal-title":"Genome Biol."},{"key":"B11","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1109\/NNSP.2003.1318025","article-title":"Icasso: software for investigating the reliability of ICA estimates by clustering and visualization","volume-title":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718), Toulouse, France","author":"Himberg","year":"2003"},{"key":"B12","doi-asserted-by":"publisher","first-page":"e0137782","DOI":"10.1371\/journal.pone.0137782","article-title":"Gene ranking of RNAseq data via discriminant non-negative matrix factorization","volume":"10","author":"Jia","year":"2015","journal-title":"PloS One"},{"key":"B13","doi-asserted-by":"publisher","first-page":"773","DOI":"10.6026\/97320630008773","article-title":"Network analysis of gene lists for finding reproducible prognostic breast cancer gene signatures","volume":"8","author":"Kairov","year":"2012","journal-title":"Bioinformation"},{"key":"B14","doi-asserted-by":"publisher","first-page":"1623","DOI":"10.1038\/s41591-020-1038-6","article-title":"A dynamic COVID-19 immune signature includes associations with poor prognosis","volume":"26","author":"Laing","year":"2020","journal-title":"Nat. Med."},{"key":"B15","doi-asserted-by":"publisher","first-page":"100935","DOI":"10.1016\/j.xcrm.2023.100935","article-title":"Dynamic activity in cis-regulatory elements of leukocytes identifies transcription factor activation and stratifies COVID-19 severity in ICU patients","volume":"4","author":"Lam","year":"2023","journal-title":"Cell Rep. Med."},{"key":"B16","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1186\/1471-2105-9-559","article-title":"WGCNA: an R package for weighted correlation network analysis","volume":"9","author":"Langfelder","year":"2008","journal-title":"BMC Bioinforma."},{"key":"B17","doi-asserted-by":"publisher","first-page":"e24549","DOI":"10.1016\/j.heliyon.2024.e24549","article-title":"Keratin gene signature expression drives epithelial-mesenchymal transition through enhanced TGF-\u03b2 signaling pathway activation and correlates with adverse prognosis in lung adenocarcinoma","volume":"10","author":"Li","year":"2024","journal-title":"Heliyon"},{"key":"B18","doi-asserted-by":"publisher","first-page":"602395","DOI":"10.3389\/fimmu.2020.602395","article-title":"Interleukin-8 as a biomarker for disease prognosis of coronavirus disease-2019 patients","volume":"11","author":"Li","year":"2021","journal-title":"Front. Immunol."},{"key":"B19","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1097\/bs9.0000000000000050","article-title":"T cell response in patients with COVID-19","volume":"2","author":"Liu","year":"2020","journal-title":"Blood Sci."},{"key":"B20","doi-asserted-by":"publisher","first-page":"580","DOI":"10.1038\/ng.2653","article-title":"The genotype-tissue expression (GTEx) project","volume":"45","author":"Lonsdale","year":"2013","journal-title":"Nat. Genet."},{"key":"B21","doi-asserted-by":"publisher","first-page":"134","DOI":"10.1038\/nri.2017.105","article-title":"Neutrophil extracellular traps in immunity and disease","volume":"18","author":"Papayannopoulos","year":"2018","journal-title":"Nat. Rev. Immunol."},{"key":"B22","doi-asserted-by":"publisher","first-page":"500","DOI":"10.1038\/ng0506-500","article-title":"GenePattern 2.0","volume":"38","author":"Reich","year":"2006","journal-title":"Nat. Genet."},{"key":"B23","doi-asserted-by":"publisher","first-page":"5536","DOI":"10.1038\/s41467-019-13483-w","article-title":"The Escherichia coli transcriptome mostly consists of independently regulated modules","volume":"10","author":"Sastry","year":"2019","journal-title":"Nat. Commun."},{"key":"B24","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1109\/ICDCSW.2011.20","article-title":"Finding a \u201ckneedle\u201d in a haystack: detecting knee points in system behavior","volume-title":"2011 31st international conference on distributed computing systems workshops","author":"Satopaa","year":"2011"},{"key":"B25","doi-asserted-by":"publisher","first-page":"1419","DOI":"10.1016\/j.cell.2020.08.001","article-title":"Severe COVID-19 is marked by a dysregulated myeloid cell compartment","volume":"182","author":"Schulte-Schrepping","year":"2020","journal-title":"Cell"},{"key":"B26","doi-asserted-by":"publisher","first-page":"3166","DOI":"10.1182\/blood.2021013565","article-title":"Formation of neutrophil extracellular traps requires actin cytoskeleton rearrangements","volume":"139","author":"Sprenkeler","year":"2022","journal-title":"Blood, J. Am. Soc. Hematol."},{"key":"B27","doi-asserted-by":"publisher","first-page":"15545","DOI":"10.1073\/pnas.0506580102","article-title":"Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles","volume":"102","author":"Subramanian","year":"2005","journal-title":"Proc. Natl. Acad. Sci. U. S. A."},{"key":"B28","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1080\/07853890.2020.1858234","article-title":"Increased expression of hypoxia-induced factor 1\u03b1 mRNA and its related genes in myeloid blood cells from critically ill COVID-19 patients","volume":"53","author":"Taniguchi-Ponciano","year":"2021","journal-title":"Ann. Med."},{"key":"B29","doi-asserted-by":"publisher","first-page":"970287","DOI":"10.3389\/fimmu.2022.970287","article-title":"Heterogeneity of neutrophils and inflammatory responses in patients with COVID-19 and healthy controls","volume":"13","author":"Xu","year":"","journal-title":"Front. Immunol."},{"key":"B30","doi-asserted-by":"publisher","first-page":"100013","DOI":"10.1016\/j.immuno.2022.100013","article-title":"Association of pyroptosis and severeness of COVID-19 as revealed by integrated single-cell transcriptome data analysis","volume":"6","author":"Xu","year":"","journal-title":"ImmunoInformatics"},{"key":"B31","doi-asserted-by":"publisher","first-page":"2187","DOI":"10.21037\/tcr-23-1940","article-title":"Comprehensive proteomic profiling of lung adenocarcinoma: development and validation of an innovative prognostic model","volume":"13","author":"Yu","year":"2024","journal-title":"Transl. Cancer Res."},{"key":"B32","doi-asserted-by":"publisher","first-page":"bbac384","DOI":"10.1093\/bib\/bbac384","article-title":"A geometric deep learning framework for drug repositioning over heterogeneous information networks","volume":"23","author":"Zhao","year":"2022","journal-title":"Briefings Bioinforma."},{"key":"B33","doi-asserted-by":"publisher","first-page":"2924","DOI":"10.1016\/j.csbj.2024.06.032","article-title":"A heterogeneous information network learning model with neighborhood-level structural representation for predicting lncRNA-miRNA interactions","volume":"23","author":"Zhao","year":"2024","journal-title":"Comput. Struct. Biotechnol. J."},{"key":"B34","doi-asserted-by":"publisher","first-page":"121360","DOI":"10.1016\/j.ins.2024.121360","article-title":"Regulation-aware graph learning for drug repositioning over heterogeneous biological network","volume":"686","author":"Zhao","year":"2025","journal-title":"Inf. Sci."},{"key":"B35","doi-asserted-by":"publisher","first-page":"e138999","DOI":"10.1172\/jci.insight.138999","article-title":"Neutrophil extracellular traps in COVID-19","volume":"5","author":"Zuo","year":"2020","journal-title":"JCI insight"},{"key":"B36","doi-asserted-by":"publisher","first-page":"446","DOI":"10.1007\/s11239-020-02324-z","article-title":"Neutrophil extracellular traps and thrombosis in COVID-19","volume":"51","author":"Zuo","year":"2021","journal-title":"J. Thrombosis Thrombolysis"}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1604418\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T05:32:40Z","timestamp":1750311160000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1604418\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,19]]},"references-count":36,"alternative-id":["10.3389\/fbinf.2025.1604418"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2025.1604418","relation":{},"ISSN":["2673-7647"],"issn-type":[{"type":"electronic","value":"2673-7647"}],"subject":[],"published":{"date-parts":[[2025,6,19]]},"article-number":"1604418"}}