{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T15:22:03Z","timestamp":1776266523222,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2024,11,15]],"date-time":"2024-11-15T00:00:00Z","timestamp":1731628800000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["NSF-III2246796"],"award-info":[{"award-number":["NSF-III2246796"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["NSF-III2152030"],"award-info":[{"award-number":["NSF-III2152030"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Integrating multiple omics datasets can significantly advance our understanding of disease mechanisms, physiology, and treatment responses. However, a major challenge in multi-omics studies is the disparity in sample sizes across different datasets, which can introduce bias and reduce statistical power. To address this issue, we propose a novel framework, OmicsNMF, designed to impute missing omics data and enhance disease phenotype prediction. OmicsNMF integrates Generative Adversarial Networks (GANs) with Non-Negative Matrix Factorization (NMF). NMF is a well-established method for uncovering underlying patterns in omics data, while GANs enhance the imputation process by generating realistic data samples. This synergy aims to more effectively address sample size disparity, thereby improving data integration and prediction accuracy.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>For evaluation, we focused on predicting breast cancer subtypes using the imputed data generated by our proposed framework, OmicsNMF. Our results indicate that OmicsNMF consistently outperforms baseline methods. We further assessed the quality of the imputed data through survival analysis, revealing that the imputed omics profiles provide significant prognostic power for both overall survival and disease-free status. Overall, OmicsNMF effectively leverages GANs and NMF to impute missing samples while preserving key biological features. This approach shows potential for advancing precision oncology by improving data integration and analysis.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Source code is available at: https:\/\/github.com\/compbiolabucf\/OmicsNMF.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae674","type":"journal-article","created":{"date-parts":[[2024,11,15]],"date-time":"2024-11-15T17:04:03Z","timestamp":1731690243000},"source":"Crossref","is-referenced-by-count":14,"title":["Optimizing multi-omics data imputation with NMF and GAN synergy"],"prefix":"10.1093","volume":"40","author":[{"given":"Md Istiaq","family":"Ansari","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Central Florida , Orlando, FL 32816,","place":["United States"]},{"name":"Department of Genomics and Bioinformatics Cluster, University of Central Florida , Orlando, FL 32816,","place":["United States"]}]},{"given":"Khandakar Tanvir","family":"Ahmed","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Central Florida , Orlando, FL 32816,","place":["United States"]},{"name":"Department of Genomics and Bioinformatics Cluster, University of Central Florida , Orlando, FL 32816,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3605-9373","authenticated-orcid":false,"given":"Wei","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Central Florida , Orlando, FL 32816,","place":["United States"]},{"name":"Department of Genomics and Bioinformatics Cluster, University of Central Florida , Orlando, FL 32816,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2024,11,15]]},"reference":[{"key":"2024112520113998900_btae674-B1","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1093\/bioinformatics\/btab608","article-title":"Multi-omics data integration by generative adversarial network","volume":"38","author":"Ahmed","year":"2021","journal-title":"Bioinformatics"},{"key":"2024112520113998900_btae674-B2","first-page":"469","author":"Ahmed","year":"2023"},{"key":"2024112520113998900_btae674-B3","doi-asserted-by":"crossref","first-page":"bbac537","DOI":"10.1093\/bib\/bbac537","article-title":"Incomplete time-series gene expression in integrative study for islet autoimmunity prediction","volume":"24","author":"Ahmed","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024112520113998900_btae674-B4","first-page":"214","author":"Arjovsky","year":"2017"},{"key":"2024112520113998900_btae674-B5","doi-asserted-by":"publisher","author":"Cho","year":"2014","DOI":"10.48550\/arXiv.1406.1078,"},{"key":"2024112520113998900_btae674-B6","author":"Davidson-Pilon"},{"key":"2024112520113998900_btae674-B7","doi-asserted-by":"crossref","first-page":"1278","DOI":"10.1093\/bioinformatics\/bty796","article-title":"TOBMI: trans-omics block missing data imputation using a k-nearest neighbor weighted approach","volume":"35","author":"Dong","year":"2019","journal-title":"Bioinformatics"},{"key":"2024112520113998900_btae674-B8","doi-asserted-by":"crossref","first-page":"pl1","DOI":"10.1126\/scisignal.2004088","article-title":"Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal","volume":"6","author":"Gao","year":"2013","journal-title":"Sci Signal"},{"key":"2024112520113998900_btae674-B9","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1038\/s41587-020-0546-8","article-title":"Visualizing and interpreting cancer genomics data via the xena platform","volume":"38","author":"Goldman","year":"2020","journal-title":"Nat Biotechnol"},{"key":"2024112520113998900_btae674-B10","doi-asserted-by":"crossref","first-page":"I1","DOI":"10.1186\/1752-0509-8-S2-I1","article-title":"Data integration in the era of omics: current and future challenges","volume":"8 Suppl 2","author":"Gomez-Cabrero","year":"2014","journal-title":"BMC Syst Biol"},{"key":"2024112520113998900_btae674-B11","first-page":"27","article-title":"Generative adversarial nets","author":"Goodfellow","year":"2014","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2024112520113998900_btae674-B12","doi-asserted-by":"crossref","first-page":"535","DOI":"10.3389\/fgene.2019.00535","article-title":"Inferring interaction networks from multi-omics data","volume":"10","author":"Hawe","year":"2019","journal-title":"Front Genet"},{"key":"2024112520113998900_btae674-B13","first-page":"1125","author":"Isola","year":"2017"},{"key":"2024112520113998900_btae674-B14","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"2024112520113998900_btae674-B15","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1007\/s11306-018-1451-8","article-title":"NS-kNN: a modified k-nearest neighbors approach for imputing metabolomics data","volume":"14","author":"Lee","year":"2018","journal-title":"Metabolomics"},{"key":"2024112520113998900_btae674-B16","first-page":"1","article-title":"scikit-survival: a library for time-to-event analysis built on top of scikit-learn","volume":"21","author":"P\u00f6lsterl","year":"2020","journal-title":"J Mach Learn Res"},{"key":"2024112520113998900_btae674-B17","volume-title":"Linear Regression Analysis","author":"Seber","year":"2012"},{"key":"2024112520113998900_btae674-B18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v039.i05","article-title":"Regularization paths for cox\u2019s proportional hazards model via coordinate descent","volume":"39","author":"Simon","year":"2011","journal-title":"J Stat Softw"},{"key":"2024112520113998900_btae674-B19","doi-asserted-by":"crossref","first-page":"790","DOI":"10.1016\/j.tig.2018.07.003","article-title":"Enter the matrix: factorization uncovers knowledge from omics","volume":"34","author":"Stein-O\u2019Brien","year":"2018","journal-title":"Trends Genet"},{"key":"2024112520113998900_btae674-B20","doi-asserted-by":"crossref","first-page":"1177932219899051","DOI":"10.1177\/1177932219899051","article-title":"Multi-omics data integration, interpretation, and its application","volume":"14","author":"Subramanian","year":"2020","journal-title":"Bioinform Biol Insights"},{"key":"2024112520113998900_btae674-B21","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1038\/nature11412","article-title":"Comprehensive molecular portraits of human breast tumours","volume":"490","author":"The Cancer Genome Atlas Network","year":"2012","journal-title":"Nature"},{"key":"2024112520113998900_btae674-B22","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J R Stat Soc Ser B Stat Methodol"},{"key":"2024112520113998900_btae674-B23","first-page":"1405","author":"Tran","year":"2017"},{"key":"2024112520113998900_btae674-B24","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for DNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"2024112520113998900_btae674-B25","doi-asserted-by":"crossref","first-page":"402","DOI":"10.1186\/s12859-016-1273-5","article-title":"Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework","volume":"17","author":"Voillet","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2024112520113998900_btae674-B26","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1016\/j.aca.2020.10.038","article-title":"Multi-omics integration in biomedical research\u2013a metabolomics-centric review","volume":"1141","author":"W\u00f6rheide","year":"2021","journal-title":"Anal Chim Acta"},{"key":"2024112520113998900_btae674-B27","doi-asserted-by":"publisher","author":"Wu","year":"2016","DOI":"10.48550\/arXiv.1611.04273,"},{"key":"2024112520113998900_btae674-B28","doi-asserted-by":"crossref","first-page":"5787","DOI":"10.3390\/molecules26195787","article-title":"NMF-based approach for missing values imputation of mass spectrometry metabolomics data","volume":"26","author":"Xu","year":"2021","journal-title":"Molecules"},{"key":"2024112520113998900_btae674-B29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/bioinformatics\/btv544","article-title":"A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data","volume":"32","author":"Yang","year":"2016","journal-title":"Bioinformatics"},{"key":"2024112520113998900_btae674-B30","first-page":"5689","author":"Yoon","year":"2018"},{"key":"2024112520113998900_btae674-B31","first-page":"653","author":"Zhang","year":"2022"},{"key":"2024112520113998900_btae674-B32","first-page":"4006","author":"Zhang","year":"2017"},{"key":"2024112520113998900_btae674-B33","doi-asserted-by":"crossref","first-page":"giaa076","DOI":"10.1093\/gigascience\/giaa076","article-title":"Imputing missing RNA-sequencing data from DNA methylation by using a transfer learning\u2013based neural network","volume":"9","author":"Zhou","year":"2020","journal-title":"Gigascience"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae674\/60686175\/btae674.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/11\/btae674\/60812073\/btae674.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/11\/btae674\/60812073\/btae674.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T20:12:01Z","timestamp":1732565521000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae674\/7901213"}},"subtitle":[],"editor":[{"given":"Jianlin","family":"Cheng","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,11,1]]},"references-count":33,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2024,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae674","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,11]]},"published":{"date-parts":[[2024,11,1]]},"article-number":"btae674"}}