{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:11:13Z","timestamp":1772165473499,"version":"3.50.1"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,8,4]],"date-time":"2025-08-04T00:00:00Z","timestamp":1754265600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,4]],"date-time":"2025-08-04T00:00:00Z","timestamp":1754265600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"the National Key Research and Development Program of China","award":["2021YFF1200902"],"award-info":[{"award-number":["2021YFF1200902"]}]},{"DOI":"10.13039\/501100001809","name":"the National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["32270689"],"award-info":[{"award-number":["32270689"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    As single-cell sequencing technology became widely used, scientists found that single-modality data alone could not fully meet the research needs of complex biological systems. To address this issue, researchers began simultaneously collect multi-modal single-cell omics data. But different sequencing technologies often result in datasets where one or more data modalities are missing. Therefore, mosaic datasets are more common when we analyze. However, the high dimensionality and sparsity of the data increase the difficulty, and the presence of batch effects poses an additional challenge. To address these challenges, we proposes a flexible integration framework based on Variational Autoencoder called scGCM. The main task of scGCM is to integrate single-cell multimodal mosaic data and eliminate batch effects. This method was conducted on multiple datasets, encompassing different modalities of single-cell data. The results demonstrate that, compared to state-of-the-art multimodal data integration methods, scGCM offers significant advantages in clustering accuracy and data consistency. The source code of scGCM can be accessed at\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/closmouz\/scCGM\" ext-link-type=\"uri\">https:\/\/github.com\/closmouz\/scCGM<\/jats:ext-link>\n                    .\n                  <\/jats:p>","DOI":"10.1186\/s12859-025-06239-5","type":"journal-article","created":{"date-parts":[[2025,8,4]],"date-time":"2025-08-04T17:46:59Z","timestamp":1754329619000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Semi-supervised contrastive learning variational autoencoder Integrating single-cell multimodal mosaic datasets"],"prefix":"10.1186","volume":"26","author":[{"given":"Zihao","family":"Wang","sequence":"first","affiliation":[]},{"given":"Zeyu","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Minghua","family":"Deng","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,8,4]]},"reference":[{"key":"6239_CR1","doi-asserted-by":"publisher","first-page":"1202","DOI":"10.1038\/s41587-021-00895-7","volume":"39","author":"R Argelaguet","year":"2021","unstructured":"Argelaguet R, Cuomo ASE, Stegle O, Marioni JC. Computational principles and challenges in single-cell data integration. Nat Biotechnol. 2021;39:1202\u201315.","journal-title":"Nat Biotechnol"},{"key":"6239_CR2","doi-asserted-by":"publisher","first-page":"3573","DOI":"10.1016\/j.cell.2021.04.048","volume":"184","author":"Y Hao","year":"2020","unstructured":"...Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, Lee MJ, Wilk AJ, Darby CA, Zagar M, Hoffman PJ, Stoeckius M, Papalexi E, Mimitou EP, Jain J, Srivastava A, Stuart T, Fleming LB, Yeung BZ, Rogers AJ, McElrath JM, Blish CA, Gottardo R, Smibert P, Satija R. Integrated analysis of multimodal single-cell data. Cell. 2020;184:3573\u2013358729.","journal-title":"Cell"},{"key":"6239_CR3","doi-asserted-by":"publisher","first-page":"293","DOI":"10.1038\/s41587-023-01767-y","volume":"42","author":"Y Hao","year":"2022","unstructured":"Hao Y, Stuart T, Kowalski MH, Choudhary SK, Hoffman PJ, Hartman A, Srivastava A, Molla G, Madad S, Fernandez-Granda C, Satija R. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat Biotechnol. 2022;42:293\u2013304.","journal-title":"Nat Biotechnol"},{"key":"6239_CR4","doi-asserted-by":"publisher","first-page":"3632","DOI":"10.1038\/s41596-020-0391-8","volume":"15","author":"J Liu","year":"2020","unstructured":"Liu J, Gao C, Sodicoff JS, Kozareva V, Macosko EZ, Welch JD. Jointly defining cell types from multiple single-cell datasets using liger. Nat Protoc. 2020;15:3632\u201362.","journal-title":"Nat Protoc"},{"key":"6239_CR5","doi-asserted-by":"publisher","first-page":"1289","DOI":"10.1038\/s41592-019-0619-0","volume":"16","author":"I Korsunsky","year":"2019","unstructured":"Korsunsky I, Fan J, Slowikowski K, Zhang F, Wei K, Baglaenko Y, Brenner MB, Loh P-R, Raychaudhuri S. Fast, sensitive, and accurate integration of single cell data with harmony. Nat Methods. 2019;16:1289\u201396.","journal-title":"Nat Methods"},{"key":"6239_CR6","doi-asserted-by":"crossref","unstructured":"Lin X, Tian T, Wei Z, Hakonarson H (2022) Clustering of single-cell multi-omics data with a multimodal deep learning method. Nature Commun13","DOI":"10.1038\/s41467-022-35031-9"},{"key":"6239_CR7","doi-asserted-by":"crossref","unstructured":"Gong B, Zhou Y, Purdom E (2021) Cobolt: integrative analysis of multimodal single-cell sequencing data. Genome Biol22","DOI":"10.1186\/s13059-021-02556-z"},{"issue":"3","key":"6239_CR8","doi-asserted-by":"publisher","first-page":"272","DOI":"10.1038\/s41592-020-01050-x","volume":"18","author":"A Gayoso","year":"2021","unstructured":"Gayoso A, Steier Z, Lopez R, Regier J, Nazor KL, Streets A, Yosef N. Joint probabilistic modeling of single-cell multi-omic data with totalvi. Nat Methods. 2021;18(3):272\u201382.","journal-title":"Nat Methods"},{"issue":"8","key":"6239_CR9","doi-asserted-by":"publisher","first-page":"1222","DOI":"10.1038\/s41592-023-01909-9","volume":"20","author":"T Ashuach","year":"2023","unstructured":"Ashuach T, Gabitto MI, Koodli RV, Saldi G-A, Jordan MI, Yosef N. Multivi: deep generative model for the integration of multimodal data. Nat Methods. 2023;20(8):1222\u201331.","journal-title":"Nat Methods"},{"key":"6239_CR10","unstructured":"Kingma DP (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114"},{"issue":"1","key":"6239_CR11","doi-asserted-by":"publisher","first-page":"7419","DOI":"10.1038\/s41467-022-35094-8","volume":"13","author":"K Cao","year":"2022","unstructured":"Cao K, Gong Q, Hong Y, Wan L. A unified computational framework for single-cell data integration with optimal transport. Nat Commun. 2022;13(1):7419.","journal-title":"Nat Commun"},{"issue":"1","key":"6239_CR12","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1089\/cmb.2021.0446","volume":"29","author":"P Demetci","year":"2022","unstructured":"Demetci P, Santorella R, Sandstede B, Noble WS, Singh R. Scot: single-cell multi-omics alignment with optimal transport. J Comput Biol. 2022;29(1):3\u201318.","journal-title":"J Comput Biol"},{"key":"6239_CR13","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-020-02015-1","volume":"21","author":"R Argelaguet","year":"2020","unstructured":"Argelaguet R, Arnol D, Bredikhin D, Deloro Y, Velten B, Marioni JC, Stegle O. Mofa+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biol. 2020;21:1\u201317.","journal-title":"Genome Biol"},{"issue":"8","key":"6239_CR14","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1093\/nar\/gkad157","volume":"51","author":"C Liu","year":"2023","unstructured":"Liu C, Huang H, Yang P. Multi-task learning from multimodal single-cell omics with matilda. Nucleic Acids Res. 2023;51(8):45\u201345.","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"6239_CR15","doi-asserted-by":"publisher","first-page":"1458","DOI":"10.1038\/s41587-022-01284-4","volume":"40","author":"Z-J Cao","year":"2022","unstructured":"Cao Z-J, Gao G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat Biotechnol. 2022;40(10):1458\u201366.","journal-title":"Nat Biotechnol"},{"key":"6239_CR16","doi-asserted-by":"crossref","unstructured":"Ghazanfar S, Guibentif C, Marioni JC (2022) Stabmap: mosaic single cell data integration using non-overlapping features. bioRxiv, 2022\u201302","DOI":"10.1101\/2022.02.24.481823"},{"key":"6239_CR17","unstructured":"Lotfollahi M, Litinetskaya A, Theis FJ (2022) Multigrate: single-cell multi-omic data integration. BioRxiv, 2022\u201303"},{"issue":"6","key":"6239_CR18","doi-asserted-by":"publisher","first-page":"391","DOI":"10.1093\/bib\/bbad391","volume":"24","author":"W Li","year":"2023","unstructured":"Li W, Xiang B, Yang F, Rong Y, Yin Y, Yao J, Zhang H. scmhnn: a novel hypergraph neural network for integrative analysis of single-cell epigenomic, transcriptomic and proteomic data. Brief Bioinform. 2023;24(6):391.","journal-title":"Brief Bioinform"},{"issue":"1","key":"6239_CR19","doi-asserted-by":"publisher","first-page":"7711","DOI":"10.1038\/s41467-023-43019-2","volume":"14","author":"G-J Huizing","year":"2023","unstructured":"Huizing G-J, Deutschmann IM, Peyr\u00e9 G, Cantini L. Paired single-cell multi-omics data integration with mowgli. Nat Commun. 2023;14(1):7711.","journal-title":"Nat Commun"},{"issue":"2","key":"6239_CR20","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1038\/s41592-023-02139-9","volume":"21","author":"K Zhang","year":"2024","unstructured":"Zhang K, Zemke NR, Armand EJ, Ren B. A fast, scalable and versatile tool for analysis of single-cell omics data. Nat Methods. 2024;21(2):217\u201327.","journal-title":"Nat Methods"},{"key":"6239_CR21","doi-asserted-by":"crossref","unstructured":"Shang J, Jiang J, Sun Y. Bacteriophage classification for assembled contigs using graph convolutional network. Bioinformatics 37(Supplement_1), 2021;25\u201333","DOI":"10.1093\/bioinformatics\/btab293"},{"key":"6239_CR22","unstructured":"Chen T, Kornblith S, Norouzi M, Hinton G. A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, 2020;1597\u20131607. PMLR"},{"key":"6239_CR23","unstructured":"Oord Avd, Li Y, Vinyals O. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 2018"},{"key":"6239_CR24","doi-asserted-by":"crossref","unstructured":"He K, Fan H, Wu Y, Xie S, Girshick R. Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 2020;9729\u20139738","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"6239_CR25","doi-asserted-by":"crossref","unstructured":"McInnes L, Healy J, Melville J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 2018","DOI":"10.21105\/joss.00861"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-025-06239-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-025-06239-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-025-06239-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,8]],"date-time":"2025-09-08T14:17:24Z","timestamp":1757341044000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-025-06239-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,4]]},"references-count":25,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["6239"],"URL":"https:\/\/doi.org\/10.1186\/s12859-025-06239-5","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-6748098\/v1","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,4]]},"assertion":[{"value":"26 May 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 July 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 August 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no Conflict of interest.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"206"}}