{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,19]],"date-time":"2026-02-19T06:59:51Z","timestamp":1771484391822,"version":"3.50.1"},"reference-count":47,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2024,10,22]],"date-time":"2024-10-22T00:00:00Z","timestamp":1729555200000},"content-version":"vor","delay-in-days":29,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2023YFD2200104"],"award-info":[{"award-number":["2023YFD2200104"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2023YFD2200102"],"award-info":[{"award-number":["2023YFD2200102"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32270664"],"award-info":[{"award-number":["32270664"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32170327"],"award-info":[{"award-number":["32170327"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,9,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Metagenomic analyses facilitate the\u00a0exploration of the microbial world, advancing our understanding of microbial roles in ecological and biological processes. A pivotal aspect of metagenomic analysis involves assessing the quality of metagenome-assembled genomes (MAGs), crucial for accurate biological insights. Current machine learning\u2013based methods often treat completeness and contamination prediction as separate tasks, overlooking their inherent relationship and limiting models\u2019 generalization. In this study, we present DeepCheck, a multitasking deep learning framework for simultaneous prediction of MAG completeness and contamination. DeepCheck consistently outperforms existing tools in accuracy across various experimental settings and demonstrates comparable speed while maintaining high predictive accuracy even for new lineages. Additionally, we employ interpretable machine learning techniques to identify specific genes and pathways that drive the model\u2019s predictions, enabling independent investigation and assessment of these biological elements for deeper insights.<\/jats:p>","DOI":"10.1093\/bib\/bbae539","type":"journal-article","created":{"date-parts":[[2024,10,22]],"date-time":"2024-10-22T23:15:57Z","timestamp":1729638957000},"source":"Crossref","is-referenced-by-count":3,"title":["DeepCheck: multitask learning aids in assessing microbial genome quality"],"prefix":"10.1093","volume":"25","author":[{"given":"Guo","family":"Wei","sequence":"first","affiliation":[{"name":"State Key Laboratory of Pharmaceutical Biotechnology , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]},{"name":"Nanjing University , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nannan","family":"Wu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Pharmaceutical Biotechnology , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]},{"name":"Nanjing University , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kunyang","family":"Zhao","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Pharmaceutical Biotechnology , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]},{"name":"Nanjing University , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sihai","family":"Yang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Pharmaceutical Biotechnology , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]},{"name":"Nanjing University , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]},{"name":"Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University , 159 Panlong road, Xuanwu District, Nanjing 210000 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7067-0558","authenticated-orcid":false,"given":"Long","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Pharmaceutical Biotechnology , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]},{"name":"Nanjing University , School of Life Sciences, , 163 Xianlin Avenue, Qixia District, Nanjing 210000 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yan","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Yangzhou University , 196 Huaxi Road, Hanjiang District, Yangzhou 225100 ,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,10,22]]},"reference":[{"key":"2024102223154292600_ref1","doi-asserted-by":"publisher","first-page":"711","DOI":"10.1038\/s41587-021-01130-z","article-title":"Generating lineage-resolved, complete metagenome-assembled genomes from complex microbial communities","volume":"40","author":"Bickhart","year":"2022","journal-title":"Nat Biotechnol"},{"key":"2024102223154292600_ref2","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1038\/nbt.3893","article-title":"Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea","volume":"35","author":"Bowers","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2024102223154292600_ref3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-015-0834-7","article-title":"Metagenome-assembled genomes uncover a global brackish microbiome","volume":"16","author":"Hugerth","year":"2015","journal-title":"Genome Biol"},{"key":"2024102223154292600_ref4","doi-asserted-by":"crossref","first-page":"5235","DOI":"10.1038\/s41467-022-32991-w","article-title":"Dissecting the role of the human microbiome in COVID-19 via metagenome-assembled genomes","volume":"13","author":"Ke","year":"2022","journal-title":"Nat Commun"},{"key":"2024102223154292600_ref5","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1038\/s41596-022-00747-x","article-title":"Metagenome-assembled genome extraction and analysis from microbiomes using KBase","volume":"18","author":"Chivian","year":"2023","journal-title":"Nat Protoc"},{"key":"2024102223154292600_ref6","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1007\/s12275-021-0632-8","article-title":"Application of computational approaches to analyze metagenomic data","volume":"59","author":"Gwak","year":"2021","journal-title":"J Microbiol"},{"key":"2024102223154292600_ref7","doi-asserted-by":"crossref","first-page":"2815","DOI":"10.1038\/s41596-022-00738-y","article-title":"Metagenome analysis using the kraken software suite","volume":"17","author":"Lu","year":"2022","journal-title":"Nat Protoc"},{"key":"2024102223154292600_ref8","doi-asserted-by":"crossref","first-page":"1043","DOI":"10.1101\/gr.186072.114","article-title":"CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes","volume":"25","author":"Parks","year":"2015","journal-title":"Genome Res"},{"key":"2024102223154292600_ref9","doi-asserted-by":"crossref","first-page":"4862","DOI":"10.1093\/bioinformatics\/btz422","article-title":"AlphaFold at CASP13","volume":"35","author":"AlQuraishi","year":"2019","journal-title":"Bioinformatics"},{"key":"2024102223154292600_ref10","doi-asserted-by":"crossref","first-page":"1203","DOI":"10.1038\/s41592-023-01940-w","article-title":"CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning","volume":"20","author":"Chklovski","year":"2023","journal-title":"Nat Methods"},{"key":"2024102223154292600_ref11","doi-asserted-by":"crossref","first-page":"bbab317","DOI":"10.1093\/bib\/bbab317","article-title":"Mol2Context-vec: learning molecular representation from context awareness for drug discovery","volume":"22","author":"Lv","year":"2021","journal-title":"Brief Bioinform"},{"key":"2024102223154292600_ref12","doi-asserted-by":"crossref","DOI":"10.1109\/TNNLS.2024.3359657","article-title":"Meta-molnet: a cross-domain benchmark for few examples drug discovery","author":"Lv","year":"2024","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"2024102223154292600_ref13","doi-asserted-by":"publisher","first-page":"11218","DOI":"10.1109\/TNNLS.2023.3250324","article-title":"Meta learning with graph attention networks for low-data drug discovery","volume":"35","author":"Lv","year":"2023","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"2024102223154292600_ref14","doi-asserted-by":"crossref","first-page":"10684","DOI":"10.1039\/D3SC02139D","article-title":"TCMBank: bridges between the largest herbal medicines, chemical ingredients, target proteins, and associated diseases with intelligence text mining","volume":"14","author":"Lv","year":"2023","journal-title":"Chem Sci"},{"key":"2024102223154292600_ref15","doi-asserted-by":"crossref","first-page":"5975","DOI":"10.1007\/s10462-022-10306-1","article-title":"Deep learning in drug discovery: an integrative review and future challenges","volume":"56","author":"Askr","year":"2023","journal-title":"Artif Intell Rev"},{"key":"2024102223154292600_ref16","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.neunet.2023.05.039","article-title":"3D graph neural network with few-shot learning for predicting drug\u2013drug interactions in scaffold-based cold start scenario","volume":"165","author":"Lv","year":"2023","journal-title":"Neural Netw"},{"key":"2024102223154292600_ref17","doi-asserted-by":"crossref","first-page":"bbad235","DOI":"10.1093\/bib\/bbad235","article-title":"Comprehensive evaluation of deep and graph learning on drug\u2013drug interactions prediction","volume":"24","author":"Lin","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024102223154292600_ref18","doi-asserted-by":"crossref","first-page":"706","DOI":"10.1038\/s41586-019-1923-7","article-title":"Improved protein structure prediction using potentials from deep learning","volume":"577","author":"Senior","year":"2020","journal-title":"Nature"},{"key":"2024102223154292600_ref19","doi-asserted-by":"crossref","first-page":"975","DOI":"10.1038\/s41587-023-01917-2","article-title":"Protein remote homology detection and structural alignment using deep learning","volume":"42","author":"Hamamsy","year":"2024","journal-title":"Nat Biotechnol"},{"key":"2024102223154292600_ref20","doi-asserted-by":"crossref","first-page":"5586","DOI":"10.1109\/TKDE.2021.3070203","article-title":"A survey on multi-task learning","volume":"34","author":"Zhang","year":"2021","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2024102223154292600_ref21","first-page":"770","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"He","year":"2016"},{"key":"2024102223154292600_ref22","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.neucom.2021.03.091","article-title":"A review on the attention mechanism of deep learning","volume":"452","author":"Niu","year":"2021","journal-title":"Neurocomputing"},{"key":"2024102223154292600_ref23","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1007\/s41095-022-0271-y","article-title":"Attention mechanisms in computer vision: a survey","volume":"8","author":"Guo","year":"2022","journal-title":"Comput Vis Media"},{"key":"2024102223154292600_ref24","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1109\/RBME.2021.3131358","article-title":"Interpreting deep machine learning models: an easy guide for oncologists","volume":"16","author":"Amorim","year":"2021","journal-title":"IEEE Rev Biomed Eng"},{"key":"2024102223154292600_ref25","doi-asserted-by":"crossref","first-page":"22071","DOI":"10.1073\/pnas.1900654116","article-title":"Definitions, methods, and applications in interpretable machine learning","volume":"116","author":"Murdoch","year":"2019","journal-title":"Proc Natl Acad Sci"},{"key":"2024102223154292600_ref26","doi-asserted-by":"crossref","first-page":"D733","DOI":"10.1093\/nar\/gkv1189","article-title":"Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation","volume":"44","author":"O'Leary","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2024102223154292600_ref27","doi-asserted-by":"crossref","first-page":"5114","DOI":"10.1038\/s41467-018-07641-9","article-title":"High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries","volume":"9","author":"Jain","year":"2018","journal-title":"Nat Commun"},{"key":"2024102223154292600_ref28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-11-119","article-title":"Prodigal: prokaryotic gene recognition and translation initiation site identification","volume":"11","author":"Hyatt","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2024102223154292600_ref29","volume-title":"BBMap: A Fast, Accurate, Splice-Aware Aligner","author":"Bushnell","year":"2014"},{"key":"2024102223154292600_ref30","doi-asserted-by":"crossref","first-page":"D785","DOI":"10.1093\/nar\/gkab776","article-title":"GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy","volume":"50","author":"Parks","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2024102223154292600_ref31","volume-title":"GTDB-Tk: A Toolkit to Classify Genomes with the Genome Taxonomy Database","author":"Chaumeil","year":"2020"},{"key":"2024102223154292600_ref32","volume-title":"Artificial Neural Networks","author":"Yegnanarayana","year":"2009"},{"key":"2024102223154292600_ref33","article-title":"Lightgbm: a highly efficient gradient boosting decision tree","volume":"30","author":"Ke","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"key":"2024102223154292600_ref34","doi-asserted-by":"crossref","first-page":"D457","DOI":"10.1093\/nar\/gkv1070","article-title":"KEGG as a reference resource for gene and protein annotation","volume":"44","author":"Kanehisa","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2024102223154292600_ref35","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1038\/s41592-021-01101-x","article-title":"Sensitive protein alignments at tree-of-life scale using DIAMOND","volume":"18","author":"Buchfink","year":"2021","journal-title":"Nat Methods"},{"key":"2024102223154292600_ref36","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1093\/bioinformatics\/btm098","article-title":"UniRef: comprehensive and non-redundant UniProt reference clusters","volume":"23","author":"Suzek","year":"2007","journal-title":"Bioinformatics"},{"key":"2024102223154292600_ref37","first-page":"2292","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wong","year":"2023"},{"key":"2024102223154292600_ref38","first-page":"10745","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Yang","year":"2023"},{"key":"2024102223154292600_ref39","first-page":"33","volume-title":"Transfer Learning for Natural Language Processing Workshop","author":"Bhat","year":"2023"},{"key":"2024102223154292600_ref40","doi-asserted-by":"crossref","first-page":"257","DOI":"10.5626\/JCSE.2011.5.3.257","article-title":"A survey of transfer and multitask learning in bioinformatics","volume":"5","author":"Xu","year":"2011","journal-title":"J Comput Sci Eng"},{"key":"2024102223154292600_ref41","doi-asserted-by":"crossref","first-page":"bbad203","DOI":"10.1093\/bib\/bbad203","article-title":"Improving the identification of miRNA\u2013disease associations with multi-task learning on gene\u2013disease networks","volume":"24","author":"He","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024102223154292600_ref42","doi-asserted-by":"crossref","first-page":"2546","DOI":"10.1038\/s41467-023-37477-x","article-title":"Explainable multi-task learning for multi-modality biological data analysis","volume":"14","author":"Tang","year":"2023","journal-title":"Nat Commun"},{"key":"2024102223154292600_ref43","article-title":"Captum: a unified and generic model interpretability library for pytorch","author":"Kokhlikyan"},{"key":"2024102223154292600_ref44","doi-asserted-by":"crossref","first-page":"2009","DOI":"10.1038\/s41467-021-22203-2","article-title":"Connecting structure to function with the recovery of over 1000 high-quality metagenome-assembled genomes from activated sludge using long-read sequencing","volume":"12","author":"Singleton","year":"2021","journal-title":"Nat Commun"},{"key":"2024102223154292600_ref45","first-page":"e01756","article-title":"The biotechnological potential of the Chloroflexota phylum","author":"Freches","year":"2024","journal-title":"Appl Environ Microbiol"},{"key":"2024102223154292600_ref46","doi-asserted-by":"crossref","first-page":"pgac226","DOI":"10.1093\/pnasnexus\/pgac226","article-title":"Carbon fixation pathways across the bacterial and archaeal tree of life","volume":"1","author":"Garritano","year":"2022","journal-title":"PNAS Nexus"},{"key":"2024102223154292600_ref47","author":"Center. KUB. GenomeNet"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/6\/bbae539\/59963318\/bbae539.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/6\/bbae539\/59963318\/bbae539.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,22]],"date-time":"2024-10-22T23:16:08Z","timestamp":1729638968000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae539\/7831257"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,23]]},"references-count":47,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,9,23]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae539","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,11]]},"published":{"date-parts":[[2024,9,23]]},"article-number":"bbae539"}}