{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T08:55:22Z","timestamp":1778057722283,"version":"3.51.4"},"reference-count":52,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2022,3,2]],"date-time":"2022-03-02T00:00:00Z","timestamp":1646179200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Capital\u2019s Funds for Health Improvement and Research","award":["2021-1G-4302"],"award-info":[{"award-number":["2021-1G-4302"]}]},{"DOI":"10.13039\/501100007077","name":"National Institute of Biomedical Innovation","doi-asserted-by":"publisher","award":["18CXZ038"],"award-info":[{"award-number":["18CXZ038"]}],"id":[{"id":"10.13039\/501100007077","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["32070166"],"award-info":[{"award-number":["32070166"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,5,13]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Explosively emerging SARS-CoV-2 variants challenge current nomenclature schemes based on genetic diversity and biological significance. Genomic composition-based machine learning methods have recently performed well in identifying phenotype\u2013genotype relationships. We introduced a framework involving dinucleotide (DNT) composition representation (DCR) to parse the general human adaptation of RNA viruses and applied a three-dimensional convolutional neural network (3D CNN) analysis to learn the human adaptation of other existing coronaviruses (CoVs) and predict the adaptation of SARS-CoV-2 variants of concern (VOCs). A markedly separable, linear DCR distribution was observed in two major genes\u2014receptor-binding glycoprotein and RNA-dependent RNA polymerase (RdRp)\u2014of six families of single-stranded (ssRNA) viruses. Additionally, there was a general host-specific distribution of both the spike proteins and RdRps of CoVs. The 3D CNN based on spike DCR predicted a dominant type II adaptation of most Beta, Delta and Omicron VOCs, with high transmissibility and low pathogenicity. Type I adaptation with opposite transmissibility and pathogenicity was predicted for SARS-CoV-2 Alpha VOCs (77%) and Kappa variants of interest (58%). The identified adaptive determinants included D1118H and A570D mutations and local DNTs. Thus, the 3D CNN model based on DCR features predicts SARS-CoV-2, a major type II human adaptation and is qualified to predict variant adaptation in real time, facilitating the risk-assessment of emerging SARS-CoV-2 variants and COVID-19 control.<\/jats:p>","DOI":"10.1093\/bib\/bbac036","type":"journal-article","created":{"date-parts":[[2022,1,26]],"date-time":"2022-01-26T12:10:29Z","timestamp":1643199029000},"source":"Crossref","is-referenced-by-count":27,"title":["Deep learning based on biologically interpretable genome representation predicts two types of human adaptation of SARS-CoV-2 variants"],"prefix":"10.1093","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6860-6135","authenticated-orcid":false,"given":"Jing","family":"Li","sequence":"first","affiliation":[{"name":"State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology 10 and Epidemiology, Beijing 100071, China"}]},{"given":"Ya-Nan","family":"Wu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology 10 and Epidemiology, Beijing 100071, China"}]},{"given":"Sen","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology 10 and Epidemiology, Beijing 100071, China"}]},{"given":"Xiao-Ping","family":"Kang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology 10 and Epidemiology, Beijing 100071, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1908-2926","authenticated-orcid":false,"given":"Tao","family":"Jiang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology 10 and Epidemiology, Beijing 100071, China"}]}],"member":"286","published-online":{"date-parts":[[2022,3,2]]},"reference":[{"key":"2022051813072605600_ref1","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1016\/j.chom.2010.05.009","article-title":"Influenza virus evolution, host adaptation, and pandemic formation","volume":"7","author":"Taubenberger","year":"2010","journal-title":"Cell Host Microbe"},{"key":"2022051813072605600_ref2","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1038\/s41579-018-0120-2","article-title":"Prisoners of war\u2014host adaptation and its constraints on virus evolution","volume":"17","author":"Simmonds","year":"2019","journal-title":"Nat Rev Microbiol"},{"key":"2022051813072605600_ref3","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1038\/s41579-018-0118-9","article-title":"Origin and evolution of pathogenic coronaviruses","volume":"17","author":"Cui","year":"2019","journal-title":"Nat Rev Microbiol"},{"key":"2022051813072605600_ref4","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1146\/annurev-micro-020518-115759","article-title":"Human coronavirus: host-pathogen interaction","volume":"73","author":"Fung","year":"2019","journal-title":"Annu Rev Microbiol"},{"key":"2022051813072605600_ref5","doi-asserted-by":"crossref","DOI":"10.3390\/diseases4030026","article-title":"Human coronaviruses: a review of virus-host interactions","volume":"4","author":"Lim","year":"2016","journal-title":"Diseases"},{"key":"2022051813072605600_ref6","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1038\/s41586-020-2012-7","article-title":"A pneumonia outbreak associated with a new coronavirus of probable bat origin","volume":"579","author":"Zhou","year":"2020","journal-title":"Nature"},{"key":"2022051813072605600_ref7","doi-asserted-by":"crossref","first-page":"1952","DOI":"10.1056\/NEJMe2103931","article-title":"Interplay between emerging SARS-CoV-2 variants and pandemic control","volume":"384","author":"Neuzil","year":"2021","journal-title":"N Engl J Med"},{"key":"2022051813072605600_ref8","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/s41423-021-00648-1","article-title":"Emerging SARS-CoV-2 variants reduce neutralization sensitivity to convalescent sera and monoclonal antibodies","volume":"18","author":"Hu","year":"2021","journal-title":"Cell Mol Immunol"},{"key":"2022051813072605600_ref9","doi-asserted-by":"crossref","first-page":"848","DOI":"10.1038\/s41467-021-21118-2","article-title":"SARS-CoV-2 D614G spike mutation increases entry efficiency with enhanced ACE2-binding affinity","volume":"12","author":"Ozono","year":"2021","journal-title":"Nat Commun"},{"key":"2022051813072605600_ref10","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1038\/s41392-021-00502-w","article-title":"Mutation D614G increases SARS-CoV-2 transmission","volume":"6","author":"Arora","year":"2021","journal-title":"Signal Transduct Target Ther"},{"key":"2022051813072605600_ref11","doi-asserted-by":"crossref","DOI":"10.1038\/s41586-021-03471-w","article-title":"Escape of SARS-CoV-2 501Y.V2 from neutralization by convalescent plasma","volume":"593","author":"Cele","year":"2021","journal-title":"Nature"},{"key":"2022051813072605600_ref12","article-title":"Emerging SARS-CoV-2 variants and impact in global vaccination programs against SARS-CoV-2\/COVID-19","volume":"9","author":"Gomez","year":"2021","journal-title":"Vaccines (Basel)"},{"key":"2022051813072605600_ref13","doi-asserted-by":"crossref","first-page":"2212","DOI":"10.1056\/NEJMoa2105000","article-title":"Vaccine breakthrough infections with SARS-CoV-2 variants","volume":"384","author":"Hacisuleyman","year":"2021","journal-title":"N Engl J Med"},{"key":"2022051813072605600_ref14","article-title":"Emerging vaccine-breakthrough SARS-CoV-2 variants","volume":"9","author":"Wang","year":"2021","journal-title":"ArXiv"},{"key":"2022051813072605600_ref15","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1038\/s41564-021-00872-5","article-title":"Addendum: a dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology","volume":"6","author":"Rambaut","year":"2021","journal-title":"Nat Microbiol"},{"key":"2022051813072605600_ref16","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1126\/science.abf2303","article-title":"Structural impact on SARS-CoV-2 spike protein by D614G substitution","volume":"372","author":"Zhang","year":"2021","journal-title":"Science"},{"key":"2022051813072605600_ref17","doi-asserted-by":"crossref","first-page":"D962","DOI":"10.1093\/nar\/gkab979","article-title":"CompoDynamics: a comprehensive database for characterizing sequence composition dynamics","volume":"50","author":"Jiang","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2022051813072605600_ref18","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1126\/science.aap9072","article-title":"Predicting reservoir hosts and arthropod vectors from evolutionary signatures in RNA virus genomes","volume":"362","author":"Babayan","year":"2018","journal-title":"Science"},{"key":"2022051813072605600_ref19","doi-asserted-by":"crossref","first-page":"1224","DOI":"10.1093\/molbev\/msz276","article-title":"Machine learning methods for predicting human-adaptive influenza a viruses based on viral nucleotide compositions","volume":"37","author":"Li","year":"2020","journal-title":"Mol Biol Evol"},{"key":"2022051813072605600_ref20","article-title":"Effective and scalable single-cell data alignment with non-linear canonical correlation analysis","author":"Hu","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2022051813072605600_ref21","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1126\/science.abd7331","article-title":"Learning the language of viral evolution and escape","volume":"371","author":"Hie","year":"2021","journal-title":"Science"},{"key":"2022051813072605600_ref22","doi-asserted-by":"crossref","first-page":"1315","DOI":"10.1038\/s41592-019-0598-1","article-title":"Unified rational protein engineering with sequence-based deep representation learning","volume":"16","author":"Alley","year":"2019","journal-title":"Nat Methods"},{"key":"2022051813072605600_ref23","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"86","author":"Van Der","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2022051813072605600_ref24","first-page":"20150202","article-title":"Principal component analysis: a review and recent developments","volume":"374","author":"Jolliffe","year":"2016","journal-title":"Philos Trans A Math Phys Eng Sci"},{"key":"2022051813072605600_ref25","first-page":"e246102","article-title":"Review of machine learning methods in soft robotics","volume":"16","author":"Kim","year":"2021","journal-title":"PLoS One"},{"key":"2022051813072605600_ref26","doi-asserted-by":"crossref","first-page":"1731","DOI":"10.15585\/mmwr.mm7050e1","article-title":"CDC COVID-19 Response Team. SARS-CoV-2 B.1.1.529 (Omicron) variant\u2014United States, December 1\u20138, 2021","volume":"70","year":"2021","journal-title":"Morb Mortal Wkly Rep"},{"key":"2022051813072605600_ref27","doi-asserted-by":"crossref","first-page":"1188","DOI":"10.1038\/s41564-021-00954-4","article-title":"SARS-CoV-2 variant prediction and antiviral drug design are enabled by RBD in vitro evolution","volume":"6","author":"Zahradnik","year":"2021","journal-title":"Nat Microbiol"},{"key":"2022051813072605600_ref28","doi-asserted-by":"crossref","DOI":"10.3390\/v13050935","article-title":"Prediction and evolution of the molecular fitness of SARS-CoV-2 variants: introducing SpikePro","volume":"13","author":"Pucci","year":"2021","journal-title":"Viruses"},{"key":"2022051813072605600_ref29","doi-asserted-by":"crossref","first-page":"6929","DOI":"10.1039\/D1SC01203G","article-title":"Prediction and mitigation of mutation threats to COVID-19 vaccines and antibody therapies","volume":"12","author":"Chen","year":"2021","journal-title":"Chem Sci"},{"key":"2022051813072605600_ref30","doi-asserted-by":"crossref","first-page":"1252","DOI":"10.1093\/molbev\/msg149","article-title":"A codon-based model of host-specific selection in parasites, with an application to the influenza a virus","volume":"20","author":"Forsberg","year":"2003","journal-title":"Mol Biol Evol"},{"key":"2022051813072605600_ref31","doi-asserted-by":"crossref","first-page":"4583","DOI":"10.1093\/nar\/gkl597","article-title":"Codon usage bias and tRNA over-expression in Buchnera aphidicola after aromatic amino acid nutritional stress on its host Acyrthosiphon pisum","volume":"34","author":"Charles","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2022051813072605600_ref32","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1038\/msb.2009.71","article-title":"Viral adaptation to host: a proteome-based analysis of codon usage and amino acid preferences","volume":"5","author":"Bahir","year":"2009","journal-title":"Mol Syst Biol"},{"key":"2022051813072605600_ref33","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1038\/s41559-020-1124-7","article-title":"Dissimilation of synonymous codon usage bias in virus-host coevolution due to translational selection","volume":"4","author":"Chen","year":"2020","journal-title":"Nat Ecol Evol"},{"key":"2022051813072605600_ref34","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/s41467-018-07391-8","article-title":"Central dogma rates and the trade-off between precision and economy in gene expression","volume":"10","author":"Hausser","year":"2019","journal-title":"Nat Commun"},{"key":"2022051813072605600_ref35","doi-asserted-by":"crossref","first-page":"13816","DOI":"10.1128\/JVI.02515-13","article-title":"CpG dinucleotide frequencies reveal the role of host methylation capabilities in parvovirus evolution","volume":"87","author":"Upadhyay","year":"2013","journal-title":"J Virol"},{"key":"2022051813072605600_ref36","doi-asserted-by":"crossref","first-page":"e1009603","DOI":"10.1371\/journal.ppat.1009603","article-title":"Characterisation of the Semliki Forest virus-host cell interactome reveals the viral capsid protein as an inhibitor of nonsense-mediated mRNA decay","volume":"17","author":"Contu","year":"2021","journal-title":"PLoS Pathog"},{"key":"2022051813072605600_ref37","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1146\/annurev.genet.42.110807.091442","article-title":"Selection on codon bias","volume":"42","author":"Hershberg","year":"2008","journal-title":"Annu Rev Genet"},{"key":"2022051813072605600_ref38","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1016\/S0168-9525(00)02041-2","article-title":"tRNA gene number and codon usage in the C. elegans genome are co-adapted for optimal translation of highly expressed genes","volume":"16","author":"Duret","year":"2000","journal-title":"Trends Genet"},{"key":"2022051813072605600_ref39","doi-asserted-by":"crossref","first-page":"2699","DOI":"10.1093\/molbev\/msaa094","article-title":"Extreme genomic CpG deficiency in SARS-CoV-2 and evasion of host antiviral defense","volume":"37","author":"Xia","year":"2020","journal-title":"Mol Biol Evol"},{"key":"2022051813072605600_ref40","doi-asserted-by":"crossref","first-page":"2706","DOI":"10.1093\/molbev\/msaa178","article-title":"Viral CpG deficiency provides no evidence that dogs were intermediate hosts for SARS-CoV-2","volume":"37","author":"Pollock","year":"2020","journal-title":"Mol Biol Evol"},{"key":"2022051813072605600_ref41","doi-asserted-by":"crossref","first-page":"548275","DOI":"10.3389\/fmicb.2021.548275","article-title":"Base composition and host adaptation of the SARS-CoV-2: insight from the codon usage perspective","volume":"12","author":"Roy","year":"2021","journal-title":"Front Microbiol"},{"key":"2022051813072605600_ref42","doi-asserted-by":"crossref","DOI":"10.1038\/s41592-019-0598-1","article-title":"Unified rational protein engineering with sequence-based deep representation learning","volume":"16","author":"Alley","year":"2019","journal-title":"Nat Methods"},{"key":"2022051813072605600_ref43","first-page":"9689","article-title":"Evaluating protein transfer learning with TAPE","volume":"32","author":"Rao","year":"2019","journal-title":"Adv Neural Inf Process Syst"},{"key":"2022051813072605600_ref44","doi-asserted-by":"crossref","first-page":"450","DOI":"10.1038\/s41591-020-0820-9","article-title":"The proximal origin of SARS-CoV-2","volume":"26","author":"Andersen","year":"2020","journal-title":"Nat Med"},{"key":"2022051813072605600_ref45","doi-asserted-by":"crossref","first-page":"D466","DOI":"10.1093\/nar\/gkw857","article-title":"Influenza research database: an integrated bioinformatics resource for influenza virus research","volume":"45","author":"Zhang","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2022051813072605600_ref46","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1007\/s003359900728","article-title":"High sequence similarity within ras exons 1 and 2 in different mammalian species and phylogenetic divergence of the ras gene family","volume":"9","author":"Watzinger","year":"1998","journal-title":"Mamm Genome"},{"key":"2022051813072605600_ref47","doi-asserted-by":"crossref","first-page":"276","DOI":"10.3201\/eid2002.131182","article-title":"Replicative capacity of MERS coronavirus in livestock cell lines","volume":"20","author":"Eckerle","year":"2014","journal-title":"Emerg Infect Dis"},{"key":"2022051813072605600_ref48","article-title":"Genetic detection and pathological finding of BVDV and BHV-1 in camel calves","volume-title":"Assiut Vet Med J","author":"Gafer","year":"2015"},{"key":"2022051813072605600_ref49","doi-asserted-by":"crossref","first-page":"40","DOI":"10.3758\/BF03213026","article-title":"Theoretical analysis of an alphabetic confusion matrix","volume":"9","author":"Townsend","year":"1971","journal-title":"Percept Psychophys"},{"key":"2022051813072605600_ref50","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2005","journal-title":"Pattern Recognit Lett"},{"key":"2022051813072605600_ref51","article-title":"Logomaker: beautiful sequence logos in python","volume":"7","author":"Ammar","year":"2019","journal-title":"Bioinformatics"},{"key":"2022051813072605600_ref52","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1002\/gch2.1018","article-title":"Data, disease and diplomacy: GISAID's innovative contributionto global health","volume":"1","author":"Elbe","year":"2017","journal-title":"Global Challenges"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/3\/bbac036\/43745512\/bbac036.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/3\/bbac036\/43745512\/bbac036.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,18]],"date-time":"2022-05-18T13:09:33Z","timestamp":1652879373000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac036\/6540151"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,2]]},"references-count":52,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,5,13]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac036","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,5]]},"published":{"date-parts":[[2022,3,2]]},"article-number":"bbac036"}}