{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:33:45Z","timestamp":1772138025575,"version":"3.50.1"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T00:00:00Z","timestamp":1729728000000},"content-version":"vor","delay-in-days":31,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"NIH NIAID","award":["R01 AI170187"],"award-info":[{"award-number":["R01 AI170187"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,9,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The COVID-19 pandemic is marked by the successive emergence of new SARS-CoV-2 variants, lineages, and sublineages that outcompete earlier strains, largely due to factors like increased transmissibility and immune escape. We propose DeepAutoCoV, an unsupervised deep learning anomaly detection system, to predict future dominant lineages (FDLs). We define FDLs as viral (sub)lineages that will constitute &amp;gt;10% of all the viral sequences added to the GISAID, a public database supporting viral genetic sequence sharing, in a given week. DeepAutoCoV is trained and validated by assembling global and country-specific data sets from over 16 million Spike protein sequences sampled over a period of ~4\u00a0years. DeepAutoCoV successfully flags FDLs at very low frequencies (0.01%\u20133%), with median lead times of 4\u201317\u00a0weeks, and predicts FDLs between ~5 and\u2009~25 times better than a baseline approach. For example, the B.1.617.2 vaccine reference strain was flagged as FDL when its frequency was only 0.01%, more than a year before it was considered for an updated COVID-19 vaccine. Furthermore, DeepAutoCoV outputs interpretable results by pinpointing specific mutations potentially linked to increased fitness and may provide significant insights for the optimization of public health \u2018pre-emptive\u2019 intervention strategies.<\/jats:p>","DOI":"10.1093\/bib\/bbae535","type":"journal-article","created":{"date-parts":[[2024,10,8]],"date-time":"2024-10-08T15:27:58Z","timestamp":1728401278000},"source":"Crossref","is-referenced-by-count":11,"title":["Forecasting dominance of SARS-CoV-2 lineages by anomaly detection using deep AutoEncoders"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-4405-1697","authenticated-orcid":false,"given":"Simone","family":"Rancati","sequence":"first","affiliation":[{"name":"Department of Electrical , Computer and Biomedical Engineering, , Via Adolfo Ferrata 5, Pavia, 27100 ,","place":["Italy"]},{"name":"University of Pavia , Computer and Biomedical Engineering, , Via Adolfo Ferrata 5, Pavia, 27100 ,","place":["Italy"]}]},{"given":"Giovanna","family":"Nicora","sequence":"additional","affiliation":[{"name":"Department of Electrical , Computer and Biomedical Engineering, , Via Adolfo Ferrata 5, Pavia, 27100 ,","place":["Italy"]},{"name":"University of Pavia , Computer and Biomedical Engineering, , Via Adolfo Ferrata 5, Pavia, 27100 ,","place":["Italy"]}]},{"given":"Mattia","family":"Prosperi","sequence":"additional","affiliation":[{"name":"Department of Epidemiology , College of Public Health and Health Professions, , 2004 Mowry Road, Gainesville, FL 32610 ,","place":["United States"]},{"name":"University of Florida , College of Public Health and Health Professions, , 2004 Mowry Road, Gainesville, FL 32610 ,","place":["United States"]},{"name":"Emerging Pathogens Institute, University of Florida , 2055 Mowry Road, Gainesville, FL 32610 ,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6974-9808","authenticated-orcid":false,"given":"Riccardo","family":"Bellazzi","sequence":"additional","affiliation":[{"name":"Department of Electrical , Computer and Biomedical Engineering, , Via Adolfo Ferrata 5, Pavia, 27100 ,","place":["Italy"]},{"name":"University of Pavia , Computer and Biomedical Engineering, , Via Adolfo Ferrata 5, Pavia, 27100 ,","place":["Italy"]}]},{"given":"Marco","family":"Salemi","sequence":"additional","affiliation":[{"name":"Emerging Pathogens Institute, University of Florida , 2055 Mowry Road, Gainesville, FL 32610 ,","place":["United States"]},{"name":"Department of Pathology , Immunology and Laboratory Medicine, College of Medicine, , 1600 SW Archer Road, Gainesville, FL 32610 ,","place":["United States"]},{"name":"University of Florida , Immunology and Laboratory Medicine, College of Medicine, , 1600 SW Archer Road, Gainesville, FL 32610 ,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5704-3533","authenticated-orcid":false,"given":"Simone","family":"Marini","sequence":"additional","affiliation":[{"name":"Department of Epidemiology , College of Public Health and Health Professions, , 2004 Mowry Road, Gainesville, FL 32610 ,","place":["United States"]},{"name":"University of Florida , College of Public Health and Health Professions, , 2004 Mowry Road, Gainesville, FL 32610 ,","place":["United States"]},{"name":"Emerging Pathogens Institute, University of Florida , 2055 Mowry Road, Gainesville, FL 32610 ,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2024,10,24]]},"reference":[{"key":"2024102413223033000_ref1","volume-title":"\u2018COVID-19 Deaths | WHO COVID-19 Dashboard\u2019, Datadot","year":"2024"},{"key":"2024102413223033000_ref2","doi-asserted-by":"publisher","first-page":"270","DOI":"10.1038\/s41586-020-2012-7","article-title":"A pneumonia outbreak associated with a new coronavirus of probable bat origin","volume":"579","author":"Zhou","year":"2020","journal-title":"Nature"},{"key":"2024102413223033000_ref3","doi-asserted-by":"publisher","first-page":"265","DOI":"10.1038\/s41586-020-2008-3","article-title":"A new coronavirus associated with human respiratory disease in China","volume":"579","author":"Wu","year":"2020","journal-title":"Nature"},{"key":"2024102413223033000_ref4","doi-asserted-by":"publisher","first-page":"22366","DOI":"10.1038\/s41598-020-79484-8","article-title":"Phylogenetic supertree reveals detailed evolution of SARS-CoV-2","volume":"10","author":"Li","year":"2020","journal-title":"Sci Rep"},{"key":"2024102413223033000_ref5","doi-asserted-by":"publisher","first-page":"1403","DOI":"10.1038\/s41564-020-0770-5","article-title":"A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology","volume":"5","author":"Rambaut","year":"2020","journal-title":"Nat Microbiol"},{"key":"2024102413223033000_ref6","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1186\/s12864-022-08358-2","article-title":"Pango lineage designation and assignment using SARS-CoV-2 spike gene nucleotide sequences","volume":"23","author":"O\u2019Toole","year":"2022","journal-title":"BMC Genomics"},{"key":"2024102413223033000_ref7","volume-title":"\u2018Coronavirus Disease 2019 (COVID-19)\u2019, Centers for Disease Control and Prevention","author":"CDC"},{"key":"2024102413223033000_ref8","doi-asserted-by":"publisher","first-page":"1161","DOI":"10.1038\/s41564-022-01143-7","article-title":"SARS-CoV-2 omicron is an immune escape variant with an altered cell entry pathway","volume":"7","author":"Willett","year":"2022","journal-title":"Nat Microbiol"},{"key":"2024102413223033000_ref9","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1007\/s15010-021-01734-2","article-title":"Waves and variants of SARS-CoV-2: understanding the causes and effect of the COVID-19 catastrophe","volume":"50","author":"Thakur","year":"2022","journal-title":"Infection"},{"key":"2024102413223033000_ref10","doi-asserted-by":"publisher","first-page":"e229","DOI":"10.1016\/S2666-5247(20)30116-6","article-title":"A genomics network established to respond rapidly to public health threats in South Africa","volume":"1","author":"Msomi","year":"2020","journal-title":"Lancet Microbe"},{"key":"2024102413223033000_ref11","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2023.106618","article-title":"Early computational detection of potential high-risk SARS-CoV-2 variants","volume":"155","author":"Beguir","year":"2023","journal-title":"Comput Biol Med"},{"key":"2024102413223033000_ref12","doi-asserted-by":"publisher","first-page":"1110","DOI":"10.1038\/s41591-022-01836-w","article-title":"An early warning system for emerging SARS-CoV-2 variants","volume":"28","author":"Subissi","year":"2022","journal-title":"Nat Med"},{"key":"2024102413223033000_ref13","article-title":"Statement on the antigen composition of COVID-19 vaccines","year":"2024"},{"key":"2024102413223033000_ref14","doi-asserted-by":"publisher","first-page":"106264","DOI":"10.1016\/j.compbiomed.2022.106264","article-title":"TEMPO: a transformer-based mutation prediction framework for SARS-CoV-2 evolution","volume":"152","author":"Zhou","year":"2023","journal-title":"Comput Biol Med"},{"key":"2024102413223033000_ref15","doi-asserted-by":"publisher","first-page":"2007","DOI":"10.1038\/s41591-023-02483-5","article-title":"Deep-learning-enabled protein\u2013protein interaction analysis for prediction of SARS-CoV-2 infectivity and variant evolution","volume":"29","author":"Wang","year":"2023","journal-title":"Nat Med"},{"key":"2024102413223033000_ref16","doi-asserted-by":"publisher","first-page":"06.21.21259286","DOI":"10.1101\/2021.06.21.21259286","article-title":"Predicting the mutational drivers of future SARS-CoV-2 variants of concern","author":"Maher","year":"2021","journal-title":"medRxiv"},{"key":"2024102413223033000_ref17","doi-asserted-by":"publisher","first-page":"3549","DOI":"10.1093\/bioinformatics\/btac370","article-title":"VOC-alarm: mutation-based prediction of SARS-CoV-2 variants of concern","volume":"38","author":"Zhao","year":"2022","journal-title":"Bioinformatics"},{"key":"2024102413223033000_ref18","doi-asserted-by":"publisher","first-page":"4008","DOI":"10.1016\/j.cell.2022.08.024","article-title":"Deep mutational learning predicts ACE2 binding and antibody escape to combinatorial mutations in the SARS-CoV-2 receptor-binding domain","volume":"185","author":"Taft","year":"2022","journal-title":"Cell"},{"key":"2024102413223033000_ref19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/WTS.2018.8363930","volume-title":"2018 Wireless Telecommunications Symposium (WTS)","author":"Chen","year":"2018"},{"key":"2024102413223033000_ref20","doi-asserted-by":"publisher","first-page":"e100643","DOI":"10.1136\/bmjhci-2022-100643","article-title":"Predicting emerging SARS-CoV-2 variants of concern through a one class dynamic anomaly detection algorithm","volume":"29","author":"Nicora","year":"2022","journal-title":"BMJ Health Care Inform"},{"key":"2024102413223033000_ref21","doi-asserted-by":"publisher","first-page":"409","DOI":"10.1038\/s41579-021-00573-0","article-title":"SARS-CoV-2 variants, spike mutations and immune escape","volume":"19","author":"Harvey","year":"2021","journal-title":"Nat Rev Microbiol"},{"key":"2024102413223033000_ref22","doi-asserted-by":"publisher","first-page":"856","DOI":"10.1093\/bioinformatics\/btab725","article-title":"Optimizing viral genome subsampling by genetic diversity and temporal distribution (TARDiS) for phylogenetics","volume":"38","author":"Marini","year":"2022","journal-title":"Bioinformatics"},{"key":"2024102413223033000_ref23","doi-asserted-by":"publisher","DOI":"10.3389\/fmicb.2023.1060891","article-title":"The K-mer antibiotic resistance gene variant analyzer (KARGVA)","volume":"14","author":"Marini","year":"2023","journal-title":"Front Microbiol"},{"key":"2024102413223033000_ref24","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"2024102413223033000_ref25","article-title":"\u2018TensorFlow: a system for large-scale machine learning\u2019, arXiv.org","volume":"09","author":"Abadi","year":"2024","journal-title":"Accessed: Jul"},{"key":"2024102413223033000_ref26","doi-asserted-by":"publisher","first-page":"232","DOI":"10.1016\/j.neucom.2015.08.104","article-title":"Auto-encoder based dimensionality reduction","volume":"184","author":"Wang","year":"2016","journal-title":"Neurocomputing"},{"key":"2024102413223033000_ref27","doi-asserted-by":"publisher","first-page":"108756","DOI":"10.1016\/j.knosys.2022.108756","article-title":"CPDGA: change point driven growing auto-encoder for lifelong anomaly detection","volume":"247","author":"Corizzo","year":"2022","journal-title":"Knowl-Based Syst"},{"key":"2024102413223033000_ref28","volume-title":"22 a Model of Evolutionary Change in Proteins","author":"O. M","year":"1978"},{"key":"2024102413223033000_ref29","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1109\/BigDataService55688.2022.00020","volume-title":"2022 IEEE Eighth International Conference on Big Data Computing Service and Applications (BigDataService)","author":"Ali","year":"2022"},{"key":"2024102413223033000_ref30","article-title":"Spike2Vec: an efficient and scalable embedding approach for COVID-19 spike sequences","author":"Ali","year":"2024"},{"key":"2024102413223033000_ref31","doi-asserted-by":"publisher","first-page":"2023.08.02.23293212","DOI":"10.1101\/2023.08.02.23293212","article-title":"Critical assessment of variant prioritization methods for rare disease diagnosis within the rare genomes project","author":"Stenton","year":"2023","journal-title":"medRxiv"},{"key":"2024102413223033000_ref32","doi-asserted-by":"publisher","DOI":"10.1016\/S1473-3099(24)00298-6","article-title":"Virological characteristics of the SARS-CoV-2 KP.2 variant","volume":"24","author":"Kaku","year":"2024","journal-title":"Lancet Infect Dis"},{"key":"2024102413223033000_ref33","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1093\/qjmed\/hcae102","article-title":"The emerging challenge of FLiRT variants: KP.1.1 and KP.2 in the global pandemic landscape","volume":"117","author":"Kumar","year":"2024","journal-title":"QJM"},{"key":"2024102413223033000_ref34","doi-asserted-by":"publisher","first-page":"1230","DOI":"10.1038\/s41591-021-01378-7","article-title":"COVID-19 in Amazonas, Brazil, was driven by the persistence of endemic lineages and P.1 emergence","volume":"27","author":"Naveca","year":"2021","journal-title":"Nat Med"},{"key":"2024102413223033000_ref35","doi-asserted-by":"publisher","first-page":"2057","DOI":"10.1093\/cid\/ciab736","article-title":"Rapid emergence and spread of severe acute respiratory syndrome coronavirus 2 gamma (P.1) variant in Haiti","volume":"74","author":"Tagliamonte","year":"2022","journal-title":"Clin Infect Dis"},{"key":"2024102413223033000_ref36","volume-title":"Pfizer and BioNTech Provide Update on Omicron Variant | Pfizer","year":"2024"},{"key":"2024102413223033000_ref37","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1038\/s41579-023-00878-2","article-title":"The evolution of SARS-CoV-2","volume":"21","author":"Markov","year":"2023","journal-title":"Nat Rev Microbiol"},{"key":"2024102413223033000_ref38","doi-asserted-by":"publisher","first-page":"110829","DOI":"10.1016\/j.celrep.2022.110829","article-title":"Delta spike P681R mutation enhances SARS-CoV-2 fitness over alpha variant","volume":"39","author":"Liu","year":"2022","journal-title":"Cell Rep"},{"key":"2024102413223033000_ref39","doi-asserted-by":"publisher","DOI":"10.3389\/fchem.2022.892093","article-title":"Deciphering the impact of mutations on the binding efficacy of SARS-CoV-2 omicron and Delta variants with human ACE2 receptor","volume":"10","author":"Khan","year":"2022","journal-title":"Front Chem"},{"key":"2024102413223033000_ref40","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1007\/s10462-023-10662-6","article-title":"Autoencoders and their applications in machine learning: a survey","volume":"57","author":"Berahmand","year":"2024","journal-title":"Artif Intell Rev"},{"key":"2024102413223033000_ref41","volume-title":"Deep Learning","author":"Goodfellow","year":"2016"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/6\/bbae535\/60016654\/bbae535.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/6\/bbae535\/60016654\/bbae535.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T09:22:46Z","timestamp":1729761766000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae535\/7833672"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,23]]},"references-count":41,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,9,23]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae535","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.10.24.563721","asserted-by":"object"}]},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,11]]},"published":{"date-parts":[[2024,9,23]]},"article-number":"bbae535"}}