{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:00Z","timestamp":1772138040871,"version":"3.50.1"},"reference-count":12,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2020,10,12]],"date-time":"2020-10-12T00:00:00Z","timestamp":1602460800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"PRAIRIE","award":["ANR-19-P3IA-0001"],"award-info":[{"award-number":["ANR-19-P3IA-0001"]}]},{"name":"Ecole Normale Sup\u00e9rieure"},{"name":"Ecole Doctorale Fronti\u00e8res de l'Innovation en Recherche et Education - Programme Bettencourt"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,7,19]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>The first cases of the COVID-19 pandemic emerged in December 2019. Until the end of February 2020, the number of available genomes was below 1000 and their multiple alignment was easily achieved using standard approaches. Subsequently, the availability of genomes has grown dramatically. Moreover, some genomes are of low quality with sequencing\/assembly errors, making accurate re-alignment of all genomes nearly impossible on a daily basis. A more efficient, yet accurate approach was clearly required to pursue all subsequent bioinformatics analyses of this crucial data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>hCoV-19 genomes are highly conserved, with very few indels and no recombination. This makes the profile HMM approach particularly well suited to align new genomes, add them to an existing alignment and filter problematic ones. Using a core of \u223c2500 high quality genomes, we estimated a profile using HMMER, and implemented this profile in COVID-Align, a user-friendly interface to be used online or as standalone via Docker. The alignment of 1000 genomes requires \u223c50 minutes on our cluster. Moreover, COVID-Align provides summary statistics, which can be used to determine the sequencing quality and evolutionary novelty of input genomes (e.g. number of new mutations and indels).<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>https:\/\/covalign.pasteur.cloud, hub.docker.com\/r\/evolbioinfo\/covid-align.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa871","type":"journal-article","created":{"date-parts":[[2020,9,24]],"date-time":"2020-09-24T07:57:26Z","timestamp":1600934246000},"page":"1761-1762","source":"Crossref","is-referenced-by-count":8,"title":["COVID-Align: accurate online alignment of hCoV-19 genomes using a profile HMM"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9576-4449","authenticated-orcid":false,"given":"Fr\u00e9d\u00e9ric","family":"Lemoine","sequence":"first","affiliation":[{"name":"Unit\u00e9 de Bioinformatique Evolutive, USR 3756 (DBC\/C3BI), Institut Pasteur & CNRS , 75015 - Paris, France"},{"name":"Hub de Bioinformatique et Biostatistique, USR 3756 (DBC\/C3BI), Institut Pasteur & CNRS , 75015 - Paris, France"}]},{"given":"Luc","family":"Blassel","sequence":"additional","affiliation":[{"name":"Unit\u00e9 de Bioinformatique Evolutive, USR 3756 (DBC\/C3BI), Institut Pasteur & CNRS , 75015 - Paris, France"},{"name":"ED515, Sorbonne Universit\u00e9, Colle\u0300ge Doctoral, 75006 - \u00a0Paris, France"}]},{"given":"Jakub","family":"Voznica","sequence":"additional","affiliation":[{"name":"Unit\u00e9 de Bioinformatique Evolutive, USR 3756 (DBC\/C3BI), Institut Pasteur & CNRS , 75015 - Paris, France"},{"name":"Universit\u00e9 de Paris , 75006 Paris, France"}]},{"given":"Olivier","family":"Gascuel","sequence":"additional","affiliation":[{"name":"Unit\u00e9 de Bioinformatique Evolutive, USR 3756 (DBC\/C3BI), Institut Pasteur & CNRS , 75015 - Paris, France"},{"name":"Acad\u00e9mie des Sciences , USR 3756, CNRS, 75015 - Paris, France"}]}],"member":"286","published-online":{"date-parts":[[2020,10,12]]},"reference":[{"key":"2023051709455177200_btaa871-B1","doi-asserted-by":"crossref","first-page":"1009","DOI":"10.1093\/bib\/bbv099","article-title":"Multiple sequence alignment modeling: methods and applications","volume":"17","author":"Chatzou","year":"2016","journal-title":"Brief. Bioinform"},{"key":"2023051709455177200_btaa871-B2","author":"De Maio","year":"2020"},{"key":"2023051709455177200_btaa871-B3","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1038\/nbt.3820","article-title":"Nextflow enables reproducible computational workflows","volume":"35","author":"Di Tommaso","year":"2017","journal-title":"Nat. Biotechnol"},{"key":"2023051709455177200_btaa871-B4","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids","author":"Durbin","year":"1998"},{"key":"2023051709455177200_btaa871-B5","doi-asserted-by":"crossref","first-page":"2077","DOI":"10.1101\/gr.174920.114","article-title":"Alignathon: a competitive assessment of whole-genome alignment methods","volume":"24","author":"Earl","year":"2014","journal-title":"Genome Res"},{"key":"2023051709455177200_btaa871-B6","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1093\/molbev\/mst010","article-title":"MAFFT multiple sequence alignment software version 7: improvements in performance and usability","volume":"30","author":"Katoh","year":"2013","journal-title":"Mol. Biol. Evol"},{"key":"2023051709455177200_btaa871-B7","doi-asserted-by":"crossref","first-page":"W260","DOI":"10.1093\/nar\/gkz303","article-title":"NGPhylogeny.fr: new generation phylogenetic services for non-specialists","volume":"47","author":"Lemoine","year":"2019","journal-title":"Nuc. Acids Res"},{"key":"2023051709455177200_btaa871-B8","author":"Li","year":"2020"},{"key":"2023051709455177200_btaa871-B9","doi-asserted-by":"crossref","first-page":"764","DOI":"10.1186\/s12864-016-3101-8","article-title":"Scaling statistical multiple sequence alignment to large datasets","volume":"17","author":"Nute","year":"2016","journal-title":"BMC Genomics"},{"key":"2023051709455177200_btaa871-B10","doi-asserted-by":"crossref","DOI":"10.2807\/1560-7917.ES.2017.22.13.30494","article-title":"GISAID: global initiative on sharing all influenza data \u2013 from vision to reality","volume":"22","author":"Shu","year":"2017","journal-title":"EuroSurveillance"},{"key":"2023051709455177200_btaa871-B11","first-page":"1012","author":"Xiaolu","year":"2020"},{"key":"2023051709455177200_btaa871-B12","doi-asserted-by":"crossref","first-page":"3501","DOI":"10.1093\/bioinformatics\/btw474","article-title":"MSAViewer: interactive JavaScript visualization of multiple sequence alignments","volume":"32","author":"Yachdav","year":"2016","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa871\/33964790\/btaa871.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/12\/1761\/50361266\/btaa871.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/12\/1761\/50361266\/btaa871.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T06:29:12Z","timestamp":1684304952000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/12\/1761\/5921170"}},"subtitle":[],"editor":[{"given":"Valencia","family":"Alfonso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,10,12]]},"references-count":12,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2021,7,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa871","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.05.25.114884","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,6,15]]},"published":{"date-parts":[[2020,10,12]]}}}