{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T20:31:06Z","timestamp":1770755466983,"version":"3.50.0"},"reference-count":15,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2024,9,26]],"date-time":"2024-09-26T00:00:00Z","timestamp":1727308800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"City University of Hong Kong, Hong Kong Research Grants Council","award":["11206819"],"award-info":[{"award-number":["11206819"]}]},{"name":"City University of Hong Kong, Hong Kong Research Grants Council","award":["11217521"],"award-info":[{"award-number":["11217521"]}]},{"DOI":"10.13039\/501100007156","name":"Hong Kong Innovation and Technology Fund","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100007156","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Summary<\/jats:title>\n                  <jats:p>RNA viruses are ubiquitous across a broad spectrum of ecosystems. Therefore, beyond their significant implications for public health, RNA viruses are also key players in ecological processes. High-through sequencing has accelerated the discovery of RNA viruses. Nevertheless, many of these viruses lack taxonomic annotation, posing a challenge to functional inference and evolutionary study. In particular, virus classification at the genus level remains difficult due to the limited reference data and ambiguous boundaries between some closely related genera. We introduce VirTAXA, a robust classification tool that combines remote homology search and tree-based validation to enhance the genus-level taxonomic classification of RNA viruses. VirTAXA is able to predict the genus label of an assembled viral contig and provide evidence type for each prediction. It achieves comparable accuracy to state-of-the-art methods while assigning genus labels to a greater number of sequences. Specifically, on the Global Ocean RNA metatranscriptomic data, VirTAXA can assign genus labels for 18% more contigs than the second-best classification tool. Furthermore, we demonstrated that VirTAXA can be conveniently extended to other types of viruses.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The source code and data of VirTAXA are available via https:\/\/github.com\/JudithEllyn\/VirTAXA.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae575","type":"journal-article","created":{"date-parts":[[2024,9,26]],"date-time":"2024-09-26T18:09:46Z","timestamp":1727374186000},"source":"Crossref","is-referenced-by-count":2,"title":["VirTAXA: enhancing RNA virus taxonomic classification with remote homology search and tree-based validation"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-7730-2739","authenticated-orcid":false,"given":"Yilin","family":"Zhu","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering, City University of Hong Kong , Tat Chee Avenue, Kowloon, Hong Kong, 999077, SAR","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1071-4993","authenticated-orcid":false,"given":"Guowei","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, City University of Hong Kong , Tat Chee Avenue, Kowloon, Hong Kong, 999077, SAR","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1373-8023","authenticated-orcid":false,"given":"Yanni","family":"Sun","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, City University of Hong Kong , Tat Chee Avenue, Kowloon, Hong Kong, 999077, SAR","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2024,9,26]]},"reference":[{"key":"2024101004160752900_btae575-B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J Mol Biol"},{"key":"2024101004160752900_btae575-B2","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1038\/s41592-021-01101-x","article-title":"Sensitive protein alignments at tree-of-life scale using DIAMOND","volume":"18","author":"Buchfink","year":"2021","journal-title":"Nat Methods"},{"key":"2024101004160752900_btae575-B3","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","article-title":"Profile hidden Markov models","volume":"14","author":"Eddy","year":"1998","journal-title":"Bioinformatics"},{"key":"2024101004160752900_btae575-B4","doi-asserted-by":"crossref","first-page":"1575","DOI":"10.1093\/nar\/30.7.1575","article-title":"An efficient algorithm for large-scale detection of protein families","volume":"30","author":"Enright","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2024101004160752900_btae575-B5","doi-asserted-by":"crossref","first-page":"bbad408","DOI":"10.1093\/bib\/bbad408","article-title":"PhaGenus: genus-level classification of bacteriophages using a transformer model","volume":"24","author":"Guan","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024101004160752900_btae575-B6","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1016\/j.coviro.2013.03.010","article-title":"Viral surveillance and discovery","volume":"3","author":"Lipkin","year":"2013","journal-title":"Curr Opin Virol"},{"key":"2024101004160752900_btae575-B7","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1186\/s40168-020-00900-2","article-title":"Ultrafast and accurate 16s rRNA microbial community analysis using kraken 2","volume":"8","author":"Lu","year":"2020","journal-title":"Microbiome"},{"key":"2024101004160752900_btae575-B8","doi-asserted-by":"crossref","first-page":"272","DOI":"10.1186\/s12864-018-4620-2","article-title":"TreeShrink: fast and accurate detection of outlier long branches in collections of phylogenetic trees","volume":"19","author":"Mai","year":"2018","journal-title":"BMC Genomics"},{"key":"2024101004160752900_btae575-B9","doi-asserted-by":"crossref","first-page":"3029","DOI":"10.1093\/bioinformatics\/btab184","article-title":"Fast and sensitive taxonomic assignment to metagenomic contigs","volume":"37","author":"Mirdita","year":"2021","journal-title":"Bioinformatics"},{"key":"2024101004160752900_btae575-B10","doi-asserted-by":"crossref","first-page":"e1009492","DOI":"10.1371\/journal.pcbi.1009492","article-title":"Constructing benchmark test sets for biological sequence analysis using independent set algorithms","volume":"18","author":"Petti","year":"2022","journal-title":"PLoS Comput Biol"},{"key":"2024101004160752900_btae575-B11","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1186\/s13059-019-1817-x","article-title":"Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT","volume":"20","author":"von Meijenfeldt","year":"2019","journal-title":"Genome Biol"},{"key":"2024101004160752900_btae575-B12","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1038\/nature05775","article-title":"Origins of major human infectious diseases","volume":"447","author":"Wolfe","year":"2007","journal-title":"Nature"},{"key":"2024101004160752900_btae575-B13","doi-asserted-by":"crossref","first-page":"761","DOI":"10.1080\/22221751.2020.1747363","article-title":"Transcriptomic characteristics of bronchoalveolar lavage fluid and peripheral blood mononuclear cells in COVID-19 patients","volume":"9","author":"Xiong","year":"2020","journal-title":"Emerg Microbes Infect"},{"key":"2024101004160752900_btae575-B14","doi-asserted-by":"crossref","first-page":"960465","DOI":"10.3389\/fmicb.2022.960465","article-title":"A discussion of RNA virus taxonomy based on the 2020 international committee on taxonomy of viruses report","volume":"13","author":"Yuan","year":"2022","journal-title":"Front Microbiol"},{"key":"2024101004160752900_btae575-B15","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1126\/science.abm5847","article-title":"Cryptic and abundant marine viruses at the evolutionary origins of earth\u2019s RNA virome","volume":"376","author":"Zayed","year":"2022","journal-title":"Science"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae575\/59357677\/btae575.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/10\/btae575\/59694299\/btae575.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/10\/btae575\/59694299\/btae575.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,10]],"date-time":"2024-10-10T11:24:42Z","timestamp":1728559482000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae575\/7777163"}},"subtitle":[],"editor":[{"given":"Christina","family":"Kendziorski","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,9,26]]},"references-count":15,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2024,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae575","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,10]]},"published":{"date-parts":[[2024,9,26]]},"article-number":"btae575"}}