{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T05:50:51Z","timestamp":1776405051196,"version":"3.51.2"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"22","license":[{"start":{"date-parts":[[2019,4,26]],"date-time":"2019-04-26T00:00:00Z","timestamp":1556236800000},"content-version":"vor","delay-in-days":1,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Department of Defense: Defense Threat Reduction Agency","award":["DTRA10027-20149"],"award-info":[{"award-number":["DTRA10027-20149"]}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DUE-1259951"],"award-info":[{"award-number":["DUE-1259951"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Computational Science Research Center at SDSU"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Currently there are no tools specifically designed for annotating genes in phages. Several tools are available that have been adapted to run on phage genomes, but due to their underlying design, they are unable to capture the full complexity of phage genomes. Phages have adapted their genomes to be extremely compact, having adjacent genes that overlap and genes completely inside of other longer genes. This non-delineated genome structure makes it difficult for gene prediction using the currently available gene annotators. Here we present PHANOTATE, a novel method for gene calling specifically designed for phage genomes. Although the compact nature of genes in phages is a problem for current gene annotators, we exploit this property by treating a phage genome as a network of paths: where open reading frames are favorable, and overlaps and gaps are less favorable, but still possible. We represent this network of connections as a weighted graph, and use dynamic programing to find the optimal path.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We compare PHANOTATE to other gene callers by annotating a set of 2133 complete phage genomes from GenBank, using PHANOTATE and the three most popular gene callers. We found that the four programs agree on 82% of the total predicted genes, with PHANOTATE predicting more genes than the other three. We searched for these extra genes in both GenBank\u2019s non-redundant protein database and all of the metagenomes in the sequence read archive, and found that they are present at levels that suggest that these are functional protein-coding genes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>https:\/\/github.com\/deprekate\/PHANOTATE<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz265","type":"journal-article","created":{"date-parts":[[2019,4,16]],"date-time":"2019-04-16T07:22:05Z","timestamp":1555399325000},"page":"4537-4542","source":"Crossref","is-referenced-by-count":280,"title":["PHANOTATE: a novel approach to gene identification in phage genomes"],"prefix":"10.1093","volume":"35","author":[{"given":"Katelyn","family":"McNair","sequence":"first","affiliation":[{"name":"Computational Sciences Research Center, San Diego State University , San Diego, CA 92182, USA"}]},{"given":"Carol","family":"Zhou","sequence":"additional","affiliation":[{"name":"Lawrence Livermore National Laboratory , Livermore, CA 94550, USA"}]},{"given":"Elizabeth A","family":"Dinsdale","sequence":"additional","affiliation":[{"name":"Department of Biology, San Diego State University , San Diego, CA 92182, USA"}]},{"given":"Brian","family":"Souza","sequence":"additional","affiliation":[{"name":"Lawrence Livermore National Laboratory , Livermore, CA 94550, USA"}]},{"given":"Robert A","family":"Edwards","sequence":"additional","affiliation":[{"name":"Computational Sciences Research Center, San Diego State University , San Diego, CA 92182, USA"},{"name":"Department of Biology, San Diego State University , San Diego, CA 92182, USA"},{"name":"Viral Information Institute, San Diego State University , San Diego, CA 92182, USA"}]}],"member":"286","published-online":{"date-parts":[[2019,4,25]]},"reference":[{"key":"2023013108322370900_btz265-B1","doi-asserted-by":"crossref","first-page":"e126","DOI":"10.1093\/nar\/gks406","article-title":"PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies","volume":"40","author":"Akhter","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023013108322370900_btz265-B2","doi-asserted-by":"crossref","first-page":"W16","DOI":"10.1093\/nar\/gkw387","article-title":"PHASTER: a better, faster version of the PHAST phage search tool","volume":"44","author":"Arndt","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023013108322370900_btz265-B3","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1093\/oxfordjournals.molbev.a026133","article-title":"CRITICA: coding region identification tool invoking comparative analysis","volume":"16","author":"Badger","year":"1999","journal-title":"Mol. Biol. Evol"},{"key":"2023013108322370900_btz265-B4","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1090\/qam\/102435","article-title":"On a routing problem","volume":"16","author":"Bellman","year":"1958","journal-title":"Quart. Appl. Math"},{"key":"2023013108322370900_btz265-B5","doi-asserted-by":"crossref","first-page":"D37","DOI":"10.1093\/nar\/gkw1070","article-title":"GenBank","volume":"45","author":"Benson","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023013108322370900_btz265-B6","doi-asserted-by":"crossref","first-page":"3911","DOI":"10.1093\/nar\/27.19.3911","article-title":"Heuristic approach to deriving models for gene finding","volume":"27","author":"Besemer","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2023013108322370900_btz265-B7","doi-asserted-by":"crossref","first-page":"741","DOI":"10.1534\/g3.116.037192","article-title":"Genetic analysis of the lambda spanins Rz and Rz1: identification of functional domains","volume":"7","author":"Cahill","year":"2017","journal-title":"G3"},{"key":"2023013108322370900_btz265-B8","volume-title":"Statistical Power Analysis for the Behavioral Sciences L.","author":"Cohen","year":"1988"},{"key":"2023013108322370900_btz265-B9","doi-asserted-by":"crossref","first-page":"2928","DOI":"10.1128\/AEM.04058-13","article-title":"Top-down proteomic identification of Shiga toxin 2 subtypes from Shiga toxin-producing Escherichia coli by matrix-assisted laser desorption ionization-tandem time of flight mass spectrometry","volume":"80","author":"Fagerquist","year":"2014","journal-title":"Appl. Environ. Microbiol"},{"key":"2023013108322370900_btz265-B10","volume-title":"Network Flow Theory.","author":"Ford","year":"1956"},{"key":"2023013108322370900_btz265-B11","volume-title":"Practical Statistics for Field Biology","author":"Fowler","year":"1998"},{"key":"2023013108322370900_btz265-B12","doi-asserted-by":"crossref","first-page":"119.","DOI":"10.1186\/1471-2105-11-119","article-title":"Prodigal: prokaryotic gene recognition and translation initiation site identification","volume":"11","author":"Hyatt","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023013108322370900_btz265-B13","author":"Jones","year":"2001"},{"key":"2023013108322370900_btz265-B14","first-page":"114819","article-title":"Prophage genomics reveals patterns in phage genome organization and replication","author":"Kang","year":"2017","journal-title":"bioRxiv"},{"key":"2023013108322370900_btz265-B15","doi-asserted-by":"crossref","first-page":"487","DOI":"10.1101\/gr.113985.110","article-title":"Adaptive seeds tame genomic sequence comparison","volume":"21","author":"Kie\u0142basa","year":"2011","journal-title":"Genome Res"},{"key":"2023013108322370900_btz265-B16","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1007\/978-1-4939-7343-9_17","volume-title":"Bacteriophages: Methods and Protocols","author":"McNair","year":"2018"},{"key":"2023013108322370900_btz265-B17","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1016\/j.coviro.2011.12.004","article-title":"Metagenomics and future perspectives in virus discovery","volume":"2","author":"Mokili","year":"2012","journal-title":"Curr. Opin. Virol"},{"key":"2023013108322370900_btz265-B18","doi-asserted-by":"crossref","first-page":"591","DOI":"10.1111\/j.1469-185X.2007.00027.x","article-title":"Effect size, confidence interval and statistical significance: a practical guide for biologists","volume":"82","author":"Nakagawa","year":"2007","journal-title":"Biol. Rev. Camb. Philos. Soc"},{"key":"2023013108322370900_btz265-B19","doi-asserted-by":"crossref","first-page":"D7","DOI":"10.1093\/nar\/gkv1290","article-title":"Database resources of the National Center for Biotechnology Information","volume":"44","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023013108322370900_btz265-B20","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1142\/S0219720004000624","article-title":"Multivariate entropy distance method for prokaryotic gene identification","volume":"2","author":"Ouyang","year":"2004","journal-title":"J. Bioinform. Comput. Biol"},{"key":"2023013108322370900_btz265-B21","doi-asserted-by":"crossref","first-page":"e02145","DOI":"10.1128\/mBio.02145-14","article-title":"Genomics and proteomics of mycobacteriophage patience, an accidental tourist in the Mycobacterium neighborhood","volume":"5","author":"Pope","year":"2014","journal-title":"MBio"},{"key":"2023013108322370900_btz265-B22","doi-asserted-by":"crossref","first-page":"4529","DOI":"10.1128\/JB.184.16.4529-4535.2002","article-title":"The Phage Proteomic Tree: a genome-based taxonomy for phage","volume":"184","author":"Rohwer","year":"2002","journal-title":"J. Bacteriol"},{"key":"2023013108322370900_btz265-B23","doi-asserted-by":"crossref","first-page":"e08490","DOI":"10.7554\/eLife.08490","article-title":"Viral dark matter and virus-host interactions resolved from publicly available microbial genomes","volume":"4","author":"Roux","year":"2015","journal-title":"Elife"},{"key":"2023013108322370900_btz265-B24","first-page":"61","volume-title":"Proceedings of the 9th Python in Science Conference","author":"Seabold","year":"2010"},{"key":"2023013108322370900_btz265-B25","doi-asserted-by":"crossref","first-page":"2068","DOI":"10.1093\/bioinformatics\/btu153","article-title":"Prokka: rapid prokaryotic genome annotation","volume":"30","author":"Seemann","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013108322370900_btz265-B26","doi-asserted-by":"crossref","first-page":"3575","DOI":"10.1093\/bioinformatics\/btu576","article-title":"Frameshift alignment: statistics and post-genomic applications","volume":"30","author":"Sheetlin","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013108322370900_btz265-B27","doi-asserted-by":"crossref","first-page":"1098","DOI":"10.1016\/j.jmb.2007.08.045","article-title":"Rz\/Rz1 lysis gene equivalents in phages of Gram-negative hosts","volume":"373","author":"Summer","year":"2007","journal-title":"J. Mol. Biol"},{"key":"2023013108322370900_btz265-B28","doi-asserted-by":"crossref","first-page":"2389","DOI":"10.1093\/bioinformatics\/btx184","article-title":"PARTIE: a partition engine to separate metagenomic and amplicon projects in the Sequence Read Archive","volume":"33","author":"Torres","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013108322370900_btz265-B29","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/S1476-9271(02)00098-1","article-title":"Protein family classification and functional annotation","volume":"27","author":"Wu","year":"2003","journal-title":"Comput. Biol. Chem"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz265\/28563489\/btz265.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/22\/4537\/48978303\/bioinformatics_35_22_4537.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/22\/4537\/48978303\/bioinformatics_35_22_4537.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T12:39:02Z","timestamp":1675168742000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/22\/4537\/5480131"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,4,25]]},"references-count":29,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2019,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz265","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/265983","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,11,15]]},"published":{"date-parts":[[2019,4,25]]}}}