{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T08:06:04Z","timestamp":1772697964803,"version":"3.50.1"},"reference-count":13,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T00:00:00Z","timestamp":1771286400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100006312","name":"BrightFocus Foundation","doi-asserted-by":"publisher","award":["A2021009F"],"award-info":[{"award-number":["A2021009F"]}],"id":[{"id":"10.13039\/100006312","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Tenure Track Clinician Scientist Fellowship","award":["MR\/N008324\/1"],"award-info":[{"award-number":["MR\/N008324\/1"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Summary<\/jats:title>\n                    <jats:p>Accurate annotation of coding sequences and translational features within transcript models is essential for interpreting assembled transcriptomes and their functional potential. Existing open reading frame (ORF) prediction tools typically operate on transcript FASTA files and do not reintegrate coding sequence (CDS) information back into transcript models, limiting their utility in long-read sequencing workflows where GTF\/GFF annotations are the primary output. We present ORFannotate, a lightweight, GTF-native Python command-line tool that predicts ORFs from transcript annotations and reinserts precise, exon-aware CDS and UTR features into the original GTF\/GFF file. In addition, ORFannotate provides biologically informative translational context by annotating Kozak sequence strength, detecting non-overlapping upstream ORFs (uORFs) with coding probabilities, characterising 5\u2032 and 3\u2032 untranslated regions (UTRs), and predicting nonsense-mediated decay (NMD) susceptibility. All annotations are consolidated in a transcript-level summary to support downstream analysis. By generating GTF files with accurate CDS annotations, ORFannotate facilitates reproducible analysis of both long- and short-read transcriptomes and integrates seamlessly with visualization tools, genome browsers, and comparative transcript analysis workflows. ORFannotate is fast, scalable and provides a practical solution for transcriptome annotation beyond coding potential prediction alone.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>ORFannotate is implemented in Python and freely available under the GNU General Public License v3 (GPL-3.0) at: https:\/\/github.com\/egustavsson\/ORFannotate (DOI: https:\/\/doi.org\/10.5281\/zenodo.16812866)<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btag082","type":"journal-article","created":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T12:40:49Z","timestamp":1771072849000},"source":"Crossref","is-referenced-by-count":0,"title":["ORFannotate: reproducible coding sequence annotation of transcriptome assemblies"],"prefix":"10.1093","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4913-5312","authenticated-orcid":false,"given":"Sonia","family":"Garc\u00eda-Ruiz","sequence":"first","affiliation":[{"name":"UK Dementia Research Institute, University of Cambridge , Cambridge,","place":["UK"]},{"name":"Department of Clinical Neurosciences, School of Clinical Medicine, University of Cambridge , Cambridge,","place":["UK"]},{"name":"Department of Neurodegenerative Disease, Queen Square Institute of Neurology, UCL , London,","place":["UK"]}]},{"given":"Hannah","family":"Macpherson","sequence":"additional","affiliation":[{"name":"Department of Neurodegenerative Disease, Queen Square Institute of Neurology, UCL , London,","place":["UK"]},{"name":"Department of Genetics and Genomic Medicine, Great Ormond Street Institute of Child Health, University College London , London,","place":["UK"]}]},{"given":"Laura","family":"Caton","sequence":"additional","affiliation":[{"name":"Department of Neurodegenerative Disease, Queen Square Institute of Neurology, UCL , London,","place":["UK"]},{"name":"UK Dementia Research Institute at UCL , London,","place":["UK"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9520-6957","authenticated-orcid":false,"given":"Mina","family":"Ryten","sequence":"additional","affiliation":[{"name":"UK Dementia Research Institute, University of Cambridge , Cambridge,","place":["UK"]},{"name":"Department of Clinical Neurosciences, School of Clinical Medicine, University of Cambridge , Cambridge,","place":["UK"]},{"name":"Department of Genomic Medicine, School of Clinical Medicine, The University of Cambridge , Cambridge,","place":["UK"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0541-7537","authenticated-orcid":false,"given":"Emil K","family":"Gustavsson","sequence":"additional","affiliation":[{"name":"UK Dementia Research Institute, University of Cambridge , Cambridge,","place":["UK"]},{"name":"Department of Clinical Neurosciences, School of Clinical Medicine, University of Cambridge , Cambridge,","place":["UK"]},{"name":"Department of Genetics and Genomic Medicine, Great Ormond Street Institute of Child Health, University College London , London,","place":["UK"]}]}],"member":"286","published-online":{"date-parts":[[2026,2,17]]},"reference":[{"key":"2026030502113112200_btag082-B1","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1038\/nrm.2017.27","article-title":"Alternative splicing as a regulator of development and tissue identity","volume":"18","author":"Baralle","year":"2017","journal-title":"Nat Rev Mol Cell Biol"},{"key":"2026030502113112200_btag082-B2","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1038\/s41592-023-01908-w","article-title":"Context-aware transcript quantification from long-read RNA-seq data with Bambu","volume":"20","author":"Chen","year":"2023","journal-title":"Nat Methods"},{"key":"2026030502113112200_btag082-B3","doi-asserted-by":"crossref","first-page":"3844","DOI":"10.1093\/bioinformatics\/btac409","article-title":"ggtranscript: an R package for the visualization and interpretation of transcript isoforms using ggplot2","volume":"38","author":"Gustavsson","year":"2022","journal-title":"Bioinformatics"},{"key":"2026030502113112200_btag082-B4","doi-asserted-by":"crossref","first-page":"715","DOI":"10.1038\/nrg3052","article-title":"Functional consequences of developmentally regulated alternative splicing","volume":"12","author":"Kalsotra","year":"2011","journal-title":"Nat Rev Genet"},{"key":"2026030502113112200_btag082-B5","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1186\/s13059-019-1910-1","article-title":"Transcriptome assembly from long-read RNA-seq alignments with StringTie2","volume":"20","author":"Kovaka","year":"2019","journal-title":"Genome Biol"},{"key":"2026030502113112200_btag082-B6","author":"Mao","year":"2025"},{"key":"2026030502113112200_btag082-B7","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1038\/nature08909","article-title":"Expansion of the eukaryotic proteome by alternative splicing","volume":"463","author":"Nilsen","year":"2010","journal-title":"Nature"},{"key":"2026030502113112200_btag082-B8","doi-asserted-by":"crossref","first-page":"1413","DOI":"10.1038\/ng.259","article-title":"Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing","volume":"40","author":"Pan","year":"2008","journal-title":"Nat Genet"},{"key":"2026030502113112200_btag082-B9","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1038\/s41592-024-02229-2","article-title":"SQANTI3: curation of long-read transcriptomes for accurate identification of known and novel isoforms","volume":"21","author":"Pardo-Palacios","year":"2024","journal-title":"Nat Methods"},{"key":"2026030502113112200_btag082-B10","doi-asserted-by":"crossref","first-page":"915","DOI":"10.1038\/s41587-022-01565-y","article-title":"Accurate isoform discovery with IsoQuant using long reads","volume":"41","author":"Prjibelski","year":"2023","journal-title":"Nat Biotechnol"},{"key":"2026030502113112200_btag082-B11","doi-asserted-by":"crossref","first-page":"1438","DOI":"10.1038\/s41467-020-15171-6","article-title":"Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns","volume":"11","author":"Tang","year":"2020","journal-title":"Nat Commun"},{"key":"2026030502113112200_btag082-B12","doi-asserted-by":"crossref","first-page":"e74","DOI":"10.1093\/nar\/gkt006","article-title":"CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model","volume":"41","author":"Wang","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2026030502113112200_btag082-B13","author":"Wyman","year":"2019"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btag082\/66966726\/btag082.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/3\/btag082\/66966726\/btag082.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/3\/btag082\/66966726\/btag082.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T07:11:41Z","timestamp":1772694701000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btag082\/8489042"}},"subtitle":[],"editor":[{"given":"Laura","family":"Cantini","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2026,2,17]]},"references-count":13,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,2,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btag082","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,3]]},"published":{"date-parts":[[2026,2,17]]},"article-number":"btag082"}}