{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T06:39:11Z","timestamp":1775284751886,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T00:00:00Z","timestamp":1731974400000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>For more than 25\u2009years, learning-based eukaryotic gene predictors were driven by hidden Markov models (HMMs), which were directly inputted a DNA sequence. Recently, Holst et al. demonstrated with their program Helixer that the accuracy of ab initio eukaryotic gene prediction can be improved by combining deep learning layers with a separate HMM postprocessor.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We present Tiberius, a novel deep learning-based ab initio gene predictor that end-to-end integrates convolutional and long short-term memory layers with a differentiable HMM layer. Tiberius uses a custom gene prediction loss and was trained for prediction in mammalian genomes and evaluated on human and two other genomes. It significantly outperforms existing ab initio methods, achieving F1 scores of 62% at gene level for the human genome, compared to 21% for the next best ab initio method. In de novo mode, Tiberius predicts the exon\u2212intron structure of two out of three human genes without error. Remarkably, even Tiberius\u2019s ab initio accuracy matches that of BRAKER3, which uses RNA-seq data and a protein database. Tiberius\u2019s highly parallelized model is the fastest state-of-the-art gene prediction method, processing the human genome in under 2\u2009hours.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>https:\/\/github.com\/Gaius-Augustus\/Tiberius<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae685","type":"journal-article","created":{"date-parts":[[2024,11,13]],"date-time":"2024-11-13T15:21:46Z","timestamp":1731511306000},"source":"Crossref","is-referenced-by-count":29,"title":["Tiberius: end-to-end deep learning with an HMM for gene prediction"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0264-016X","authenticated-orcid":false,"given":"Lars","family":"Gabriel","sequence":"first","affiliation":[{"name":"Institute of Mathematics and Computer Science, University of Greifswald , Greifswald 17489,","place":["Germany"]},{"name":"Center for Functional Genomics of Microbes, University of Greifswald , Greifswald 17489,","place":["Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6831-8523","authenticated-orcid":false,"given":"Felix","family":"Becker","sequence":"additional","affiliation":[{"name":"Institute of Mathematics and Computer Science, University of Greifswald , Greifswald 17489,","place":["Germany"]},{"name":"Center for Functional Genomics of Microbes, University of Greifswald , Greifswald 17489,","place":["Germany"]}]},{"given":"Katharina J","family":"Hoff","sequence":"additional","affiliation":[{"name":"Institute of Mathematics and Computer Science, University of Greifswald , Greifswald 17489,","place":["Germany"]},{"name":"Center for Functional Genomics of Microbes, University of Greifswald , Greifswald 17489,","place":["Germany"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8696-0384","authenticated-orcid":false,"given":"Mario","family":"Stanke","sequence":"additional","affiliation":[{"name":"Institute of Mathematics and Computer Science, University of Greifswald , Greifswald 17489,","place":["Germany"]},{"name":"Center for Functional Genomics of Microbes, University of Greifswald , Greifswald 17489,","place":["Germany"]}]}],"member":"286","published-online":{"date-parts":[[2024,11,18]]},"reference":[{"key":"2024121404514570200_btae685-B1","doi-asserted-by":"crossref","first-page":"giac104","DOI":"10.1093\/gigascience\/giac104","article-title":"learnMSA: learning and aligning large protein families","volume":"11","author":"Becker","year":"2022","journal-title":"Gigascience"},{"key":"2024121404514570200_btae685-B2","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1093\/nar\/27.2.573","article-title":"Tandem repeats finder: a program to analyze DNA sequences","volume":"27","author":"Benson","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2024121404514570200_btae685-B3","doi-asserted-by":"crossref","first-page":"lqaa108","DOI":"10.1093\/nargab\/lqaa108","article-title":"BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database","volume":"3","author":"Br\u016fna","year":"2021","journal-title":"NAR Genom Bioinform"},{"key":"2024121404514570200_btae685-B4","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1186\/s12859-023-05449-z","article-title":"Galba: genome annotation with miniprot and augustus","volume":"24","author":"Br\u016fna","year":"2023","journal-title":"BMC Bioinformatics"},{"key":"2024121404514570200_btae685-B5","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1101\/gr.278373.123","article-title":"GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes","volume":"34","author":"Br\u016fna","year":"2024","journal-title":"Genome Res"},{"key":"2024121404514570200_btae685-B6","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1006\/geno.1996.0298","article-title":"Evaluation of gene structure prediction programs","volume":"34","author":"Burset","year":"1996","journal-title":"Genomics"},{"key":"2024121404514570200_btae685-B7","first-page":"2023","article-title":"The nucleotide transformer: building and evaluating robust foundation models for human genomics","author":"Dalla-Torre","year":"2023"},{"key":"2024121404514570200_btae685-B8","doi-asserted-by":"crossref","first-page":"9451","DOI":"10.1073\/pnas.1921046117","article-title":"RepeatModeler2 for automated genomic discovery of transposable element families","volume":"117","author":"Flynn","year":"2020","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024121404514570200_btae685-B9","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1101\/gr.278090.123","article-title":"BRAKER3: fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA","volume":"34","author":"Gabriel","year":"2024","journal-title":"Genome Res"},{"key":"2024121404514570200_btae685-B10","first-page":"2023","article-title":"Helixer\u2013de novo prediction of primary eukaryotic gene models combining deep learning and a hidden markov model","author":"Holst","year":"2023"},{"key":"2024121404514570200_btae685-B11","first-page":"161","author":"Keilwagen","year":"2019"},{"key":"2024121404514570200_btae685-B12","author":"Kingma","year":"2014"},{"key":"2024121404514570200_btae685-B13","first-page":"134","author":"Kulp","year":"1996"},{"key":"2024121404514570200_btae685-B14","doi-asserted-by":"crossref","first-page":"msac174","DOI":"10.1093\/molbev\/msac174","article-title":"TimeTree 5: an expanded resource for species divergence times","volume":"39","author":"Kumar","year":"2022","journal-title":"Mol Biol Evol"},{"key":"2024121404514570200_btae685-B15","doi-asserted-by":"crossref","first-page":"e2115639118","DOI":"10.1073\/pnas.2115639118","article-title":"Standards recommendations for the earth BioGenome project","volume":"119","author":"Lawniczak","year":"2022","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024121404514570200_btae685-B16","volume-title":"Proc Natl Acad Sci","author":"Lewin","year":"2022"},{"key":"2024121404514570200_btae685-B17","doi-asserted-by":"crossref","first-page":"6494","DOI":"10.1093\/nar\/gki937","article-title":"Gene identification in novel eukaryotic genomes by self-training algorithm","volume":"33","author":"Lomsadze","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2024121404514570200_btae685-B18","doi-asserted-by":"crossref","first-page":"4647","DOI":"10.1093\/molbev\/msab199","article-title":"BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes","volume":"38","author":"Manni","year":"2021","journal-title":"Mol Biol Evol"},{"key":"2024121404514570200_btae685-B19","volume-title":"The Twelfth International Conference on Learning Representations","author":"Marin","year":"2023"},{"key":"2024121404514570200_btae685-B20","doi-asserted-by":"crossref","first-page":"1857","DOI":"10.1093\/bioinformatics\/btac028","article-title":"End-to-end learning of evolutionary models to find coding regions in genome alignments","volume":"38","author":"Mertsch","year":"2022","journal-title":"Bioinformatics"},{"key":"2024121404514570200_btae685-B21","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1007\/978-1-0716-3838-5_7","volume-title":"Comparative Genomics: Methods and Protocols","author":"Nachtweide","year":"2024"},{"issue":"Suppl. 2","key":"2024121404514570200_btae685-B22","doi-asserted-by":"crossref","first-page":"ii215","DOI":"10.1093\/bioinformatics\/btg1080","article-title":"Gene prediction with a hidden Markov model and a new intron submodel","volume":"19","author":"Stanke","year":"2003","journal-title":"Bioinformatics"},{"key":"2024121404514570200_btae685-B23","doi-asserted-by":"crossref","first-page":"5291","DOI":"10.1093\/bioinformatics\/btaa1044","article-title":"Helixer: cross-species gene annotation of large eukaryotic genomes using deep learning","volume":"36","author":"Stiehler","year":"2021","journal-title":"Bioinformatics"},{"key":"2024121404514570200_btae685-B24","first-page":"5063","author":"Tian","year":"2022"},{"key":"2024121404514570200_btae685-B25","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1038\/nrg3174","article-title":"A beginner\u2019s guide to eukaryotic genome annotation","volume":"13","author":"Yandell","year":"2012","journal-title":"Nat Rev Genet"},{"key":"2024121404514570200_btae685-B26","doi-asserted-by":"crossref","first-page":"240","DOI":"10.1038\/s41586-020-2876-6","article-title":"A comparative genomics multitool for scientific discovery and conservation","volume":"587","author":"Zoonomia Consortium","year":"2020","journal-title":"Nature"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae685\/60744079\/btae685.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/12\/btae685\/60924524\/btae685.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/12\/btae685\/60924524\/btae685.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,13]],"date-time":"2024-12-13T23:52:02Z","timestamp":1734133922000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae685\/7903281"}},"subtitle":[],"editor":[{"given":"Peter","family":"Robinson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,11,18]]},"references-count":26,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2024,11,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae685","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.07.21.604459","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,12]]},"published":{"date-parts":[[2024,11,18]]},"article-number":"btae685"}}