{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T17:32:41Z","timestamp":1775669561372,"version":"3.50.1"},"reference-count":27,"publisher":"Oxford University Press (OUP)","issue":"22","funder":[{"DOI":"10.13039\/100004440","name":"Wellcome Trust","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100004440","id-type":"DOI","asserted-by":"publisher"}]},{"name":"the German National Academic Foundation"},{"name":"the DFG Research Unit","award":["1234"],"award-info":[{"award-number":["1234"]}]},{"name":"the German Research Foundation","award":["GRK 1870\/01"],"award-info":[{"award-number":["GRK 1870\/01"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,11,15]]},"abstract":"<jats:p>Motivation: As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or\u2014if not\u2014where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach.<\/jats:p>\n               <jats:p>Results: The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances.<\/jats:p>\n               <jats:p>Availability and implementation: The method is implemented in C\u2002++\u2002as part of Augustus and available open source at http:\/\/bioinf.uni-greifswald.de\/augustus\/.<\/jats:p>\n               <jats:p>Contact: \u00a0stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.de<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btw494","type":"journal-article","created":{"date-parts":[[2016,7,28]],"date-time":"2016-07-28T02:29:14Z","timestamp":1469672954000},"page":"3388-3395","source":"Crossref","is-referenced-by-count":68,"title":["Simultaneous gene finding in multiple genomes"],"prefix":"10.1093","volume":"32","author":[{"given":"Stefanie","family":"K\u00f6nig","sequence":"first","affiliation":[{"name":"Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, 17487, Germany"}]},{"given":"Lars W.","family":"Romoth","sequence":"additional","affiliation":[{"name":"Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, 17487, Germany"}]},{"given":"Lizzy","family":"Gerischer","sequence":"additional","affiliation":[{"name":"Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, 17487, Germany"}]},{"given":"Mario","family":"Stanke","sequence":"additional","affiliation":[{"name":"Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, 17487, Germany"}]}],"member":"286","published-online":{"date-parts":[[2016,7,27]]},"reference":[{"key":"2023020113524260300_btw494-B1","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1007\/s10878-008-9139-z","article-title":"A Lagrangian relaxation approach for the multiple sequence alignment problem","volume":"16","author":"Althaus","year":"2008","journal-title":"J. Comb. Optim"},{"key":"2023020113524260300_btw494-B2","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1093\/nar\/27.2.573","article-title":"Tandem repeats finder: a program to analyze DNA sequences","volume":"27","author":"Benson","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2023020113524260300_btw494-B3","doi-asserted-by":"crossref","first-page":"988","DOI":"10.1101\/gr.1865504","article-title":"GeneWise and Genomewise","volume":"14","author":"Birney","year":"2004","journal-title":"Genome Res"},{"key":"2023020113524260300_btw494-B4","doi-asserted-by":"crossref","first-page":"708","DOI":"10.1101\/gr.1933104","article-title":"Aligning multiple genomic sequences with the threaded blockset aligner","volume":"14","author":"Blanchette","year":"2004","journal-title":"Genome Res"},{"key":"2023020113524260300_btw494-B5","doi-asserted-by":"crossref","first-page":"e84.","DOI":"10.1371\/journal.pcbi.0020084","article-title":"On the estimation of intron evolution","volume":"2","author":"Cs\u0171r\u00f6s","year":"2006","journal-title":"PLoS Comput. Biol"},{"key":"2023020113524260300_btw494-B6","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1093\/bioinformatics\/bts635","article-title":"STAR: ultrafast universal RNA-seq aligner","volume":"29","author":"Dobin","year":"2013","journal-title":"Bioinformatics"},{"key":"2023020113524260300_btw494-B7","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1038\/nature06341","article-title":"Evolution of genes and genomes on the Drosophila phylogeny","volume":"450","author":"Drosophila 12 Genomes Consortium","year":"2007","journal-title":"Nature"},{"key":"2023020113524260300_btw494-B8","volume-title":"Inferring Phylogenies","author":"Felsenstein","year":"2003"},{"key":"2023020113524260300_btw494-B9","doi-asserted-by":"crossref","first-page":"659","DOI":"10.1093\/jhered\/esp086","article-title":"Genome 10k: a proposal to obtain whole-genome sequence for 10,000 vertebrate species","volume":"100","author":"Genome 10K Community of Scientists","year":"2009","journal-title":"J. Hered"},{"key":"2023020113524260300_btw494-B10","doi-asserted-by":"crossref","first-page":"965","DOI":"10.1016\/j.infsof.2005.09.005","article-title":"Engineering a software tool for gene structure prediction in higher organisms","volume":"47","author":"Gremme","year":"2005","journal-title":"Inf. Softw. Technol"},{"key":"2023020113524260300_btw494-B11","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1089\/cmb.2006.13.379","article-title":"Using multiple alignments to improve gene prediction","volume":"13","author":"Gross","year":"2006","journal-title":"J. Comp. Biol"},{"key":"2023020113524260300_btw494-B12","doi-asserted-by":"crossref","first-page":"R269","DOI":"10.1186\/gb-2007-8-12-r269","article-title":"CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction","volume":"8","author":"Gross","year":"2007","journal-title":"Genome Biol"},{"key":"2023020113524260300_btw494-B13","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.cois.2015.02.008","article-title":"Current methods for automated annotation of protein-coding genes","volume":"7","author":"Hoff","year":"2015","journal-title":"Curr. Opin. Insect Sci"},{"key":"2023020113524260300_btw494-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2197\/ipsjtbio.9.1","article-title":"Prediction of gene structures from RNA-seq data using dual decomposition","volume":"9","author":"Inatsuki","year":"2016","journal-title":"IPSJ Trans. Bioinformatics"},{"key":"2023020113524260300_btw494-B15","doi-asserted-by":"crossref","first-page":"50.","DOI":"10.1186\/1471-2105-4-50","article-title":"Eval: a software package for analysis of genome annotations","volume":"4","author":"Keibler","year":"2003","journal-title":"BMC Bioinformatics"},{"key":"2023020113524260300_btw494-B16","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1093\/bioinformatics\/btr010","article-title":"A novel hybrid gene prediction method employing protein multiple sequence alignments","volume":"27","author":"Keller","year":"2011","journal-title":"Bioinformatics"},{"key":"2023020113524260300_btw494-B17","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1109\/TPAMI.2010.108","article-title":"MRF energy minimization and beyond via dual decomposition","volume":"33","author":"Komodakis","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell"},{"key":"2023020113524260300_btw494-B18","doi-asserted-by":"crossref","first-page":"S140","DOI":"10.1093\/bioinformatics\/17.suppl_1.S140","article-title":"Integrating genomic homology into gene structure prediction","volume":"17 (Suppl 1)","author":"Korf","year":"2001","journal-title":"Bioinformatics"},{"key":"2023020113524260300_btw494-B19","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1137\/S1052623499362111","article-title":"Incremental subgradient methods for nondifferentiable optimization","volume":"12","author":"Nedic","year":"2001","journal-title":"SIAM J. Optim"},{"key":"2023020113524260300_btw494-B20","doi-asserted-by":"crossref","first-page":"1512","DOI":"10.1101\/gr.123356.111","article-title":"Cactus: algorithms for genome multiple sequence alignment","volume":"21","author":"Paten","year":"2011","journal-title":"Genome Res"},{"key":"2023020113524260300_btw494-B21","doi-asserted-by":"crossref","first-page":"1386","DOI":"10.1126\/science.331.6023.1386","article-title":"Creating a buzz about insect genomes","volume":"331","author":"Robinson","year":"2011","journal-title":"Science"},{"key":"2023020113524260300_btw494-B22","author":"Rush","year":"2010"},{"key":"2023020113524260300_btw494-B23","doi-asserted-by":"crossref","first-page":"31.","DOI":"10.1186\/1471-2105-6-31","article-title":"Automated generation of heuristics for biological sequence comparison","volume":"6","author":"Slater","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023020113524260300_btw494-B24","author":"Smit","year":"2013\u20132015"},{"key":"2023020113524260300_btw494-B25","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1093\/nar\/gkl200","article-title":"Augustus: ab initio prediction of alternative transcripts","volume":"34","author":"Stanke","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023020113524260300_btw494-B26","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1093\/bioinformatics\/btn013","article-title":"Using native and syntenically mapped cDNA alignments to improve de novo gene finding","volume":"24","author":"Stanke","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020113524260300_btw494-B27","doi-asserted-by":"crossref","first-page":"1177","DOI":"10.1038\/nmeth.2714","article-title":"Assessment of transcript reconstruction methods for RNA-seq","volume":"10","author":"Steijger","year":"2013","journal-title":"Nat. Methods"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/22\/3388\/49026902\/bioinformatics_32_22_3388.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/22\/3388\/49026902\/bioinformatics_32_22_3388.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T23:56:03Z","timestamp":1675295763000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/22\/3388\/2525611"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,7,27]]},"references-count":27,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2016,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw494","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2016,11,15]]},"published":{"date-parts":[[2016,7,27]]}}}