{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,17]],"date-time":"2026-01-17T11:02:15Z","timestamp":1768647735041,"version":"3.49.0"},"reference-count":49,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1134,"URL":"http:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: High-throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in the detection of expressed genes and reconstruction of RNA transcripts. However, the extensive dynamic range of gene expression, technical limitations and biases, as well as the observed complexity of the transcriptional landscape, pose profound computational challenges for transcriptome reconstruction.<\/jats:p>\n               <jats:p>Results: We present the novel framework MITIE (Mixed Integer Transcript IdEntification) for simultaneous transcript reconstruction and quantification. We define a likelihood function based on the negative binomial distribution, use a regularization approach to select a few transcripts collectively explaining the observed read data and show how to find the optimal solution using Mixed Integer Programming. MITIE can (i) take advantage of known transcripts, (ii) reconstruct and quantify transcripts simultaneously in multiple samples, and (iii) resolve the location of multi-mapping reads. It is designed for genome- and assembly-based transcriptome reconstruction. We present an extensive study based on realistic simulated RNA-Seq data. When compared with state-of-the-art approaches, MITIE proves to be significantly more sensitive and overall more accurate. Moreover, MITIE yields substantial performance gains when used with multiple samples. We applied our system to 38 Drosophila melanogaster modENCODE RNA-Seq libraries and estimated the sensitivity of reconstructing omitted transcript annotations and the specificity with respect to annotated transcripts. Our results corroborate that a well-motivated objective paired with appropriate optimization techniques lead to significant improvements over the state-of-the-art in transcriptome reconstruction.<\/jats:p>\n               <jats:p>Availability: MITIE is implemented in C++ and is available from http:\/\/bioweb.me\/mitie under the GPL license.<\/jats:p>\n               <jats:p>Contact: \u00a0Jonas_Behr@web.de and raetsch@cbio.mskcc.org<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt442","type":"journal-article","created":{"date-parts":[[2013,8,27]],"date-time":"2013-08-27T00:14:57Z","timestamp":1377562497000},"page":"2529-2538","source":"Crossref","is-referenced-by-count":52,"title":["MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples"],"prefix":"10.1093","volume":"29","author":[{"given":"Jonas","family":"Behr","sequence":"first","affiliation":[{"name":"1 Computational Biology Center, Sloan-Kettering Institute, 1275 York Avenue, New York, NY 10065, USA and 2Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany"},{"name":"1 Computational Biology Center, Sloan-Kettering Institute, 1275 York Avenue, New York, NY 10065, USA and 2Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andr\u00e9","family":"Kahles","sequence":"additional","affiliation":[{"name":"1 Computational Biology Center, Sloan-Kettering Institute, 1275 York Avenue, New York, NY 10065, USA and 2Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yi","family":"Zhong","sequence":"additional","affiliation":[{"name":"1 Computational Biology Center, Sloan-Kettering Institute, 1275 York Avenue, New York, NY 10065, USA and 2Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vipin T.","family":"Sreedharan","sequence":"additional","affiliation":[{"name":"1 Computational Biology Center, Sloan-Kettering Institute, 1275 York Avenue, New York, NY 10065, USA and 2Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Philipp","family":"Drewe","sequence":"additional","affiliation":[{"name":"1 Computational Biology Center, Sloan-Kettering Institute, 1275 York Avenue, New York, NY 10065, USA and 2Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gunnar","family":"R\u00e4tsch","sequence":"additional","affiliation":[{"name":"1 Computational Biology Center, Sloan-Kettering Institute, 1275 York Avenue, New York, NY 10065, USA and 2Friedrich Miescher Laboratory, Max Planck Society, Spemannstr. 39, 72076 T\u00fcbingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2013,8,25]]},"reference":[{"key":"2023012810473689000_btt442-B1","doi-asserted-by":"crossref","first-page":"R106","DOI":"10.1186\/gb-2010-11-10-r106","article-title":"Differential expression analysis for sequence count data","volume":"11","author":"Anders","year":"2010","journal-title":"Genome Biol."},{"key":"2023012810473689000_btt442-B2","doi-asserted-by":"crossref","first-page":"2008","DOI":"10.1101\/gr.133744.111","article-title":"Detecting differential usage of exons from RNA-seq data","volume":"22","author":"Anders","year":"2012","journal-title":"Genome Res."},{"key":"2023012810473689000_btt442-B3","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1101\/gr.124107.111","article-title":"Accurate identification of a-to-i rna editing in human by transcriptome sequencing","volume":"22","author":"Bahn","year":"2012","journal-title":"Genome Res."},{"key":"2023012810473689000_btt442-B4","article-title":"Computational methods for high-throughput genomics and transcriptomics","author":"Bohnert","year":"2011"},{"key":"2023012810473689000_btt442-B5","doi-asserted-by":"crossref","first-page":"P5","DOI":"10.1186\/1471-2105-10-S13-P5","article-title":"Transcript quantification with RNA-Seq data","volume":"10","author":"Bohnert","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012810473689000_btt442-B6","doi-asserted-by":"crossref","first-page":"e1001229","DOI":"10.1371\/journal.pbio.1001229","article-title":"Alternative splicing of RNA triplets is often regulated and accelerates proteome evolution","volume":"10","author":"Bradley","year":"2012","journal-title":"PLoS Biol."},{"key":"2023012810473689000_btt442-B7","doi-asserted-by":"crossref","first-page":"927","DOI":"10.1038\/459927a","article-title":"Unlocking the secrets of the genome","volume":"459","author":"Celniker","year":"2009","journal-title":"Nature"},{"key":"2023012810473689000_btt442-B8","doi-asserted-by":"crossref","first-page":"827","DOI":"10.1038\/ejhg.2011.28","article-title":"The gencode exome: sequencing the complete human exome","volume":"19","author":"Coffey","year":"2011","journal-title":"Eur. J. Hum. Genet."},{"key":"2023012810473689000_btt442-B9","doi-asserted-by":"crossref","first-page":"i174","DOI":"10.1093\/bioinformatics\/btn300","article-title":"Optimal spliced alignments of short sequence reads","volume":"24","author":"De Bona","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012810473689000_btt442-B10","doi-asserted-by":"crossref","first-page":"R175","DOI":"10.1186\/gb-2008-9-12-r175","article-title":"Annotating genomes with massive-scale RNA sequencing","volume":"9","author":"Denoeud","year":"2008","journal-title":"Genome Biol."},{"key":"2023012810473689000_btt442-B11","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1093\/bioinformatics\/bts635","article-title":"Star: ultrafast universal RNA-Seq aligner","volume":"29","author":"Dobin","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012810473689000_btt442-B12","doi-asserted-by":"crossref","first-page":"5189","DOI":"10.1093\/nar\/gkt211","article-title":"Accurate detection of differential rna processing","volume":"41","author":"Drewe","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012810473689000_btt442-B13","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nature11247","article-title":"An integrated encyclopedia of dna elements in the human genome","volume":"489","author":"ENCODE Project Consortium et al.","year":"2012","journal-title":"Nature"},{"key":"2023012810473689000_btt442-B14","doi-asserted-by":"crossref","first-page":"D84","DOI":"10.1093\/nar\/gkr991","article-title":"Ensembl 2012","volume":"40","author":"Flicek","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012810473689000_btt442-B15","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1038\/nbt.1883","article-title":"Full-length transcriptome assembly from RNA-Seq data without a reference genome","volume":"29","author":"Grabherr","year":"2011","journal-title":"Nat. Biotechnol."},{"key":"2023012810473689000_btt442-B16","doi-asserted-by":"crossref","first-page":"10073","DOI":"10.1093\/nar\/gks666","article-title":"Modelling and simulating generic RNA-Seq experiments with the flux simulator","volume":"40","author":"Griebel","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012810473689000_btt442-B17","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1038\/nbt.1633","article-title":"Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs","volume":"28","author":"Guttman","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023012810473689000_btt442-B18","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/gb-2006-7-s1-s4","article-title":"Gencode: producing a reference annotation for encode","volume":"7","author":"Harrow","year":"2006","journal-title":"Genome Biol."},{"key":"2023012810473689000_btt442-B19","doi-asserted-by":"crossref","first-page":"S181","DOI":"10.1093\/bioinformatics\/18.suppl_1.S181","article-title":"Splicing graphs and est assembly problem","volume":"18","author":"Heber","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012810473689000_btt442-B20","first-page":"1","article-title":"Simultaneous isoform discovery and quantification from RNA-Seq","author":"Hiller","year":"2012","journal-title":"Stat. Biosci."},{"key":"2023012810473689000_btt442-B21","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1146\/annurev.ecolsys.28.1.437","article-title":"Phylogeny estimation and hypothesis testing using maximum likelihood","volume":"28","author":"Huelsenbeck","year":"1997","journal-title":"Annu. Revi. Ecol. Syst."},{"key":"2023012810473689000_btt442-B22","doi-asserted-by":"crossref","first-page":"11.6.1","DOI":"10.1002\/0471250953.bi1106s32","article-title":"RNA-Seq read alignments with palmapper","volume":"32","author":"Jean","year":"2010","journal-title":"Curr. Protoc. Bioinform."},{"key":"2023012810473689000_btt442-B23","doi-asserted-by":"crossref","first-page":"1009","DOI":"10.1038\/nmeth.1528","article-title":"Analysis and design of rna sequencing experiments for identifying isoform regulation","volume":"7","author":"Katz","year":"2010","journal-title":"Nat. Methods"},{"key":"2023012810473689000_btt442-B24","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-540-87361-7_5","article-title":"Exact transcriptome reconstruction from short sequence reads","volume-title":"Proceedings of the 8th International Workshop on Algorithms in Bioinformatics","author":"Lacroix","year":"2008"},{"key":"2023012810473689000_btt442-B25","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-20036-6_18","article-title":"Isolasso: a lasso regression approach to RNA-Seq based transcriptome assembly","volume-title":"Research in Computational Molecular Biology","author":"Li","year":"2011"},{"key":"2023012810473689000_btt442-B26","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-33122-0_14","article-title":"Cliiq: accurate comparative detection and quantification of expressed isoforms in a population","volume-title":"Algorithms in Bioinformatics","author":"Lin","year":"2012"},{"key":"2023012810473689000_btt442-B27","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1101\/gr.142232.112","article-title":"iReckon: simultaneous isoform discovery and abundance estimation from RNA-Seq","volume":"23","author":"Mezlini","year":"2012","journal-title":"Genome Res."},{"key":"2023012810473689000_btt442-B28","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1038\/nmeth.1226","article-title":"Mapping and quantifying mammalian transcriptomes by RNA-Seq","volume":"5","author":"Mortazavi","year":"2008","journal-title":"Nat. Methods"},{"key":"2023012810473689000_btt442-B29","first-page":"375","article-title":"Generalized linear models","volume":"135","author":"Nelder","year":"1972","journal-title":"J. R. Stat. Soc."},{"key":"2023012810473689000_btt442-B30","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1038\/nature08909","article-title":"Expansion of the eukaryotic proteome by alternative splicing","volume":"463","author":"Nilsen","year":"2010","journal-title":"Nature"},{"key":"2023012810473689000_btt442-B31","doi-asserted-by":"crossref","first-page":"709","DOI":"10.1056\/NEJMoa1106920","article-title":"Origins of the e. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany","volume":"365","author":"Rasko","year":"2011","journal-title":"N. Engl. J. Med."},{"key":"2023012810473689000_btt442-B32","first-page":"3011","article-title":"Gaussian processes for machine learning (gpml) toolbox","volume":"11","author":"Rasmusen","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"2023012810473689000_btt442-B33","doi-asserted-by":"crossref","first-page":"e20","DOI":"10.1371\/journal.pcbi.0030020","article-title":"Improving the caenorhabditis elegans genome annotation using machine learning","volume":"3","author":"R\u00e4tsch","year":"2007","journal-title":"PLoS Comput. Biol."},{"key":"2023012810473689000_btt442-B34","doi-asserted-by":"crossref","first-page":"909","DOI":"10.1038\/nmeth.1517","article-title":"De novo assembly and analysis of RNA-Seq data","volume":"7","author":"Robertson","year":"2010","journal-title":"Nat. Methods"},{"key":"2023012810473689000_btt442-B35","doi-asserted-by":"crossref","first-page":"1086","DOI":"10.1093\/bioinformatics\/bts094","article-title":"Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels","volume":"28","author":"Schulz","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012810473689000_btt442-B36","doi-asserted-by":"crossref","first-page":"2133","DOI":"10.1101\/gr.090597.108","article-title":"mGene: accurate SVM-based gene finding with an application to nematode genomes","volume":"19","author":"Schweikert","year":"2009","journal-title":"Genome Res."},{"key":"2023012810473689000_btt442-B37","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1093\/bioinformatics\/btk028","article-title":"Inferring global levels of alternative splicing isoforms using a generative model of microarray data","volume":"22","author":"Shai","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012810473689000_btt442-B38","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1101\/gr.089532.108","article-title":"ABySS: A parallel assembler for short read sequence data","volume":"19","author":"Simpson","year":"2009","journal-title":"Genome Res."},{"key":"2023012810473689000_btt442-B39","doi-asserted-by":"crossref","first-page":"596","DOI":"10.4161\/rna.19683","article-title":"Multiple insert size paired-end sequencing for deconvolution of complex transcriptomes","volume":"9","author":"Smith","year":"2012","journal-title":"RNA Biol."},{"key":"2023012810473689000_btt442-B40","article-title":"Practical bayesian optimization of machine learning algorithms","author":"Snoek","year":"2012"},{"key":"2023012810473689000_btt442-B41","doi-asserted-by":"crossref","first-page":"S7","DOI":"10.1186\/1471-2105-8-S10-S7","article-title":"Accurate splice site prediction using support vector machines","volume":"8","author":"Sonnenburg","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012810473689000_btt442-B42","doi-asserted-by":"crossref","first-page":"1105","DOI":"10.1093\/bioinformatics\/btp120","article-title":"TopHat: discovering splice junctions with RNA-Seq","volume":"25","author":"Trapnell","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012810473689000_btt442-B43","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1038\/nbt.1621","article-title":"Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation","volume":"28","author":"Trapnell","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023012810473689000_btt442-B44","doi-asserted-by":"crossref","first-page":"i315","DOI":"10.1093\/bioinformatics\/btg1044","article-title":"Gene structure-based splice variant deconvolution using a microarry platform","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012810473689000_btt442-B45","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrg2484","article-title":"RNA-Seq: a revolutionary tool for transcriptomics","volume":"10","author":"Wang","year":"2009","journal-title":"Nat. Rev. Genet."},{"key":"2023012810473689000_btt442-B46","doi-asserted-by":"crossref","first-page":"e178","DOI":"10.1093\/nar\/gkq622","article-title":"MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery","volume":"38","author":"Wang","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012810473689000_btt442-B47","doi-asserted-by":"crossref","first-page":"873","DOI":"10.1093\/bioinformatics\/btq057","article-title":"Fast and SNP-tolerant detection of complex variants and splicing in short reads","volume":"26","author":"Wu","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012810473689000_btt442-B48","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1186\/1471-2105-12-162","article-title":"NSMAP: a method for spliced isoforms identification and quantification from RNA-Seq","volume":"12","author":"Xia","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023012810473689000_btt442-B49","doi-asserted-by":"crossref","first-page":"426","DOI":"10.1101\/gr.1304504","article-title":"The multiassembly problem: reconstructing multiple transcript isoforms from est fragment mixtures","volume":"14","author":"Xing","year":"2004","journal-title":"Genome Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/20\/2529\/48895014\/bioinformatics_29_20_2529.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/20\/2529\/48895014\/bioinformatics_29_20_2529.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,28]],"date-time":"2023-01-28T12:45:30Z","timestamp":1674909930000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/29\/20\/2529\/277475"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,8,25]]},"references-count":49,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2013,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt442","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2013,10,15]]},"published":{"date-parts":[[2013,8,25]]}}}