{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T05:50:53Z","timestamp":1775541053017,"version":"3.50.1"},"reference-count":17,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2480,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: RNA-Seq is a promising new technology for accurately measuring gene expression levels. Expression estimation with RNA-Seq requires the mapping of relatively short sequencing reads to a reference genome or transcript set. Because reads are generally shorter than transcripts from which they are derived, a single read may map to multiple genes and isoforms, complicating expression analyses. Previous computational methods either discard reads that map to multiple locations or allocate them to genes heuristically.<\/jats:p><jats:p>Results: We present a generative statistical model and associated inference methods that handle read mapping uncertainty in a principled manner. Through simulations parameterized by real RNA-Seq data, we show that our method is more accurate than previous methods. Our improved accuracy is the result of handling read mapping uncertainty with a statistical model and the estimation of gene expression levels as the sum of isoform expression levels. Unlike previous methods, our method is capable of modeling non-uniform read distributions. Simulations with our method indicate that a read length of 20\u201325 bases is optimal for gene-level expression estimation from mouse and maize RNA-Seq data when sequencing throughput is fixed.<\/jats:p><jats:p>Availability: An initial C++ implementation of our method that was used for the results presented in this article is available at http:\/\/deweylab.biostat.wisc.edu\/rsem.<\/jats:p><jats:p>Contact: \u00a0cdewey@biostat.wisc.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics on<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp692","type":"journal-article","created":{"date-parts":[[2009,12,19]],"date-time":"2009-12-19T02:09:36Z","timestamp":1261188576000},"page":"493-500","source":"Crossref","is-referenced-by-count":999,"title":["RNA-Seq gene expression estimation with read mapping uncertainty"],"prefix":"10.1093","volume":"26","author":[{"given":"Bo","family":"Li","sequence":"first","affiliation":[{"name":"1 Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, 2 Morgridge Institute for Research, Madison, WI 53707 and 3 Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53706, USA"}]},{"given":"Victor","family":"Ruotti","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, 2 Morgridge Institute for Research, Madison, WI 53707 and 3 Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53706, USA"}]},{"given":"Ron M.","family":"Stewart","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, 2 Morgridge Institute for Research, Madison, WI 53707 and 3 Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53706, USA"}]},{"given":"James A.","family":"Thomson","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, 2 Morgridge Institute for Research, Madison, WI 53707 and 3 Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53706, USA"}]},{"given":"Colin N.","family":"Dewey","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, 2 Morgridge Institute for Research, Madison, WI 53707 and 3 Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53706, USA"},{"name":"1 Department of Computer Sciences, University of Wisconsin, Madison, WI 53706, 2 Morgridge Institute for Research, Madison, WI 53707 and 3 Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53706, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,12,18]]},"reference":[{"issue":"Suppl. 1","key":"2023012508024127200_B1","doi-asserted-by":"crossref","first-page":"i31","DOI":"10.1093\/bioinformatics\/bth924","article-title":"Statistical modeling of sequencing errors in SAGE libraries","volume":"20","author":"Beissbarth","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012508024127200_B2","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1038\/nmeth.1223","article-title":"Stem cell transcriptome profiling via massive-scale mRNA sequencing","volume":"5","author":"Cloonan","year":"2008","journal-title":"Nat. Methods"},{"key":"2023012508024127200_B3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","author":"Dempster","year":"1977","journal-title":"J. R. Stat. Soc. Ser. B (Methodol.)"},{"key":"2023012508024127200_B4","doi-asserted-by":"crossref","first-page":"e105","DOI":"10.1093\/nar\/gkn425","article-title":"Substantial biases in ultra-short read data sets from high-throughput DNA sequencing","volume":"36","author":"Dohm","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012508024127200_B5","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1016\/j.ygeno.2007.11.003","article-title":"A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE","volume":"91","author":"Faulkner","year":"2008","journal-title":"Genomics"},{"key":"2023012508024127200_B6","doi-asserted-by":"crossref","first-page":"1036","DOI":"10.1093\/bioinformatics\/btl048","article-title":"The UCSC known genes","volume":"22","author":"Hsu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012508024127200_B7","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1093\/bioinformatics\/btp113","article-title":"Statistical inferences for isoform expression in RNA-Seq","volume":"25","author":"Jiang","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012508024127200_B8","doi-asserted-by":"crossref","first-page":"2887","DOI":"10.1093\/bioinformatics\/btn571","article-title":"Cross-hybridization modeling on Affymetrix exon arrays","volume":"24","author":"Kapur","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012508024127200_B9","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1007\/978-3-540-87361-7_5","article-title":"Exact transcriptome reconstruction from short sequence reads","volume-title":"Proceedings of the 8th International Workshop on Algorithms in Bioinformatics.","author":"Lacroix","year":"2008"},{"key":"2023012508024127200_B10","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023012508024127200_B11","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1016\/j.cell.2008.03.029","article-title":"Highly integrated single-base resolution maps of the epigenome in Arabidopsis","volume":"133","author":"Lister","year":"2008","journal-title":"Cell"},{"key":"2023012508024127200_B12","doi-asserted-by":"crossref","first-page":"1509","DOI":"10.1101\/gr.079558.108","article-title":"RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays","volume":"18","author":"Marioni","year":"2008","journal-title":"Genome Res."},{"key":"2023012508024127200_B13","doi-asserted-by":"crossref","first-page":"81","DOI":"10.2144\/000112900","article-title":"Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing","volume":"45","author":"Morin","year":"2008","journal-title":"BioTechniques"},{"key":"2023012508024127200_B14","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1038\/nmeth.1226","article-title":"Mapping and quantifying mammalian transcriptomes by RNA-Seq","volume":"5","author":"Mortazavi","year":"2008","journal-title":"Nat. Methods"},{"key":"2023012508024127200_B15","doi-asserted-by":"crossref","first-page":"1344","DOI":"10.1126\/science.1158441","article-title":"The transcriptional landscape of the yeast genome defined by RNA sequencing","volume":"320","author":"Nagalakshmi","year":"2008","journal-title":"Science"},{"key":"2023012508024127200_B16","doi-asserted-by":"crossref","first-page":"2601","DOI":"10.1093\/nar\/6.7.2601","article-title":"A strategy of DNA sequencing employing computer programs","volume":"6","author":"Staden","year":"1979","journal-title":"Nucleic Acids Res."},{"key":"2023012508024127200_B17","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrg2484","article-title":"RNA-Seq: a revolutionary tool for transcriptomics","volume":"10","author":"Wang","year":"2009","journal-title":"Nat. Rev. Genet."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/4\/493\/48855011\/bioinformatics_26_4_493.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/4\/493\/48855011\/bioinformatics_26_4_493.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,13]],"date-time":"2025-02-13T22:37:24Z","timestamp":1739486244000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/4\/493\/243395"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,12,18]]},"references-count":17,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2010,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp692","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,2,15]]},"published":{"date-parts":[[2009,12,18]]}}}