{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,3]],"date-time":"2024-08-03T09:48:59Z","timestamp":1722678539997},"reference-count":25,"publisher":"Oxford University Press (OUP)","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: RNA-sequencing technologies provide a powerful tool for expression analysis at gene and isoform level, but accurate estimation of isoform abundance is still a challenge. Standard assumption of uniform read intensity would yield biased estimates when the read intensity is in fact non-uniform. The problem is that, without strong assumptions, the read intensity pattern is not identifiable from data observed in a single sample.<\/jats:p><jats:p>Results: We develop a joint statistical model that accounts for non-uniform isoform-specific read distribution and gene isoform expression estimation. The main challenge is in dealing with the large number of isoform-specific read distributions, which potentially are as many as the number of splice variants in the genome. A statistical regularization via a smoothing penalty is imposed to control the estimation. Also, for identifiability reasons, the method uses information across samples from the same region. We develop a fast and robust computational procedure based on the iterated-weighted least-squares algorithm, and apply it to simulated data and two real RNA-Seq datasets with reverse transcription\u2013polymerase chain reaction validation. Empirical tests show that our model performs better than existing methods in terms of increasing precision in isoform-level estimation.<\/jats:p><jats:p>Availability and implementation: We have implemented our method in an R package called Sequgio as a pipeline for fast processing of RNA-Seq data.<\/jats:p><jats:p>Contact: \u00a0yudi.pawitan@ki.se<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt704","type":"journal-article","created":{"date-parts":[[2013,12,5]],"date-time":"2013-12-05T02:39:46Z","timestamp":1386211186000},"page":"506-513","source":"Crossref","is-referenced-by-count":15,"title":["Joint estimation of isoform expression and isoform-specific read distribution using multisample RNA-Seq data"],"prefix":"10.1093","volume":"30","author":[{"given":"Chen","family":"Suo","sequence":"first","affiliation":[{"name":"1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 2Department of Molecular and Translational Medicine, University of Brescia, Italy and 3Department of Mathematics and Statistics, La Trobe University, Australia"}]},{"given":"Stefano","family":"Calza","sequence":"additional","affiliation":[{"name":"1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 2Department of Molecular and Translational Medicine, University of Brescia, Italy and 3Department of Mathematics and Statistics, La Trobe University, Australia"},{"name":"1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 2Department of Molecular and Translational Medicine, University of Brescia, Italy and 3Department of Mathematics and Statistics, La Trobe University, Australia"},{"name":"1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 2Department of Molecular and Translational Medicine, University of Brescia, Italy and 3Department of Mathematics and Statistics, La Trobe University, Australia"}]},{"given":"Agus","family":"Salim","sequence":"additional","affiliation":[{"name":"1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 2Department of Molecular and Translational Medicine, University of Brescia, Italy and 3Department of Mathematics and Statistics, La Trobe University, Australia"}]},{"given":"Yudi","family":"Pawitan","sequence":"additional","affiliation":[{"name":"1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, 2Department of Molecular and Translational Medicine, University of Brescia, Italy and 3Department of Mathematics and Statistics, La Trobe University, Australia"}]}],"member":"286","published-online":{"date-parts":[[2013,12,3]]},"reference":[{"key":"2023012710422943700_btt704-B1","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1038\/13810","article-title":"Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2","volume":"23","author":"Amir","year":"1999","journal-title":"Nat. Genet."},{"key":"2023012710422943700_btt704-B2","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1038\/ng803","article-title":"Alternative splicing and genome complexity","volume":"30","author":"Brett","year":"2002","journal-title":"Nat. Genet."},{"key":"2023012710422943700_btt704-B3","doi-asserted-by":"crossref","first-page":"1168","DOI":"10.1093\/bioinformatics\/btn100","article-title":"Unequal group variances in microarray data analyses","volume":"9","author":"Demissie","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012710422943700_btt704-B4","doi-asserted-by":"crossref","first-page":"1721","DOI":"10.1093\/bioinformatics\/bts260","article-title":"Identifying differentially expressed transcripts from RNA-Seq data with biological variation","volume":"28","author":"Glaus","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012710422943700_btt704-B5","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/S0167-6377(99)00074-7","article-title":"On the convergence of the block nonlinear Gauss-Seidel method under convex constraints","volume":"26","author":"Grippo","year":"2000","journal-title":"Oper. Res. Lett."},{"key":"2023012710422943700_btt704-B6","doi-asserted-by":"crossref","first-page":"e131","DOI":"10.1093\/nar\/gkq224","article-title":"Biases in Illumina transcriptome sequencing caused by random hexamer priming","volume":"38","author":"Hansen","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012710422943700_btt704-B7","doi-asserted-by":"crossref","DOI":"10.1109\/BIBM.2009.70","article-title":"Towards reliable isoform quantification using RNA-SEQ data","volume-title":"Int. Conf. on Bioinformatics and Biomed","author":"Howard","year":"2009"},{"key":"2023012710422943700_btt704-B8","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1093\/bioinformatics\/btp113","article-title":"Statistical inferences for isoform expression in RNA-Seq","volume":"25","author":"Jiang","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012710422943700_btt704-B9","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1038\/nmeth.1311","article-title":"Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes","volume":"6","author":"Kozarewa","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012710422943700_btt704-B10","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023012710422943700_btt704-B12","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows-Wheeler Transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012710422943700_btt704-B14","doi-asserted-by":"crossref","first-page":"R50","DOI":"10.1186\/gb-2010-11-5-r50","article-title":"Modeling non-uniformity in short-read rates in RNA-Seq data","volume":"11","author":"Li","year":"2010","journal-title":"Genome Biol."},{"key":"2023012710422943700_btt704-B15","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1186\/1471-2105-14-220","article-title":"NURD: an implementation of a new method to estimate isoform expression from non-uniform RNA-Seq data","volume":"14","author":"Ma","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023012710422943700_btt704-B16","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1038\/nrm1645","article-title":"Understanding alternative splicing: towards a cellular code","volume":"6","author":"Matlin","year":"2005","journal-title":"Nat. Rev. Mol. Cell Biol."},{"key":"2023012710422943700_btt704-B17","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1038\/nmeth.1226","article-title":"Mapping and quantifying mammalian transcriptomes by RNA-Seq","volume":"5","author":"Mortazavi","year":"2008","journal-title":"Nat. Methods"},{"key":"2023012710422943700_btt704-B18","doi-asserted-by":"crossref","first-page":"3379","DOI":"10.1093\/hmg\/ddi369","article-title":"Detecting tissue-specific alternative splicing and disease-associated aberrant splicing of the PTCH gene with exon junction microarrays","volume":"14","author":"Nagao","year":"2005","journal-title":"Hum. Mol. Genet."},{"key":"2023012710422943700_btt704-B19","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198507659.001.0001","volume-title":"In All Likelihood: Statistical Modelling and Inference Using Likelihood","author":"Pawitan","year":"2001"},{"key":"2023012710422943700_btt704-B20","doi-asserted-by":"crossref","first-page":"r22","DOI":"10.1186\/gb-2011-12-3-r22","article-title":"Improving RNA-Seq expression estimates by correcting for fragment bias","volume":"12","author":"Roberts","year":"2011","journal-title":"Genome Biol."},{"key":"2023012710422943700_btt704-B21","doi-asserted-by":"crossref","first-page":"1151","DOI":"10.1038\/nbt1239","article-title":"The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements","volume":"24","author":"MAQC Consortium. et al.","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"2023012710422943700_btt704-B22","doi-asserted-by":"crossref","first-page":"1105","DOI":"10.1093\/bioinformatics\/btp120","article-title":"TopHat: discovering splice junctions with RNA-Seq","volume":"25","author":"Trapnell","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012710422943700_btt704-B23","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1038\/nbt.1621","article-title":"Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation","volume":"28","author":"Trapnell","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023012710422943700_btt704-B24","doi-asserted-by":"crossref","first-page":"470","DOI":"10.1038\/nature07509","article-title":"Alternative isoform regulation in human tissue transcriptomes","volume":"456","author":"Wang","year":"2008","journal-title":"Nature"},{"key":"2023012710422943700_btt704-B25","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1093\/bioinformatics\/btg1044","article-title":"Gene structure-based splice variant deconvolution using a microarray platform","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012710422943700_btt704-B26","doi-asserted-by":"crossref","first-page":"502","DOI":"10.1093\/bioinformatics\/btq696","article-title":"Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq","volume":"27","author":"Wu","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012710422943700_btt704-B27","doi-asserted-by":"crossref","first-page":"e75","DOI":"10.1093\/nar\/gkp282","article-title":"A hierarchical Bayesian model for comparing transcriptomes at the individual transcript isoform level","volume":"37","author":"Zheng","year":"2009","journal-title":"Nucleic Acids Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/4\/506\/48917222\/bioinformatics_30_4_506.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/4\/506\/48917222\/bioinformatics_30_4_506.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,20]],"date-time":"2024-05-20T22:51:04Z","timestamp":1716245464000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/4\/506\/203018"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,12,3]]},"references-count":25,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2014,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt704","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,2,15]]},"published":{"date-parts":[[2013,12,3]]}}}