{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,25]],"date-time":"2025-09-25T16:04:43Z","timestamp":1758816283298},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1709,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Quantification of sequence abundance in RNA-Seq experiments is often conflated by protocol-specific sequence bias. The exact sources of the bias are unknown, but may be influenced by polymerase chain reaction amplification, or differing primer affinities and mixtures, for example. The result is decreased accuracy in many applications, such as de novo gene annotation and transcript quantification.<\/jats:p>\n               <jats:p>Results: We present a new method to measure and correct for these influences using a simple graphical model. Our model does not rely on existing gene annotations, and model selection is performed automatically making it applicable with few assumptions. We evaluate our method on several datasets, and by multiple criteria, demonstrating that it effectively decreases bias and increases uniformity. Additionally, we provide theoretical and empirical results showing that the method is unlikely to have any effect on unbiased data, suggesting it can be applied with little risk of spurious adjustment.<\/jats:p>\n               <jats:p>Availability: The method is implemented in the seqbias R\/Bioconductor package, available freely under the LGPL license from http:\/\/bioconductor.org<\/jats:p>\n               <jats:p>Contact: \u00a0dcjones@cs.washington.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts055","type":"journal-article","created":{"date-parts":[[2012,1,29]],"date-time":"2012-01-29T01:14:57Z","timestamp":1327799697000},"page":"921-928","source":"Crossref","is-referenced-by-count":27,"title":["A new approach to bias correction in RNA-Seq"],"prefix":"10.1093","volume":"28","author":[{"given":"Daniel C.","family":"Jones","sequence":"first","affiliation":[{"name":"1 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, 2Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, 3Fred Hutchinson Cancer Research Center, Seattle, WA 98109 and 4Department of Microbiology, University of Washington, Seattle, WA 98195-7242, USA"}]},{"given":"Walter L.","family":"Ruzzo","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, 2Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, 3Fred Hutchinson Cancer Research Center, Seattle, WA 98109 and 4Department of Microbiology, University of Washington, Seattle, WA 98195-7242, USA"},{"name":"1 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, 2Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, 3Fred Hutchinson Cancer Research Center, Seattle, WA 98109 and 4Department of Microbiology, University of Washington, Seattle, WA 98195-7242, USA"},{"name":"1 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, 2Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, 3Fred Hutchinson Cancer Research Center, Seattle, WA 98109 and 4Department of Microbiology, University of Washington, Seattle, WA 98195-7242, USA"}]},{"given":"Xinxia","family":"Peng","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, 2Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, 3Fred Hutchinson Cancer Research Center, Seattle, WA 98109 and 4Department of Microbiology, University of Washington, Seattle, WA 98195-7242, USA"}]},{"given":"Michael G.","family":"Katze","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, 2Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, 3Fred Hutchinson Cancer Research Center, Seattle, WA 98109 and 4Department of Microbiology, University of Washington, Seattle, WA 98195-7242, USA"}]}],"member":"286","published-online":{"date-parts":[[2012,1,28]]},"reference":[{"key":"2023012512221283400_B1","doi-asserted-by":"crossref","first-page":"4570","DOI":"10.1093\/nar\/gkq211","article-title":"Detection of splice junctions from paired-end RNA-seq data by SpliceMap","volume":"38","author":"Au","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012512221283400_B2","doi-asserted-by":"crossref","first-page":"2657","DOI":"10.1093\/bioinformatics\/bti410","article-title":"Identification of transcription factor binding sites with variable-order Bayesian networks","volume":"21","author":"Ben-Gal","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012512221283400_B3","doi-asserted-by":"crossref","first-page":"817","DOI":"10.1214\/aoms\/1177703581","article-title":"A new proof of the Pearson-Fisher theorem","volume":"35","author":"Birch","year":"1964","journal-title":"Ann. Math. Stat."},{"key":"2023012512221283400_B4","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1186\/1471-2105-11-94","article-title":"Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments","volume":"11","author":"Bullard","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012512221283400_B5","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1093\/bioinformatics\/16.2.152","article-title":"Modeling splice sites with Bayes networks","volume":"16","author":"Cai","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012512221283400_B6","doi-asserted-by":"crossref","first-page":"662","DOI":"10.1016\/j.devcel.2010.02.014","article-title":"Genome-wide MyoD binding in skeletal muscle cells: a potential for broad cellular reprogramming","volume":"18","author":"Cao","year":"2010","journal-title":"Dev. Cell"},{"key":"2023012512221283400_B7","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1093\/bioinformatics\/bti025","article-title":"Prediction of splice sites with dependency graphs and their expanded Bayesian networks","volume":"21","author":"Chen","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012512221283400_B8","doi-asserted-by":"crossref","first-page":"e105","DOI":"10.1093\/nar\/gkn425","article-title":"Substantial biases in ultra-short read data sets from high-throughput DNA sequencing","volume":"36","author":"Dohm","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012512221283400_B9","doi-asserted-by":"crossref","first-page":"1365","DOI":"10.1002\/sim.1501","article-title":"Multiple additive regression trees with application in epidemiology","volume":"22","author":"Friedman","year":"2003","journal-title":"Stat. Med."},{"key":"2023012512221283400_B10","doi-asserted-by":"crossref","first-page":"W529","DOI":"10.1093\/nar\/gkl212","article-title":"VOMBAT: prediction of transcription factor binding sites using variable order Bayesian trees","volume":"34","author":"Grau","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023012512221283400_B11","doi-asserted-by":"crossref","DOI":"10.1145\/1015330.1015339","article-title":"Learning Bayesian network classifiers by maximizing conditional likelihood","volume-title":"Proceedings of the Twenty-first International Conference on Machine Learning (ICML '04).","author":"Grossman","year":"2004"},{"key":"2023012512221283400_B12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/nar\/gkq224","article-title":"Biases in Illumina transcriptome sequencing caused by random hexamer priming","volume":"38","author":"Hansen","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012512221283400_B13","doi-asserted-by":"crossref","first-page":"D690","DOI":"10.1093\/nar\/gkn828","article-title":"Ensembl 2009","volume":"37","author":"Hubbard","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012512221283400_B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/nar\/gkr693","article-title":"Identification and remediation of biases in the activity of RNA ligases in small-RNA deep sequencing","volume":"39","author":"Jayaprakash","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023012512221283400_B15","doi-asserted-by":"crossref","first-page":"D773","DOI":"10.1093\/nar\/gkm966","article-title":"The UCSC Genome Browser Database: 2008 update","volume":"36","author":"Karolchik","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012512221283400_B16","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1214\/aoms\/1177729694","article-title":"On information and sufficiency","volume":"22","author":"Kullback","year":"1951","journal-title":"Ann. Math. Stat."},{"key":"2023012512221283400_B17","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023012512221283400_B18","doi-asserted-by":"crossref","first-page":"R50","DOI":"10.1186\/gb-2010-11-5-r50","article-title":"Modeling non-uniformity in short-read rates in RNA-Seq data","volume":"11","author":"Li","year":"2010","journal-title":"Genome Biol."},{"key":"2023012512221283400_B19","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1038\/nmeth.1417","article-title":"FRT-seq: amplification-free, strand-specific transcriptome sequencing","volume":"7","author":"Mamanova","year":"2010","journal-title":"Nat. Methods"},{"key":"2023012512221283400_B20","first-page":"105","volume-title":"Conditional Logic Analysis of Qualitative Choice Behavior.","author":"McFadden","year":"1974"},{"key":"2023012512221283400_B21","doi-asserted-by":"crossref","first-page":"621","DOI":"10.1038\/nmeth.1226","article-title":"Mapping and quantifying mammalian transcriptomes by RNA-Seq","volume":"5","author":"Mortazavi","year":"2008","journal-title":"Nat. Methods"},{"key":"2023012512221283400_B22","doi-asserted-by":"crossref","first-page":"3082","DOI":"10.1093\/bioinformatics\/bti477","article-title":"A multiple-feature framework for modelling and predicting transcription factor binding sites","volume":"21","author":"Pudimat","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012512221283400_B23","doi-asserted-by":"crossref","first-page":"R22","DOI":"10.1186\/gb-2011-12-3-r22","article-title":"Improving RNA-Seq expression estimates by correcting for fragment bias","volume":"12","author":"Roberts","year":"2011","journal-title":"Genome Biol."},{"key":"2023012512221283400_B24","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1214\/aos\/1176344136","article-title":"Estimating the Dimension of a Model","volume":"6","author":"Schwarz","year":"1978","journal-title":"Ann. Stat."},{"key":"2023012512221283400_B25","doi-asserted-by":"crossref","first-page":"1151","DOI":"10.1038\/nbt1239","article-title":"The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements","volume":"24","author":"Shi","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"2023012512221283400_B26","doi-asserted-by":"crossref","first-page":"e170","DOI":"10.1093\/nar\/gkq670","article-title":"A two-parameter generalized Poisson model to improve the analysis of RNA-seq data","volume":"38","author":"Srivastava","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012512221283400_B27","doi-asserted-by":"crossref","first-page":"516","DOI":"10.1038\/nbt.1621","article-title":"Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation","volume":"28","author":"Trapnell","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023012512221283400_B28","doi-asserted-by":"crossref","first-page":"R78","DOI":"10.1186\/gb-2010-11-7-r78","article-title":"Identification of novel exons and transcribed regions by chimpanzee transcriptome sequencing","volume":"11","author":"Wetterbom","year":"2010","journal-title":"Genome Biol."},{"key":"2023012512221283400_B29","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1186\/1471-2105-12-290","article-title":"Bias detection and correction in RNA-Sequencing data","volume":"12","author":"Zheng","year":"2011","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/7\/921\/48879442\/bioinformatics_28_7_921.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/7\/921\/48879442\/bioinformatics_28_7_921.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T15:49:13Z","timestamp":1674661753000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/7\/921\/209263"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,1,28]]},"references-count":29,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2012,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts055","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,4,1]]},"published":{"date-parts":[[2012,1,28]]}}}