{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,16]],"date-time":"2026-06-16T04:59:47Z","timestamp":1781585987387,"version":"3.54.5"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"24","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":395,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,12,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Motivation: Because of the advantages of RNA sequencing (RNA-Seq) over microarrays, it is gaining widespread popularity for highly parallel gene expression analysis. For example, RNA-Seq is expected to be able to provide accurate identification and quantification of full-length splice forms. A number of informatics packages have been developed for this purpose, but short reads make it a difficult problem in principle. Sequencing error and polymorphisms add further complications. It has become necessary to perform studies to determine which algorithms perform best and which if any algorithms perform adequately. However, there is a dearth of independent and unbiased benchmarking studies. Here we take an approach using both simulated and experimental benchmark data to evaluate their accuracy.<\/jats:p>\n                  <jats:p>Results: We conclude that most methods are inaccurate even using idealized data, and that no method is highly accurate once multiple splice forms, polymorphisms, intron signal, sequencing errors, alignment errors, annotation errors and other complicating factors are present. These results point to the pressing need for further algorithm development.<\/jats:p>\n                  <jats:p>Availability and implementation: Simulated datasets and other supporting information can be found at http:\/\/bioinf.itmat.upenn.edu\/BEERS\/bp2<\/jats:p>\n                  <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <jats:p>Contact: \u00a0hayer@upenn.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv488","type":"journal-article","created":{"date-parts":[[2015,9,3]],"date-time":"2015-09-03T20:33:36Z","timestamp":1441312416000},"page":"3938-3945","source":"Crossref","is-referenced-by-count":90,"title":["Benchmark analysis of algorithms for determining and quantifying full-length mRNA splice forms from RNA-seq data"],"prefix":"10.1093","volume":"31","author":[{"given":"Katharina E.","family":"Hayer","sequence":"first","affiliation":[{"name":"1 University of Pennsylvania, Institute for Translational Medicine and Therapeutics, Philadelphia, PA 19104,"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Angel","family":"Pizarro","sequence":"additional","affiliation":[{"name":"2 Scientific Computing at Amazon Web Services, Seattle, WA 98108,"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Nicholas F.","family":"Lahens","sequence":"additional","affiliation":[{"name":"3 Department of Pharmacology and"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"John B.","family":"Hogenesch","sequence":"additional","affiliation":[{"name":"3 Department of Pharmacology and"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Gregory R.","family":"Grant","sequence":"additional","affiliation":[{"name":"1 University of Pennsylvania, Institute for Translational Medicine and Therapeutics, Philadelphia, PA 19104,"},{"name":"4 Department of Genetics, University of Pennsylvania, Philadelphia, PA 19104, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2015,9,3]]},"reference":[{"key":"2023051307185909900_btv488-B1","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1093\/bioinformatics\/btu638","article-title":"HTSeq\u2014A Python framework to work with high-throughput sequencing data","volume":"31","author":"Anders","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051307185909900_btv488-B2","doi-asserted-by":"crossref","first-page":"2529","DOI":"10.1093\/bioinformatics\/btt442","article-title":"MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples","volume":"29","author":"Behr","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051307185909900_btv488-B3","doi-asserted-by":"crossref","first-page":"2447","DOI":"10.1093\/bioinformatics\/btu317","article-title":"Efficient RNA isoform identification and quantification from RNA-Seq data with network flows","volume":"30","author":"Bernard","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051307185909900_btv488-B4","first-page":"647","article-title":"Benchmarking RNA-Seq quantification tools","volume":"2013","author":"Chandramohan","year":"2013","journal-title":"Conf. Proc. IEEE. Eng. Med. Biol. Soc."},{"key":"2023051307185909900_btv488-B6","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1093\/bioinformatics\/bts635","article-title":"STAR: ultrafast universal RNA-seq aligner","volume":"29","author":"Dobin","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051307185909900_btv488-B7","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1038\/nmeth.2722","article-title":"Systematic evaluation of Spliced Alignment Programs for RNA-Seq Data","volume":"10","author":"Engstr\u00f6m","year":"2013","journal-title":"Nat Methods"},{"key":"2023051307185909900_btv488-B8","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1038\/nbt.1883","article-title":"Full-length transcriptome assembly from RNA-Seq data without a reference genome","volume":"29","author":"Grabherr","year":"2011","journal-title":"Nat Biotechnol"},{"key":"2023051307185909900_btv488-B9","doi-asserted-by":"crossref","first-page":"2518","DOI":"10.1093\/bioinformatics\/btr427","article-title":"Comparative Analysis of RNA-Seq Alignment Algorithms and the RNA-Seq Unified Mapper (RUM)","volume":"27","author":"Grant","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051307185909900_btv488-B10","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1038\/nbt.1633","article-title":"Ab initio reconstruction of cell type\u2013specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs","volume":"28","author":"Guttman","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023051307185909900_btv488-B11","doi-asserted-by":"crossref","first-page":"e20","DOI":"10.1093\/nar\/gkt1304","article-title":"PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution","volume":"42","author":"Hu","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023051307185909900_btv488-B12","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1101\/gr.229102","article-title":"The human genome browser at UCSC","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023051307185909900_btv488-B13","doi-asserted-by":"crossref","first-page":"R36","DOI":"10.1186\/gb-2013-14-4-r36","article-title":"TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions","volume":"14","author":"Kim","year":"2013","journal-title":"Genome Biol."},{"key":"2023051307185909900_btv488-B14","doi-asserted-by":"crossref","first-page":"R86","DOI":"10.1186\/gb-2014-15-6-r86","article-title":"IVT-seq reveals extreme bias in RNA-sequencing","volume":"15","author":"Lahens","year":"2014","journal-title":"Genome Biol."},{"key":"2023051307185909900_btv488-B15","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1186\/1471-2105-12-323","article-title":"RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome","volume":"12","author":"Li","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051307185909900_btv488-B16","doi-asserted-by":"crossref","first-page":"1693","DOI":"10.1089\/cmb.2011.0171","article-title":"IsoLasso a LASSO regression approach to RNA-Seq based transcriptome assembly","volume":"18","author":"Li","year":"2011","journal-title":"J. Comput. Biol."},{"key":"2023051307185909900_btv488-B17","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1101\/gr.142232.112","article-title":"iReckon: simultaneous isoform discovery and abundance estimation from RNA-seq data","volume":"23","author":"Mezlini","year":"2013","journal-title":"Genome Res."},{"key":"2023051307185909900_btv488-B19","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1038\/nature01262","article-title":"Initial sequencing and comparative analysis of the mouse genome","volume":"420","author":"Mouse Genome Sequencing Consortium. et\u00a0al","year":"2002","journal-title":"Nature"},{"key":"2023051307185909900_btv488-B20","first-page":"520","article-title":"Estimation of alternative splicing isoform frequencies from RNA-Seq data","volume":"420","author":"Nicolae","year":"2010","journal-title":"Algorithms Mol. Biol."},{"key":"2023051307185909900_btv488-B21","doi-asserted-by":"crossref","first-page":"1559","DOI":"10.1038\/nprot.2006.236","article-title":"Quantification of mRNA using real-time RT-PCR","volume":"1","author":"Nolan","year":"2006","journal-title":"Nat Protoc"},{"key":"2023051307185909900_btv488-B18","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1038\/nbt.3122","article-title":"StringTie enables improved reconstruction of a transcriptome from RNA-seq reads","volume":"33","author":"Pertea","year":"2015","journal-title":"Nat. Biotechnol."},{"key":"2023051307185909900_btv488-B22","doi-asserted-by":"crossref","first-page":"D756","DOI":"10.1093\/nar\/gkt1114","article-title":"RefSeq: an update on mammalian reference sequences","volume":"42","author":"Pruitt","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023051307185909900_btv488-B23","doi-asserted-by":"crossref","first-page":"909","DOI":"10.1038\/nmeth.1517","article-title":"De\u00a0novo assembly and analysis of RNA-seq data","volume":"7","author":"Robertson","year":"2010","journal-title":"Nat. Methods"},{"key":"2023051307185909900_btv488-B24","doi-asserted-by":"crossref","first-page":"1086","DOI":"10.1093\/bioinformatics\/bts094","article-title":"Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels","volume":"28","author":"Schulz","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051307185909900_btv488-B25","doi-asserted-by":"crossref","first-page":"S14","DOI":"10.1186\/1471-2105-14-S5-S14","article-title":"CLASS: constrained transcript assembly of RNA-seq reads","volume":"14","author":"Song","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023051307185909900_btv488-B26","first-page":"637","article-title":"Using native and syntenically mapped cDNA alignments to improve de novo gene finding","volume":"24","author":"Stanke","year":"2008"},{"key":"2023051307185909900_btv488-B27","doi-asserted-by":"crossref","first-page":"1177","DOI":"10.1038\/nmeth.2714","article-title":"Assessment of transcript reconstruction methods for RNA-seq","volume":"10","author":"Steijger","year":"2013","journal-title":"Nat. Methods"},{"key":"2023051307185909900_btv488-B28","doi-asserted-by":"crossref","first-page":"S15","DOI":"10.1186\/1471-2105-14-S5-S15","article-title":"A novel min-cost flow method for estimating transcript expression with RNA-Seq","volume":"14","author":"Tomescu","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023051307185909900_btv488-B29","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1038\/nbt.1621","article-title":"Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation","volume":"28","author":"Trapnell","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023051307185909900_btv488-B30","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1038\/nbt.2450","article-title":"Differential analysis of gene regulation at transcript resolution with RNA-seq","volume":"31","author":"Trapnell","year":"2013","journal-title":"Nat Biotechnol."},{"key":"2023051307185909900_btv488-B32","doi-asserted-by":"crossref","first-page":"1660","DOI":"10.1093\/bioinformatics\/btu077","article-title":"SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads","volume":"30","author":"Xie","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051307185909900_btv488-B33","doi-asserted-by":"crossref","first-page":"16219","DOI":"10.1073\/pnas.1408886111","article-title":"A circadian gene expression atlas in mammals: implications for biology and medicine","volume":"111","author":"Zhang","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/24\/3938\/50307104\/bioinformatics_31_24_3938.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/24\/3938\/50307104\/bioinformatics_31_24_3938.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,13]],"date-time":"2023-05-13T03:21:06Z","timestamp":1683948066000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/24\/3938\/197198"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,9,3]]},"references-count":31,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2015,12,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv488","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/007088","asserted-by":"object"}]},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,12,15]]},"published":{"date-parts":[[2015,9,3]]}}}