{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T18:33:35Z","timestamp":1774290815336,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"24","license":[{"start":{"date-parts":[[2016,11,16]],"date-time":"2016-11-16T00:00:00Z","timestamp":1479254400000},"content-version":"vor","delay-in-days":74,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000057","name":"NIGMS","doi-asserted-by":"publisher","award":["1R01GM105705"],"award-info":[{"award-number":["1R01GM105705"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100007880","name":"Johns Hopkins University","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100007880","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["IIS-1349906"],"award-info":[{"award-number":["IIS-1349906"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,12,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>RNA sequencing (RNA-seq) experiments now span hundreds to thousands of samples. Current spliced alignment software is designed to analyze each sample separately. Consequently, no information is gained from analyzing multiple samples together, and it requires extra work to obtain analysis products that incorporate data from across samples.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We describe Rail-RNA, a cloud-enabled spliced aligner that analyzes many samples at once. Rail-RNA eliminates redundant work across samples, making it more efficient as samples are added. For many samples, Rail-RNA is more accurate than annotation-assisted aligners. We use Rail-RNA to align 667 RNA-seq samples from the GEUVADIS project on Amazon Web Services in under 16\u2009h for US$0.91 per sample. Rail-RNA outputs alignments in SAM\/BAM format; but it also outputs (i) base-level coverage bigWigs for each sample; (ii) coverage bigWigs encoding normalized mean and median coverages at each base across samples analyzed; and (iii) exon\u2013exon splice junctions and indels (features) in columnar formats that juxtapose coverages in samples in which a given feature is found. Supplementary outputs are ready for use with downstream packages for reproducible statistical analysis. We use Rail-RNA to identify expressed regions in the GEUVADIS samples and show that both annotated and unannotated (novel) expressed regions exhibit consistent patterns of variation across populations and with respect to known confounding variables.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and Implementation<\/jats:title>\n                    <jats:p>Rail-RNA is open-source software available at http:\/\/rail.bio.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btw575","type":"journal-article","created":{"date-parts":[[2016,9,4]],"date-time":"2016-09-04T20:07:40Z","timestamp":1473019660000},"page":"4033-4040","source":"Crossref","is-referenced-by-count":49,"title":["Rail-RNA: scalable analysis of RNA-seq splicing and coverage"],"prefix":"10.1093","volume":"33","author":[{"given":"Abhinav","family":"Nellore","sequence":"first","affiliation":[{"name":"Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA"},{"name":"Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA"},{"name":"Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA"}]},{"given":"Leonardo","family":"Collado-Torres","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA"},{"name":"Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA"},{"name":"Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA"}]},{"given":"Andrew E","family":"Jaffe","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA"},{"name":"Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA"},{"name":"Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA"},{"name":"Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA"}]},{"given":"Jos\u00e9","family":"Alquicira-Hern\u00e1ndez","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA"},{"name":"Undergraduate Program on Genomic Sciences, National Autonomous University of Mexico, Mexico City, D.F., Mexico"}]},{"given":"Christopher","family":"Wilks","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA"},{"name":"Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA"}]},{"given":"Jacob","family":"Pritt","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA"},{"name":"Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA"}]},{"given":"James","family":"Morton","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA"}]},{"given":"Jeffrey T","family":"Leek","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA"},{"name":"Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA"}]},{"given":"Ben","family":"Langmead","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA"},{"name":"Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA"},{"name":"Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA"}]}],"member":"286","published-online":{"date-parts":[[2016,9,3]]},"reference":[{"key":"2023020301104817900_btw575-B1","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1038\/nbt.2702","article-title":"Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories","volume":"31","author":"Ac\u2019t Hoen","year":"2013","journal-title":"Nat. Biotechnol"},{"key":"2023020301104817900_btw575-B2","doi-asserted-by":"crossref","first-page":"4570","DOI":"10.1093\/nar\/gkq211","article-title":"Detection of splice junctions from paired-end RNA-seq data by splicemap","volume":"38","author":"Au","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023020301104817900_btw575-B3","doi-asserted-by":"crossref","first-page":"S9","DOI":"10.1186\/1471-2105-13-S6-S9","article-title":"A context-based approach to identify the most likely mapping for RNA-seq experiments","volume":"13","author":"Bonfert","year":"2012","journal-title":"BMC Bioinf"},{"key":"2023020301104817900_btw575-B4","doi-asserted-by":"crossref","first-page":"1500","DOI":"10.1093\/bioinformatics\/btq206","article-title":"Supersplat spliced RNA-seq alignment","volume":"26","author":"Bryant","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B5","doi-asserted-by":"crossref","first-page":"2615","DOI":"10.1093\/bioinformatics\/btp459","article-title":"RNA-mate: a recursive mapping strategy for high-throughput RNA-sequencing data","volume":"25","author":"Cloonan","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B6","first-page":"015370","article-title":"derfinder: software for annotation-agnostic RNA-seq differential expression analysis","author":"Collado-Torres","year":"2015","journal-title":"bioRxiv"},{"key":"2023020301104817900_btw575-B7","doi-asserted-by":"crossref","first-page":"e869","DOI":"10.7717\/peerj.869","article-title":"Low-cost, low-input RNA-seq protocols perform nearly as well as high-input protocols","volume":"3","author":"Combs","year":"2015","journal-title":"PeerJ PrePrints"},{"key":"2023020301104817900_btw575-B8","doi-asserted-by":"crossref","first-page":"D662","DOI":"10.1093\/nar\/gku1010","article-title":"Ensembl 2015","volume":"43","author":"Cunningham","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023020301104817900_btw575-B9","doi-asserted-by":"crossref","first-page":"O7","DOI":"10.1186\/1471-2105-9-S10-O7","article-title":"Optimal spliced alignments of short sequence reads","volume":"9","author":"De Bona","year":"2008","journal-title":"BMC Bioinf"},{"key":"2023020301104817900_btw575-B10","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1145\/1327452.1327492","article-title":"Mapreduce: simplified data processing","volume":"51","author":"Dean","year":"2008","journal-title":"Commun. ACM Large Clusters"},{"key":"2023020301104817900_btw575-B11","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1093\/bioinformatics\/bts635","article-title":"Star: ultrafast universal RNA-seq aligner","volume":"29","author":"Dobin","year":"2013","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B12","first-page":"kxt053.","article-title":"Differential expression analysis of RNA-seq data at single-base resolution","author":"Frazee","year":"2014","journal-title":"Biostatistics"},{"key":"2023020301104817900_btw575-B13","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1111\/j.1755-0998.2011.03024.x","article-title":"Field guide to next-generation DNA sequencers","volume":"11","author":"Glenn","year":"2011","journal-title":"Mol. Ecol. Resources"},{"key":"2023020301104817900_btw575-B14","doi-asserted-by":"crossref","first-page":"2518","DOI":"10.1093\/bioinformatics\/btr427","article-title":"Comparative analysis of RNA-seq alignment algorithms and the RNA-seq unified mapper (rum)","volume":"27","author":"Grant","year":"2011","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B15","doi-asserted-by":"crossref","first-page":"10073","DOI":"10.1093\/nar\/gks666","article-title":"Modelling and simulating generic RNA-seq experiments with the flux simulator","volume":"40","author":"Griebel","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023020301104817900_btw575-B16","first-page":"1038","article-title":"Is the $1,000 genome for real?","volume":"10","author":"Hayden","year":"2014","journal-title":"Nat. News"},{"key":"2023020301104817900_btw575-B17","doi-asserted-by":"crossref","first-page":"1933","DOI":"10.1093\/bioinformatics\/bts294","article-title":"Osa: a fast and accurate alignment tool for RNA-seq","volume":"28","author":"Hu","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B18","doi-asserted-by":"crossref","first-page":"46","DOI":"10.3389\/fgene.2011.00046","article-title":"Soapsplice: genome-wide ab initio detection of splice junctions from RNA-seq data","volume":"2","author":"Huang","year":"2011","journal-title":"Front. Genet"},{"key":"2023020301104817900_btw575-B19","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1038\/nn.3898","article-title":"Developmental regulation of human cortex transcription and its clinical relevance at single base resolution","volume":"18","author":"Jaffe","year":"2015","journal-title":"Nat. Neurosci"},{"key":"2023020301104817900_btw575-B20","first-page":"11","article-title":"RNA-seq read alignments with palmapper","author":"Jean","year":"2010","journal-title":"Curr. Protoc. Bioinf"},{"key":"2023020301104817900_btw575-B21","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1101\/gr.229102","article-title":"The human genome browser at UCSC","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res"},{"key":"2023020301104817900_btw575-B22","doi-asserted-by":"crossref","first-page":"2204","DOI":"10.1093\/bioinformatics\/btq351","article-title":"Bigwig and bigbed: enabling browsing of large distributed datasets","volume":"26","author":"Kent","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B23","doi-asserted-by":"crossref","first-page":"R36.","DOI":"10.1186\/gb-2013-14-4-r36","article-title":"Tophat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions","volume":"14","author":"Kim","year":"2013","journal-title":"Genome Biol"},{"key":"2023020301104817900_btw575-B24","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.3317","article-title":"Hisat: a fast spliced aligner with low memory requirements","volume":"12","author":"Kim","year":"2015","journal-title":"Nat. Methods"},{"key":"2023020301104817900_btw575-B25","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023020301104817900_btw575-B26","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol"},{"key":"2023020301104817900_btw575-B27","doi-asserted-by":"crossref","first-page":"506","DOI":"10.1038\/nature12531","article-title":"Transcriptome and genome sequencing uncovers functional variation in humans","volume":"501","author":"Lappalainen","year":"2013","journal-title":"Nature"},{"key":"2023020301104817900_btw575-B28","first-page":"gkq967","article-title":"The European nucleotide archive","author":"Leinonen","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023020301104817900_btw575-B29","first-page":"gkq1019","article-title":"The sequence read archive","author":"Leinonen","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023020301104817900_btw575-B30","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The sequence alignment\/map format and samtools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B31","doi-asserted-by":"crossref","first-page":"e108e108.","DOI":"10.1093\/nar\/gkt214","article-title":"The subread aligner: fast, accurate and scalable read mapping by seed-and-vote","volume":"41","author":"Liao","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023020301104817900_btw575-B32","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1038\/ng.2653","article-title":"The genotype-tissue expression (gtex) project","volume":"45","author":"Lonsdale","year":"2013","journal-title":"Nat. Genet"},{"key":"2023020301104817900_btw575-B33","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1038\/nmeth.2221","article-title":"The gem mapper: fast, accurate and versatile alignment by filtration","volume":"9","author":"Marco-Sola","year":"2012","journal-title":"Nat. Methods"},{"key":"2023020301104817900_btw575-B34","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1038\/nrg2934","article-title":"RNA sequencing: advances, challenges and opportunities","volume":"12","author":"Ozsolak","year":"2010","journal-title":"Nat. Rev. Genet"},{"key":"2023020301104817900_btw575-B35","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1109\/MCSE.2007.53","article-title":"Ipython: a system for interactive scientific computing","volume":"9","author":"Perez","year":"2007","journal-title":"Comput. Sci. Eng"},{"key":"2023020301104817900_btw575-B36","doi-asserted-by":"crossref","first-page":"R30.","DOI":"10.1186\/gb-2013-14-3-r30","article-title":"Crac: an integrated approach to the analysis of RNA-seq reads","volume":"14","author":"Philippe","year":"2013","journal-title":"Genome Biol"},{"key":"2023020301104817900_btw575-B37","doi-asserted-by":"crossref","first-page":"691","DOI":"10.1038\/nbt0710-691","article-title":"Cloud computing and the DNA data race","volume":"28","author":"Schatz","year":"2010","journal-title":"Nat. Biotechnol"},{"key":"2023020301104817900_btw575-B38","doi-asserted-by":"crossref","first-page":"207.","DOI":"10.1186\/gb-2010-11-5-207","article-title":"The case for cloud computing in genome informatics","volume":"11","author":"Stein","year":"2010","journal-title":"Genome Biol"},{"key":"2023020301104817900_btw575-B39","doi-asserted-by":"crossref","first-page":"1105","DOI":"10.1093\/bioinformatics\/btp120","article-title":"Tophat: discovering splice junctions with RNA-seq","volume":"25","author":"Trapnell","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B40","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrg2484","article-title":"Rna-seq: a revolutionary tool for transcriptomics","volume":"10","author":"Wang","year":"2009","journal-title":"Nat. Rev. Genet"},{"key":"2023020301104817900_btw575-B41","first-page":"gkq622","article-title":"Mapsplice: accurate mapping of RNA-seq reads for splice junction discovery","author":"Wang","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023020301104817900_btw575-B42","doi-asserted-by":"crossref","first-page":"873","DOI":"10.1093\/bioinformatics\/btq057","article-title":"Fast and snp-tolerant detection of complex variants and splicing in short reads","volume":"26","author":"Wu","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020301104817900_btw575-B43","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1093\/bioinformatics\/btr712","article-title":"Passion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-seq data","volume":"28","author":"Zhang","year":"2012","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/24\/4033\/49042019\/bioinformatics_33_24_4033.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/24\/4033\/49042019\/bioinformatics_33_24_4033.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T20:11:19Z","timestamp":1675368679000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/24\/4033\/2525684"}},"subtitle":[],"editor":[{"given":"Gunnar","family":"Ratsch","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2016,9,3]]},"references-count":43,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2017,12,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw575","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/019067","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,12,15]]},"published":{"date-parts":[[2016,9,3]]}}}