{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T05:17:36Z","timestamp":1774675056545,"version":"3.50.1"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T00:00:00Z","timestamp":1712880000000},"content-version":"vor","delay-in-days":16,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"Next Generation of RNA-Seq Simulators for Benchmarking Analyses","award":["R21-LM012763-01A1"],"award-info":[{"award-number":["R21-LM012763-01A1"]}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["5UL1TR000003"],"award-info":[{"award-number":["5UL1TR000003"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["NHLBI R01HL155934-01A1(SS)"],"award-info":[{"award-number":["NHLBI R01HL155934-01A1(SS)"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["NHLBI-R01HL147472"],"award-info":[{"award-number":["NHLBI-R01HL147472"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["DP2GM146251"],"award-info":[{"award-number":["DP2GM146251"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,3,27]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Simulation of RNA-seq reads is critical in the assessment, comparison, benchmarking and development of bioinformatics tools. Yet the field of RNA-seq simulators has progressed little in the last decade. To address this need we have developed BEERS2, which combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline. BEERS2 takes input transcripts (typically fully length messenger RNA transcripts with polyA tails) from either customizable input or from CAMPAREE simulated RNA samples. It produces realistic reads of these transcripts as FASTQ, SAM or BAM formats with the SAM or BAM formats containing the true alignment to the reference genome. It also produces true transcript-level quantification values. BEERS2 combines a flexible and highly configurable design with detailed simulation of the entire library preparation and sequencing pipeline and is designed to include the effects of polyA selection and RiboZero for ribosomal depletion, hexamer priming sequence biases, GC-content biases in polymerase chain reaction (PCR) amplification, barcode read errors and errors during PCR amplification. These characteristics combine to make BEERS2 the most complete simulation of RNA-seq to date. Finally, we demonstrate the use of BEERS2 by measuring the effect of several settings on the popular Salmon pseudoalignment algorithm.<\/jats:p>","DOI":"10.1093\/bib\/bbae164","type":"journal-article","created":{"date-parts":[[2024,3,26]],"date-time":"2024-03-26T19:44:08Z","timestamp":1711482248000},"source":"Crossref","is-referenced-by-count":5,"title":["BEERS2: RNA-Seq simulation through high fidelity\n                    <i>in silico<\/i>\n                    modeling"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6980-0079","authenticated-orcid":false,"given":"Thomas G","family":"Brooks","sequence":"first","affiliation":[{"name":"Institute for Translational Medicine and Therapeutics, University of Pennsylvania , PA , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nicholas F","family":"Lahens","sequence":"additional","affiliation":[{"name":"Institute for Translational Medicine and Therapeutics, University of Pennsylvania , PA , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Antonijo","family":"Mr\u010dela","sequence":"additional","affiliation":[{"name":"Institute for Translational Medicine and Therapeutics, University of Pennsylvania , PA , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dimitra","family":"Sarantopoulou","sequence":"additional","affiliation":[{"name":"Institute for Translational Medicine and Therapeutics, University of Pennsylvania , PA , USA"},{"name":"Current address: National Institute on Aging, National Institutes of Health , Baltimore, MD , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Soumyashant","family":"Nayak","sequence":"additional","affiliation":[{"name":"Institute for Translational Medicine and Therapeutics, University of Pennsylvania , PA , USA"},{"name":"Current address: Statistics and Mathematics Unit, Indian Statistical Institute , Bengaluru, Karnataka , India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Amruta","family":"Naik","sequence":"additional","affiliation":[{"name":"Institute for Translational Medicine and Therapeutics, University of Pennsylvania , PA , USA"},{"name":"Children\u2019s Hospital of Philadelphia , Philadelphia, PA , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shaon","family":"Sengupta","sequence":"additional","affiliation":[{"name":"Institute for Translational Medicine and Therapeutics, University of Pennsylvania , PA , USA"},{"name":"Children\u2019s Hospital of Philadelphia , Philadelphia, PA , USA"},{"name":"Department of Pediatrics, University of Pennsylvania Perelman School of Medicine , Philadelphia, Pennsylvania , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter S","family":"Choi","sequence":"additional","affiliation":[{"name":"Division of Cancer Pathobiology, Children\u2019s Hospital of Philadelphia , Philadelphia, PA , USA"},{"name":"Department of Pathology & Laboratory Medicine, University of Pennsylvania Perelman School of Medicine , Philadelphia, PA , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gregory R","family":"Grant","sequence":"additional","affiliation":[{"name":"Institute for Translational Medicine and Therapeutics, University of Pennsylvania , PA , USA"},{"name":"Department of Genetics, University of Pennsylvania , Philadelphia, PA , USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,4,11]]},"reference":[{"issue":"18","key":"2024041205442038900_ref1","doi-asserted-by":"crossref","first-page":"3372","DOI":"10.1093\/bioinformatics\/btz089","article-title":"How well do RNA-Seq differential gene expression tools perform in a complex eukaryote? A case study in Arabidopsis thaliana","volume":"35","author":"Froussios","year":"2019","journal-title":"Bioinformatics"},{"issue":"21","key":"2024041205442038900_ref2","doi-asserted-by":"crossref","first-page":"4994","DOI":"10.1093\/bioinformatics\/btac612","article-title":"Access to ground truth at unconstrained size makes simulated data as indispensable as experimental data for bioinformatics methods development and benchmarking","volume":"38","author":"Sandve","year":"2022","journal-title":"Bioinformatics"},{"issue":"5","key":"2024041205442038900_ref3","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1038\/nbt.3519","article-title":"Near-optimal probabilistic RNA-seq quantification","volume":"34","author":"Bray","year":"2016","journal-title":"Nat Biotechnol"},{"issue":"4","key":"2024041205442038900_ref4","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1038\/nmeth.4197","article-title":"Salmon provides fast and bias-aware quantification of transcript expression","volume":"14","author":"Patro","year":"2017","journal-title":"Nat Methods"},{"key":"2024041205442038900_ref5","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1186\/s13059-020-02151-8","article-title":"Alignment and mapping methodology influence transcript abundance estimation","volume":"21","author":"Srivastava","year":"2020","journal-title":"Genome Biol"},{"key":"2024041205442038900_ref6","doi-asserted-by":"crossref","first-page":"e0232271","DOI":"10.1371\/journal.pone.0232271","article-title":"Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data","volume":"15","author":"Baik","year":"2020","journal-title":"PloS One"},{"key":"2024041205442038900_ref7","doi-asserted-by":"crossref","first-page":"e0176185","DOI":"10.1371\/journal.pone.0176185","article-title":"A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data","volume":"12","author":"Li","year":"2017","journal-title":"PloS One"},{"key":"2024041205442038900_ref8","doi-asserted-by":"crossref","DOI":"10.1186\/s13059-018-1466-5","article-title":"Differential gene expression analysis tools exhibit substandard performance for long non-coding RNA-sequencing data","volume":"19","author":"Assefa","year":"2018","journal-title":"Genome Biol"},{"issue":"1","key":"2024041205442038900_ref9","first-page":"bbw092","article-title":"Synthetic data sets for the identification of key ingredients for RNA-seq differential analysis","volume":"19","author":"Rigaill","year":"2018","journal-title":"Brief Bioinform"},{"issue":"13","key":"2024041205442038900_ref10","doi-asserted-by":"crossref","first-page":"2131","DOI":"10.1093\/bioinformatics\/btv124","article-title":"SimSeq: a nonparametric approach to simulation of RNA-sequence datasets","volume":"31","author":"Benidt","year":"2015","journal-title":"Bioinformatics"},{"issue":"18","key":"2024041205442038900_ref11","doi-asserted-by":"crossref","first-page":"2518","DOI":"10.1093\/bioinformatics\/btr427","article-title":"Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM)","volume":"27","author":"Grant","year":"2011","journal-title":"Bioinformatics"},{"key":"2024041205442038900_ref12","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1186\/1471-2105-15-224","article-title":"MAP-RSeq: Mayo Analysis Pipeline for RNA sequencing","volume":"15","author":"Kalari","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2024041205442038900_ref13","doi-asserted-by":"crossref","first-page":"3353","DOI":"10.1038\/s41467-021-23608-9","article-title":"MOCCASIN: a method for correcting for known and unknown confounders in RNA splicing analysis","volume":"12","author":"Slaff","year":"2021","journal-title":"Nat Commun"},{"key":"2024041205442038900_ref14","article-title":"iREAD: a tool for intron retention detection from RNA-seq data","volume":"21","author":"Li","year":"2020","journal-title":"BMC Genomics"},{"issue":"17","key":"2024041205442038900_ref15","doi-asserted-by":"crossref","first-page":"2778","DOI":"10.1093\/bioinformatics\/btv272","article-title":"Polyester: simulating RNA-seq datasets with differential transcript expression","volume":"31","author":"Frazee","year":"2015","journal-title":"Bioinformatics"},{"issue":"18","key":"2024041205442038900_ref16","doi-asserted-by":"crossref","first-page":"3008","DOI":"10.1093\/bioinformatics\/btab142","article-title":"ASimulatoR: splice-aware RNA-Seq data simulation","volume":"37","author":"Manz","year":"2021","journal-title":"Bioinformatics"},{"key":"2024041205442038900_ref17","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1186\/1471-2105-12-323","article-title":"RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome","volume":"12","author":"Li","year":"2011","journal-title":"BMC Bioinformatics"},{"issue":"20","key":"2024041205442038900_ref18","doi-asserted-by":"crossref","first-page":"10073","DOI":"10.1093\/nar\/gks666","article-title":"Modelling and simulating generic RNA-Seq experiments with the flux simulator","volume":"40","author":"Griebel","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2024041205442038900_ref19","doi-asserted-by":"crossref","first-page":"428","DOI":"10.1186\/s12859-017-1831-5","article-title":"SimBA: a methodology and tools for evaluating the performance of RNA-Seq bioinformatic pipelines","volume":"18","author":"Audoux","year":"2017","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"2024041205442038900_ref20","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1186\/1471-2164-15-264","article-title":"Comparison of mapping algorithms used in high-throughput sequencing: application to ion torrent data","volume":"15","author":"Caboche","year":"2014","journal-title":"BMC Genomics"},{"key":"2024041205442038900_ref21","volume-title":"BBMap short read aligner, and other bioinformatic tools","author":"Bushnell"},{"issue":"1","key":"2024041205442038900_ref22","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1093\/bioinformatics\/bts649","article-title":"PBSIM: PacBio reads simulator\u2014toward accurate genome assembly","volume":"29","author":"Ono","year":"2012","journal-title":"Bioinformatics"},{"key":"2024041205442038900_ref23","article-title":"Tim Massingham, Nick Goldman. Realistic simulations reveal extensive sample-specificity of RNA-seq biases","author":"Botond","year":"2013"},{"issue":"6","key":"2024041205442038900_ref24","doi-asserted-by":"crossref","first-page":"770","DOI":"10.1038\/s41588-021-00873-4","article-title":"Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis","volume":"53","author":"Sarkar","year":"2021","journal-title":"Nat Genet"},{"issue":"1","key":"2024041205442038900_ref25","doi-asserted-by":"crossref","first-page":"692","DOI":"10.1186\/s12864-021-07934-2","article-title":"CAMPAREE: a robust and configurable RNA expression simulator","volume":"22","author":"Lahens","year":"2021","journal-title":"BMC Genomics"},{"key":"2024041205442038900_ref26","doi-asserted-by":"crossref","first-page":"33","DOI":"10.12688\/f1000research.29032.2","article-title":"Sustainable data analysis with Snakemake","volume":"10","author":"Molder","year":"2021","journal-title":"F1000Res"},{"key":"2024041205442038900_ref27","doi-asserted-by":"crossref","first-page":"e131","DOI":"10.1093\/nar\/gkq224","article-title":"Biases in Illumina transcriptome sequencing caused by random hexamer priming","volume":"38","author":"Hansen","year":"2010","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"2024041205442038900_ref28","doi-asserted-by":"crossref","first-page":"1884","DOI":"10.1101\/gr.095299.109","article-title":"BayesCall: a model-based base-calling algorithm for high-throughput short-read sequencing","volume":"19","author":"Kao","year":"2009","journal-title":"Genome Res"},{"issue":"12","key":"2024041205442038900_ref29","doi-asserted-by":"crossref","first-page":"1287","DOI":"10.1038\/nbt.3682","article-title":"Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation","volume":"34","author":"Love","year":"2016","journal-title":"Nat Biotechnol"},{"key":"2024041205442038900_ref30","doi-asserted-by":"crossref","DOI":"10.1101\/2021.05.05.442755","article-title":"STARsolo: accurate, fast and versatile mapping\/quantification of single-cell and single-nucleus RNA-seq","author":"Kaminow","year":"2021"},{"issue":"1","key":"2024041205442038900_ref31","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1186\/s12859-021-04198-1","article-title":"Comparative evaluation of full-length isoform quantification from RNA-Seq","volume":"22","author":"Sarantopoulou","year":"2021","journal-title":"BMC Bioinformatics"},{"key":"2024041205442038900_ref32","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1186\/s12864-017-4011-0","article-title":"A comparison of Illumina and ion torrent sequencing platforms in the context of differential gene expression","volume":"18","author":"Lahens","year":"2017","journal-title":"BMC Genomics"},{"issue":"1521","key":"2024041205442038900_ref33","doi-asserted-by":"crossref","first-page":"1521","DOI":"10.12688\/f1000research.7563.2","article-title":"Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences [version 2; peer review: 2 approved]","volume":"4","author":"Soneson","year":"2016","journal-title":"F1000Research"},{"key":"2024041205442038900_ref34","doi-asserted-by":"crossref","first-page":"e1008585","DOI":"10.1371\/journal.pcbi.1008585","article-title":"Preprocessing choices affect RNA velocity results for droplet scRNA-seq data","volume":"17","author":"Soneson","year":"2021","journal-title":"PLoS Comput Biol"},{"issue":"3","key":"2024041205442038900_ref35","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1038\/s41592-022-01408-3","article-title":"Alevin-fry unlocks rapid, accurate and memory-frugal quantification of single-cell RNA-seq data","volume":"19","author":"He","year":"2022","journal-title":"Nat Methods"},{"issue":"1","key":"2024041205442038900_ref36","doi-asserted-by":"crossref","first-page":"6911","DOI":"10.1038\/s41467-021-27130-w","article-title":"A benchmark study of simulation methods for single-cell RNA sequencing data","volume":"12","author":"Cao","year":"2021","journal-title":"Nat Commun"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae164\/57215931\/bbae164.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae164\/57215931\/bbae164.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T01:44:46Z","timestamp":1712886286000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae164\/7644138"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,27]]},"references-count":36,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,3,27]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae164","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.04.21.537847","asserted-by":"object"}]},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5]]},"published":{"date-parts":[[2024,3,27]]},"article-number":"bbae164"}}