{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T08:47:25Z","timestamp":1775983645366,"version":"3.50.1"},"reference-count":21,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,5,26]],"date-time":"2020-05-26T00:00:00Z","timestamp":1590451200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,5,26]],"date-time":"2020-05-26T00:00:00Z","timestamp":1590451200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Recently, it has become possible to collect next-generation DNA sequencing data sets that are composed of multiple samples from multiple biological units where each of these samples may be from a single cell or bulk tissue. Yet, there does not yet exist a tool for simulating DNA sequencing data from such a nested sampling arrangement with single-cell and bulk samples so that developers of analysis methods can assess accuracy and precision.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We have developed a tool that simulates DNA sequencing data from hierarchically grouped (correlated) samples where each sample is designated bulk or single-cell. Our tool uses a simple configuration file to define the experimental arrangement and can be integrated into software pipelines for testing of variant callers or other genomic tools.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>The DNA sequencing data generated by our simulator is representative of real data and integrates seamlessly with standard downstream analysis tools.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-020-03550-1","type":"journal-article","created":{"date-parts":[[2020,5,26]],"date-time":"2020-05-26T04:02:59Z","timestamp":1590465779000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["SCSIM: Jointly simulating correlated single-cell and bulk next-generation DNA sequencing data"],"prefix":"10.1186","volume":"21","author":[{"given":"Collin","family":"Giguere","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Harsh Vardhan","family":"Dubey","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vishal Kumar","family":"Sarsani","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hachem","family":"Saddiki","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shai","family":"He","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6097-6572","authenticated-orcid":false,"given":"Patrick","family":"Flaherty","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,5,26]]},"reference":[{"issue":"8","key":"3550_CR1","doi-asserted-by":"publisher","first-page":"459","DOI":"10.1038\/nrg.2016.57","volume":"17","author":"M Escalona","year":"2016","unstructured":"Escalona M, Rocha S, Posada D. A comparison of tools for the simulation of genomic next-generation sequencing data. Nat Rev Genet. 2016; 17(8):459\u201369. https:\/\/doi.org\/10.1038\/nrg.2016.57.","journal-title":"Nat Rev Genet"},{"key":"3550_CR2","unstructured":"NCI Division of Cancer Control & Population Sciences. Genetic Simulation Resources. 2018. https:\/\/popmodels.cancercontrol.cancer.gov\/gsr\/. Accessed 27 Nov 2018."},{"issue":"1","key":"3550_CR3","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1186\/1471-2164-13-74","volume":"13","author":"KE McElroy","year":"2012","unstructured":"McElroy KE, Luciani F, Thomas T. GEMSIM: General, error-model based simulator of next-generation sequencing data. BMC Genomics. 2012; 13(1):74. https:\/\/doi.org\/10.1186\/1471-2164-13-74.","journal-title":"BMC Genomics"},{"issue":"11","key":"3550_CR4","doi-asserted-by":"publisher","first-page":"0167047","DOI":"10.1371\/journal.pone.0167047","volume":"11","author":"ZD Stephens","year":"2016","unstructured":"Stephens ZD, Hudson ME, Mainzer LS, Taschuk M, Weber MR, Iyer RK. Simulating next-generation sequencing datasets from empirical mutation and sequencing models. PLOS ONE. 2016; 11(11):0167047. https:\/\/doi.org\/10.1371\/journal.pone.0167047.","journal-title":"PLOS ONE"},{"issue":"1","key":"3550_CR5","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1186\/1471-2105-15-40","volume":"15","author":"S Pattnaik","year":"2014","unstructured":"Pattnaik S, Gupta S, Rao AA, Panda B. SInC: an accurate and fast error-model based simulator for SNPs, indels and CNVs coupled with a read generator for short-read sequence data. BMC Bioinformatics. 2014; 15(1):40. https:\/\/doi.org\/10.1186\/1471-2105-15-40.","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"3550_CR6","doi-asserted-by":"publisher","first-page":"264","DOI":"10.1186\/1471-2164-15-264","volume":"15","author":"S Caboche","year":"2014","unstructured":"Caboche S, Audebert C, Lemoine Y, Hot D. Comparison of mapping algorithms used in high-throughput sequencing: Application to ion torrent data. BMC Genomics. 2014; 15(1):264. https:\/\/doi.org\/10.1186\/1471-2164-15-264.","journal-title":"BMC Genomics"},{"issue":"4","key":"3550_CR7","doi-asserted-by":"publisher","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","volume":"28","author":"W Huang","year":"2012","unstructured":"Huang W, Li L, Myers JR, Marth GT. Art: a next-generation sequencing read simulator. Bioinformatics. 2012; 28(4):593\u20134.","journal-title":"Bioinformatics"},{"issue":"3","key":"3550_CR8","doi-asserted-by":"publisher","first-page":"521","DOI":"10.1093\/bioinformatics\/bty630","volume":"35","author":"Hadrien Gourl\u00e9","year":"2018","unstructured":"Gourl\u00e9 H, Karlsson-Lindsj\u00f6 O, Hayer J, Bongcam-Rudloff E. Simulating Illumina metagenomic data with InSilicoSeq. Bioinformatics. 2018; 35(3):521\u20132. https:\/\/doi.org\/10.1093\/bioinformatics\/bty630 https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/3\/521\/27699758\/bty630.pdf.","journal-title":"Bioinformatics"},{"key":"3550_CR9","doi-asserted-by":"publisher","DOI":"10.1109\/BIBM47256.2019.8983192","volume-title":"2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","author":"S Wang","year":"2019","unstructured":"Wang S, Wang J, Xiao X, Zhang X, Wang X, Zhu X, Lai X. GSDcreator: An Efficient and Comprehensive Simulator for Genarating NGS Data with Population Genetic Information. In: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). San Diego: IEEE: 2019. p. 1868\u201375. https:\/\/doi.org\/10.1109\/BIBM47256.2019.8983192."},{"key":"3550_CR10","doi-asserted-by":"crossref","unstructured":"Yu Z, Du F, Sun X, Li A. SCSsim: an integrated tool for simulating single-cell genome sequencing data. Bioinformatics. 2019; 36(4):1281\u20132. https:\/\/doi.org\/10.1093\/bioinformatics\/btz713 https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1281\/32527663\/btz713.pdf.","DOI":"10.1093\/bioinformatics\/btz713"},{"issue":"1-2","key":"3550_CR11","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1016\/j.cell.2017.12.007","volume":"172","author":"AK Casasent","year":"2018","unstructured":"Casasent AK, Schalck A, Gao R, Sei E, Long A, Pangburn W, Casasent T, Meric-Bernstam F, Edgerton ME, Navin NE. Multiclonal invasion in breast tumors identified by topographic single cell sequencing. Cell. 2018; 172(1-2):205\u201321712. https:\/\/doi.org\/10.1016\/j.cell.2017.12.007.","journal-title":"Cell"},{"issue":"1","key":"3550_CR12","doi-asserted-by":"publisher","first-page":"12083","DOI":"10.1038\/ncomms12083","volume":"7","author":"J Zhou","year":"2016","unstructured":"Zhou J, Deng Y, Shen L, Wen C, Yan Q, Ning D, Qin Y, Xue K, Wu L, He Z, Voordeckers JW, Nostrand JDV, Buzzard V, Michaletz ST, Enquist BJ, Weiser MD, Kaspari M, Waide R, Yang Y, Brown JH. Temperature mediates continental-scale diversity of microbes in forest soils. Nat Commun. 2016; 7(1):12083. https:\/\/doi.org\/10.1038\/ncomms12083.","journal-title":"Nat Commun"},{"issue":"476","key":"3550_CR13","doi-asserted-by":"publisher","first-page":"1566","DOI":"10.1198\/016214506000000302","volume":"101","author":"YW Teh","year":"2006","unstructured":"Teh YW, Jordan MI, Beal MJ, Blei DM. Hierarchical dirichlet processes. J Am Stat Assoc. 2006; 101(476):1566\u201381. https:\/\/doi.org\/10.1198\/016214506000000302.","journal-title":"J Am Stat Assoc"},{"issue":"6","key":"3550_CR14","doi-asserted-by":"publisher","first-page":"505","DOI":"10.1038\/nmeth.3835","volume":"13","author":"H Zafar","year":"2016","unstructured":"Zafar H, Wang Y, Nakhleh L, Navin N, Chen K. Monovar: single-nucleotide variant detection in single cells. Nat Methods. 2016; 13(6):505\u2013507. https:\/\/doi.org\/10.1038\/nmeth.3835.","journal-title":"Nat Methods"},{"key":"3550_CR15","unstructured":"Homer N. Whole Genome Simulator for Next-Generation Sequencing. 2018. http:\/\/github.com\/nh13\/dwgsim. Accessed 27 Nov 2018."},{"issue":"1","key":"3550_CR16","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1186\/s13059-015-0616-2","volume":"16","author":"ML Leung","year":"2015","unstructured":"Leung ML, Wang Y, Waters J, Navin NE. SNES: single nucleus exome sequencing. Genome Biol. 2015; 16(1):55. https:\/\/doi.org\/10.1186\/s13059-015-0616-2.","journal-title":"Genome Biol"},{"issue":"21","key":"3550_CR17","doi-asserted-by":"publisher","first-page":"2987","DOI":"10.1093\/bioinformatics\/btr509","volume":"27","author":"H Li","year":"2011","unstructured":"Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011; 27(21):2987\u201393. https:\/\/doi.org\/10.1093\/bioinformatics\/btr509.","journal-title":"Bioinformatics"},{"issue":"14","key":"3550_CR18","doi-asserted-by":"publisher","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","volume":"25","author":"H. Li","year":"2009","unstructured":"Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25(14):1754\u201360. https:\/\/doi.org\/10.1093\/bioinformatics\/btp324 https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/14\/1754\/605544\/btp324.pdf.","journal-title":"Bioinformatics"},{"issue":"50","key":"3550_CR19","doi-asserted-by":"publisher","first-page":"17947","DOI":"10.1073\/pnas.1420822111","volume":"111","author":"Charles Gawad","year":"2014","unstructured":"Gawad C, Koh W, Quake SR. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proc Natl Acad Sci. 2014; 111(50):17947\u201352. https:\/\/doi.org\/10.1073\/pnas.1420822111 https:\/\/www.pnas.org\/content\/111\/50\/17947.full.pdf.","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"12","key":"3550_CR20","doi-asserted-by":"publisher","first-page":"2975","DOI":"10.1002\/ajmg.a.37297","volume":"167","author":"NE Bowles","year":"2015","unstructured":"Bowles NE, Jou CJ, Arrington CB, Kennedy BJ, Earl A, Matsunami N, Meyers LL, Etheridge SP, Saarel EV, Bleyl SB, Yost HJ, Yandell M, Leppert MF, Tristani-Firouzi M, Gruber PJ. the Baylor Hopkins Centers for Mendelian Genomics: Exome analysis of a family with wolff-parkinson-white syndrome identifies a novel disease locus. Am J Med Genet A. 2015; 167(12):2975\u201384. https:\/\/doi.org\/10.1002\/ajmg.a.37297.","journal-title":"Am J Med Genet A"},{"issue":"1","key":"3550_CR21","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1186\/s12859-019-2928-9","volume":"20","author":"M Kumaran","year":"2019","unstructured":"Kumaran M, Subramanian U, Devarajan B. Performance assessment of variant calling pipelines using human whole exome sequencing and simulated data. BMC Bioinformatics. 2019; 20(1):342. https:\/\/doi.org\/10.1186\/s12859-019-2928-9.","journal-title":"BMC Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03550-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-020-03550-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03550-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,1]],"date-time":"2023-10-01T09:33:04Z","timestamp":1696152784000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-03550-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,26]]},"references-count":21,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["3550"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-03550-1","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.02.03.930354","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,26]]},"assertion":[{"value":"3 February 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 May 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 May 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"215"}}