{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,12]],"date-time":"2026-06-12T06:24:58Z","timestamp":1781245498801,"version":"3.54.1"},"reference-count":16,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":655,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Summary: VarSim is a framework for assessing alignment and variant calling accuracy in high-throughput genome sequencing through simulation or real data. In contrast to simulating a random mutation spectrum, it synthesizes diploid genomes with germline and somatic mutations based on a realistic model. This model leverages information such as previously reported mutations to make the synthetic genomes biologically relevant. VarSim simulates and validates a wide range of variants, including single nucleotide variants, small indels and large structural variants. It is an automated, comprehensive compute framework supporting parallel computation and multiple read simulators. Furthermore, we developed a novel map data structure to validate read alignments, a strategy to compare variants binned in size ranges and a lightweight, interactive, graphical report to visualize validation results with detailed statistics. Thus far, it is the most comprehensive validation tool for secondary analysis in next generation sequencing.<\/jats:p>\n               <jats:p>Availability and implementation: Code in Java and Python along with instructions to download the reads and variants is at http:\/\/bioinform.github.io\/varsim.<\/jats:p>\n               <jats:p>Contact: \u00a0rd@bina.com<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu828","type":"journal-article","created":{"date-parts":[[2014,12,19]],"date-time":"2014-12-19T10:40:49Z","timestamp":1418985649000},"page":"1469-1471","source":"Crossref","is-referenced-by-count":73,"title":["VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications"],"prefix":"10.1093","volume":"31","author":[{"given":"John C.","family":"Mu","sequence":"first","affiliation":[{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"},{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Marghoob","family":"Mohiyuddin","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jian","family":"Li","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Narges","family":"Bani Asadi","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mark B.","family":"Gerstein","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Alexej","family":"Abyzov","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Wing H.","family":"Wong","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"},{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hugo Y.K.","family":"Lam","sequence":"additional","affiliation":[{"name":"1 Department of Electrical Engineering, Stanford University, Stanford, CA 94035, USA, 2Department of Bioinformatics, Bina Technologies, Redwood City, CA 94065, USA, 3Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA, 4Mayo Clinics, Department of Health Sciences Research, Rochester, MN 55902, USA, 5Department of Statistics, Stanford University, Stanford, CA 94035, USA and 6Department of Health Research and Policy, Stanford University, Stanford, CA 94035, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2014,12,17]]},"reference":[{"key":"2023051308502014100_btu828-B1","doi-asserted-by":"crossref","first-page":"974","DOI":"10.1101\/gr.114876.110","article-title":"Cnvnator: an approach to discover, genotype, and characterize typical and atypical cnvs from family and population genome sequencing","volume":"21","author":"Abyzov","year":"2011","journal-title":"Genome Res."},{"key":"2023051308502014100_btu828-B2","doi-asserted-by":"crossref","first-page":"1679","DOI":"10.1093\/bioinformatics\/btt198","article-title":"RSVSim: an R\/Bioconductor package for the simulation of structural variations","volume":"29","author":"Bartenhagen","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051308502014100_btu828-B3","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1038\/nmeth.1363","article-title":"BreakDancer: an algorithm for high-resolution mapping of genomic structural variation","volume":"6","author":"Chen","year":"2009","journal-title":"Nat. Methods"},{"key":"2023051308502014100_btu828-B4","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1038\/nbt.2514","article-title":"Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples","volume":"31","author":"Cibulskis","year":"2013","journal-title":"Nat. Biotechnol."},{"key":"2023051308502014100_btu828-B5","first-page":"D805","article-title":"COSMIC: exploring the world\u2019s knowledge of somatic mutations in human cancer","author":"Forbes","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023051308502014100_btu828-B6","article-title":"Haplotype-based variant detection from short-read sequencing","volume":"arXiv","author":"Garrison","year":"2012","journal-title":"arXiv preprint"},{"key":"2023051308502014100_btu828-B7","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","article-title":"ART: a next-generation sequencing read simulator","volume":"28","author":"Huang","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051308502014100_btu828-B8","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1038\/nbt.2134","article-title":"Detecting and annotating genetic variations using the HugeSeq pipeline","volume":"30","author":"Lam","year":"2012","journal-title":"Nat. Biotechnol."},{"key":"2023051308502014100_btu828-B9","article-title":"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM","volume":"arXiv","author":"Li","year":"2013","journal-title":"arXiv"},{"key":"2023051308502014100_btu828-B10","doi-asserted-by":"crossref","first-page":"D986","DOI":"10.1093\/nar\/gkt958","article-title":"The Database of Genomic Variants: a curated collection of structural variation in the human genome","volume":"42","author":"MacDonald","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"2023051308502014100_btu828-B11","doi-asserted-by":"crossref","first-page":"1297","DOI":"10.1101\/gr.107524.110","article-title":"The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data","volume":"20","author":"McKenna","year":"2010","journal-title":"Genome Res."},{"key":"2023051308502014100_btu828-B12","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nature09708","article-title":"Mapping copy number variation by population-scale genome sequencing","volume":"470","author":"Mills","year":"2011","journal-title":"Nature"},{"key":"2023051308502014100_btu828-B13","doi-asserted-by":"crossref","first-page":"522","DOI":"10.1038\/msb.2011.54","article-title":"AlleleSeq: analysis of allele-specific expression and binding in a network framework","volume":"7","author":"Rozowsky","year":"2011","journal-title":"Mol. Syst. Biol."},{"key":"2023051308502014100_btu828-B14","doi-asserted-by":"crossref","first-page":"2787","DOI":"10.1093\/bioinformatics\/btu345","article-title":"SMaSH: a benchmarking toolkit for human genome variant calling","volume":"30","author":"Talwalkar","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051308502014100_btu828-B15","doi-asserted-by":"crossref","first-page":"2865","DOI":"10.1093\/bioinformatics\/btp394","article-title":"Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads","volume":"25","author":"Ye","year":"2009","journal-title":"Bioinformatics"},{"key":"2023051308502014100_btu828-B16","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1038\/nbt.2835","article-title":"Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls","volume":"32","author":"Zook","year":"2014","journal-title":"Nat. Biotechnol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/9\/1469\/50306493\/bioinformatics_31_9_1469.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/9\/1469\/50306493\/bioinformatics_31_9_1469.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,13]],"date-time":"2023-05-13T08:51:39Z","timestamp":1683967899000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/9\/1469\/200131"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,12,17]]},"references-count":16,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2015,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu828","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,5,1]]},"published":{"date-parts":[[2014,12,17]]}}}