{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T20:25:03Z","timestamp":1772828703431,"version":"3.50.1"},"reference-count":46,"publisher":"Oxford University Press (OUP)","issue":"21","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Analysing next-generation sequencing (NGS) data for copy number variations (CNVs) detection is a relatively new and challenging field, with no accepted standard protocols or quality control measures so far. There are by now several algorithms developed for each of the four broad methods for CNV detection using NGS, namely the depth of coverage (DOC), read-pair, split-read and assembly-based methods. However, because of the complexity of the genome and the short read lengths from NGS technology, there are still many challenges associated with the analysis of NGS data for CNVs, no matter which method or algorithm is used.<\/jats:p>\n               <jats:p>Results: In this review, we describe and discuss areas of potential biases in CNV detection for each of the four methods. In particular, we focus on issues pertaining to (i) mappability, (ii) GC-content bias, (iii) quality control measures of reads and (iv) difficulty in identifying duplications. To gain insights to some of the issues discussed, we also download real data from the 1000 Genomes Project and analyse its DOC data. We show examples of how reads in repeated regions can affect CNV detection, demonstrate current GC-correction algorithms, investigate sensitivity of DOC algorithm before and after quality control of reads and discuss reasons for which duplications are harder to detect than deletions.<\/jats:p>\n               <jats:p>Contact: g0801862@nus.edu.sg or agus_salim@nuhs.edu.sg<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts535","type":"journal-article","created":{"date-parts":[[2012,9,1]],"date-time":"2012-09-01T20:37:41Z","timestamp":1346531861000},"page":"2711-2718","source":"Crossref","is-referenced-by-count":202,"title":["Statistical challenges associated with detecting copy number variations with next-generation sequencing"],"prefix":"10.1093","volume":"28","author":[{"given":"Shu Mei","family":"Teo","sequence":"first","affiliation":[{"name":"1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, 2NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore 117456 and 3Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden"},{"name":"1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, 2NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore 117456 and 3Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden"},{"name":"1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, 2NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore 117456 and 3Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden"}]},{"given":"Yudi","family":"Pawitan","sequence":"additional","affiliation":[{"name":"1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, 2NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore 117456 and 3Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden"}]},{"given":"Chee Seng","family":"Ku","sequence":"additional","affiliation":[{"name":"1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, 2NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore 117456 and 3Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden"}]},{"given":"Kee Seng","family":"Chia","sequence":"additional","affiliation":[{"name":"1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, 2NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore 117456 and 3Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden"},{"name":"1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, 2NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore 117456 and 3Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden"}]},{"given":"Agus","family":"Salim","sequence":"additional","affiliation":[{"name":"1 Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, 2NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore 117456 and 3Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden"}]}],"member":"286","published-online":{"date-parts":[[2012,8,31]]},"reference":[{"key":"2023012513151321300_bts535-B1","doi-asserted-by":"crossref","first-page":"974","DOI":"10.1101\/gr.114876.110","article-title":"CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing","volume":"21","author":"Abyzov","year":"2011","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B2","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1038\/nrg2958","article-title":"Genome structural variation discovery and genotyping","volume":"12","author":"Alkan","year":"2011","journal-title":"Nat. Rev. Genet."},{"key":"2023012513151321300_bts535-B3","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/ng.437","article-title":"Personalized copy number and segmental duplication maps using next-generation sequencing","volume":"41","author":"Alkan","year":"2009","journal-title":"Nat. Genet."},{"key":"2023012513151321300_bts535-B4","doi-asserted-by":"crossref","first-page":"R18","DOI":"10.1186\/gb-2011-12-2-r18","article-title":"Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries","volume":"12","author":"Aird","year":"2011","journal-title":"Genome Biol."},{"key":"2023012513151321300_bts535-B5","doi-asserted-by":"crossref","first-page":"e72","DOI":"10.1093\/nar\/gks001","article-title":"Summarizing and correction for GC-content bias in high throughput sequencing","volume":"40","author":"Benjamini","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012513151321300_bts535-B6","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1093\/bioinformatics\/btr670","article-title":"Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data","volume":"28","author":"Boeva","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012513151321300_bts535-B7","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1038\/nmeth.1363","article-title":"BreakDancer: an algorithm for high resolution mapping of genomic structural variation","volume":"6","author":"Chen","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012513151321300_bts535-B8","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1038\/nmeth.1276","article-title":"High-resolution mapping of copy-number alterations with massively parallel sequencing","volume":"6","author":"Chiang","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012513151321300_bts535-B9","doi-asserted-by":"crossref","first-page":"1001","DOI":"10.1038\/nbt1109-1001","article-title":"Mapping duplicated sequences","volume":"27","author":"Chiang","year":"2009","journal-title":"Nat. Biotechnol."},{"key":"2023012513151321300_bts535-B10","doi-asserted-by":"crossref","first-page":"704","DOI":"10.1038\/nature08516","article-title":"Origins and functional impact of copy number variation in the human genome","volume":"464","author":"Conrad","year":"2010","journal-title":"Nature"},{"key":"2023012513151321300_bts535-B12","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/gm132","article-title":"Inversion variants in the human genome: role in disease and genome architecture","volume":"2","author":"Feuk","year":"2010","journal-title":"Genome Med."},{"key":"2023012513151321300_bts535-B13","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1038\/ng.768","article-title":"Discovery and genotyping of genome structural polymorphism by sequencing on population scale","volume":"43","author":"Handsaker","year":"2011","journal-title":"Nat. Genet."},{"key":"2023012513151321300_bts535-B14","doi-asserted-by":"crossref","first-page":"R32","DOI":"10.1186\/gb-2009-10-3-r32","article-title":"Evaluation of next generation sequencing platforms for population targeted sequencing studies","volume":"10","author":"Harismendy","year":"2009","journal-title":"Genome Biol."},{"key":"2023012513151321300_bts535-B15","doi-asserted-by":"crossref","first-page":"1513","DOI":"10.1093\/bioinformatics\/btr169","article-title":"Efficient algorithms for tandem copy number variation reconstruction in repeat-rich regions","volume":"27","author":"He","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012513151321300_bts535-B16","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1101\/gr.088633.108","article-title":"Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes","volume":"19","author":"Hormozdiari","year":"2009","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B17","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1038\/ng.1028","article-title":"De novo assembly and genotyping of variants using colored de Bruijn graphs","volume":"44","author":"Iqbal","year":"2011","journal-title":"Nat. Genet."},{"key":"2023012513151321300_bts535-B18","doi-asserted-by":"crossref","first-page":"R23","DOI":"10.1186\/gb-2009-10-2-r23","article-title":"PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data","volume":"10","author":"Korbel","year":"2009","journal-title":"Genome Biol."},{"key":"2023012513151321300_bts535-B19","doi-asserted-by":"crossref","first-page":"263","DOI":"10.3390\/genes1020263","article-title":"A computer simulator for assessing different challenges and strategies of de novo sequence assembly","volume":"1","author":"Knudsen","year":"2010","journal-title":"Genes"},{"key":"2023012513151321300_bts535-B20","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B21","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1101\/gr.097261.109","article-title":"De novo assembly of human genomes with massively parallel short read sequencing","volume":"20","author":"Li","year":"2010","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B22","doi-asserted-by":"crossref","first-page":"1166","DOI":"10.1038\/ng.238","article-title":"Integrated detection and population-genetic analysis of SNPs and copy number variation","volume":"40","author":"McCarroll","year":"2008","journal-title":"Nat. Genet."},{"key":"2023012513151321300_bts535-B23","doi-asserted-by":"crossref","first-page":"1527","DOI":"10.1101\/gr.091868.109","article-title":"Sequence and structural variation in a human genome uncovered by massively parallel ligation sequencing using two-base encoding","volume":"19","author":"McKernan","year":"2009","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B24","doi-asserted-by":"crossref","first-page":"S13","DOI":"10.1038\/nmeth.1374","article-title":"Computational methods for discovering structural variation with next-generation sequencing","volume":"6","author":"Medvedev","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012513151321300_bts535-B25","doi-asserted-by":"crossref","first-page":"1613","DOI":"10.1101\/gr.106344.110","article-title":"Detecting copy number variation with mated short reads","volume":"20","author":"Medvedev","year":"2010","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B26","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1038\/nrg2626","article-title":"Sequencing technologies\u2014the next generation","volume":"11","author":"Metzker","year":"2010","journal-title":"Nat. Rev. Genet."},{"key":"2023012513151321300_bts535-B27","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nature09708","article-title":"Mapping copy number variation by population-scale genome sequencing","volume":"470","author":"Mills","year":"2011","journal-title":"Nature"},{"key":"2023012513151321300_bts535-B28","doi-asserted-by":"crossref","first-page":"R52","DOI":"10.1186\/gb-2010-11-5-r52","article-title":"Towards a comprehensive structural variation map of an individual human genome","volume":"11","author":"Pang","year":"2010","journal-title":"Genome Biol."},{"key":"2023012513151321300_bts535-B29","doi-asserted-by":"crossref","first-page":"9748","DOI":"10.1073\/pnas.171285098","article-title":"An Eulerian path approach to DNA fragment assembly","volume":"98","author":"Pevzner","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012513151321300_bts535-B30","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1093\/bib\/5.3.237","article-title":"Comparative genome assembly","volume":"5","author":"Pop","year":"2004","journal-title":"Brief. Bioinformatics"},{"key":"2023012513151321300_bts535-B31","doi-asserted-by":"crossref","first-page":"623","DOI":"10.1101\/gr.102970.109","article-title":"Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome","volume":"20","author":"Quinlan","year":"2010","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B32","doi-asserted-by":"crossref","first-page":"2648","DOI":"10.1093\/bioinformatics\/btr462","article-title":"Exome sequencing-based copy number variation and loss of heterozygosity detection: ExomeCNV","volume":"27","author":"Sathirapongsasuti","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012513151321300_bts535-B33","doi-asserted-by":"crossref","first-page":"R227","DOI":"10.1093\/hmg\/ddq416","article-title":"A window into third-generation sequencing","volume":"19","author":"Schadt","year":"2010","journal-title":"Hum. Mol. Genet."},{"key":"2023012513151321300_bts535-B34","doi-asserted-by":"crossref","first-page":"1165","DOI":"10.1101\/gr.101360.109","article-title":"Assembly of large genomes using second generation sequencing","volume":"20","author":"Schatz","year":"2010","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B35","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1214\/11-AOAS517","article-title":"Change-point model on nonhomogeneous poisson processes with application in copy number profiling by next-generation DNA sequencing","volume":"6","author":"Shen","year":"2012","journal-title":"Ann. Appl. Stat."},{"key":"2023012513151321300_bts535-B36","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1101\/gr.089532.108","article-title":"ABySS: a parallel assembler for short read sequence data","volume":"19","author":"Simpson","year":"2009","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B37","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1126\/science.1197005","article-title":"Diversity of human copy number variation and multicopy genes","volume":"330","author":"Sudmant","year":"2010","journal-title":"Science"},{"key":"2023012513151321300_bts535-B38","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/nature09534","article-title":"A map of human genome variation from population-scale sequencing","volume":"467","author":"The 1000 Genomes Project Consortium. (2010)","journal-title":"Nature"},{"key":"2023012513151321300_bts535-B39","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1038\/nrg3117","article-title":"Repetitive DNA and next-generation sequencing: computational challenges and solutions","volume":"13","author":"Treangen","year":"2012","journal-title":"Nat. Rev. Genet."},{"key":"2023012513151321300_bts535-B40","doi-asserted-by":"crossref","first-page":"1665","DOI":"10.1101\/gr.6861907","article-title":"PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data","volume":"17","author":"Wang","year":"2007","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B41","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1038\/nature07484","article-title":"The diploid genome sequence of an Asian individual","volume":"456","author":"Wang","year":"2008","journal-title":"Nature"},{"key":"2023012513151321300_bts535-B42","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1093\/bfgp\/elq025","article-title":"Detecting structural variations in the human genome using next generation sequencing","volume":"9","author":"Xi","year":"2011","journal-title":"Brief. Funct. Genomics"},{"key":"2023012513151321300_bts535-B43","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1186\/1471-2105-10-80","article-title":"CNV-seq, a new method to detect copy number variation using high-throughput sequencing","volume":"10","author":"Xie","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012513151321300_bts535-B44","doi-asserted-by":"crossref","first-page":"2865","DOI":"10.1093\/bioinformatics\/btp394","article-title":"Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads","volume":"25","author":"Ye","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012513151321300_bts535-B45","doi-asserted-by":"crossref","first-page":"1586","DOI":"10.1101\/gr.092981.109","article-title":"Sensitive and accurate detection of copy number variants using read depth of coverage","volume":"19","author":"Yoon","year":"2009","journal-title":"Genome Res."},{"key":"2023012513151321300_bts535-B46","doi-asserted-by":"crossref","first-page":"1895","DOI":"10.1093\/bioinformatics\/btq293","article-title":"SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data","volume":"26","author":"Zeitouni","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012513151321300_bts535-B47","doi-asserted-by":"crossref","first-page":"821","DOI":"10.1101\/gr.074492.107","article-title":"Velvet: algorithms for de novo short read assembly using de Bruijn graphs","volume":"18","author":"Zerbino","year":"2008","journal-title":"Genome Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/21\/2711\/48874134\/bioinformatics_28_21_2711.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/21\/2711\/48874134\/bioinformatics_28_21_2711.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T19:18:48Z","timestamp":1674674328000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/21\/2711\/237315"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,8,31]]},"references-count":46,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2012,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts535","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,11,1]]},"published":{"date-parts":[[2012,8,31]]}}}