{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,20]],"date-time":"2026-05-20T03:41:11Z","timestamp":1779248471777,"version":"3.51.4"},"reference-count":50,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2021,1,12]],"date-time":"2021-01-12T00:00:00Z","timestamp":1610409600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100006458","name":"Children's Hospital of Philadelphia","doi-asserted-by":"publisher","award":["U01-HG006830"],"award-info":[{"award-number":["U01-HG006830"]}],"id":[{"id":"10.13039\/100006458","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005049","name":"Science and Engineering Research Council","doi-asserted-by":"publisher","award":["CIE170034"],"award-info":[{"award-number":["CIE170034"]}],"id":[{"id":"10.13039\/501100005049","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100008982","name":"National Science Foundation","doi-asserted-by":"publisher","award":["ACI-1548562"],"award-info":[{"award-number":["ACI-1548562"]}],"id":[{"id":"10.13039\/501100008982","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,9,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Copy number variations (CNVs) are an important class of variations contributing to the pathogenesis of many disease phenotypes. Detecting CNVs from genomic data remains difficult, and the most currently applied methods suffer from an unacceptably high false positive rate. A common practice is to have human experts manually review original CNV calls for filtering false positives before further downstream analysis or experimental validation. Here, we propose DeepCNV, a deep learning-based tool, intended to replace human experts when validating CNV calls, focusing on the calls made by one of the most accurate CNV callers, PennCNV. The sophistication of the deep neural network algorithm is enriched with over 10\u00a0000 expert-scored samples that are split into training and testing sets. Variant confidence, especially for CNVs, is a main roadblock impeding the progress of linking CNVs with the disease. We show that DeepCNV adds to the confidence of the CNV calls with an optimal area under the receiver operating characteristic curve of 0.909, exceeding other machine learning methods. The superiority of DeepCNV was also benchmarked and confirmed using an experimental wet-lab validation dataset. We conclude that the improvement obtained by DeepCNV results in significantly fewer false positive results and failures to replicate the CNV association results.<\/jats:p>","DOI":"10.1093\/bib\/bbaa381","type":"journal-article","created":{"date-parts":[[2021,1,7]],"date-time":"2021-01-07T21:35:33Z","timestamp":1610055333000},"source":"Crossref","is-referenced-by-count":22,"title":["DeepCNV: a deep learning approach for authenticating copy number variations"],"prefix":"10.1093","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5131-2811","authenticated-orcid":false,"given":"Joseph T","family":"Glessner","sequence":"first","affiliation":[{"name":"Center for Applied Genomics, Department of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA"},{"name":"Perelman School of Medicine, Department of Pediatrics, University of Pennsylvania, Philadelphia, PA 19102, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiurui","family":"Hou","sequence":"additional","affiliation":[{"name":"Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cheng","family":"Zhong","sequence":"additional","affiliation":[{"name":"Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jie","family":"Zhang","sequence":"additional","affiliation":[{"name":"Adobe Inc., San Jose, CA 95110, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Munir","family":"Khan","sequence":"additional","affiliation":[{"name":"Center for Applied Genomics, Department of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA"},{"name":"Perelman School of Medicine, Department of Pediatrics, University of Pennsylvania, Philadelphia, PA 19102, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fabian","family":"Brand","sequence":"additional","affiliation":[{"name":"University of Bonn, 53113 Bonn, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter","family":"Krawitz","sequence":"additional","affiliation":[{"name":"University of Bonn, 53113 Bonn, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Patrick M A","family":"Sleiman","sequence":"additional","affiliation":[{"name":"Perelman School of Medicine, Department of Pediatrics, University of Pennsylvania, Philadelphia, PA 19102, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hakon","family":"Hakonarson","sequence":"additional","affiliation":[{"name":"Perelman School of Medicine, Department of Pediatrics, University of Pennsylvania, Philadelphia, PA 19102, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhi","family":"Wei","sequence":"additional","affiliation":[{"name":"Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2021,1,12]]},"reference":[{"key":"2021120509200416500_ref1","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1038\/nature07239","article-title":"Rare chromosomal deletions and duplications increase risk of schizophrenia","volume":"455","author":"Consortium","year":"2008","journal-title":"Nature"},{"key":"2021120509200416500_ref2","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1016\/j.ajhg.2008.10.006","article-title":"Genome-wide copy-number-variation study identified a susceptibility gene, UGT2B17, for osteoporosis","volume":"83","author":"Yang","year":"2008","journal-title":"Am J Hum Genet"},{"key":"2021120509200416500_ref3","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1038\/nbt.1852","article-title":"Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants","volume":"29","author":"Pinto","year":"2011","journal-title":"Nat Biotechnol"},{"key":"2021120509200416500_ref4","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1186\/1471-2164-10-588","article-title":"The pitfalls of platform comparison: DNA copy number array technologies assessed","volume":"10","author":"Curtis","year":"2009","journal-title":"BMC Genom"},{"key":"2021120509200416500_ref5","first-page":"135","article-title":"Comparison of comparative genomic hybridization technologies across microarray platforms","volume":"20","author":"Hester","year":"2009","journal-title":"J Biomol Tech"},{"key":"2021120509200416500_ref6","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1159\/000095923","article-title":"Array-based comparative genomic hybridization and copy number variation in cancer research","volume":"115","author":"Cho","year":"2006","journal-title":"Cytogenet Genome Res"},{"key":"2021120509200416500_ref7","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1186\/1479-7364-2-6-403","article-title":"Strategies for the detection of copy number and other structural variants in the human genome","volume":"2","author":"Carson","year":"2006","journal-title":"Hum Genomics"},{"key":"2021120509200416500_ref8","doi-asserted-by":"crossref","first-page":"R52","DOI":"10.1186\/gb-2010-11-5-r52","article-title":"Towards a comprehensive structural variation map of an individual human genome","volume":"11","author":"Pang","year":"2010","journal-title":"Genome Biol"},{"key":"2021120509200416500_ref9","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1016\/j.ajhg.2010.04.006","article-title":"Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies","volume":"86","author":"Miller","year":"2010","journal-title":"Am J Hum Genet"},{"key":"2021120509200416500_ref10","doi-asserted-by":"crossref","first-page":"1665","DOI":"10.1101\/gr.6861907","article-title":"PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data","volume":"17","author":"Wang","year":"2007","journal-title":"Genome Res"},{"key":"2021120509200416500_ref11","doi-asserted-by":"crossref","first-page":"2013","DOI":"10.1093\/nar\/gkm076","article-title":"QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data","volume":"35","author":"Colella","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2021120509200416500_ref12","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1093\/bfgp\/elp017","article-title":"Comparing CNV detection methods for SNP arrays","volume":"8","author":"Winchester","year":"2009","journal-title":"Brief Funct Genomic Proteomic"},{"key":"2021120509200416500_ref13","doi-asserted-by":"crossref","first-page":"1253","DOI":"10.1038\/ng.237","article-title":"Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs","volume":"40","author":"Korn","year":"2008","journal-title":"Nat Genet"},{"key":"2021120509200416500_ref14","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1093\/bioinformatics\/btm601","article-title":"Sparse representation and Bayesian detection of genome copy number alterations from microarray data","volume":"24","author":"Pique-Regi","year":"2008","journal-title":"Bioinformatics"},{"key":"2021120509200416500_ref15","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature06862","article-title":"Mapping and sequencing of structural variation from eight human genomes","volume":"453","author":"Kidd","year":"2008","journal-title":"Nature"},{"key":"2021120509200416500_ref16","doi-asserted-by":"crossref","first-page":"727","DOI":"10.1038\/ng1562","article-title":"Fine-scale structural variation of the human genome","volume":"37","author":"Tuzun","year":"2005","journal-title":"Nat Genet"},{"key":"2021120509200416500_ref17","doi-asserted-by":"crossref","first-page":"S30","DOI":"10.1038\/ng2042","article-title":"The population genetics of structural variation","volume":"39","author":"Conrad","year":"2007","journal-title":"Nat Genet"},{"key":"2021120509200416500_ref18","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1101\/gr.3677206","article-title":"Copy number variation: new insights in genome diversity","volume":"16","author":"Freeman","year":"2006","journal-title":"Genome Res"},{"key":"2021120509200416500_ref19","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1038\/ng1416","article-title":"Detection of large-scale variation in the human genome","volume":"36","author":"Iafrate","year":"2004","journal-title":"Nat Genet"},{"key":"2021120509200416500_ref20","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1038\/ng1307","article-title":"A tiling resolution DNA microarray with complete coverage of the human genome","volume":"36","author":"Ishkanian","year":"2004","journal-title":"Nat Genet"},{"key":"2021120509200416500_ref21","doi-asserted-by":"crossref","first-page":"S7","DOI":"10.1038\/ng2093","article-title":"Challenges and standards in integrating surveys of structural variation","volume":"39","author":"Scherer","year":"2007","journal-title":"Nat Genet"},{"key":"2021120509200416500_ref22","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1086\/510560","article-title":"A comprehensive analysis of common copy-number variations in the human genome","volume":"80","author":"Wong","year":"2007","journal-title":"Am J Hum Genet"},{"key":"2021120509200416500_ref23","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/nature24270","article-title":"Mastering the game of go without human knowledge","volume":"550","author":"Silver","year":"2017","journal-title":"Nature"},{"key":"2021120509200416500_ref24","doi-asserted-by":"crossref","DOI":"10.15252\/msb.20156651","article-title":"Deep learning for computational biology","volume":"12","author":"Angermueller","year":"2016","journal-title":"Mol Syst Biol"},{"key":"2021120509200416500_ref25","doi-asserted-by":"crossref","first-page":"24340","DOI":"10.1109\/ACCESS.2018.2825996","article-title":"DeepPolyA: a convolutional neural network approach for polyadenylation site prediction","volume":"6","author":"Gao","year":"2018","journal-title":"IEEE Access"},{"key":"2021120509200416500_ref26","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1159\/000493215","article-title":"tRNA-DL: a deep learning approach to improve tRNAscan-SE prediction results","volume":"83","author":"Gao","year":"2018","journal-title":"Hum Hered"},{"key":"2021120509200416500_ref27","doi-asserted-by":"crossref","first-page":"1201","DOI":"10.3174\/ajnr.A5667","article-title":"Deep-learning convolutional neural networks accurately classify genetic mutations in gliomas","volume":"39","author":"Chang","year":"2018","journal-title":"Am J Neuroradiol"},{"key":"2021120509200416500_ref28","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1038\/s42256-019-0037-0","article-title":"Clustering single-cell RNA-seq data with a model-based deep learning approach","volume":"1","author":"Tian","year":"2019","journal-title":"Nat Mach Intell"},{"key":"2021120509200416500_ref29","doi-asserted-by":"crossref","first-page":"955","DOI":"10.1093\/nar\/25.5.955","article-title":"tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence","volume":"25","author":"Lowe","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2021120509200416500_ref30","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1038\/nature15394","article-title":"An integrated map of structural variation in 2,504 human genomes","volume":"526","author":"Sudmant","year":"2015","journal-title":"Nature"},{"key":"2021120509200416500_ref31","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1186\/s12859-017-1802-x","article-title":"PennCNV in whole-genome sequencing data","volume":"18","author":"Araujo Lima","year":"2017","journal-title":"BMC Bioinform"},{"key":"2021120509200416500_ref32","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2021120509200416500_ref33","first-page":"1097","article-title":"Advances in neural information processing systems","volume-title":"Advances in Neural Information Processing Systems","author":"Krizhevsky","year":"2012"},{"key":"2021120509200416500_ref34","first-page":"2818","article-title":"Rethinking the inception architecture for computer vision","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Szegedy","year":"2016"},{"key":"2021120509200416500_ref35","doi-asserted-by":"crossref","DOI":"10.3115\/v1\/D14-1181","article-title":"Convolutional neural networks for sentence classification","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing","author":"Kim","year":"2014"},{"key":"2021120509200416500_ref36","article-title":"Rectifier nonlinearities improve neural network acoustic models, in ICML Workshop on Deep Learning for Audio, Speech and Language Processing","author":"Maas","year":"2013"},{"key":"2021120509200416500_ref37","first-page":"807","article-title":"Rectified linear units improve restricted boltzmann machines","volume-title":"Proceedings of the 27th International Conference on Machine Learning (ICML-10)","author":"Nair","year":"2010"},{"key":"2021120509200416500_ref38","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J Mach Learn Res"},{"key":"2021120509200416500_ref39","article-title":"Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude","volume-title":"COURSERA: Neural Networks for Machine Learning","author":"Tieleman","year":"2012"},{"key":"2021120509200416500_ref40","first-page":"618","article-title":"Grad-cam: Visual explanations from deep networks via gradient-based localization","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Selvaraju","year":"2017"},{"key":"2021120509200416500_ref41","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1023\/A:1018628609742","article-title":"Least squares support vector machine classifiers","volume":"9","author":"Suykens","year":"1999","journal-title":"Neural Proc Letters"},{"key":"2021120509200416500_ref42","first-page":"18","article-title":"Classification and regression by randomForest","volume":"2","author":"Liaw","year":"2002","journal-title":"R News"},{"key":"2021120509200416500_ref43","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J Mach Learn Res"},{"key":"2021120509200416500_ref44","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Maaten","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2021120509200416500_ref45","article-title":"Very Deep Convolutional Networks for Large-Scale Image Recognition","volume-title":"nternational Conference on Learning Representations","author":"Simonyan","year":"2015"},{"key":"2021120509200416500_ref46","doi-asserted-by":"crossref","first-page":"770","DOI":"10.1109\/CVPR.2016.90","article-title":"Deep residual learning for image recognition","volume-title":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"He","year":"2016"},{"key":"2021120509200416500_ref47","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int J Comp Vision"},{"key":"2021120509200416500_ref48","article-title":"Deep learning for drug response prediction in cancer","year":"2020","journal-title":"Briefings in Bioinformatics"},{"key":"2021120509200416500_ref49","article-title":"Genome-wide discovery of pre-miRNAs: comparison of recent approaches based on machine learning","year":"2020","journal-title":"Briefings in Bioinformatics"},{"key":"2021120509200416500_ref50","doi-asserted-by":"crossref","first-page":"2066","DOI":"10.1093\/bib\/bbz144","article-title":"Deep learning of pharmacogenomics resources: moving towards precision oncology","volume":"21","year":"2020","journal-title":"Briefings in bioinformatics"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/22\/5\/bbaa381\/41508916\/bbaa381.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/22\/5\/bbaa381\/41508916\/bbaa381.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,5]],"date-time":"2021-12-05T09:21:07Z","timestamp":1638696067000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbaa381\/6082822"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,12]]},"references-count":50,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2021,9,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbaa381","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,9]]},"published":{"date-parts":[[2021,1,12]]},"article-number":"bbaa381"}}