{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T09:16:55Z","timestamp":1775985415910,"version":"3.50.1"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"19","license":[{"start":{"date-parts":[[2019,3,2]],"date-time":"2019-03-02T00:00:00Z","timestamp":1551484800000},"content-version":"vor","delay-in-days":1,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"publisher","award":["328154-2014"],"award-info":[{"award-number":["328154-2014"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000038","name":"NSERC","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Chromatin Immunopreciptation (ChIP)-seq is used extensively to identify sites of transcription factor binding or regions of epigenetic modifications to the genome. A key step in ChIP-seq analysis is peak calling, where genomic regions enriched for ChIP versus control reads are identified. Many programs have been designed to solve this task, but nearly all fall into the statistical trap of using the data twice\u2014once to determine candidate enriched regions, and again to assess enrichment by classical statistical hypothesis testing. This double use of the data invalidates the statistical significance assigned to enriched regions, thus the true significance or reliability of peak calls remains unknown.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Using simulated and real ChIP-seq data, we show that three well-known peak callers, MACS, SICER and diffReps, output biased P-values and false discovery rate estimates that can be many orders of magnitude too optimistic. We propose a wrapper algorithm, RECAP, that uses resampling of ChIP-seq and control data to estimate a monotone transform correcting for biases built into peak calling algorithms. When applied to null hypothesis data, where there is no enrichment between ChIP-seq and control, P-values recalibrated by RECAP are approximately uniformly distributed. On data where there is genuine enrichment, RECAP P-values give a better estimate of the true statistical significance of candidate peaks and better false discovery rate estimates, which correlate better with empirical reproducibility. RECAP is a powerful new tool for assessing the true statistical significance of ChIP-seq peak calls.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The RECAP software is available through www.perkinslab.ca or on github at https:\/\/github.com\/theodorejperkins\/RECAP.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz150","type":"journal-article","created":{"date-parts":[[2019,2,27]],"date-time":"2019-02-27T16:45:17Z","timestamp":1551285917000},"page":"3592-3598","source":"Crossref","is-referenced-by-count":7,"title":["RECAP reveals the true statistical significance of ChIP-seq peak calls"],"prefix":"10.1093","volume":"35","author":[{"given":"Justin G","family":"Chitpin","sequence":"first","affiliation":[{"name":"Translational and Molecular Medicine Program, University of Ottawa , Ottawa, ON K1H8M5, Canada"},{"name":"Regenerative Medicine Program, Ottawa Hospital Research Institute , Ottawa, ON K1H8L6, Canada"}]},{"given":"Aseel","family":"Awdeh","sequence":"additional","affiliation":[{"name":"Regenerative Medicine Program, Ottawa Hospital Research Institute , Ottawa, ON K1H8L6, Canada"},{"name":"School of Electrical Engineering and Computer Science, University of Ottawa , Ottawa, ON K1N6N5, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6622-8003","authenticated-orcid":false,"given":"Theodore J","family":"Perkins","sequence":"additional","affiliation":[{"name":"Regenerative Medicine Program, Ottawa Hospital Research Institute , Ottawa, ON K1H8L6, Canada"},{"name":"School of Electrical Engineering and Computer Science, University of Ottawa , Ottawa, ON K1N6N5, Canada"},{"name":"Department of Biochemistry, Microbiology and Immunology, University of Ottawa , Ottawa, ON K1H8M5, Canada"}]}],"member":"286","published-online":{"date-parts":[[2019,3,1]]},"reference":[{"key":"2023013108150913900_btz150-B1","doi-asserted-by":"crossref","first-page":"2705","DOI":"10.1093\/bioinformatics\/btt470","article-title":"Identification of transcription factor binding sites from ChIP-seq data at high resolution","volume":"29","author":"Bardet","year":"2013","journal-title":"Bioinformatics"},{"key":"2023013108150913900_btz150-B2","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Series B"},{"key":"2023013108150913900_btz150-B3","doi-asserted-by":"crossref","first-page":"2537","DOI":"10.1093\/bioinformatics\/btn480","article-title":"F-Seq: a feature density estimator for high-throughput sequence tags","volume":"24","author":"Boyle","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013108150913900_btz150-B4","doi-asserted-by":"crossref","first-page":"371.","DOI":"10.1038\/nature13985","article-title":"Principles of regulatory information conservation between mouse and human","volume":"515","author":"Cheng","year":"2014","journal-title":"Nature"},{"key":"2023013108150913900_btz150-B5","doi-asserted-by":"crossref","first-page":"57.","DOI":"10.1038\/nature11247","article-title":"An integrated encyclopedia of DNA elements in the human genome","volume":"489","author":"Consortium","year":"2012","journal-title":"Nature"},{"key":"2023013108150913900_btz150-B6","doi-asserted-by":"crossref","first-page":"1351","DOI":"10.1214\/009053606000001460","article-title":"Size, power and false discovery rates","volume":"35","author":"Efron","year":"2007","journal-title":"Ann. Stat"},{"key":"2023013108150913900_btz150-B7","doi-asserted-by":"crossref","first-page":"1729","DOI":"10.1093\/bioinformatics\/btn305","article-title":"FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology","volume":"24","author":"Fejes","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013108150913900_btz150-B8","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1002\/0471250953.bi0214s34","article-title":"Using MACS to identify peaks from chip-seq data","volume":"34","author":"Feng","year":"2011","journal-title":"Curr. Protoc. Bioinformatics"},{"key":"2023013108150913900_btz150-B9","doi-asserted-by":"crossref","first-page":"1728","DOI":"10.1038\/nprot.2012.101","article-title":"Identifying ChIP-seq enrichment using MACS","volume":"7","author":"Feng","year":"2012","journal-title":"Nat. Protoc"},{"key":"2023013108150913900_btz150-B10","doi-asserted-by":"crossref","first-page":"139.","DOI":"10.1186\/1471-2105-12-139","article-title":"PeakRanger: a cloud-enabled peak caller for ChIP-seq data","volume":"12","author":"Feng","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023013108150913900_btz150-B11","doi-asserted-by":"crossref","first-page":"840","DOI":"10.1038\/nrg3306","article-title":"ChIP\u2013seq and beyond: new and improved methodologies to detect and characterize protein\u2013DNA interactions","volume":"13","author":"Furey","year":"2012","journal-title":"Nat. Rev. Genet"},{"key":"2023013108150913900_btz150-B12","doi-asserted-by":"crossref","first-page":"91.","DOI":"10.1038\/nature11245","article-title":"Architecture of the human regulatory network derived from ENCODE data","volume":"489","author":"Gerstein","year":"2012","journal-title":"Nature"},{"key":"2023013108150913900_btz150-B13","doi-asserted-by":"crossref","first-page":"e197","DOI":"10.1093\/nar\/gkt831","article-title":"A general approach for discriminative de novo motif discovery from high-throughput data","volume":"41","author":"Grau","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023013108150913900_btz150-B14","doi-asserted-by":"crossref","first-page":"e27","DOI":"10.1093\/nar\/gku1280","article-title":"Integrative analysis of public ChIP-seq experiments reveals a complex multi-cell regulatory landscape","volume":"43","author":"Griffon","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023013108150913900_btz150-B15","first-page":"191","author":"Hiranuma","year":"2016"},{"key":"2023013108150913900_btz150-B16","first-page":"278762.","article-title":"AIControl: replacing matched control experiments with machine learning improves ChIP-seq peak identification","author":"Hiranuma","year":"2018","journal-title":"bioRxiv"},{"key":"2023013108150913900_btz150-B17","doi-asserted-by":"crossref","first-page":"2622","DOI":"10.1093\/bioinformatics\/btq488","article-title":"Deep and wide digging for binding motifs in ChIP-seq data","volume":"26","author":"Kulakovskiy","year":"2010","journal-title":"Bioinformatics"},{"key":"2023013108150913900_btz150-B18","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1038\/nature14248","article-title":"Integrative analysis of 111 reference human epigenomes","volume":"518","author":"Kundaje","year":"2015","journal-title":"Nature"},{"key":"2023013108150913900_btz150-B19","doi-asserted-by":"crossref","first-page":"1813","DOI":"10.1101\/gr.136184.111","article-title":"ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia","volume":"22","author":"Landt","year":"2012","journal-title":"Genome Res"},{"key":"2023013108150913900_btz150-B20","doi-asserted-by":"crossref","first-page":"1752","DOI":"10.1214\/11-AOAS466","article-title":"Measuring reproducibility of high-throughput experiments","volume":"5","author":"Li","year":"2011","journal-title":"Ann. Appl. Stat"},{"key":"2023013108150913900_btz150-B21","doi-asserted-by":"crossref","first-page":"e95","DOI":"10.1093\/nar\/gku351","article-title":"De novo detection of differentially bound regions for ChIP-seq data using peaks and windows: controlling error rates correctly","volume":"42","author":"Lun","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023013108150913900_btz150-B22","doi-asserted-by":"crossref","first-page":"D142","DOI":"10.1093\/nar\/gkt997","article-title":"JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles","volume":"42","author":"Mathelier","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023013108150913900_btz150-B23","author":"Ramachandran","year":"2013"},{"key":"2023013108150913900_btz150-B24","doi-asserted-by":"crossref","first-page":"33.","DOI":"10.1186\/s13072-015-0028-2","article-title":"BIDCHIPS: bias decomposition and removal from ChIP-seq data clarifies true binding signal and its functional correlates","volume":"8","author":"Ramachandran","year":"2015","journal-title":"Epigenetics Chromatin"},{"key":"2023013108150913900_btz150-B25","doi-asserted-by":"crossref","first-page":"R67.","DOI":"10.1186\/gb-2011-12-7-r67","article-title":"ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions","volume":"12","author":"Rashid","year":"2011","journal-title":"Genome Biol"},{"key":"2023013108150913900_btz150-B26","doi-asserted-by":"crossref","first-page":"1787","DOI":"10.1126\/science.1198374","article-title":"Identification of functional elements and regulatory circuits by Drosophila modENCODE","volume":"330","author":"Roy","year":"2010","journal-title":"Science"},{"key":"2023013108150913900_btz150-B27","doi-asserted-by":"crossref","first-page":"e65598.","DOI":"10.1371\/journal.pone.0065598","article-title":"diffReps: detecting differential chromatin modification sites from ChIP-seq data with biological replicates","volume":"8","author":"Shen","year":"2013","journal-title":"PLoS One"},{"key":"2023013108150913900_btz150-B28","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1111\/joim.12231","article-title":"Epigenetics, chromatin and genome organization: recent advances from the ENCODE project","volume":"276","author":"Siggens","year":"2014","journal-title":"J. Internal Med"},{"key":"2023013108150913900_btz150-B29","doi-asserted-by":"crossref","first-page":"299.","DOI":"10.1186\/1471-2105-10-299","article-title":"BayesPeak: bayesian analysis of ChIP-seq data","volume":"10","author":"Spyrou","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023013108150913900_btz150-B30","doi-asserted-by":"crossref","first-page":"1145","DOI":"10.1016\/j.cell.2016.11.007","article-title":"The international human epigenome consortium: a blueprint for scientific collaboration and discovery","volume":"167","author":"Stunnenberg","year":"2016","journal-title":"Cell"},{"key":"2023013108150913900_btz150-B31","doi-asserted-by":"crossref","first-page":"e113","DOI":"10.1093\/nar\/gkp536","article-title":"Extracting transcription factor targets from ChIP-seq data","volume":"37","author":"Tuteja","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023013108150913900_btz150-B32","doi-asserted-by":"crossref","first-page":"829","DOI":"10.1038\/nmeth.1246","article-title":"Genome-wide analysis of transcription factor binding sites based on ChIP-seq data","volume":"5","author":"Valouev","year":"2008","journal-title":"Nat. Methods"},{"key":"2023013108150913900_btz150-B33","volume-title":"All of Statistics: A Concise Course in Statistical Inference","author":"Wasserman","year":"2013"},{"key":"2023013108150913900_btz150-B34","doi-asserted-by":"crossref","first-page":"e1002613.","DOI":"10.1371\/journal.pcbi.1002613","article-title":"Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data","volume":"8","author":"Xing","year":"2012","journal-title":"PLoS Comput. Biol"},{"key":"2023013108150913900_btz150-B35","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1007\/978-1-4939-0512-6_5","article-title":"Spatial clustering for identification of ChIP-enriched regions (SICER) to map regions of histone methylation patterns in embryonic stem cells","volume":"1150","author":"Xu","year":"2014","journal-title":"Methods Mol. Biol"},{"key":"2023013108150913900_btz150-B36","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btp340","article-title":"A clustering approach for identification of enriched domains from histone modification ChIP-seq data","volume":"25","author":"Zang","year":"2009","journal-title":"Bioinformatics"},{"key":"2023013108150913900_btz150-B37","doi-asserted-by":"crossref","first-page":"R137.","DOI":"10.1186\/gb-2008-9-9-r137","article-title":"Model-based analysis of ChIP-seq (MACS)","volume":"9","author":"Zhang","year":"2008","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/19\/3592\/48976500\/bioinformatics_35_19_3592.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/19\/3592\/48976500\/bioinformatics_35_19_3592.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,15]],"date-time":"2024-07-15T01:28:57Z","timestamp":1721006937000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/19\/3592\/5368012"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,3,1]]},"references-count":37,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2019,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz150","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/260687","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,10,1]]},"published":{"date-parts":[[2019,3,1]]}}}