{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T11:39:45Z","timestamp":1767181185955,"version":"build-2238731810"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1008839","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,10,21]],"date-time":"2021-10-21T00:00:00Z","timestamp":1634774400000}}],"reference-count":20,"publisher":"Public Library of Science (PLoS)","issue":"10","license":[{"start":{"date-parts":[[2021,10,11]],"date-time":"2021-10-11T00:00:00Z","timestamp":1633910400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP180101506"],"award-info":[{"award-number":["DP180101506"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>\n                    Hi-C is a sample preparation method that enables high-throughput sequencing to capture genome-wide spatial interactions between DNA molecules. The technique has been successfully applied to solve challenging problems such as 3D structural analysis of chromatin, scaffolding of large genome assemblies and more recently the accurate resolution of metagenome-assembled genomes (MAGs). Despite continued refinements, however, preparing a Hi-C library remains a complex laboratory protocol. To avoid costly failures and maximise the odds of successful outcomes, diligent quality management is recommended. Current wet-lab methods provide only a crude assay of Hi-C library quality, while key post-sequencing quality indicators used have\u2014thus far\u2014relied upon reference-based read-mapping. When a reference is accessible, this reliance introduces a concern for quality, where an incomplete or inexact reference skews the resulting quality indicators. We propose a new, reference-free approach that infers the total fraction of read-pairs that are a product of proximity ligation. This quantification of Hi-C library quality requires only a modest amount of sequencing data and is independent of other application-specific criteria. The algorithm builds upon the observation that proximity ligation events are likely to create\n                    <jats:italic>k<\/jats:italic>\n                    -mers that would not naturally occur in the sample. Our software tool (qc3C) is to our knowledge the first to implement a reference-free Hi-C QC tool, and also provides reference-based QC, enabling Hi-C to be more easily applied to non-model organisms and environmental samples. We characterise the accuracy of the new algorithm on simulated and real datasets and compare it to reference-based methods.\n                  <\/jats:p>","DOI":"10.1371\/journal.pcbi.1008839","type":"journal-article","created":{"date-parts":[[2021,10,11]],"date-time":"2021-10-11T17:06:07Z","timestamp":1633971967000},"page":"e1008839","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":11,"title":["qc3C: Reference-free quality control for Hi-C sequencing data"],"prefix":"10.1371","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7601-5108","authenticated-orcid":true,"given":"Matthew Z.","family":"DeMaere","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2397-7925","authenticated-orcid":true,"given":"Aaron E.","family":"Darling","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2021,10,11]]},"reference":[{"issue":"5950","key":"pcbi.1008839.ref001","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1126\/science.1181369","article-title":"Comprehensive mapping of long-range interactions reveals folding principles of the human genome","volume":"326","author":"E Lieberman-Aiden","year":"2009","journal-title":"Science"},{"issue":"12","key":"pcbi.1008839.ref002","doi-asserted-by":"crossref","first-page":"1119","DOI":"10.1038\/nbt.2727","article-title":"Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions","volume":"31","author":"JN Burton","year":"2013","journal-title":"Nat Biotechnol"},{"issue":"8","key":"pcbi.1008839.ref003","doi-asserted-by":"crossref","first-page":"e1007273","DOI":"10.1371\/journal.pcbi.1007273","article-title":"Integrating Hi-C links with assembly graphs for chromosome-scale assembly","volume":"15","author":"J Ghurye","year":"2019","journal-title":"PLoS Comput Biol"},{"issue":"5","key":"pcbi.1008839.ref004","doi-asserted-by":"crossref","first-page":"801","DOI":"10.1101\/gr.213462.116","article-title":"HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies","volume":"27","author":"P Edge","year":"2017","journal-title":"Genome Res"},{"issue":"1","key":"pcbi.1008839.ref005","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1186\/s13059-019-1643-1","article-title":"bin3C: exploiting Hi-C sequencing data to accurately resolve metagenome-assembled genomes","volume":"20","author":"MZ DeMaere","year":"2019","journal-title":"Genome Biol"},{"key":"pcbi.1008839.ref006","doi-asserted-by":"crossref","unstructured":"Press MO, Wiser AH, Kronenberg ZN, Langford KW, Shakya M, Lo CC, et al. Hi-C deconvolution of a human gut microbiome yields high-quality draft genomes and reveals plasmid-genome interactions; 2017. Available from: https:\/\/www.biorxiv.org\/content\/early\/2017\/10\/05\/198713.","DOI":"10.1101\/198713"},{"issue":"1","key":"pcbi.1008839.ref007","doi-asserted-by":"crossref","DOI":"10.1093\/gigascience\/giz158","article-title":"Multifaceted Hi-C benchmarking: what makes a difference in chromosome-scale genome scaffolding?","volume":"9","author":"M Kadota","year":"2020","journal-title":"Gigascience"},{"key":"pcbi.1008839.ref008","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1016\/j.ymeth.2018.04.033","article-title":"Iteratively improving Hi-C experiments one step at a time","volume":"142","author":"R Golloshi","year":"2018","journal-title":"Methods"},{"key":"pcbi.1008839.ref009","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1016\/j.ymeth.2017.04.004","article-title":"Hi-C 2.0: An optimized Hi-C procedure for high-resolution genome-wide mapping of chromosome conformation","volume":"123","author":"H Belaghzal","year":"2017","journal-title":"Methods"},{"issue":"3","key":"pcbi.1008839.ref010","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1016\/j.ymeth.2012.05.001","article-title":"Hi-C: a comprehensive technique to capture the conformation of genomes","volume":"58","author":"JM Belton","year":"2012","journal-title":"Methods"},{"issue":"1","key":"pcbi.1008839.ref011","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1186\/s13059-015-0831-x","article-title":"HiC-Pro: an optimized and flexible pipeline for Hi-C data processing","volume":"16","author":"N Servant","year":"2015","journal-title":"Genome Biol"},{"issue":"19","key":"pcbi.1008839.ref012","doi-asserted-by":"crossref","first-page":"3047","DOI":"10.1093\/bioinformatics\/btw354","article-title":"MultiQC: summarize analysis results for multiple tools and samples in a single report","volume":"32","author":"P Ewels","year":"2016","journal-title":"Bioinformatics"},{"key":"pcbi.1008839.ref013","doi-asserted-by":"crossref","first-page":"1310","DOI":"10.12688\/f1000research.7334.1","article-title":"HiCUP: pipeline for mapping and processing Hi-C data","volume":"4","author":"S Wingett","year":"2015","journal-title":"F1000Res"},{"key":"pcbi.1008839.ref014","first-page":"1","article-title":"Jellyfish: A fast k-mer counter","author":"G Marcais","year":"2012","journal-title":"Tutorialis e Manuais"},{"issue":"2","key":"pcbi.1008839.ref015","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1086\/341527","article-title":"A note on the calculation of empirical P values from Monte Carlo procedures","volume":"71","author":"BV North","year":"2002","journal-title":"Am J Hum Genet"},{"issue":"4","key":"pcbi.1008839.ref016","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1038\/nbt.3820","article-title":"Nextflow enables reproducible computational workflows","volume":"35","author":"P Di Tommaso","year":"2017","journal-title":"Nat Biotechnol"},{"issue":"2","key":"pcbi.1008839.ref017","doi-asserted-by":"crossref","DOI":"10.1093\/gigascience\/gix103","article-title":"sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies","volume":"7","author":"MZ DeMaere","year":"2018","journal-title":"Gigascience"},{"issue":"17","key":"pcbi.1008839.ref018","doi-asserted-by":"crossref","first-page":"i884","DOI":"10.1093\/bioinformatics\/bty560","article-title":"fastp: an ultra-fast all-in-one FASTQ preprocessor","volume":"34","author":"S Chen","year":"2018","journal-title":"Bioinformatics"},{"issue":"W1","key":"pcbi.1008839.ref019","doi-asserted-by":"crossref","first-page":"W11","DOI":"10.1093\/nar\/gky504","article-title":"Galaxy HiCExplorer: a web server for reproducible Hi-C data analysis, quality control and visualization","volume":"46","author":"J Wolff","year":"2018","journal-title":"Nucleic Acids Res"},{"issue":"W1","key":"pcbi.1008839.ref020","doi-asserted-by":"crossref","first-page":"W177","DOI":"10.1093\/nar\/gkaa220","article-title":"Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization","volume":"48","author":"J Wolff","year":"2020","journal-title":"Nucleic Acids Res"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1008839","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,10,21]],"date-time":"2021-10-21T00:00:00Z","timestamp":1634774400000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1008839","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,10,21]],"date-time":"2021-10-21T14:18:55Z","timestamp":1634825935000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1008839"}},"subtitle":[],"editor":[{"given":"Mihaela","family":"Pertea","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,10,11]]},"references-count":20,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2021,10,11]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1008839","relation":{},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,11]]}}}