{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T07:10:55Z","timestamp":1776841855950,"version":"3.51.2"},"reference-count":78,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,11,22]],"date-time":"2023-11-22T00:00:00Z","timestamp":1700611200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,11,22]],"date-time":"2023-11-22T00:00:00Z","timestamp":1700611200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Los Alamos National Laboratory Directed Research","award":["20210082DR"],"award-info":[{"award-number":["20210082DR"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Correlation metrics are widely utilized in genomics analysis and often implemented with little regard to assumptions of normality, homoscedasticity, and independence of values. This is especially true when comparing values between replicated sequencing experiments that probe chromatin accessibility, such as assays for transposase-accessible chromatin via sequencing (ATAC-seq). Such data can possess several regions across the human genome with little to no sequencing depth and are thus non-normal with a large portion of zero values. Despite distributed use in the epigenomics field, few studies have evaluated and benchmarked how correlation and association statistics behave across ATAC-seq experiments with known differences or the effects of removing specific outliers from the data. Here, we developed a computational simulation of ATAC-seq data to elucidate the behavior of correlation statistics and to compare their accuracy under set conditions of reproducibility.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>\n                      Using these simulations, we monitored the behavior of several correlation statistics, including the Pearson\u2019s\n                      <jats:italic>R<\/jats:italic>\n                      and Spearman\u2019s\n                      <jats:inline-formula>\n                        <jats:alternatives>\n                          <jats:tex-math>$$\\rho$$<\/jats:tex-math>\n                          <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                            <mml:mi>\u03c1<\/mml:mi>\n                          <\/mml:math>\n                        <\/jats:alternatives>\n                      <\/jats:inline-formula>\n                      coefficients as well as Kendall\u2019s\n                      <jats:inline-formula>\n                        <jats:alternatives>\n                          <jats:tex-math>$$\\tau$$<\/jats:tex-math>\n                          <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                            <mml:mi>\u03c4<\/mml:mi>\n                          <\/mml:math>\n                        <\/jats:alternatives>\n                      <\/jats:inline-formula>\n                      and Top\u2013Down correlation. We also test the behavior of association measures, including the coefficient of determination\n                      <jats:italic>R<\/jats:italic>\n                      <jats:inline-formula>\n                        <jats:alternatives>\n                          <jats:tex-math>$$^2$$<\/jats:tex-math>\n                          <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                            <mml:msup>\n                              <mml:mrow\/>\n                              <mml:mn>2<\/mml:mn>\n                            <\/mml:msup>\n                          <\/mml:math>\n                        <\/jats:alternatives>\n                      <\/jats:inline-formula>\n                      , Kendall\u2019s W, and normalized mutual information. Our experiments reveal an insensitivity of most statistics, including Spearman\u2019s\n                      <jats:inline-formula>\n                        <jats:alternatives>\n                          <jats:tex-math>$$\\rho$$<\/jats:tex-math>\n                          <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                            <mml:mi>\u03c1<\/mml:mi>\n                          <\/mml:math>\n                        <\/jats:alternatives>\n                      <\/jats:inline-formula>\n                      , Kendall\u2019s\n                      <jats:inline-formula>\n                        <jats:alternatives>\n                          <jats:tex-math>$$\\tau$$<\/jats:tex-math>\n                          <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                            <mml:mi>\u03c4<\/mml:mi>\n                          <\/mml:math>\n                        <\/jats:alternatives>\n                      <\/jats:inline-formula>\n                      , and Kendall\u2019s W, to increasing differences between simulated ATAC-seq replicates. The removal of co-zeros (regions lacking mapped sequenced reads) between simulated experiments greatly improves the estimates of correlation and association. After removing co-zeros, the\n                      <jats:italic>R<\/jats:italic>\n                      <jats:inline-formula>\n                        <jats:alternatives>\n                          <jats:tex-math>$$^2$$<\/jats:tex-math>\n                          <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                            <mml:msup>\n                              <mml:mrow\/>\n                              <mml:mn>2<\/mml:mn>\n                            <\/mml:msup>\n                          <\/mml:math>\n                        <\/jats:alternatives>\n                      <\/jats:inline-formula>\n                      coefficient and normalized mutual information display the best performance, having a closer one-to-one relationship with the known portion of shared, enhanced loci between simulated replicates. When comparing values between experimental ATAC-seq data using a random forest model, mutual information best predicts ATAC-seq replicate relationships.\n                    <\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>Collectively, this study demonstrates how measures of correlation and association can behave in epigenomics experiments. We provide improved strategies for quantifying relationships in these increasingly prevalent and important chromatin accessibility assays.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-023-05553-0","type":"journal-article","created":{"date-parts":[[2023,11,21]],"date-time":"2023-11-21T21:02:09Z","timestamp":1700600529000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Improved quality metrics for association and reproducibility in chromatin accessibility data using mutual information"],"prefix":"10.1186","volume":"24","author":[{"given":"Cullen","family":"Roth","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vrinda","family":"Venu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vanessa","family":"Job","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nicholas","family":"Lubbers","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Karissa Y.","family":"Sanbonmatsu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christina R.","family":"Steadman","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shawn R.","family":"Starkenburg","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,11,22]]},"reference":[{"issue":"4","key":"5553_CR1","doi-asserted-by":"publisher","first-page":"823","DOI":"10.1016\/j.cell.2007.05.009","volume":"129","author":"A Barski","year":"2007","unstructured":"Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129(4):823\u201337.","journal-title":"Cell"},{"issue":"1","key":"5553_CR2","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1002\/jcb.22077","volume":"107","author":"A Barski","year":"2009","unstructured":"Barski A, Zhao K. Genomic location analysis by ChIP-Seq. J Cell Biochem. 2009;107(1):11\u20138.","journal-title":"J Cell Biochem"},{"issue":"10","key":"5553_CR3","doi-asserted-by":"publisher","first-page":"669","DOI":"10.1038\/nrg2641","volume":"10","author":"PJ Park","year":"2009","unstructured":"Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet. 2009;10(10):669\u201380.","journal-title":"Nat Rev Genet"},{"issue":"1","key":"5553_CR4","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1002\/0471142727.mb2129s109","volume":"109","author":"JD Buenrostro","year":"2015","unstructured":"Buenrostro JD, Wu B, Chang HY, Greenleaf WJ. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr Protoc Mol Biol. 2015;109(1):21\u20139.","journal-title":"Curr Protoc Mol Biol"},{"issue":"2","key":"5553_CR5","doi-asserted-by":"publisher","first-page":"178","DOI":"10.1093\/bib\/bbs017","volume":"14","author":"H Thorvaldsd\u00f3ttir","year":"2013","unstructured":"Thorvaldsd\u00f3ttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14(2):178\u201392.","journal-title":"Brief Bioinform"},{"issue":"9","key":"5553_CR6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/gb-2008-9-9-r137","volume":"9","author":"Y Zhang","year":"2008","unstructured":"Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):1\u20139.","journal-title":"Genome Biol"},{"issue":"4","key":"5553_CR7","doi-asserted-by":"publisher","first-page":"576","DOI":"10.1016\/j.molcel.2010.05.004","volume":"38","author":"S Heinz","year":"2010","unstructured":"Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38(4):576\u201389.","journal-title":"Mol Cell"},{"issue":"9","key":"5553_CR8","doi-asserted-by":"publisher","first-page":"1813","DOI":"10.1101\/gr.136184.111","volume":"22","author":"SG Landt","year":"2012","unstructured":"Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 2012;22(9):1813\u201331.","journal-title":"Genome Res"},{"issue":"1","key":"5553_CR9","doi-asserted-by":"publisher","first-page":"7933","DOI":"10.1038\/s41598-020-64655-4","volume":"10","author":"D Oh","year":"2020","unstructured":"Oh D, Strattan JS, Hur JK, Bento J, Urban AE, Song G, et al. CNN-Peaks: ChIP-Seq peak detection pipeline using convolutional neural networks that imitate human visual inspection. Sci Rep. 2020;10(1):7933.","journal-title":"Sci Rep"},{"key":"5553_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-020-1929-3","volume":"21","author":"F Yan","year":"2020","unstructured":"Yan F, Powell DR, Curtis DJ, Wong NC. From reads to insight: a hitchhiker\u2019s guide to ATAC-seq data analysis. Genome Biol. 2020;21:1\u201316.","journal-title":"Genome Biol"},{"key":"5553_CR11","doi-asserted-by":"crossref","unstructured":"ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57.","DOI":"10.1038\/nature11247"},{"issue":"D1","key":"5553_CR12","doi-asserted-by":"publisher","first-page":"D882","DOI":"10.1093\/nar\/gkz1062","volume":"48","author":"Y Luo","year":"2020","unstructured":"Luo Y, Hitz BC, Gabdank I, Hilton JA, Kagda MS, Lam B, et al. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Res. 2020;48(D1):D882\u20139.","journal-title":"Nucleic Acids Res"},{"issue":"W1","key":"5553_CR13","doi-asserted-by":"publisher","first-page":"W187","DOI":"10.1093\/nar\/gku365","volume":"42","author":"F Ram\u00edrez","year":"2014","unstructured":"Ram\u00edrez F, D\u00fcndar F, Diehl S, Gr\u00fcning BA, Manke T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 2014;42(W1):W187\u201391.","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"5553_CR14","doi-asserted-by":"publisher","first-page":"1518","DOI":"10.1038\/s41596-022-00692-9","volume":"17","author":"FC Grandi","year":"2022","unstructured":"Grandi FC, Modi H, Kampman L, Corces MR. Chromatin accessibility profiling by ATAC-seq. Nat Protoc. 2022;17(6):1518\u201352.","journal-title":"Nat Protoc"},{"key":"5553_CR15","doi-asserted-by":"publisher","DOI":"10.7554\/eLife.72792","volume":"11","author":"K Sahinyan","year":"2022","unstructured":"Sahinyan K, Blackburn DM, Simon MM, Lazure F, Kwan T, Bourque G, et al. Application of ATAC-Seq for genome-wide analysis of the chromatin state at single myofiber resolution. Elife. 2022;11: e72792.","journal-title":"Elife"},{"issue":"1","key":"5553_CR16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12967-021-02936-w","volume":"19","author":"Y Zhao","year":"2021","unstructured":"Zhao Y, Li MC, Konat\u00e9 MM, Chen L, Das B, Karlovich C, et al. TPM, FPKM, or normalized counts? A comparative study of quantification measures for the analysis of RNA-seq data from the NCI patient-derived models repository. J Transl Med. 2021;19(1):1\u201315.","journal-title":"J Transl Med"},{"issue":"5","key":"5553_CR17","doi-asserted-by":"publisher","first-page":"1763","DOI":"10.1213\/ANE.0000000000002864","volume":"126","author":"P Schober","year":"2018","unstructured":"Schober P, Boer C, Schwarte LA. Correlation coefficients: appropriate use and interpretation. Anesth Analg. 2018;126(5):1763\u20138.","journal-title":"Anesth Analg"},{"issue":"1","key":"5553_CR18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/srep25474","volume":"6","author":"P Milani","year":"2016","unstructured":"Milani P, Escalante-Chong R, Shelley BC, Patel-Murray NL, Xin X, Adam M, et al. Cell freezing protocol suitable for ATAC-Seq on motor neurons derived from human induced pluripotent stem cells. Sci Rep. 2016;6(1):1\u201310.","journal-title":"Sci Rep"},{"issue":"1","key":"5553_CR19","doi-asserted-by":"publisher","first-page":"11502","DOI":"10.1038\/s41598-018-29775-y","volume":"8","author":"X Shan","year":"2018","unstructured":"Shan X, Roberts C, Lan Y, Percec I. Age alters chromatin structure and expression of SUMO proteins under stress conditions in human adipose-derived stem cells. Sci Rep. 2018;8(1):11502.","journal-title":"Sci Rep"},{"issue":"6413","key":"5553_CR20","doi-asserted-by":"publisher","first-page":"eaav1898","DOI":"10.1126\/science.aav1898","volume":"362","author":"MR Corces","year":"2018","unstructured":"Corces MR, Granja JM, Shams S, Louie BH, Seoane JA, Zhou W, et al. The chromatin accessibility landscape of primary human cancers. Science. 2018;362(6413):eaav1898.","journal-title":"Science"},{"issue":"1","key":"5553_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-020-61678-9","volume":"10","author":"M Halstead","year":"2020","unstructured":"Halstead M, Kern C, Saelao P, Chanthavixay G, Wang Y, Delany M, et al. Systematic alteration of ATAC-seq for profiling open chromatin in cryopreserved nuclei preparations from livestock tissues. Sci Rep. 2020;10(1):1\u201312.","journal-title":"Sci Rep"},{"issue":"1","key":"5553_CR22","doi-asserted-by":"publisher","first-page":"1337","DOI":"10.1038\/s41467-021-21583-9","volume":"12","author":"R Fang","year":"2021","unstructured":"Fang R, Preissl S, Li Y, Hou X, Lucero J, Wang X, et al. Comprehensive analysis of single cell ATAC-seq data with SnapATAC. Nat Commun. 2021;12(1):1337.","journal-title":"Nat Commun"},{"issue":"1","key":"5553_CR23","doi-asserted-by":"publisher","first-page":"5506","DOI":"10.1038\/s41598-023-32256-6","volume":"13","author":"YY Wong","year":"2023","unstructured":"Wong YY, Harbison JE, Hope CM, Gundsambuu B, Brown KA, Wong SW, et al. Parallel recovery of chromatin accessibility and gene expression dynamics from frozen human regulatory T cells. Sci Rep. 2023;13(1):5506.","journal-title":"Sci Rep"},{"key":"5553_CR24","volume-title":"Genetics and analysis of quantitative traits","author":"M Lynch","year":"1998","unstructured":"Lynch M, Walsh B, et al. Genetics and analysis of quantitative traits, vol. 1. Sunderland: Sinauer; 1998."},{"issue":"1","key":"5553_CR25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-015-0866-z","volume":"17","author":"A Conesa","year":"2016","unstructured":"Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016;17(1):1\u201319.","journal-title":"Genome Biol"},{"issue":"14","key":"5553_CR26","doi-asserted-by":"publisher","first-page":"2199","DOI":"10.1093\/bioinformatics\/btx152","volume":"33","author":"KK Yan","year":"2017","unstructured":"Yan KK, Yard\u0131mc\u0131 GG, Yan C, Noble WS, Gerstein M. HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps. Bioinformatics. 2017;33(14):2199\u2013201.","journal-title":"Bioinformatics"},{"issue":"11","key":"5553_CR27","doi-asserted-by":"publisher","first-page":"1939","DOI":"10.1101\/gr.220640.117","volume":"27","author":"T Yang","year":"2017","unstructured":"Yang T, Zhang F, Yard\u0131mc\u0131 GG, Song F, Hardison RC, Noble WS, et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res. 2017;27(11):1939\u201349.","journal-title":"Genome Res"},{"issue":"2","key":"5553_CR28","doi-asserted-by":"publisher","first-page":"567","DOI":"10.1534\/genetics.118.300996","volume":"209","author":"C Roth","year":"2018","unstructured":"Roth C, Sun S, Billmyre RB, Heitman J, Magwene PM. A high-resolution map of meiotic recombination in Cryptococcus deneoformans demonstrates decreased recombination in unisexual reproduction. Genetics. 2018;209(2):567\u201378.","journal-title":"Genetics"},{"issue":"1","key":"5553_CR29","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-018-2288-x","volume":"19","author":"JC Stansfield","year":"2018","unstructured":"Stansfield JC, Cresswell KG, Vladimirov VI, Dozmorov MG. HiCcompare: an R-package for joint normalization and comparison of HI-C datasets. BMC Bioinform. 2018;19(1):1\u201310.","journal-title":"BMC Bioinform"},{"issue":"1","key":"5553_CR30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-019-1658-7","volume":"20","author":"GG Yard\u0131mc\u0131","year":"2019","unstructured":"Yard\u0131mc\u0131 GG, Ozadam H, Sauria ME, Ursu O, Yan KK, Yang T, et al. Measuring the reproducibility and quality of Hi-C data. Genome Biol. 2019;20(1):1\u201319.","journal-title":"Genome Biol"},{"issue":"W1","key":"5553_CR31","doi-asserted-by":"publisher","first-page":"W160","DOI":"10.1093\/nar\/gkw257","volume":"44","author":"F Ram\u00edrez","year":"2016","unstructured":"Ram\u00edrez F, Ryan DP, Gr\u00fcning B, Bhardwaj V, Kilpert F, Richter AS, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44(W1):W160\u20135.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"5553_CR32","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1038\/s41467-017-02525-w","volume":"9","author":"F Ram\u00edrez","year":"2018","unstructured":"Ram\u00edrez F, Bhardwaj V, Arrigoni L, Lam KC, Gr\u00fcning BA, Villaveces J, et al. High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat Commun. 2018;9(1):189.","journal-title":"Nat Commun"},{"issue":"W1","key":"5553_CR33","doi-asserted-by":"publisher","first-page":"W11","DOI":"10.1093\/nar\/gky504","volume":"46","author":"J Wolff","year":"2018","unstructured":"Wolff J, Bhardwaj V, Nothjunge S, Richard G, Renschler G, Gilsbach R, et al. Galaxy HiCExplorer: a web server for reproducible Hi-C data analysis, quality control and visualization. Nucleic Acids Res. 2018;46(W1):W11\u20136.","journal-title":"Nucleic Acids Res"},{"issue":"W1","key":"5553_CR34","doi-asserted-by":"publisher","first-page":"W177","DOI":"10.1093\/nar\/gkaa220","volume":"48","author":"J Wolff","year":"2020","unstructured":"Wolff J, Rabbani L, Gilsbach R, Richard G, Manke T, Backofen R, et al. Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. Nucleic Acids Res. 2020;48(W1):W177\u201384.","journal-title":"Nucleic Acids Res"},{"key":"5553_CR35","doi-asserted-by":"publisher","first-page":"322","DOI":"10.3389\/fpsyg.2012.00322","volume":"3","author":"KF Nimon","year":"2012","unstructured":"Nimon KF. Statistical assumptions of substantive analyses across the general linear model: a mini-review. Front Psychol. 2012;3:322.","journal-title":"Front Psychol"},{"key":"5553_CR36","doi-asserted-by":"publisher","first-page":"2789","DOI":"10.1016\/j.csbj.2020.09.014","volume":"18","author":"JD Silverman","year":"2020","unstructured":"Silverman JD, Roche K, Mukherjee S, David LA. Naught all zeros in sequence count data are the same. Comput Struct Biotechnol J. 2020;18:2789\u201398.","journal-title":"Comput Struct Biotechnol J"},{"key":"5553_CR37","doi-asserted-by":"crossref","unstructured":"Student. Probable error of a correlation coefficient. Biometrika. 1908;6(2-3):302\u201310.","DOI":"10.1093\/biomet\/6.2-3.302"},{"key":"5553_CR38","unstructured":"Fisher R. Statistical methods for research workers Oliver and Boyd, London. Reprinted in Statistical Methods, Experimental Design and Scientific Inference; 1925."},{"issue":"1","key":"5553_CR39","first-page":"1","volume":"21","author":"CJ Kowalski","year":"1972","unstructured":"Kowalski CJ. On the effects of non-normality on the distribution of the sample product-moment correlation coefficient. J R Stat Soc Ser C (Appl Stat). 1972;21(1):1\u201312.","journal-title":"J R Stat Soc Ser C (Appl Stat)"},{"key":"5553_CR40","volume-title":"CRC standard probability and statistics tables and formulae","author":"S Kokoska","year":"2000","unstructured":"Kokoska S, Zwillinger D. CRC standard probability and statistics tables and formulae. CRC Press; 2000."},{"key":"5553_CR41","volume-title":"Rank correlation methods","author":"M Kendall","year":"1970","unstructured":"Kendall M. Rank correlation methods. 4th ed. High Wycombe, Bucks: Charles Griffin; 1970.","edition":"4"},{"key":"5553_CR42","unstructured":"Noether GE. Elements of nonparametric statistics. Elements of nonparametric statistics; 1967."},{"issue":"2","key":"5553_CR43","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1016\/S0022-3956(98)90046-2","volume":"33","author":"S Arndt","year":"1999","unstructured":"Arndt S, Turvey C, Andreasen NC. Correlating and predicting psychiatric symptom ratings: Spearmans r versus Kendalls tau correlation. J Psychiatr Res. 1999;33(2):97\u2013104.","journal-title":"J Psychiatr Res"},{"issue":"1","key":"5553_CR44","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1016\/j.sigpro.2012.08.005","volume":"93","author":"W Xu","year":"2013","unstructured":"Xu W, Hou Y, Hung Y, Zou Y. A comparative analysis of Spearman\u2019s rho and Kendall\u2019s tau in normal and contaminated normal models. Signal Process. 2013;93(1):261\u201376.","journal-title":"Signal Process"},{"key":"5553_CR45","volume-title":"Elements of information theory","author":"TM Cover","year":"1999","unstructured":"Cover TM. Elements of information theory. Wiley; 1999."},{"key":"5553_CR46","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825\u201330.","journal-title":"J Mach Learn Res"},{"issue":"6","key":"5553_CR47","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1002\/widm.1072","volume":"2","author":"AL Boulesteix","year":"2012","unstructured":"Boulesteix AL, Janitza S, Kruppa J, K\u00f6nig IR. Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdiscip Rev Data Min Knowl Discov. 2012;2(6):493\u2013507.","journal-title":"Wiley Interdiscip Rev Data Min Knowl Discov"},{"issue":"3","key":"5553_CR48","first-page":"351","volume":"29","author":"RL Iman","year":"1987","unstructured":"Iman RL, Conover W. A measure of top-down correlation. Technometrics. 1987;29(3):351\u20137.","journal-title":"Technometrics"},{"issue":"1","key":"5553_CR49","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-022-05054-6","volume":"23","author":"T Fitzgerald","year":"2022","unstructured":"Fitzgerald T, Jones A, Engelhardt BE. A Poisson reduced-rank regression model for association mapping in sequencing data. BMC Bioinform. 2022;23(1):1\u201322.","journal-title":"BMC Bioinform"},{"issue":"3","key":"5553_CR50","doi-asserted-by":"publisher","first-page":"275","DOI":"10.1214\/aoms\/1177732186","volume":"10","author":"MG Kendall","year":"1939","unstructured":"Kendall MG, Smith BB. The problem of m rankings. Ann Math Stat. 1939;10(3):275\u201387.","journal-title":"Ann Math Stat"},{"key":"5553_CR51","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2917-7_3","volume-title":"Practical use of the information-theoretic approach","author":"KP Burnham","year":"1998","unstructured":"Burnham KP, Anderson DR, Burnham KP, Anderson DR. Practical use of the information-theoretic approach. Springer; 1998."},{"issue":"14","key":"5553_CR52","doi-asserted-by":"publisher","first-page":"e497","DOI":"10.1093\/bioinformatics\/btl224","volume":"22","author":"V Varadan","year":"2006","unstructured":"Varadan V, Miller DM III, Anastassiou D. Computational inference of the molecular logic for synaptic connectivity in C. elegans. Bioinformatics. 2006;22(14):e497\u2013506.","journal-title":"Bioinformatics."},{"issue":"1","key":"5553_CR53","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1038\/msb4100124","volume":"3","author":"D Anastassiou","year":"2007","unstructured":"Anastassiou D. Computational analysis of the synergy among multiple interacting genes. Mol Syst Biol. 2007;3(1):83.","journal-title":"Mol Syst Biol"},{"issue":"4","key":"5553_CR54","doi-asserted-by":"publisher","first-page":"630","DOI":"10.1136\/amiajnl-2012-001525","volume":"20","author":"T Hu","year":"2013","unstructured":"Hu T, Chen Y, Kiralis JW, Collins RL, Wejse C, Sirugo G, et al. An information-gain approach to detecting three-way epistatic interactions in genetic association studies. J Am Med Inform Assoc. 2013;20(4):630\u20136.","journal-title":"J Am Med Inform Assoc"},{"issue":"1","key":"5553_CR55","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12918-016-0331-y","volume":"10","author":"DM Budden","year":"2016","unstructured":"Budden DM, Crampin EJ. Information theoretic approaches for inference of biological networks from continuous-valued data. BMC Syst Biol. 2016;10(1):1\u20137.","journal-title":"BMC Syst Biol"},{"issue":"1","key":"5553_CR56","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.1009313","volume":"17","author":"C Roth","year":"2021","unstructured":"Roth C, Murray D, Scott A, Fu C, Averette AF, Sun S, et al. Pleiotropy and epistasis within and between signaling pathways defines the genetic architecture of fungal virulence. PLoS Genet. 2021;17(1): e1009313.","journal-title":"PLoS Genet"},{"issue":"8","key":"5553_CR57","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2122293119","volume":"119","author":"S Sun","year":"2022","unstructured":"Sun S, Roth C, Floyd Averette A, Magwene PM, Heitman J. Epistatic genetic interactions govern morphogenesis during sexual reproduction and infection in a global human fungal pathogen. Proc Natl Acad Sci. 2022;119(8): e2122293119.","journal-title":"Proc Natl Acad Sci"},{"issue":"1","key":"5553_CR58","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-019-1854-5","volume":"20","author":"H Chen","year":"2019","unstructured":"Chen H, Lareau C, Andreani T, Vinyard ME, Garcia SP, Clement K, et al. Assessment of computational methods for the analysis of single-cell ATAC-seq data. Genome Biol. 2019;20(1):1\u201325.","journal-title":"Genome Biol"},{"issue":"2","key":"5553_CR59","doi-asserted-by":"publisher","first-page":"476","DOI":"10.1093\/bioinformatics\/btab706","volume":"38","author":"Y Xu","year":"2022","unstructured":"Xu Y, Das P, McCord RP. SMILE: mutual information learning for integration of single-cell omics data. Bioinformatics. 2022;38(2):476\u201386.","journal-title":"Bioinformatics"},{"issue":"12","key":"5553_CR60","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-014-0550-8","volume":"15","author":"MI Love","year":"2014","unstructured":"Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):1\u201321.","journal-title":"Genome Biol"},{"issue":"11","key":"5553_CR61","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0167047","volume":"11","author":"ZD Stephens","year":"2016","unstructured":"Stephens ZD, Hudson ME, Mainzer LS, Taschuk M, Weber MR, Iyer RK. Simulating next-generation sequencing datasets from empirical mutation and sequencing models. PLoS ONE. 2016;11(11): e0167047.","journal-title":"PLoS ONE"},{"key":"5553_CR62","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-021-02270-w","volume":"22","author":"Z Navidi","year":"2021","unstructured":"Navidi Z, Zhang L, Wang B. simATAC: a single-cell ATAC-seq simulation framework. Genome Biol. 2021;22:1\u201316.","journal-title":"Genome Biol"},{"issue":"5","key":"5553_CR63","doi-asserted-by":"publisher","first-page":"1417","DOI":"10.1093\/jnci\/51.5.1417","volume":"51","author":"DJ Giard","year":"1973","unstructured":"Giard DJ, Aaronson SA, Todaro GJ, Arnstein P, Kersey JH, Dosik H, et al. In vitro cultivation of human tumors: establishment of cell lines derived from a series of solid tumors. J Natl Cancer Inst. 1973;51(5):1417\u201323.","journal-title":"J Natl Cancer Inst"},{"issue":"2","key":"5553_CR64","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1006\/excr.1998.4172","volume":"243","author":"KA Foster","year":"1998","unstructured":"Foster KA, Oster CG, Mayer MM, Avery ML, Audus KL. Characterization of the A549 cell line as a type II pulmonary epithelial cell model for drug metabolism. Exp Cell Res. 1998;243(2):359\u201366.","journal-title":"Exp Cell Res"},{"issue":"2","key":"5553_CR65","first-page":"113","volume":"31","author":"KJ Peng","year":"2010","unstructured":"Peng KJ, Wang JH, Su WT, Wang XC, Yang FT, Nie WH, et al. Characterization of two human lung adenocarcinoma cell lines by reciprocal chromosome painting. Dongwuxue Yanjiu. 2010;31(2):113\u201321.","journal-title":"Dongwuxue Yanjiu."},{"issue":"17","key":"5553_CR66","doi-asserted-by":"publisher","first-page":"i884","DOI":"10.1093\/bioinformatics\/bty560","volume":"34","author":"S Chen","year":"2018","unstructured":"Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34(17):i884\u201390.","journal-title":"Bioinformatics"},{"issue":"6588","key":"5553_CR67","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1126\/science.abj6987","volume":"376","author":"S Nurk","year":"2022","unstructured":"Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, et al. The complete sequence of a human genome. Science. 2022;376(6588):44\u201353.","journal-title":"Science"},{"issue":"14","key":"5553_CR68","doi-asserted-by":"publisher","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","volume":"25","author":"H Li","year":"2009","unstructured":"Li H, Durbin R. Fast and accurate short read alignment with Burrows\u2013Wheeler transform. Bioinformatics. 2009;25(14):1754\u201360.","journal-title":"Bioinformatics."},{"issue":"17","key":"5553_CR69","doi-asserted-by":"publisher","first-page":"2503","DOI":"10.1093\/bioinformatics\/btu314","volume":"30","author":"GG Faust","year":"2014","unstructured":"Faust GG, Hall IM. SAMBLASTER: fast duplicate marking and structural variant read extraction. Bioinformatics. 2014;30(17):2503\u20135.","journal-title":"Bioinformatics"},{"issue":"4","key":"5553_CR70","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.1001046","volume":"9","author":"Consortium EP","year":"2011","unstructured":"Consortium EP. A user\u2019s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol. 2011;9(4): e1001046.","journal-title":"PLoS Biol"},{"issue":"9","key":"5553_CR71","doi-asserted-by":"publisher","first-page":"1728","DOI":"10.1038\/nprot.2012.101","volume":"7","author":"J Feng","year":"2012","unstructured":"Feng J, Liu T, Qin B, Zhang Y, Liu XS. Identifying ChIP-seq enrichment using MACS. Nat Protoc. 2012;7(9):1728\u201340.","journal-title":"Nat Protoc"},{"issue":"16","key":"5553_CR72","doi-asserted-by":"publisher","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","volume":"25","author":"H Li","year":"2009","unstructured":"Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment\/map format and SAMtools. Bioinformatics. 2009;25(16):2078\u20139.","journal-title":"Bioinformatics"},{"key":"5553_CR73","doi-asserted-by":"crossref","unstructured":"Gaspar JM. Improved peak-calling with MACS2. BioRxiv. 2018;496521.","DOI":"10.1101\/496521"},{"issue":"3","key":"5553_CR74","doi-asserted-by":"publisher","first-page":"1752","DOI":"10.1214\/11-AOAS466","volume":"5","author":"Q Li","year":"2011","unstructured":"Li Q, Brown JB, Huang H, Bickel PJ. Measuring reproducibility of high-throughput experiments. Ann Appl Stat. 2011;5(3):1752\u201379.","journal-title":"Ann Appl Stat"},{"issue":"4","key":"5553_CR75","doi-asserted-by":"publisher","first-page":"1855","DOI":"10.1016\/j.ygeno.2021.04.026","volume":"113","author":"R Newell","year":"2021","unstructured":"Newell R, Pienaar R, Balderson B, Piper M, Essebier A, Bod\u00e9n M. ChIP-R: assembling reproducible sets of ChIP-seq and ATAC-seq peaks from multiple replicates. Genomics. 2021;113(4):1855\u201366.","journal-title":"Genomics"},{"key":"5553_CR76","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","volume":"17","author":"P Virtanen","year":"2020","unstructured":"Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17:261\u201372. https:\/\/doi.org\/10.1038\/s41592-019-0686-2.","journal-title":"Nat Methods"},{"issue":"3","key":"5553_CR77","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1093\/biomet\/68.3.589","volume":"68","author":"B Efron","year":"1981","unstructured":"Efron B. Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika. 1981;68(3):589\u201399.","journal-title":"Biometrika"},{"key":"5553_CR78","doi-asserted-by":"crossref","unstructured":"John\u00a0Lu Z. The elements of statistical learning: data mining, inference, and prediction; 2010.","DOI":"10.1111\/j.1467-985X.2010.00646_6.x"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05553-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-023-05553-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05553-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,26]],"date-time":"2023-12-26T00:13:14Z","timestamp":1703549594000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-023-05553-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,22]]},"references-count":78,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["5553"],"URL":"https:\/\/doi.org\/10.1186\/s12859-023-05553-0","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.04.26.538354","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,22]]},"assertion":[{"value":"13 June 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 October 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 November 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"441"}}