{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T08:07:59Z","timestamp":1776413279002,"version":"3.51.2"},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,1,13]],"date-time":"2020-01-13T00:00:00Z","timestamp":1578873600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2020,1,13]],"date-time":"2020-01-13T00:00:00Z","timestamp":1578873600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100014419","name":"EIT Health","doi-asserted-by":"publisher","award":["Campus 19359 HADACA"],"award-info":[{"award-number":["Campus 19359 HADACA"]}],"id":[{"id":"10.13039\/100014419","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Cell-type heterogeneity of tumors is a key factor in tumor progression and response to chemotherapy. Tumor cell-type heterogeneity, defined as the proportion of the various cell-types in a tumor, can be inferred from DNA methylation of surgical specimens. However, confounding factors known to associate with methylation values, such as age and sex, complicate accurate inference of cell-type proportions. While reference-free algorithms have been developed to infer cell-type proportions from DNA methylation, a comparative evaluation of the performance of these methods is still lacking.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Here we use simulations to evaluate several computational pipelines based on the software packages MeDeCom, EDec, and RefFreeEWAS. We identify that accounting for confounders, feature selection, and the choice of the number of estimated cell types are critical steps for inferring cell-type proportions. We find that removal of methylation probes which are correlated with confounder variables reduces the error of inference by 30\u201335%, and that selection of cell-type informative probes has similar effect. We show that Cattell\u2019s rule based on the scree plot is a powerful tool to determine the number of cell-types. Once the pre-processing steps are achieved, the three deconvolution methods provide comparable results. We observe that all the algorithms\u2019 performance improves when inter-sample variation of cell-type proportions is large or when the number of available samples is large. We find that under specific circumstances the methods are sensitive to the initialization method, suggesting that averaging different solutions or optimizing initialization is an avenue for future research.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>\n                      Based on the lessons learned, to facilitate pipeline validation and catalyze further pipeline improvement by the community, we develop a benchmark pipeline for inference of cell-type proportions and implement it in the R package\n                      <jats:italic>medepir<\/jats:italic>\n                      .\n                    <\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-019-3307-2","type":"journal-article","created":{"date-parts":[[2020,1,13]],"date-time":"2020-01-13T09:02:49Z","timestamp":1578906169000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":37,"title":["Guidelines for cell-type heterogeneity quantification based on a comparative analysis of reference-free DNA methylation deconvolution software"],"prefix":"10.1186","volume":"21","author":[{"name":"HADACA consortium","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cl\u00e9mentine","family":"Decamps","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Florian","family":"Priv\u00e9","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Raphael","family":"Bacher","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel","family":"Jost","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Arthur","family":"Waguet","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eugene Andres","family":"Houseman","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eugene","family":"Lurie","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pavlo","family":"Lutsik","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Aleksandar","family":"Milosavljevic","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Scherer","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael G. B.","family":"Blum","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3165-3218","authenticated-orcid":false,"given":"Magali","family":"Richard","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,1,13]]},"reference":[{"key":"3307_CR1","doi-asserted-by":"publisher","first-page":"846","DOI":"10.1038\/nm.3915","volume":"21","author":"AA Alizadeh","year":"2015","unstructured":"Alizadeh AA, Aranda V, Bardelli A, Blanpain C, Bock C, Borowski C, et al. Toward understanding and exploiting tumor heterogeneity. Nat Med. 2015;21:846\u201353.","journal-title":"Nat Med"},{"key":"3307_CR2","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1186\/s12859-016-1140-4","volume":"17","author":"EA Houseman","year":"2016","unstructured":"Houseman EA, Kile ML, Christiani DC, Ince TA, Kelsey KT, Marsit CJ. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC Bioinformatics. 2016;17:259.","journal-title":"BMC Bioinformatics."},{"key":"3307_CR3","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1186\/s13059-017-1182-6","volume":"18","author":"P Lutsik","year":"2017","unstructured":"Lutsik P, Slawski M, Gasparoni G, Vedeneev N, Hein M, Walter J. MeDeCom: discovery and quantification of latent components of heterogeneous methylomes. Genome Biol. BioMed Central. 2017;18:55.","journal-title":"Genome Biol. BioMed Central"},{"key":"3307_CR4","doi-asserted-by":"publisher","first-page":"2075","DOI":"10.1016\/j.celrep.2016.10.057","volume":"17","author":"V Onuchic","year":"2016","unstructured":"Onuchic V, Hartmaier RJ, Boone DN, Samuels ML, Patel RY, White WM, et al. Epigenomic Deconvolution of breast tumors reveals metabolic coupling between constituent cell types. Cell Rep. 2016;17:2075\u201386.","journal-title":"Cell Rep"},{"key":"3307_CR5","doi-asserted-by":"publisher","first-page":"484","DOI":"10.1038\/nrg3230","volume":"13","author":"PA Jones","year":"2012","unstructured":"Jones PA. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet. 2012;13:484\u201392.","journal-title":"Nat Rev Genet"},{"key":"3307_CR6","doi-asserted-by":"publisher","first-page":"R216","DOI":"10.1093\/hmg\/ddx275","volume":"26","author":"AJ Titus","year":"2017","unstructured":"Titus AJ, Gallimore RM, Salas LA, Christensen BC. Cell-type deconvolution from DNA methylation: a review of recent applications. Hum Mol Genet. 2017;26:R216\u201324.","journal-title":"Hum Mol Genet"},{"key":"3307_CR7","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1186\/s13059-016-0935-y","volume":"17","author":"K McGregor","year":"2016","unstructured":"McGregor K, Bernatsky S, Colmegna I, Hudson M, Pastinen T, Labbe A, et al. An evaluation of methods correcting for cell-type heterogeneity in DNA methylation studies. Genome Biol. 2016;17:84.","journal-title":"Genome Biol"},{"key":"3307_CR8","doi-asserted-by":"publisher","first-page":"216","DOI":"10.1186\/s12859-017-1611-2","volume":"18","author":"A Kaushal","year":"2017","unstructured":"Kaushal A, Zhang H, Karmaus WJJ, Ray M, Torres MA, Smith AK, et al. Comparison of different cell type correction methods for genome-scale epigenetics studies. BMC Bioinformatics. 2017;18:216.","journal-title":"BMC Bioinformatics."},{"key":"3307_CR9","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1186\/1471-2105-13-86","volume":"13","author":"EA Houseman","year":"2012","unstructured":"Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012;13:86.","journal-title":"BMC Bioinformatics"},{"key":"3307_CR10","doi-asserted-by":"publisher","first-page":"R31","DOI":"10.1186\/gb-2014-15-2-r31","volume":"15","author":"AE Jaffe","year":"2014","unstructured":"Jaffe AE, Irizarry RA. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 2014;15:R31.","journal-title":"Genome Biol"},{"key":"3307_CR11","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1038\/nmeth.3809","volume":"13","author":"E Rahmani","year":"2016","unstructured":"Rahmani E, Zaitlen N, Baran Y, Eng C, Hu D, Galanter J, et al. Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nat Meth. 2016;13:443\u20135.","journal-title":"Nat Meth."},{"key":"3307_CR12","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1038\/nmeth.2815","volume":"11","author":"J Zou","year":"2014","unstructured":"Zou J, Lippert C, Heckerman D, Aryee M, Listgarten J. Epigenome-wide association studies without the need for cell-type composition. Nat Meth. 2014;11:309\u201311.","journal-title":"Nat Meth"},{"key":"3307_CR13","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1186\/s12859-017-1511-5","volume":"18","author":"AE Teschendorff","year":"2017","unstructured":"Teschendorff AE, Breeze CE, Zheng SC, Beck S. A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies. BMC Bioinformatics. 2017;18:105.","journal-title":"BMC Bioinformatics"},{"key":"3307_CR14","doi-asserted-by":"publisher","first-page":"1059","DOI":"10.1038\/s41592-018-0213-x","volume":"15","author":"SC Zheng","year":"2018","unstructured":"Zheng SC, Breeze CE, Beck S, Teschendorff AE. Identification of differentially methylated cell types in epigenome-wide association studies. Nat Meth. 2018;15:1059\u201366.","journal-title":"Nat Meth"},{"key":"3307_CR15","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","volume":"57","author":"Y Benjamini","year":"1995","unstructured":"Benjamini Y, online YHFTA. Controlling the false discovery rate: A practical and powerful approach to multiple testing. JR Statist Soc B. 1995;57:289\u2013300.","journal-title":"JR Statist Soc B"},{"key":"3307_CR16","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1207\/s15327906mbr0102_10","volume":"1","author":"RB Cattell","year":"2010","unstructured":"Cattell RB. The scree test for the number of factors. Multivariate Behav Res. 2010;1:245\u201376.","journal-title":"Multivariate Behav Res"},{"key":"3307_CR17","doi-asserted-by":"publisher","first-page":"543","DOI":"10.1038\/nature13385","volume":"511","author":"Cancer Genome Atlas Research Network","year":"2014","unstructured":"Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511:543\u201350.","journal-title":"Nature."},{"key":"3307_CR18","doi-asserted-by":"publisher","first-page":"519","DOI":"10.1038\/nature11404","volume":"489","author":"Cancer Genome Atlas Research Network","year":"2012","unstructured":"Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature. 2012;489:519\u201325.","journal-title":"Nature."},{"key":"3307_CR19","doi-asserted-by":"publisher","first-page":"925","DOI":"10.2217\/epi-2018-0037","volume":"10","author":"SC Zheng","year":"2018","unstructured":"Zheng SC, Webster AP, Dong D, Feber A, Graham DG, Sullivan R, et al. A novel cell-type deconvolution algorithm reveals substantial contamination by immune cells in saliva, buccal and cervix. - PubMed - NCBI. Epigenomics. 2018;10:925\u201340.","journal-title":"Epigenomics."},{"key":"3307_CR20","doi-asserted-by":"publisher","first-page":"2612","DOI":"10.1038\/ncomms3612","volume":"4","author":"K Yoshihara","year":"2013","unstructured":"Yoshihara K, Shahmoradgoli M, Mart\u00ednez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun. 2013;4:2612.","journal-title":"Nat Commun"},{"key":"3307_CR21","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1093\/bioinformatics\/btw623","volume":"33","author":"M Alhamdoosh","year":"2017","unstructured":"Alhamdoosh M, Ng M, Wilson NJ, Sheridan JM, Huynh H, Wilson MJ, et al. Combining multiple tools outperforms individual methods in gene set enrichment analyses. Bioinformatics. 2017;33:414\u201324.","journal-title":"Bioinformatics."},{"key":"3307_CR22","doi-asserted-by":"publisher","first-page":"1801","DOI":"10.1093\/bioinformatics\/btm233","volume":"23","author":"M Jakobsson","year":"2007","unstructured":"Jakobsson M, Rosenberg NA. CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics. 2007;23:1801\u20136.","journal-title":"Bioinformatics."},{"key":"3307_CR23","doi-asserted-by":"crossref","unstructured":"Teschendorff AE, Marabita F, Lechner M, 2012. A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. academic.oup.com.","DOI":"10.1093\/bioinformatics\/bts680"},{"key":"3307_CR24","doi-asserted-by":"publisher","unstructured":"Zhou W, Laird PW, Shen H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 2016;45(4):e22. https:\/\/doi.org\/10.1093\/nar\/gkw967.","DOI":"10.1093\/nar\/gkw967"},{"key":"3307_CR25","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1111\/1755-0998.12592","volume":"17","author":"K Luu","year":"2017","unstructured":"Luu K, Bazin E, Blum MGB. pcadapt: an R package to perform genome scans for selection based on principal component analysis. Mol Ecol Resources. 2017;17:67\u201377.","journal-title":"Mol Ecol Resources"},{"key":"3307_CR26","doi-asserted-by":"publisher","first-page":"2781","DOI":"10.1093\/bioinformatics\/bty185","volume":"34","author":"F Priv\u00e9","year":"2018","unstructured":"Priv\u00e9 F, Aschard H, Ziyatdinov A, Blum MGB. Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr. Stegle O, editor. Bioinformatics. 2018;34:2781\u20137.","journal-title":"Bioinformatics."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-019-3307-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-019-3307-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-019-3307-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,29]],"date-time":"2024-07-29T21:38:16Z","timestamp":1722289096000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-019-3307-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1,13]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["3307"],"URL":"https:\/\/doi.org\/10.1186\/s12859-019-3307-2","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/698050","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,1,13]]},"assertion":[{"value":"10 July 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 December 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 January 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"16"}}