{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T15:59:59Z","timestamp":1781711999023,"version":"3.54.5"},"reference-count":103,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,5,1]],"date-time":"2023-05-01T00:00:00Z","timestamp":1682899200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,5,1]],"date-time":"2023-05-01T00:00:00Z","timestamp":1682899200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001602","name":"Science Foundation Ireland","doi-asserted-by":"publisher","award":["18\/CRT\/6183"],"award-info":[{"award-number":["18\/CRT\/6183"]}],"id":[{"id":"10.13039\/501100001602","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Health Research Council of New Zealand,New Zealand"},{"name":"Brain and Behavior Research Foundation,United States"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>The field of epigenomics holds great promise in understanding and treating disease with advances in machine learning (ML) and artificial intelligence being vitally important in this pursuit. Increasingly, research now utilises DNA methylation measures at cytosine\u2013guanine dinucleotides (CpG) to detect disease and estimate biological traits such as aging. Given the challenge of high dimensionality of DNA methylation data, feature-selection techniques are commonly employed to reduce dimensionality and identify the most important subset of features. In this study, our aim was to test and compare a range of feature-selection methods and ML algorithms in the development of a novel DNA methylation-based telomere length (TL) estimator. We utilised both nested cross-validation and two independent test sets for the comparisons.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We found that principal component analysis in advance of elastic net regression led to the overall best performing estimator when evaluated using a nested cross-validation analysis and two independent test cohorts. This approach achieved a correlation between estimated and actual TL of 0.295 (83.4% CI [0.201, 0.384]) on the EXTEND test data set. Contrastingly, the baseline model of elastic net regression with no prior feature reduction stage performed less well in general\u2014suggesting a prior feature-selection stage may have important utility. A previously developed TL estimator, DNAmTL, achieved a correlation of 0.216 (83.4% CI [0.118, 0.310]) on the EXTEND data. Additionally, we observed that different DNA methylation-based TL estimators, which have few common CpGs, are associated with many of the same biological entities.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>The variance in performance across tested approaches shows that estimators are sensitive to data set heterogeneity and the development of an optimal DNA methylation-based estimator should benefit from the robust methodological approach used in this study. Moreover, our methodology which utilises a range of feature-selection approaches and ML algorithms could be applied to other biological markers and disease phenotypes, to examine their relationship with DNA methylation and predictive value.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-023-05282-4","type":"journal-article","created":{"date-parts":[[2023,5,1]],"date-time":"2023-05-01T08:02:24Z","timestamp":1682928144000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["A comparison of feature selection methodologies and learning algorithms in the development of a DNA methylation-based telomere length estimator"],"prefix":"10.1186","volume":"24","author":[{"given":"Trevor","family":"Doherty","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Emma","family":"Dempster","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Eilis","family":"Hannon","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jonathan","family":"Mill","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Richie","family":"Poulton","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"David","family":"Corcoran","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Karen","family":"Sugden","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ben","family":"Williams","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Avshalom","family":"Caspi","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Terrie E.","family":"Moffitt","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Sarah Jane","family":"Delany","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Therese M.","family":"Murphy","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2023,5,1]]},"reference":[{"issue":"6","key":"5282_CR1","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1038\/s41576-018-0004-3","volume":"19","author":"S Horvath","year":"2018","unstructured":"Horvath S, Raj K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat Rev Genet. 2018;19(6):371\u201384.","journal-title":"Nat Rev Genet"},{"issue":"7","key":"5282_CR2","doi-asserted-by":"publisher","first-page":"885","DOI":"10.1093\/aje\/kwp215","volume":"170","author":"NL Benowitz","year":"2009","unstructured":"Benowitz NL, et al. Prevalence of smoking assessed biochemically in an urban public hospital: a rationale for routine cotinine screening. Am J Epidemiol. 2009;170(7):885\u201391.","journal-title":"Am J Epidemiol"},{"issue":"1","key":"5282_CR3","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1097\/CCM.0b013e3181fa4196","volume":"39","author":"SJ Hsieh","year":"2011","unstructured":"Hsieh SJ, et al. Biomarkers increase detection of active smoking and secondhand smoke exposure in critically ill patients. Crit Care Med. 2011;39(1):40.","journal-title":"Crit Care Med"},{"issue":"9","key":"5282_CR4","doi-asserted-by":"publisher","first-page":"514","DOI":"10.1038\/s41584-020-0470-9","volume":"16","author":"E Ballestar","year":"2020","unstructured":"Ballestar E, Sawalha AH, Lu Q. Clinical value of DNA methylation markers in autoimmune rheumatic diseases. Nat Rev Rheumatol. 2020;16(9):514\u201324.","journal-title":"Nat Rev Rheumatol"},{"issue":"10","key":"5282_CR5","doi-asserted-by":"publisher","first-page":"3156","DOI":"10.1186\/gb-2013-14-10-r115","volume":"14","author":"S Horvath","year":"2013","unstructured":"Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14(10):3156.","journal-title":"Genome Biol"},{"issue":"6","key":"5282_CR6","doi-asserted-by":"publisher","first-page":"e14821","DOI":"10.1371\/journal.pone.0014821","volume":"6","author":"S Bocklandt","year":"2011","unstructured":"Bocklandt S, et al. Epigenetic predictor of age. PLoS ONE. 2011;6(6):e14821.","journal-title":"PLoS ONE"},{"issue":"2","key":"5282_CR7","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1016\/j.molcel.2012.10.016","volume":"49","author":"G Hannum","year":"2013","unstructured":"Hannum G, et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol Cell. 2013;49(2):359\u201367.","journal-title":"Mol Cell"},{"issue":"11","key":"5282_CR8","doi-asserted-by":"publisher","first-page":"888","DOI":"10.3390\/genes10110888","volume":"10","author":"H Choi","year":"2019","unstructured":"Choi H, Joe S, Nam H. Development of tissue-specific age predictors using DNA methylation data. Genes. 2019;10(11):888.","journal-title":"Genes"},{"key":"5282_CR9","doi-asserted-by":"publisher","first-page":"388","DOI":"10.3389\/fbioe.2019.00388","volume":"7","author":"T Zhu","year":"2019","unstructured":"Zhu T, et al. CancerClock: a DNA methylation age predictor to identify and characterize aging clock in pan-cancer. Front Bioeng Biotechnol. 2019;7:388.","journal-title":"Front Bioeng Biotechnol"},{"key":"5282_CR10","doi-asserted-by":"crossref","unstructured":"Horvath S et al. DNA methylation aging and transcriptomic studies in horses. Biorxiv, 2021.","DOI":"10.1101\/2021.03.11.435032"},{"issue":"7","key":"5282_CR11","doi-asserted-by":"publisher","first-page":"1758","DOI":"10.18632\/aging.101508","volume":"10","author":"S Horvath","year":"2018","unstructured":"Horvath S, et al. Epigenetic clock for skin and blood cells applied to Hutchinson Gilford Progeria syndrome and ex vivo studies. Aging (Albany NY). 2018;10(7):1758.","journal-title":"Aging (Albany NY)"},{"issue":"1","key":"5282_CR12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13148-020-00899-1","volume":"12","author":"M Boroni","year":"2020","unstructured":"Boroni M, et al. Highly accurate skin-specific methylome analysis algorithm as a platform to screen and validate therapeutics for healthy aging. Clin Epigenet. 2020;12(1):1\u201316.","journal-title":"Clin Epigenet"},{"issue":"2","key":"5282_CR13","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/gb-2014-15-2-r24","volume":"15","author":"CI Weidner","year":"2014","unstructured":"Weidner CI, et al. Aging of blood can be tracked by DNA methylation changes at just three CpG sites. Genome Biol. 2014;15(2):1\u201312.","journal-title":"Genome Biol"},{"issue":"5","key":"5282_CR14","doi-asserted-by":"publisher","first-page":"1252","DOI":"10.14336\/AD.2020.1202","volume":"12","author":"F Galkin","year":"2020","unstructured":"Galkin F, et al. DeepMAge: a methylation aging clock developed with deep learning. Aging Dis. 2020;12(5):1252.","journal-title":"Aging Dis"},{"key":"5282_CR15","doi-asserted-by":"publisher","first-page":"e73420","DOI":"10.7554\/eLife.73420","volume":"11","author":"DW Belsky","year":"2022","unstructured":"Belsky DW, et al. DunedinPACE, a DNA methylation biomarker of the pace of aging. Elife. 2022;11:e73420.","journal-title":"Elife"},{"issue":"1","key":"5282_CR16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41514-022-00085-y","volume":"8","author":"LP de Lima Camillo","year":"2022","unstructured":"de Lima Camillo LP, Lapierre LR, Singh R. A pan-tissue DNA-methylation epigenetic clock based on deep learning. npj Aging. 2022;8(1):1\u201315.","journal-title":"npj Aging"},{"issue":"13","key":"5282_CR17","doi-asserted-by":"publisher","first-page":"1469","DOI":"10.2217\/epi-2019-0206","volume":"11","author":"S Bollepalli","year":"2019","unstructured":"Bollepalli S, et al. EpiSmokEr: a robust classifier to determine smoking status from DNA methylation data. Epigenomics. 2019;11(13):1469\u201386.","journal-title":"Epigenomics"},{"issue":"5","key":"5282_CR18","first-page":"436","volume":"9","author":"R Joehanes","year":"2016","unstructured":"Joehanes R, et al. Epigenetic signatures of cigarette smoking. Circ: Cardiovasc Genet. 2016;9(5):436\u201347.","journal-title":"Circ: Cardiovasc Genet"},{"issue":"1","key":"5282_CR19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41398-019-0430-9","volume":"9","author":"K Sugden","year":"2019","unstructured":"Sugden K, et al. Establishing a generalized polyepigenetic biomarker for tobacco smoking. Transl Psychiatry. 2019;9(1):1\u201312.","journal-title":"Transl Psychiatry"},{"issue":"9","key":"5282_CR20","doi-asserted-by":"publisher","first-page":"097003","DOI":"10.1289\/EHP6076","volume":"128","author":"S Rauschert","year":"2020","unstructured":"Rauschert S, et al. Machine learning-based DNA methylation score for fetal exposure to maternal smoking: development and validation in samples collected from adolescents and adults. Environ Health Perspect. 2020;128(9):097003.","journal-title":"Environ Health Perspect"},{"issue":"9","key":"5282_CR21","doi-asserted-by":"publisher","first-page":"1795","DOI":"10.1038\/s41366-018-0262-3","volume":"43","author":"OK Hamilton","year":"2019","unstructured":"Hamilton OK, et al. An epigenetic score for BMI based on DNA methylation correlates with poor physical health and major disease in the Lothian Birth Cohort. Int J Obes. 2019;43(9):1795\u2013802.","journal-title":"Int J Obes"},{"key":"5282_CR22","doi-asserted-by":"crossref","unstructured":"Bellman R. Curse of dimensionality. Adaptive control processes: a guided tour. Princeton, NJ, 1961;3(2).","DOI":"10.1515\/9781400874668"},{"issue":"Mar","key":"5282_CR23","first-page":"1157","volume":"3","author":"I Guyon","year":"2003","unstructured":"Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3(Mar):1157\u201382.","journal-title":"J Mach Learn Res"},{"issue":"19","key":"5282_CR24","doi-asserted-by":"publisher","first-page":"2507","DOI":"10.1093\/bioinformatics\/btm344","volume":"23","author":"Y Saeys","year":"2007","unstructured":"Saeys Y, Inza I, Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007;23(19):2507\u201317.","journal-title":"Bioinformatics"},{"key":"5282_CR25","doi-asserted-by":"publisher","first-page":"106839","DOI":"10.1016\/j.csda.2019.106839","volume":"143","author":"A Bommert","year":"2020","unstructured":"Bommert A, et al. Benchmark for filter methods for feature selection in high-dimensional classification data. Comput Stat Data Anal. 2020;143:106839.","journal-title":"Comput Stat Data Anal"},{"key":"5282_CR26","doi-asserted-by":"crossref","unstructured":"Alkuhlani A, Nassef M, Farag I. A comparative study of feature selection and classification techniques for high-throughput DNA methylation data. In International conference on advanced intelligent systems and informatics; 2016. Springer.","DOI":"10.1007\/978-3-319-48308-5_76"},{"key":"5282_CR27","doi-asserted-by":"crossref","unstructured":"Jovi\u0107 A, Brki\u0107 K, Bogunovi\u0107 N. A review of feature selection methods with applications. in 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO); IEEE. 2015.","DOI":"10.1109\/MIPRO.2015.7160458"},{"key":"5282_CR28","doi-asserted-by":"crossref","unstructured":"Cunningham P. Dimension reduction, in machine learning techniques for multimedia. Springer; 2008. p. 91\u2013112.","DOI":"10.1007\/978-3-540-75171-7_4"},{"issue":"4","key":"5282_CR29","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1504\/IJMIC.2013.053535","volume":"18","author":"A Garg","year":"2013","unstructured":"Garg A, Tai K. Comparison of statistical and machine learning methods in modelling of data with multicollinearity. Int J Model Ident Control. 2013;18(4):295\u2013312.","journal-title":"Int J Model Ident Control"},{"issue":"7","key":"5282_CR30","doi-asserted-by":"publisher","first-page":"644","DOI":"10.1038\/s43587-022-00248-2","volume":"2","author":"AT Higgins-Chen","year":"2022","unstructured":"Higgins-Chen AT, et al. A computational solution for bolstering reliability of epigenetic clocks: implications for clinical trials and longitudinal tracking. Nat Aging. 2022;2(7):644\u201361.","journal-title":"Nat Aging"},{"issue":"2","key":"5282_CR31","doi-asserted-by":"publisher","first-page":"381","DOI":"10.1161\/01.HYP.37.2.381","volume":"37","author":"A Benetos","year":"2001","unstructured":"Benetos A, et al. Telomere length as an indicator of biological aging: the gender effect and relation with pulse pressure and pulse wave velocity. Hypertension. 2001;37(2):381\u20135.","journal-title":"Hypertension"},{"issue":"3","key":"5282_CR32","doi-asserted-by":"publisher","first-page":"1861","DOI":"10.1007\/s11357-022-00586-4","volume":"44","author":"EE Pearce","year":"2022","unstructured":"Pearce EE, et al. Telomere length and epigenetic clocks as markers of cellular aging: a comparative study. GeroScience. 2022;44(3):1861\u20139.","journal-title":"GeroScience"},{"issue":"1","key":"5282_CR33","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1089\/rej.2021.0045","volume":"25","author":"S Yadav","year":"2022","unstructured":"Yadav S, Maurya PK. Correlation between telomere length and biomarkers of oxidative stress in human aging. Rejuvenation Res. 2022;25(1):25\u20139.","journal-title":"Rejuvenation Res"},{"issue":"16","key":"5282_CR34","doi-asserted-by":"publisher","first-page":"5895","DOI":"10.18632\/aging.102173","volume":"11","author":"AT Lu","year":"2019","unstructured":"Lu AT, et al. DNA methylation-based estimator of telomere length. Aging (Albany NY). 2019;11(16):5895.","journal-title":"Aging (Albany NY)"},{"issue":"2","key":"5282_CR35","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","volume":"67","author":"H Zou","year":"2005","unstructured":"Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc: Ser B (Stat Methodol). 2005;67(2):301\u201320.","journal-title":"J R Stat Soc: Ser B (Stat Methodol)"},{"issue":"1","key":"5282_CR36","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","volume":"57","author":"Y Benjamini","year":"1995","unstructured":"Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc: Ser B (Methodol). 1995;57(1):289\u2013300.","journal-title":"J Roy Stat Soc: Ser B (Methodol)"},{"issue":"3","key":"5282_CR37","doi-asserted-by":"publisher","first-page":"479","DOI":"10.1111\/1467-9868.00346","volume":"64","author":"JD Storey","year":"2002","unstructured":"Storey JD. A direct approach to false discovery rates. J R Stat Soc: Ser B (Stat Methodol). 2002;64(3):479\u201398.","journal-title":"J R Stat Soc: Ser B (Stat Methodol)"},{"issue":"1","key":"5282_CR38","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-019-1716-1","volume":"20","author":"K Korthauer","year":"2019","unstructured":"Korthauer K, et al. A practical guide to methods controlling false discoveries in computational biology. Genome Biol. 2019;20(1):1\u201321.","journal-title":"Genome Biol"},{"issue":"10","key":"5282_CR39","doi-asserted-by":"publisher","first-page":"1018","DOI":"10.18632\/aging.100395","volume":"3","author":"CM Koch","year":"2011","unstructured":"Koch CM, Wagner W. Epigenetic-aging-signature to determine age in different tissues. Aging (Albany NY). 2011;3(10):1018.","journal-title":"Aging (Albany NY)"},{"issue":"10","key":"5282_CR40","doi-asserted-by":"publisher","first-page":"922","DOI":"10.1080\/15592294.2015.1080413","volume":"10","author":"B Bekaert","year":"2015","unstructured":"Bekaert B, et al. Improved age determination of blood and teeth samples using a selected set of DNA methylation markers. Epigenetics. 2015;10(10):922\u201330.","journal-title":"Epigenetics"},{"key":"5282_CR41","first-page":"373","volume":"12","author":"P Karir","year":"2019","unstructured":"Karir P, Goel N, Garg VK. Human age prediction using DNA methylation and regression methods. Int J Inf Technol. 2019;12:373\u201381.","journal-title":"Int J Inf Technol"},{"issue":"3","key":"5282_CR42","doi-asserted-by":"publisher","first-page":"791","DOI":"10.1039\/C4MB00659C","volume":"11","author":"Z Cai","year":"2015","unstructured":"Cai Z, et al. Classification of lung cancer using ensemble-based feature selection and machine learning methods. Mol BioSyst. 2015;11(3):791\u2013800.","journal-title":"Mol BioSyst"},{"issue":"1","key":"5282_CR43","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41392-018-0034-5","volume":"4","author":"W Xu","year":"2019","unstructured":"Xu W, et al. Integrative analysis of DNA methylation and gene expression identified cervical cancer-specific diagnostic biomarkers. Signal Transduct Target Ther. 2019;4(1):1\u201311.","journal-title":"Signal Transduct Target Ther"},{"issue":"7","key":"5282_CR44","doi-asserted-by":"publisher","first-page":"204","DOI":"10.31083\/j.fbl2707204","volume":"27","author":"L Chen","year":"2022","unstructured":"Chen L, et al. Identification of DNA methylation signature and rules for SARS-CoV-2 associated with age. Front Biosci-Landmark. 2022;27(7):204.","journal-title":"Front Biosci-Landmark"},{"issue":"1","key":"5282_CR45","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/srep17788","volume":"5","author":"C Xu","year":"2015","unstructured":"Xu C, et al. A novel strategy for forensic age prediction by DNA methylation and support vector regression model. Sci Rep. 2015;5(1):1\u201310.","journal-title":"Sci Rep"},{"issue":"5","key":"5282_CR46","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1007\/s00127-015-1048-8","volume":"50","author":"R Poulton","year":"2015","unstructured":"Poulton R, Moffitt TE, Silva PA. The Dunedin multidisciplinary health and development study: overview of the first 40 years, with an eye to the future. Soc Psychiatry Psychiatr Epidemiol. 2015;50(5):679\u201393.","journal-title":"Soc Psychiatry Psychiatr Epidemiol"},{"issue":"4","key":"5282_CR47","doi-asserted-by":"publisher","first-page":"288","DOI":"10.1016\/j.ygeno.2011.07.007","volume":"98","author":"M Bibikova","year":"2011","unstructured":"Bibikova M, et al. High density DNA methylation array with single CpG site resolution. Genomics. 2011;98(4):288\u201395.","journal-title":"Genomics"},{"issue":"10","key":"5282_CR48","doi-asserted-by":"publisher","first-page":"e47","DOI":"10.1093\/nar\/30.10.e47","volume":"30","author":"RM Cawthon","year":"2002","unstructured":"Cawthon RM. Telomere measurement by quantitative PCR. Nucl Acids Res. 2002;30(10):e47\u2013e47.","journal-title":"Nucl Acids Res"},{"issue":"5","key":"5282_CR49","doi-asserted-by":"publisher","first-page":"576","DOI":"10.1038\/mp.2012.32","volume":"18","author":"I Shalev","year":"2013","unstructured":"Shalev I, et al. Exposure to violence during childhood is associated with telomere erosion from 5 to 10 years of age: a longitudinal study. Mol Psychiatry. 2013;18(5):576\u201381.","journal-title":"Mol Psychiatry"},{"issue":"16","key":"5282_CR50","doi-asserted-by":"publisher","first-page":"2840","DOI":"10.1093\/hmg\/ddy199","volume":"27","author":"B Crawford","year":"2018","unstructured":"Crawford B, et al. DNA methylation and inflammation marker profiles associated with a history of depression. Hum Mol Genet. 2018;27(16):2840\u201350.","journal-title":"Hum Mol Genet"},{"issue":"1","key":"5282_CR51","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-016-1041-x","volume":"17","author":"E Hannon","year":"2016","unstructured":"Hannon E, et al. An integrated genetic-epigenetic analysis of schizophrenia: evidence for co-localization of genetic associations and differential DNA methylation. Genome Biol. 2016;17(1):1\u201316.","journal-title":"Genome Biol"},{"issue":"1","key":"5282_CR52","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1480-9222-13-3","volume":"13","author":"NJ O'Callaghan","year":"2011","unstructured":"O\u2019Callaghan NJ, Fenech M. A quantitative PCR method for measuring absolute telomere length. Biol Proced Online. 2011;13(1):1\u201310.","journal-title":"Biol Proced Online"},{"key":"5282_CR53","unstructured":"Davis SDP et al. methylumi: Handle Illumina methylation data; 2015."},{"issue":"1","key":"5282_CR54","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2164-14-293","volume":"14","author":"R Pidsley","year":"2013","unstructured":"Pidsley R, et al. A data-driven approach to preprocessing Illumina 450K methylation array data. BMC Genomics. 2013;14(1):1\u201310.","journal-title":"BMC Genomics"},{"issue":"Oct","key":"5282_CR55","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, et al. Scikit-learn: machine learning in Python. J Mac Learn Res. 2011;12(Oct):2825\u201330.","journal-title":"J Mac Learn Res"},{"issue":"1","key":"5282_CR56","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1186\/1471-2105-7-91","volume":"7","author":"S Varma","year":"2006","unstructured":"Varma S, Simon R. Bias in error estimation when using cross-validation for model selection. BMC Bioinform. 2006;7(1):91.","journal-title":"BMC Bioinform"},{"issue":"1","key":"5282_CR57","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-6-10","volume":"6","author":"D Krstajic","year":"2014","unstructured":"Krstajic D, et al. Cross-validation pitfalls when selecting and assessing regression and classification models. J Cheminform. 2014;6(1):1\u201315.","journal-title":"J Cheminform"},{"key":"5282_CR58","doi-asserted-by":"crossref","unstructured":"Dugu\u00e9 P-A et al. DNA methylation\u2013based measures of biological aging, In Epigenetics in human disease, Elsevier; 2018. p. 39\u201364.","DOI":"10.1016\/B978-0-12-812215-0.00003-0"},{"key":"5282_CR59","doi-asserted-by":"crossref","unstructured":"Ogutu JO, Schulz-Streeck T, Piepho H-P. Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions. In BMC proceedings; Springer. 2012.","DOI":"10.1186\/1753-6561-6-S2-S10"},{"key":"5282_CR60","doi-asserted-by":"crossref","unstructured":"Benesty J et al. Pearson correlation coefficient, In Noise reduction in speech processing, Springer; 2009. p. 1\u20134.","DOI":"10.1007\/978-3-642-00296-0_5"},{"key":"5282_CR61","unstructured":"Brank J et al. Feature selection using support vector machines. WIT Trans Inf Commun Technol; 2002:28."},{"issue":"1","key":"5282_CR62","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L. Random forests. Mach Learn. 2001;45(1):5\u201332.","journal-title":"Mach Learn"},{"issue":"1","key":"5282_CR63","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13148-015-0108-y","volume":"7","author":"B Quraishi","year":"2015","unstructured":"Quraishi B, et al. Identifying CpG sites associated with eczema via random forest screening of epigenome-scale DNA methylation. Clin Epigenet. 2015;7(1):1\u201311.","journal-title":"Clin Epigenet"},{"key":"5282_CR64","unstructured":"Cunningham P, Kathirgamanathan B, Delany SJ, Feature selection tutorial with python examples. arXiv preprint http:\/\/arxiv.org\/abs\/2106.06437; 2021."},{"issue":"3","key":"5282_CR65","doi-asserted-by":"publisher","first-page":"807","DOI":"10.1016\/j.ejor.2020.08.045","volume":"290","author":"C Gambella","year":"2021","unstructured":"Gambella C, Ghaddar B, Naoum-Sawaya J. Optimization problems for machine learning: a survey. Eur J Oper Res. 2021;290(3):807\u201328.","journal-title":"Eur J Oper Res"},{"key":"5282_CR66","doi-asserted-by":"crossref","unstructured":"Chen T, Guestrin C. Xgboost: a scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016.","DOI":"10.1145\/2939672.2939785"},{"key":"5282_CR67","unstructured":"Brownlee J. Feature importance and feature selection with xgboost in python. Machine Learning Mastery; 2016. https:\/\/machinelearningmastery.com\/feature-importance-and-feature-selection-with-xgboost-in-python. Accessed 15 Oct 2021."},{"key":"5282_CR68","unstructured":"DeHan C. BoostARoota. 2017."},{"issue":"4","key":"5282_CR69","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1002\/wics.101","volume":"2","author":"H Abdi","year":"2010","unstructured":"Abdi H, Williams LJ. Principal component analysis. Wiley Interdiscip Rev: Comput Stat. 2010;2(4):433\u201359.","journal-title":"Wiley Interdiscip Rev: Comput Stat"},{"issue":"2","key":"5282_CR70","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1093\/carcin\/bgt391","volume":"35","author":"Z Xu","year":"2014","unstructured":"Xu Z, Taylor JA. Genome-wide age-related DNA methylation changes in blood and other tissues relate to histone modification, expression and cancer. Carcinogenesis. 2014;35(2):356\u201364.","journal-title":"Carcinogenesis"},{"issue":"1","key":"5282_CR71","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13073-015-0213-8","volume":"7","author":"TM Everson","year":"2015","unstructured":"Everson TM, et al. DNA methylation loci associated with atopy and high serum IgE: a genome-wide application of recursive random forest feature selection. Genome Med. 2015;7(1):1\u201316.","journal-title":"Genome Med"},{"issue":"2","key":"5282_CR72","doi-asserted-by":"publisher","first-page":"e0148977","DOI":"10.1371\/journal.pone.0148977","volume":"11","author":"B Baur","year":"2016","unstructured":"Baur B, Bozdag S. A feature selection algorithm to compute gene centric methylation from probe level methylation data. PLoS ONE. 2016;11(2):e0148977.","journal-title":"PLoS ONE"},{"issue":"4","key":"5282_CR73","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1007\/s10654-011-9563-8","volume":"26","author":"MJ Knol","year":"2011","unstructured":"Knol MJ, Pestman WR, Grobbee DE. The (mis) use of overlap of confidence intervals to assess effect modification. Eur J Epidemiol. 2011;26(4):253\u20134.","journal-title":"Eur J Epidemiol"},{"key":"5282_CR74","unstructured":"Correlation Confidence Interval Calculator. Statistics Kingdom, 2022."},{"key":"5282_CR75","doi-asserted-by":"publisher","first-page":"456","DOI":"10.3389\/fpsyg.2017.00456","volume":"8","author":"JZ Bakdash","year":"2017","unstructured":"Bakdash JZ, Marusich LR. Repeated measures correlation. Front Psychol. 2017;8:456.","journal-title":"Front Psychol"},{"key":"5282_CR76","doi-asserted-by":"publisher","DOI":"10.1111\/1755-0998.13114","volume-title":"Improving comparability between qPCR-based telomere studies","author":"S Verhulst","year":"2020","unstructured":"Verhulst S. Improving comparability between qPCR-based telomere studies. Wiley; 2020."},{"issue":"23","key":"5282_CR77","doi-asserted-by":"publisher","first-page":"9326","DOI":"10.1016\/j.eswa.2015.08.016","volume":"42","author":"ZY Algamal","year":"2015","unstructured":"Algamal ZY, Lee MH. Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification. Expert Syst Appl. 2015;42(23):9326\u201332.","journal-title":"Expert Syst Appl"},{"issue":"11","key":"5282_CR78","doi-asserted-by":"publisher","first-page":"14675","DOI":"10.18632\/aging.203126","volume":"13","author":"EE Pearce","year":"2021","unstructured":"Pearce EE, et al. DNA-methylation-based telomere length estimator: comparisons with measurements from flow FISH and qPCR. Aging (Albany NY). 2021;13(11):14675.","journal-title":"Aging (Albany NY)"},{"key":"5282_CR79","unstructured":"Kelleher J, Mac Namee B, Arcy AD\u2019. Machine learning for predictive data analytics. Fundamentals of machine learning for predictive data analytics: algorithms, worked examples, and case studies; 2015. p. 1\u201319."},{"issue":"11","key":"5282_CR80","doi-asserted-by":"publisher","first-page":"2744","DOI":"10.1002\/1878-0261.12767","volume":"14","author":"M Li","year":"2020","unstructured":"Li M, et al. Identification and validation of novel DNA methylation markers for early diagnosis of lung adenocarcinoma. Mol Oncol. 2020;14(11):2744\u201358.","journal-title":"Mol Oncol"},{"key":"5282_CR81","doi-asserted-by":"crossref","unstructured":"Raweh AA, Nassef M, Badr A, Feature selection and extraction framework for DNA methylation in cancer. Int J Adv Comp Science & Appl.;2017:8(7).","DOI":"10.14569\/IJACSA.2017.080705"},{"issue":"1","key":"5282_CR82","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-022-04651-9","volume":"23","author":"V Halla-Aho","year":"2022","unstructured":"Halla-Aho V, L\u00e4hdesm\u00e4ki H. Probabilistic modeling methods for cell-free DNA methylation based cancer classification. BMC Bioinform. 2022;23(1):1\u201324.","journal-title":"BMC Bioinform"},{"issue":"1","key":"5282_CR83","doi-asserted-by":"publisher","first-page":"194","DOI":"10.1067\/mva.2002.125015","volume":"36","author":"PC Austin","year":"2002","unstructured":"Austin PC, Hux JE. A brief note on overlapping confidence intervals. J Vasc Surg. 2002;36(1):194\u20135.","journal-title":"J Vasc Surg"},{"issue":"20","key":"5282_CR84","doi-asserted-by":"publisher","first-page":"5273","DOI":"10.1080\/01431160903130937","volume":"30","author":"GM Foody","year":"2009","unstructured":"Foody GM. Sample size determination for image classification accuracy assessment and comparison. Int J Rem Sens. 2009;30(20):5273\u201391.","journal-title":"Int J Rem Sens"},{"key":"5282_CR85","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1016\/j.rse.2011.11.020","volume":"118","author":"DC Duro","year":"2012","unstructured":"Duro DC, Franklin SE, Dub\u00e9 MG. A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for the classification of agricultural landscapes using SPOT-5 HRG imagery. Rem Sens Environ. 2012;118:259\u201372.","journal-title":"Rem Sens Environ"},{"issue":"9","key":"5282_CR86","doi-asserted-by":"publisher","first-page":"e0184098","DOI":"10.1371\/journal.pone.0184098","volume":"12","author":"CL Dagnall","year":"2017","unstructured":"Dagnall CL, et al. Effect of pre-analytic variables on the reproducibility of qPCR relative telomere length measurement. PLoS ONE. 2017;12(9):e0184098.","journal-title":"PLoS ONE"},{"issue":"3","key":"5282_CR87","doi-asserted-by":"publisher","first-page":"312","DOI":"10.1093\/gerona\/glq223","volume":"66","author":"W Chen","year":"2011","unstructured":"Chen W, et al. Longitudinal versus cross-sectional evaluations of leukocyte telomere length dynamics: age-dependent telomere shortening is the rule. J Gerontol Ser A: Biomed Sci Med Sci. 2011;66(3):312\u20139.","journal-title":"J Gerontol Ser A: Biomed Sci Med Sci"},{"issue":"4","key":"5282_CR88","doi-asserted-by":"publisher","first-page":"478","DOI":"10.1111\/joim.12282","volume":"277","author":"A Baragetti","year":"2015","unstructured":"Baragetti A, et al. Telomere shortening over 6 years is associated with increased subclinical carotid vascular damage and worse cardiovascular prognosis in the general population. J Intern Med. 2015;277(4):478\u201387.","journal-title":"J Intern Med"},{"issue":"2","key":"5282_CR89","doi-asserted-by":"publisher","first-page":"509","DOI":"10.1016\/j.arr.2013.01.003","volume":"12","author":"A M\u00fcezzinler","year":"2013","unstructured":"M\u00fcezzinler A, Zaineddin AK, Brenner H. A systematic review of leukocyte telomere length and age in adults. Ageing Res Rev. 2013;12(2):509\u201319.","journal-title":"Ageing Res Rev"},{"issue":"6","key":"5282_CR90","doi-asserted-by":"publisher","first-page":"1725","DOI":"10.1093\/ije\/dyp273","volume":"38","author":"S Ehrlenbach","year":"2009","unstructured":"Ehrlenbach S, et al. Influences on the reduction of relative telomere length over 10 years in the population-based Bruneck study: introduction of a well-controlled high-throughput assay. Int J Epidemiol. 2009;38(6):1725\u201334.","journal-title":"Int J Epidemiol"},{"issue":"6","key":"5282_CR91","doi-asserted-by":"publisher","first-page":"1060","DOI":"10.1038\/s41390-019-0699-7","volume":"87","author":"J-H Kim","year":"2020","unstructured":"Kim J-H, et al. Heritability of telomere length across three generations of Korean families. Pediatr Res. 2020;87(6):1060\u20135.","journal-title":"Pediatr Res"},{"issue":"5","key":"5282_CR92","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1375\/twin.8.5.433","volume":"8","author":"C Bischoff","year":"2005","unstructured":"Bischoff C, et al. The heritability of telomere length among the elderly and oldest-old. Twin Res Hum Genet. 2005;8(5):433\u20139.","journal-title":"Twin Res Hum Genet"},{"issue":"10","key":"5282_CR93","doi-asserted-by":"publisher","first-page":"1163","DOI":"10.1038\/ejhg.2012.303","volume":"21","author":"L Broer","year":"2013","unstructured":"Broer L, et al. Meta-analysis of telomere length in 19 713 subjects reveals high heritability, stronger maternal inheritance and a paternal age effect. Eur J Hum Genet. 2013;21(10):1163\u20138.","journal-title":"Eur J Hum Genet"},{"issue":"5","key":"5282_CR94","doi-asserted-by":"publisher","first-page":"297","DOI":"10.1136\/jmedgenet-2014-102736","volume":"52","author":"JB Hjelmborg","year":"2015","unstructured":"Hjelmborg JB, et al. The heritability of leucocyte telomere length dynamics. J Med Genet. 2015;52(5):297\u2013302.","journal-title":"J Med Genet"},{"issue":"10","key":"5282_CR95","doi-asserted-by":"publisher","first-page":"2785","DOI":"10.1016\/j.neurobiolaging.2015.06.017","volume":"36","author":"LS Honig","year":"2015","unstructured":"Honig LS, et al. Heritability of telomere length in a study of long-lived families. Neurobiol Aging. 2015;36(10):2785\u201390.","journal-title":"Neurobiol Aging"},{"issue":"2","key":"5282_CR96","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1161\/01.HYP.36.2.195","volume":"36","author":"E Jeanclos","year":"2000","unstructured":"Jeanclos E, et al. Telomere length inversely correlates with pulse pressure and is highly familial. Hypertension. 2000;36(2):195\u2013200.","journal-title":"Hypertension"},{"issue":"1","key":"5282_CR97","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13148-016-0186-5","volume":"8","author":"LP Breitling","year":"2016","unstructured":"Breitling LP, et al. Frailty is associated with the epigenetic clock but not with telomere length in a German cohort. Clin Epigenet. 2016;8(1):1\u20138.","journal-title":"Clin Epigenet"},{"issue":"2","key":"5282_CR98","doi-asserted-by":"publisher","first-page":"424","DOI":"10.1093\/ije\/dyw041","volume":"45","author":"RE Marioni","year":"2016","unstructured":"Marioni RE, et al. The epigenetic clock and telomere length are independently associated with chronological age and mortality. Int J Epidemiol. 2016;45(2):424\u201332.","journal-title":"Int J Epidemiol"},{"issue":"6","key":"5282_CR99","doi-asserted-by":"crossref","first-page":"1220","DOI":"10.1093\/aje\/kwy060","volume":"187","author":"DW Belsky","year":"2018","unstructured":"Belsky DW, et al. Eleven telomere, epigenetic clock, and biomarker-composite quantifications of biological aging: do they measure the same thing? Am J Epidemiol. 2018;187(6):1220\u201330.","journal-title":"Am J Epidemiol"},{"issue":"5","key":"5282_CR100","doi-asserted-by":"publisher","first-page":"1688","DOI":"10.1093\/ije\/dyv165","volume":"44","author":"C Dalg\u00e5rd","year":"2015","unstructured":"Dalg\u00e5rd C, et al. Leukocyte telomere length dynamics in women and men: menopause vs age effects. Int J Epidemiol. 2015;44(5):1688\u201395.","journal-title":"Int J Epidemiol"},{"issue":"1","key":"5282_CR101","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-13-86","volume":"13","author":"EA Houseman","year":"2012","unstructured":"Houseman EA, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinform. 2012;13(1):1\u201316.","journal-title":"BMC Bioinform"},{"issue":"1","key":"5282_CR102","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-016-1030-0","volume":"17","author":"S Horvath","year":"2016","unstructured":"Horvath S, et al. An epigenetic clock analysis of race\/ethnicity, sex, and coronary heart disease. Genome Biol. 2016;17(1):1\u201323.","journal-title":"Genome Biol"},{"issue":"9","key":"5282_CR103","doi-asserted-by":"publisher","first-page":"1983","DOI":"10.18632\/aging.101293","volume":"9","author":"BH Chen","year":"2017","unstructured":"Chen BH, et al. Leukocyte telomere length, T cell composition and DNA methylation age. Aging (Albany NY). 2017;9(9):1983.","journal-title":"Aging (Albany NY)"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05282-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-023-05282-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05282-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,19]],"date-time":"2024-10-19T10:45:13Z","timestamp":1729334713000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-023-05282-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,1]]},"references-count":103,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["5282"],"URL":"https:\/\/doi.org\/10.1186\/s12859-023-05282-4","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.04.02.486242","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,1]]},"assertion":[{"value":"29 March 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 April 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 May 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 June 2023","order":4,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":5,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"In the XML Figs has been corrected to Fig. The article has been updated to rectify the error.","order":6,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"All experiments were performed in accordance with national guidelines and regulations as well as the Declaration of Helsinki.\u00a0The Dunedin study\u00a0participants gave written informed consent, and study protocols were approved by the NZ-HDEC (New Zealand Health and Disability Ethics Committee).\u00a0For both the EXTEND and TWIN studies,\u00a0informed consent was obtained from participants\u00a0and ethical approval was obtained from the University of Exeter Medical School Research Ethics Board.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors confirm no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"178"}}