{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T08:22:12Z","timestamp":1772785332690,"version":"3.50.1"},"reference-count":77,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2023,12,22]],"date-time":"2023-12-22T00:00:00Z","timestamp":1703203200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,12,22]],"date-time":"2023-12-22T00:00:00Z","timestamp":1703203200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"crossref","award":["427806116"],"award-info":[{"award-number":["427806116"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100003484","name":"Heinrich-Heine-Universit\u00e4t D\u00fcsseldorf","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100003484","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2024,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Interactions between predictors play an important role in many applications. Popular and successful tree-based supervised learning methods such as random forests or logic regression can incorporate interactions associated with the considered outcome without specifying which variables might interact. Nonetheless, these algorithms suffer from certain drawbacks such as limited interpretability of model predictions and difficulties with negligible marginal effects in the case of random forests or not being able to incorporate interactions with continuous variables, being restricted to additive structures between Boolean terms, and not directly considering conjunctions that reveal the interactions in the case of logic regression. We, therefore, propose a novel method called logic decision trees (logicDT) that is specifically tailored to binary input data and helps to overcome the drawbacks of existing methods. The main idea consists of considering sets of Boolean conjunctions, using these terms as input variables for decision trees, and searching for the best performing model. logicDT is also accompanied by a framework for estimating the importance of identified terms, i.e., input variables and interactions between input variables. This new method is compared to other popular statistical learning algorithms in simulations and real data applications. As these evaluations show, logicDT is able to yield high prediction performances while maintaining interpretability.<\/jats:p>","DOI":"10.1007\/s10994-023-06488-6","type":"journal-article","created":{"date-parts":[[2023,12,22]],"date-time":"2023-12-22T17:02:03Z","timestamp":1703264523000},"page":"933-992","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["logicDT: a procedure for identifying response-associated interactions between binary predictors"],"prefix":"10.1007","volume":"113","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5327-8351","authenticated-orcid":false,"given":"Michael","family":"Lau","sequence":"first","affiliation":[]},{"given":"Tamara","family":"Schikowski","sequence":"additional","affiliation":[]},{"given":"Holger","family":"Schwender","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,12,22]]},"reference":[{"issue":"4","key":"6488_CR1","first-page":"193","volume":"40","author":"E Aarts","year":"1985","unstructured":"Aarts, E., & Van Laarhoven, P. (1985). Statistical cooling: A general approach to combinatorial optimization problems. Philips Journal of Research, 40(4), 193\u2013226.","journal-title":"Philips Journal of Research"},{"key":"6488_CR2","doi-asserted-by":"crossref","unstructured":"Aglin, G., Nijssen, S., & Schaus, P. (2020). Learning optimal decision trees using caching branch-and-bound search. In Proceedings of the AAAI conference on artificial intelligence, (Vol. 34, pp. 3146\u20133153).","DOI":"10.1609\/aaai.v34i04.5711"},{"key":"6488_CR3","doi-asserted-by":"crossref","unstructured":"Aglin, G., Nijssen, S., & Schaus, P. (2020b). PyDL8.5: A library for learning optimal decision trees. In Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI-20 (pp. 5222\u20135224). International Joint Conferences on Artificial Intelligence Organization.","DOI":"10.24963\/ijcai.2020\/750"},{"key":"6488_CR4","doi-asserted-by":"publisher","first-page":"907","DOI":"10.1186\/s12889-017-4914-3","volume":"17","author":"C Bellinger","year":"2017","unstructured":"Bellinger, C., Mohomed Jabbar, M. S., Za\u00efane, O., & Osornio-Vargas, A. (2017). A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health, 17, 907. https:\/\/doi.org\/10.1186\/s12889-017-4914-3","journal-title":"BMC Public Health"},{"key":"6488_CR5","unstructured":"B\u00e9nard, C., Biau, G., da Veiga, S., & Scornet, E. (2021). Interpretable random forests via rule extraction. In Proceedings of the 24th international conference on artificial intelligence and statistics, Volume 130 of Proceedings of machine learning research (pp. 937\u2013945). PMLR."},{"issue":"1","key":"6488_CR6","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","volume":"57","author":"Y Benjamini","year":"1995","unstructured":"Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1), 289\u2013300. https:\/\/doi.org\/10.1111\/j.2517-6161.1995.tb02031.x","journal-title":"Journal of the Royal Statistical Society: Series B (Methodological)"},{"key":"6488_CR7","unstructured":"Bergstra, J., Bardenet, R., Bengio, Y., & K\u00e9gl, B. (2011). Algorithms for hyper-parameter optimization. In Proceedings of the 24th International Conference on Neural Information Processing Systems, NIPS\u201911 (pp. 2546\u20132554). Curran Associates Inc."},{"key":"6488_CR8","doi-asserted-by":"publisher","first-page":"1039","DOI":"10.1007\/s10994-017-5633-9","volume":"106","author":"D Bertsimas","year":"2017","unstructured":"Bertsimas, D., & Dunn, J. (2017). Optimal classification trees. Machine Learning, 106, 1039\u20131082. https:\/\/doi.org\/10.1007\/s10994-017-5633-9","journal-title":"Machine Learning"},{"issue":"1","key":"6488_CR9","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1016\/S0004-3702(98)00034-4","volume":"101","author":"H Blockeel","year":"1998","unstructured":"Blockeel, H., & De Raedt, L. (1998). Top-down induction of first-order logical decision trees. Artificial Intelligence, 101(1), 285\u2013297. https:\/\/doi.org\/10.1016\/S0004-3702(98)00034-4","journal-title":"Artificial Intelligence"},{"issue":"2","key":"6488_CR10","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1007\/BF00058655","volume":"24","author":"L Breiman","year":"1996","unstructured":"Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123\u2013140. https:\/\/doi.org\/10.1007\/BF00058655","journal-title":"Machine Learning"},{"issue":"1","key":"6488_CR11","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5\u201332. https:\/\/doi.org\/10.1023\/A:1010933404324","journal-title":"Machine Learning"},{"key":"6488_CR12","volume-title":"Classification and regression trees","author":"L Breiman","year":"1984","unstructured":"Breiman, L., Friedman, J. H., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC Press."},{"issue":"2","key":"6488_CR13","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1002\/gepi.20041","volume":"28","author":"A Bureau","year":"2005","unstructured":"Bureau, A., Dupuis, J., Falls, K., Lunetta, K. L., Hayward, B., Keith, T. P., & Van Eerdewegh, P. (2005). Identifying SNPs predictive of phenotype using random forests. Genetic Epidemiology, 28(2), 171\u2013182. https:\/\/doi.org\/10.1002\/gepi.20041","journal-title":"Genetic Epidemiology"},{"key":"6488_CR14","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1007\/s11750-021-00594-1","volume":"29","author":"E Carrizosa","year":"2021","unstructured":"Carrizosa, E., Molero-R\u00edo, C., & Romero Morales, D. (2021). Mathematical optimization in classification and regression trees. TOP, 29, 5\u201333. https:\/\/doi.org\/10.1007\/s11750-021-00594-1","journal-title":"TOP"},{"key":"6488_CR15","doi-asserted-by":"publisher","first-page":"138","DOI":"10.3389\/fgene.2013.00138","volume":"4","author":"R Che","year":"2013","unstructured":"Che, R., & Motsinger-Reif, A. (2013). Evaluation of genetic risk score models in the presence of interaction and linkage disequilibrium. Frontiers in Genetics, 4, 138. https:\/\/doi.org\/10.3389\/fgene.2013.00138","journal-title":"Frontiers in Genetics"},{"issue":"6","key":"6488_CR16","doi-asserted-by":"publisher","first-page":"1580","DOI":"10.1109\/TCBB.2011.46","volume":"8","author":"CC Chen","year":"2011","unstructured":"Chen, C. C., Schwender, H., Keith, J., Nunkesser, R., Mengersen, K., & Macrossan, P. (2011). Methods for identifying SNP interactions: A review on variations of logic regression, random forest and Bayesian logistic regression. IEEE\/ACM Transactions on Computational Biology and Bioinformatics, 8(6), 1580\u20131591. https:\/\/doi.org\/10.1109\/TCBB.2011.46","journal-title":"IEEE\/ACM Transactions on Computational Biology and Bioinformatics"},{"key":"6488_CR17","doi-asserted-by":"crossref","unstructured":"Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD \u201916, New York, NY, USA (pp. 785\u2013794). Association for Computing Machinery.","DOI":"10.1145\/2939672.2939785"},{"issue":"5","key":"6488_CR18","doi-asserted-by":"publisher","first-page":"248","DOI":"10.1186\/ar2781","volume":"11","author":"A Clarke","year":"2009","unstructured":"Clarke, A., & Vyse, T. J. (2009). Genetics of rheumatic disease. Arthritis Research & Therapy, 11(5), 248. https:\/\/doi.org\/10.1186\/ar2781","journal-title":"Arthritis Research & Therapy"},{"issue":"26","key":"6488_CR19","first-page":"1","volume":"23","author":"E Demirovi\u0107","year":"2022","unstructured":"Demirovi\u0107, E., Lukina, A., Hebrard, E., Chan, J., Bailey, J., Leckie, C., Ramamohanarao, K., & Stuckey, P. J. (2022). MurTree: optimal decision trees via dynamic programming and search. Journal of Machine Learning Research, 23(26), 1\u201347.","journal-title":"Journal of Machine Learning Research"},{"issue":"4","key":"6488_CR20","doi-asserted-by":"publisher","first-page":"178","DOI":"10.1159\/000446581","volume":"80","author":"F Dudbridge","year":"2015","unstructured":"Dudbridge, F., & Newcombe, P. J. (2015). Accuracy of gene scores when pruning markers by linkage disequilibrium. Human Heredity, 80(4), 178\u2013186. https:\/\/doi.org\/10.1159\/000446581","journal-title":"Human Heredity"},{"issue":"12","key":"6488_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v092.i12","volume":"92","author":"M Fokkema","year":"2020","unstructured":"Fokkema, M. (2020). Fitting prediction rule ensembles with R package pre. Journal of Statistical Software, 92(12), 1\u201330. https:\/\/doi.org\/10.18637\/jss.v092.i12","journal-title":"Journal of Statistical Software"},{"issue":"5","key":"6488_CR22","doi-asserted-by":"publisher","first-page":"1189","DOI":"10.1214\/aos\/1013203451","volume":"29","author":"JH Friedman","year":"2001","unstructured":"Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189\u20131232. https:\/\/doi.org\/10.1214\/aos\/1013203451","journal-title":"The Annals of Statistics"},{"issue":"3","key":"6488_CR23","doi-asserted-by":"publisher","first-page":"916","DOI":"10.1214\/07-AOAS148","volume":"2","author":"JH Friedman","year":"2008","unstructured":"Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916\u2013954. https:\/\/doi.org\/10.1214\/07-AOAS148","journal-title":"The Annals of Applied Statistics"},{"issue":"1","key":"6488_CR24","doi-asserted-by":"publisher","first-page":"72","DOI":"10.1016\/j.geb.2005.03.002","volume":"55","author":"K Fujimoto","year":"2006","unstructured":"Fujimoto, K., Kojadinovic, I., & Marichal, J. L. (2006). Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices. Games and Economic Behavior, 55(1), 72\u201399. https:\/\/doi.org\/10.1016\/j.geb.2005.03.002","journal-title":"Games and Economic Behavior"},{"key":"6488_CR25","doi-asserted-by":"publisher","DOI":"10.1007\/b97848","volume-title":"A distribution-free theory of nonparametric regression","author":"L Gy\u00f6rfi","year":"2002","unstructured":"Gy\u00f6rfi, L., Kohler, M., Krzy\u017cak, A., & Walk, H. (2002). A distribution-free theory of nonparametric regression. Springer."},{"key":"6488_CR26","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The elements of statistical learning: Data mining, inference, and prediction","author":"T Hastie","year":"2009","unstructured":"Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer."},{"key":"6488_CR27","doi-asserted-by":"publisher","DOI":"10.3389\/fgene.2019.00267","author":"DSW Ho","year":"2019","unstructured":"Ho, D. S. W., Schierding, W., Wake, M., Saffery, R., & O\u2019Sullivan, J. (2019). Machine learning SNP based prediction for precision medicine. Frontiers in Genetics. https:\/\/doi.org\/10.3389\/fgene.2019.00267","journal-title":"Frontiers in Genetics"},{"issue":"1","key":"6488_CR28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s42979-021-00920-1","volume":"3","author":"R Hornung","year":"2022","unstructured":"Hornung, R. (2022). Diversity forests: Using split sampling to enable innovative complex split procedures in random forests. SN Computer Science, 3(1), 1\u201316. https:\/\/doi.org\/10.1007\/s42979-021-00920-1","journal-title":"SN Computer Science"},{"key":"6488_CR29","doi-asserted-by":"publisher","DOI":"10.1016\/j.csda.2022.107460","volume":"171","author":"R Hornung","year":"2022","unstructured":"Hornung, R., & Boulesteix, A. L. (2022). Interaction forests: Identifying and exploiting interpretable quantitative and qualitative interaction effects. Computational Statistics & Data Analysis, 171, 107460. https:\/\/doi.org\/10.1016\/j.csda.2022.107460","journal-title":"Computational Statistics & Data Analysis"},{"key":"6488_CR30","unstructured":"Huang, M., Romeo, F., & Sangiovanni-Vincentelli, A. (1986). An efficient general cooling schedule for simulated annealing. In Proceedings of the IEEE international conference on computer-aided design, Santa Clara, California, USA (pp. 381\u2013384). IEEE Computer Society."},{"issue":"4598","key":"6488_CR31","doi-asserted-by":"publisher","first-page":"671","DOI":"10.1126\/science.220.4598.671","volume":"220","author":"S Kirkpatrick","year":"1983","unstructured":"Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671\u2013680. https:\/\/doi.org\/10.1126\/science.220.4598.671","journal-title":"Science"},{"key":"6488_CR32","unstructured":"Kooperberg, C., & Ruczinski, I. (2022). LogicReg: Logic regression. R Package Version 1.6.5."},{"issue":"9","key":"6488_CR33","doi-asserted-by":"publisher","first-page":"1273","DOI":"10.1289\/ehp.0901689","volume":"118","author":"U Kr\u00e4mer","year":"2010","unstructured":"Kr\u00e4mer, U., Herder, C., Sugiri, D., Strassburger, K., Schikowski, T., Ranft, U., & Rathmann, W. (2010). Traffic-related air pollution and incident type 2 diabetes: Results from the salia cohort study. Environmental Health Perspectives, 118(9), 1273\u20131279. https:\/\/doi.org\/10.1289\/ehp.0901689","journal-title":"Environmental Health Perspectives"},{"key":"6488_CR34","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-015-7744-1","volume-title":"Simulated annealing: Theory and applications","author":"P Van Laarhoven","year":"1987","unstructured":"Van Laarhoven, P., & Aarts, E. (1987). Simulated annealing: Theory and applications. Springer."},{"key":"6488_CR35","unstructured":"Lau, M. (2023). logicDT: Identifying interactions between binary predictors. R Package Version 1.0.3."},{"key":"6488_CR36","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1186\/s12859-022-04634-w","volume":"23","author":"M Lau","year":"2022","unstructured":"Lau, M., Wigmann, C., Kress, S., Schikowski, T., & Schwender, H. (2022). Evaluation of tree-based statistical learning methods for constructing genetic risk scores. BMC Bioinformatics, 23, 97. https:\/\/doi.org\/10.1186\/s12859-022-04634-w","journal-title":"BMC Bioinformatics"},{"key":"6488_CR37","doi-asserted-by":"crossref","unstructured":"Li, R. H., & Belford, G. G. (2002). Instability of decision tree classification algorithms. In Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, New York, NY, USA (pp. 570\u2013575). Association for Computing Machinery.","DOI":"10.1145\/775047.775131"},{"key":"6488_CR38","unstructured":"Louppe, G. (2014). Understanding random forests: From theory to practice. Dissertation, University of Li\u00e8ge, Department of Electrical Engineering & Computer Science. arXiv:1407.7502."},{"key":"6488_CR39","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1038\/s42256-019-0138-9","volume":"2","author":"SM Lundberg","year":"2020","unstructured":"Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S. I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2, 56\u201367. https:\/\/doi.org\/10.1038\/s42256-019-0138-9","journal-title":"Nature Machine Intelligence"},{"issue":"1","key":"6488_CR40","doi-asserted-by":"publisher","first-page":"74","DOI":"10.3414\/ME00-01-0052","volume":"51","author":"JD Malley","year":"2012","unstructured":"Malley, J. D., Kruppa, J., Dasgupta, A., Malley, K. G., & Ziegler, A. (2012). Probability machines: Consistent probability estimation using nonparametric learning machines. Methods of Information in Medicine, 51(1), 74\u201381. https:\/\/doi.org\/10.3414\/ME00-01-0052","journal-title":"Methods of Information in Medicine"},{"issue":"4","key":"6488_CR41","doi-asserted-by":"publisher","first-page":"2049","DOI":"10.1214\/10-AOAS367","volume":"4","author":"N Meinshausen","year":"2010","unstructured":"Meinshausen, N. (2010). Node harvest. The Annals of Applied Statistics, 4(4), 2049\u20132072. https:\/\/doi.org\/10.1214\/10-AOAS367","journal-title":"The Annals of Applied Statistics"},{"issue":"26","key":"6488_CR42","first-page":"1","volume":"17","author":"L Mentch","year":"2016","unstructured":"Mentch, L., & Hooker, G. (2016). Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. Journal of Machine Learning Research, 17(26), 1\u201341.","journal-title":"Journal of Machine Learning Research"},{"key":"6488_CR43","doi-asserted-by":"crossref","unstructured":"Menze, B. H., Kelm, B. M., Splitthoff, D. N., Koethe, U., & Hamprecht, F. A. (2011). On oblique random forests. In Proceedings of the joint European conference on machine learning and knowledge discovery in databases, Berlin, Heidelberg (pp. 453\u2013469). Springer.","DOI":"10.1007\/978-3-642-23783-6_29"},{"key":"6488_CR44","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1613\/jair.63","volume":"2","author":"SK Murthy","year":"1994","unstructured":"Murthy, S. K., Kasif, S., & Salzberg, S. (1994). A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2, 1\u201332. https:\/\/doi.org\/10.1613\/jair.63","journal-title":"Journal of Artificial Intelligence Research"},{"key":"6488_CR45","unstructured":"Murthy, S. K., & Salzberg, S. (1995). Decision tree induction: How effective is the greedy heuristic? In Proceedings of the first international conference on knowledge discovery and data mining, KDD\u201995 (pp. 222\u2013227). AAAI Press."},{"key":"6488_CR46","doi-asserted-by":"publisher","first-page":"9","DOI":"10.1007\/s10618-010-0174-x","volume":"21","author":"S Nijssen","year":"2010","unstructured":"Nijssen, S., & Fromont, E. (2010). Optimal constraint-based decision tree induction from itemset lattices. Data Mining and Knowledge Discovery, 21, 9\u201351. https:\/\/doi.org\/10.1007\/s10618-010-0174-x","journal-title":"Data Mining and Knowledge Discovery"},{"issue":"6","key":"6488_CR47","doi-asserted-by":"publisher","first-page":"764","DOI":"10.1006\/pmed.1996.0117","volume":"25","author":"R Ottman","year":"1996","unstructured":"Ottman, R. (1996). Gene-environment interaction: Definitions and study design. Preventive Medicine, 25(6), 764\u2013770. https:\/\/doi.org\/10.1006\/pmed.1996.0117","journal-title":"Preventive Medicine"},{"issue":"3","key":"6488_CR48","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1023\/A:1024099825458","volume":"52","author":"F Provost","year":"2003","unstructured":"Provost, F., & Domingos, P. (2003). Tree Induction for probability-based ranking. Machine Learning, 52(3), 199\u2013215. https:\/\/doi.org\/10.1023\/A:1024099825458","journal-title":"Machine Learning"},{"issue":"3","key":"6488_CR49","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1086\/519795","volume":"81","author":"S Purcell","year":"2007","unstructured":"Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., Maller, J., Sklar, P., De Bakker, P. I., Daly, M. J., & Pak, C. S. (2007). PLINK: A tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics, 81(3), 559\u2013575. https:\/\/doi.org\/10.1086\/519795","journal-title":"The American Journal of Human Genetics"},{"key":"6488_CR50","volume-title":"C4.5: Programs for machine learning","author":"JR Quinlan","year":"1993","unstructured":"Quinlan, J. R. (1993). C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc."},{"key":"6488_CR51","volume-title":"R: A language and environment for statistical computing","author":"R Core Team","year":"2022","unstructured":"R Core Team. (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing."},{"issue":"3","key":"6488_CR52","doi-asserted-by":"publisher","first-page":"475","DOI":"10.1198\/1061860032238","volume":"12","author":"I Ruczinski","year":"2003","unstructured":"Ruczinski, I., Kooperberg, C., & LeBlanc, M. (2003). Logic regression. Journal of Computational and Graphical Statistics, 12(3), 475\u2013511. https:\/\/doi.org\/10.1198\/1061860032238","journal-title":"Journal of Computational and Graphical Statistics"},{"issue":"1","key":"6488_CR53","doi-asserted-by":"publisher","first-page":"178","DOI":"10.1016\/j.jmva.2004.02.010","volume":"90","author":"I Ruczinski","year":"2004","unstructured":"Ruczinski, I., Kooperberg, C., & LeBlanc, M. (2004). Exploring interactions in high-dimensional genomic data: An overview of logic regression, with applications. Journal of Multivariate Analysis, 90(1), 178\u2013195. https:\/\/doi.org\/10.1016\/j.jmva.2004.02.010","journal-title":"Journal of Multivariate Analysis"},{"issue":"7","key":"6488_CR54","doi-asserted-by":"publisher","first-page":"1301","DOI":"10.1080\/00949655.2012.658804","volume":"83","author":"T Rusch","year":"2013","unstructured":"Rusch, T., & Zeileis, A. (2013). Gaining insight with recursive partitioning of generalized linear models. Journal of Statistical Computation and Simulation, 83(7), 1301\u20131315. https:\/\/doi.org\/10.1080\/00949655.2012.658804","journal-title":"Journal of Statistical Computation and Simulation"},{"key":"6488_CR55","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1186\/1465-9921-6-152","volume":"6","author":"T Schikowski","year":"2005","unstructured":"Schikowski, T., Sugiri, D., Ranft, U., Gehring, U., Heinrich, J., Wichmann, H. E., & Kr\u00e4mer, U. (2005). Long-term air pollution exposure and living close to busy roads are associated with COPD in women. Respiratory Research, 6, 152. https:\/\/doi.org\/10.1186\/1465-9921-6-152","journal-title":"Respiratory Research"},{"issue":"1","key":"6488_CR56","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1093\/biostatistics\/kxm024","volume":"9","author":"H Schwender","year":"2007","unstructured":"Schwender, H., & Ickstadt, K. (2007). Identification of SNP interactions using logic regression. Biostatistics, 9(1), 187\u2013198. https:\/\/doi.org\/10.1093\/biostatistics\/kxm024","journal-title":"Biostatistics"},{"issue":"4","key":"6488_CR57","doi-asserted-by":"publisher","first-page":"1716","DOI":"10.1214\/15-aos1321","volume":"43","author":"E Scornet","year":"2015","unstructured":"Scornet, E., Biau, G., & Vert, J. P. (2015). Consistency of random forests. The Annals of Statistics, 43(4), 1716\u20131741. https:\/\/doi.org\/10.1214\/15-aos1321","journal-title":"The Annals of Statistics"},{"key":"6488_CR58","doi-asserted-by":"publisher","first-page":"41262","DOI":"10.1038\/srep41262","volume":"7","author":"HC So","year":"2017","unstructured":"So, H. C., & Sham, P. C. (2017). Improving polygenic risk prediction from summary statistics by an empirical Bayes approach. Scientific Reports, 7, 41262. https:\/\/doi.org\/10.1038\/srep41262","journal-title":"Scientific Reports"},{"key":"6488_CR59","doi-asserted-by":"crossref","unstructured":"Sorokina, D., Caruana, R., Riedewald, M., & Fink, D. (2008). Detecting statistical interactions with additive groves of trees. In Proceedings of the 25th international conference on machine learning, ICML \u201908, New York, NY, USA (pp. 1000\u20131007). Association for Computing Machinery.","DOI":"10.1145\/1390156.1390282"},{"key":"6488_CR60","unstructured":"Tang, C., Garreau, D., & von Luxburg, U. (2018). When do random forests fail? In Proceedings of the 32nd international conference on neural information processing systems, NIPS\u201918, Montr\u00e9al, Canada (pp. 2987\u20132997)."},{"key":"6488_CR61","unstructured":"Therneau, T., & Atkinson, B. (2019). rpart: Recursive Partitioning and Regression Trees. R package version 4.1-15."},{"issue":"1","key":"6488_CR62","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","volume":"58","author":"R Tibshirani","year":"1996","unstructured":"Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267\u2013288. https:\/\/doi.org\/10.1111\/j.2517-6161.1996.tb02080.x","journal-title":"Journal of the Royal Statistical Society: Series B (Methodological)"},{"issue":"104","key":"6488_CR63","first-page":"1","volume":"21","author":"TM Tomita","year":"2020","unstructured":"Tomita, T. M., Browne, J., Shen, C., Chung, J., Patsolic, J. L., Falk, B., Priebe, C. E., Yim, J., Burns, R., Maggioni, M., & Vogelstein, J. T. (2020). Sparse projection oblique randomer forests. Journal of Machine Learning Research, 21(104), 1\u201339.","journal-title":"Journal of Machine Learning Research"},{"issue":"1","key":"6488_CR64","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1016\/j.ejor.2004.03.035","volume":"166","author":"E Triki","year":"2005","unstructured":"Triki, E., Collette, Y., & Siarry, P. (2005). A theoretical study on the behavior of simulated annealing leading to a new cooling schedule. European Journal of Operational Research, 166(1), 77\u201392. https:\/\/doi.org\/10.1016\/j.ejor.2004.03.035","journal-title":"European Journal of Operational Research"},{"key":"6488_CR65","volume-title":"Statistical learning theory","author":"VN Vapnik","year":"1998","unstructured":"Vapnik, V. N. (1998). Statistical learning theory. Wiley-Interscience."},{"key":"6488_CR66","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-3264-1","volume-title":"The nature of statistical learning theory","author":"VN Vapnik","year":"2000","unstructured":"Vapnik, V. N. (2000). The nature of statistical learning theory. Springer."},{"key":"6488_CR67","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1186\/1471-2105-7-91","volume":"7","author":"S Varma","year":"2006","unstructured":"Varma, S., & Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7, 91. https:\/\/doi.org\/10.1186\/1471-2105-7-91","journal-title":"BMC Bioinformatics"},{"key":"6488_CR68","doi-asserted-by":"publisher","first-page":"2107","DOI":"10.1007\/s10994-021-06030-6","volume":"110","author":"DS Watson","year":"2021","unstructured":"Watson, D. S., & Wright, M. N. (2021). Testing conditional independence in supervised learning algorithms. Machine Learning, 110, 2107\u20132129. https:\/\/doi.org\/10.1007\/s10994-021-06030-6","journal-title":"Machine Learning"},{"issue":"1","key":"6488_CR69","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1214\/aoms\/1177732360","volume":"9","author":"SS Wilks","year":"1938","unstructured":"Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses. The Annals of Mathematical Statistics, 9(1), 60\u201362. https:\/\/doi.org\/10.1214\/aoms\/1177732360","journal-title":"The Annals of Mathematical Statistics"},{"key":"6488_CR70","unstructured":"Wilson, S. (2021). ParBayesianOptimization: Parallel Bayesian optimization of hyperparameters. R Package Version 1.2.4."},{"key":"6488_CR71","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1186\/1471-2105-13-164","volume":"13","author":"SJ Winham","year":"2012","unstructured":"Winham, S. J., Colby, C. L., Freimuth, R. R., Wang, X., de Andrade, M., Huebner, M., & Biernacka, J. M. (2012). SNP interaction detection with random forests in high-dimensional genetic data. BMC Bioinformatics, 13, 164. https:\/\/doi.org\/10.1186\/1471-2105-13-164","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"6488_CR72","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v077.i01","volume":"77","author":"MN Wright","year":"2017","unstructured":"Wright, M. N., & Ziegler, A. (2017). ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software, 77(1), 1\u201317. https:\/\/doi.org\/10.18637\/jss.v077.i01","journal-title":"Journal of Statistical Software"},{"key":"6488_CR73","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1186\/s12859-016-0995-8","volume":"17","author":"MN Wright","year":"2016","unstructured":"Wright, M. N., Ziegler, A., & K\u00f6nig, I. R. (2016). Do little interactions get lost in dark random forests? BMC Bioinformatics, 17, 145. https:\/\/doi.org\/10.1186\/s12859-016-0995-8","journal-title":"BMC Bioinformatics"},{"key":"6488_CR74","doi-asserted-by":"crossref","unstructured":"Yang, B. B., Shen, S. Q., & Gao, W. (2019). Weighted oblique decision trees. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, pp. 5621\u20135627).","DOI":"10.1609\/aaai.v33i01.33015621"},{"issue":"2","key":"6488_CR75","doi-asserted-by":"publisher","first-page":"492","DOI":"10.1198\/106186008X319331","volume":"17","author":"A Zeileis","year":"2008","unstructured":"Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-based recursive partitioning. Journal of Computational and Graphical Statistics, 17(2), 492\u2013514. https:\/\/doi.org\/10.1198\/106186008X319331","journal-title":"Journal of Computational and Graphical Statistics"},{"key":"6488_CR76","doi-asserted-by":"publisher","first-page":"72","DOI":"10.1016\/j.ympev.2015.06.007","volume":"92","author":"S Zhi","year":"2015","unstructured":"Zhi, S., Li, Q., Yasui, Y., Edge, T., Topp, E., & Neumann, N. F. (2015). Assessing host-specificity of Escherichia coli using a supervised learning logic-regression-based analysis of single nucleotide polymorphisms in intergenic regions. Molecular Phylogenetics and Evolution, 92, 72\u201381. https:\/\/doi.org\/10.1016\/j.ympev.2015.06.007","journal-title":"Molecular Phylogenetics and Evolution"},{"key":"6488_CR77","unstructured":"Zhu, H., Murali, P., Phan, D., Nguyen, L., & Kalagnanam, J. (2020). A scalable MIP-based method for learning optimal multivariate decision trees. In Advances in neural information processing systems (Vol.\u00a033, pp. 1771\u20131781). Curran Associates, Inc."}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-023-06488-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-023-06488-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-023-06488-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,1,18]],"date-time":"2024-01-18T19:12:33Z","timestamp":1705605153000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-023-06488-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,22]]},"references-count":77,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,2]]}},"alternative-id":["6488"],"URL":"https:\/\/doi.org\/10.1007\/s10994-023-06488-6","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,22]]},"assertion":[{"value":"3 November 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 October 2023","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 November 2023","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 December 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no competing interests to declare that are relevant to the content of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The SALIA cohort study was conducted in accordance to the declaration of Helsinki and has been approved by the Ethics Committees of the Ruhr-University Bochum and the Heinrich Heine University D\u00fcsseldorf. We received written informed consent from all participants.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}}]}}