{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T07:12:28Z","timestamp":1773126748334,"version":"3.50.1"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"1","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62176117,62306104"],"award-info":[{"award-number":["62176117,62306104"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Jiangsu Science Foundation","award":["BK20230949"],"award-info":[{"award-number":["BK20230949"]}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"crossref","award":["2023TQ0104"],"award-info":[{"award-number":["2023TQ0104"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Jiangsu Excellent Postdoctoral Program","award":["2023ZB140"],"award-info":[{"award-number":["2023ZB140"]}]},{"name":"Collaborative Innovation Center of Novel Software Technology and Industrialization"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2025,11,30]]},"abstract":"<jats:p>Deep forest is a non-differentiable deep model that has achieved impressive empirical success across a wide variety of applications, especially on categorical\/symbolic or mixed modeling tasks. Many of the application fields prefer explainable models, such as random forests with feature contributions that can provide a local explanation for each prediction, and Mean Decrease Impurity (MDI) that can provide global feature importance. However, deep forest, as a cascade of random forests, possesses interpretability only at the first layer. From the second layer on, many of the tree splits occur on the new features generated by the previous layer, which makes existing explaining tools for random forests inapplicable. To disclose the impact of the original features in the deep layers, we design a calculation method with an estimation step followed by a calibration step for each layer, and propose our feature contribution and MDI feature importance calculation tools for deep forest. Experimental results on both simulated data and real-world data verify the effectiveness of our methods.<\/jats:p>","DOI":"10.1145\/3641108","type":"journal-article","created":{"date-parts":[[2024,1,19]],"date-time":"2024-01-19T07:11:05Z","timestamp":1705648265000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Interpreting Deep Forest through Feature Contribution and MDI Feature Importance"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9518-5313","authenticated-orcid":false,"given":"Yi-Xiao","family":"He","sequence":"first","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, and School of Artificial Intelligence, Nanjing University","place":["Nanjing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0173-8408","authenticated-orcid":false,"given":"Shen-Huan","family":"Lyu","sequence":"additional","affiliation":[{"name":"Key Laboratory of Water Big Data Technology of Ministry of Water Resources, and College of Computer Science and Software Engineering, Hohai University","place":["Nanjing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-8093-4826","authenticated-orcid":false,"given":"Yuan","family":"Jiang","sequence":"additional","affiliation":[{"name":"National Key Laboratory for Novel Software Technology, and School of Artificial Intelligence, Nanjing University","place":["Nanjing, China"]}]}],"member":"320","published-online":{"date-parts":[[2025,12,11]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"342","volume-title":"Proceedings of the 38th International Conference on Machine Learning","author":"Arnould Ludovic","year":"2021","unstructured":"Ludovic Arnould, Claire Boyer, and Erwan Scornet. 2021. Analyzing the tree-layer structure of deep forests. In Proceedings of the 38th International Conference on Machine Learning. 342\u2013350."},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2019.12.012"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.dss.2010.08.008"},{"key":"e_1_3_1_5_2","article-title":"Consistency of random forests and other averaging classifiers.","volume":"9","author":"Biau G\u00e9rard","year":"2008","unstructured":"G\u00e9rard Biau, Luc Devroye, and G\u00e4bor Lugosi. 2008. Consistency of random forests and other averaging classifiers. Journal of Machine Learning Research 9 (2008), 2015\u20132033.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/LGRS.2019.2911855"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"e_1_3_1_8_2","volume-title":"Classification and Regression Trees","author":"Breiman Leo","year":"1984","unstructured":"Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. 1984. Classification and Regression Trees. Boca Raton, FL: Chapman and Hall\/CRC."},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939785"},{"key":"e_1_3_1_10_2","first-page":"1036","volume-title":"Proceedings of the 21st IEEE International Conference on Data Mining","author":"Chen Yi-He","year":"2021","unstructured":"Yi-He Chen, Shen-Huan Lyu, and Yuan Jiang. 2021. Improving deep forest by exploiting high-order interactions. In Proceedings of the 21st IEEE International Conference on Data Mining. 1036\u20131041."},{"key":"e_1_3_1_11_2","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1007\/978-1-4419-9326-7_5","volume-title":"Proceedings of the Ensemble Machine Learning: Methods and Applications","author":"Cutler Adele","year":"2012","unstructured":"Adele Cutler, D. Richard Cutler, and John R. Stevens. 2012. Random forests. In Proceedings of the Ensemble Machine Learning: Methods and Applications. 157\u2013175."},{"key":"e_1_3_1_12_2","article-title":"Event labeling combining ensemble detectors and background knowledge","author":"Fanaee-T. Hadi","year":"2014","unstructured":"Hadi Fanaee-T. and Joao Gama. 2014. Event labeling combining ensemble detectors and background knowledge. Progress in Artificial Intelligence 2 (2014), 113\u2013127.","journal-title":"Progress in Artificial Intelligence"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1013203451"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-006-6226-1"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11634-016-0276-4"},{"issue":"6","key":"e_1_3_1_16_2","first-page":"593","article-title":"Interpretation of QSAR models based on random forest methods","volume":"30","author":"Kuz\u2019min Victor E.","year":"2011","unstructured":"Victor E. Kuz\u2019min, Pavel G. Polishchuk, Anatoly G. Artemenko, and Sergey A. Andronati. 2011. Interpretation of QSAR models based on random forest methods. Molecular Informatics 30, 6\u20137 (2011), 593\u2013603.","journal-title":"Molecular Informatics"},{"key":"e_1_3_1_17_2","unstructured":"Xiao Li Yu Wang Sumanta Basu Karl Kumbier and Bin Yu. 2019. A debiased MDI feature importance measure for random forests. Advances in Neural Information Processing Systems 32 (2019) 8049\u20138059."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1049\/cje.2022.00.178"},{"key":"e_1_3_1_19_2","first-page":"29719","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Lyu Shen-Huan","year":"2022","unstructured":"Shen-Huan Lyu, Yi-Xiao He, and Zhi-Hua Zhou. 2022. Depth is more powerful than width with prediction concatenation in deep forest. In Proceedings of the Advances in Neural Information Processing Systems. 29719\u201329732."},{"key":"e_1_3_1_20_2","first-page":"5530","volume-title":"Proceedings of the Advances in Neural Information Processing Systems","author":"Lyu Shen-Huan","year":"2019","unstructured":"Shen-Huan Lyu, Liang Yang, and Zhi-Hua Zhou. 2019. A refined margin distribution analysis for forest representation learning. In Proceedings of the Advances in Neural Information Processing Systems. 5530\u20135540."},{"key":"e_1_3_1_21_2","article-title":"HW-Forest: Deep forest with hashing screening and window screening","author":"Ma Pengfei","year":"2022","unstructured":"Pengfei Ma, Youxi Wu, Yan Li, Lei Guo, He Jiang, Xingquan Zhu, and Xindong Wu. 2022. HW-Forest: Deep forest with hashing screening and window screening. ACM Transactions on Knowledge Discovery from Data 16, 6 (2022), 1\u201324.","journal-title":"ACM Transactions on Knowledge Discovery from Data"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbr016"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btp331"},{"key":"e_1_3_1_24_2","first-page":"193","volume-title":"Proceedings of the Integration of Reusable Systems","author":"Palczewska Anna","year":"2013","unstructured":"Anna Palczewska, Jan Palczewski, Richard Marchese Robinson, and Daniel Neagu. 2013. Interpreting random forest classification models using a feature contribution method. In Proceedings of the Integration of Reusable Systems. 193\u2013218."},{"key":"e_1_3_1_25_2","first-page":"1194","volume-title":"Proceeding of the 18th IEEE International Conference on Data Mining","author":"Pang Ming","year":"2018","unstructured":"Ming Pang, Kai-Ming Ting, Peng Zhao, and Zhi-Hua Zhou. 2018. Improving deep forest by confidence screening. In Proceeding of the 18th IEEE International Conference on Data Mining. 1194\u20131199."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2020.3038799"},{"key":"e_1_3_1_27_2","volume-title":"Interpreting Random Forests","author":"Saabas Ando","year":"2014","unstructured":"Ando Saabas. 2014. Interpreting Random Forests. Retrieved May 25, 2022 from https:\/\/blog.datadive.net\/interpreting-random-forests\/"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/8291.001.0001"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1214\/15-AOS1321"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-9-307"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-8-25"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ymeth.2019.02.009"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2017.10.006"},{"issue":"2","key":"e_1_3_1_34_2","first-page":"1950007:1\u201319500","article-title":"Discriminative metric learning with deep forest","volume":"28","author":"Utkin Lev V.","year":"2019","unstructured":"Lev V. Utkin and Mikhail A. Ryabinin. 2019. Discriminative metric learning with deep forest. International Journal on Artificial Intelligence Tools 28, 2 (2019), 1950007:1\u20131950007:19.","journal-title":"International Journal on Artificial Intelligence Tools"},{"key":"e_1_3_1_35_2","first-page":"6251","volume-title":"Proceedings of the 34th AAAI Conference on Artificial Intelligence","author":"Wang Qian-Wei","year":"2020","unstructured":"Qian-Wei Wang, Liang Yang, and Yu-Feng Li. 2020. Learning from weak-label data: A deep forest expedition. In Proceedings of the 34th AAAI Conference on Artificial Intelligence. 6251\u20136258."},{"key":"e_1_3_1_36_2","first-page":"1634","volume-title":"Proceedings of the 24th European Conference on Artificial Intelligence","author":"Yang Liang","year":"2020","unstructured":"Liang Yang, Xi-Zhu Wu, Yuan Jiang, and Zhi-Hua Zhou. 2020. Multi-label learning with deep forest. In Proceedings of the 24th European Conference on Artificial Intelligence. 1634\u20131641."},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.3390\/diagnostics12020237"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3342241"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2019.06.005"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3429445"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.5555\/3172077.3172386"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1093\/nsr\/nwy108"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3641108","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,11]],"date-time":"2025-12-11T14:09:03Z","timestamp":1765462143000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3641108"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,30]]},"references-count":41,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,11,30]]}},"alternative-id":["10.1145\/3641108"],"URL":"https:\/\/doi.org\/10.1145\/3641108","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,30]]},"assertion":[{"value":"2022-12-16","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-01-03","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-12-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}