{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,5]],"date-time":"2026-04-05T05:26:04Z","timestamp":1775366764686,"version":"3.50.1"},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"10","license":[{"start":{"date-parts":[[2020,5,2]],"date-time":"2020-05-02T00:00:00Z","timestamp":1588377600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,5,2]],"date-time":"2020-05-02T00:00:00Z","timestamp":1588377600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Comput Aided Mol Des"],"published-print":{"date-parts":[[2020,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Difficulties in interpreting machine learning (ML) models and their predictions limit the practical applicability of and confidence in ML in pharmaceutical research. There is a need for agnostic approaches aiding in the interpretation of ML models regardless of their complexity that is also applicable to deep neural network (DNN) architectures and model ensembles. To these ends, the SHapley Additive exPlanations (SHAP) methodology has recently been introduced. The SHAP approach enables the identification and prioritization of features that determine compound classification and activity prediction using any ML model. Herein, we further extend the evaluation of the SHAP methodology by investigating a variant for exact calculation of Shapley values for decision tree methods and systematically compare this variant in compound activity and potency value predictions with the model-independent SHAP method. Moreover, new applications of the SHAP analysis approach are presented including interpretation of DNN models for the generation of multi-target activity profiles and ensemble regression models for potency prediction.<\/jats:p>","DOI":"10.1007\/s10822-020-00314-0","type":"journal-article","created":{"date-parts":[[2020,5,2]],"date-time":"2020-05-02T05:02:30Z","timestamp":1588395750000},"page":"1013-1026","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":608,"title":["Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions"],"prefix":"10.1007","volume":"34","author":[{"given":"Raquel","family":"Rodr\u00edguez-P\u00e9rez","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0557-5714","authenticated-orcid":false,"given":"J\u00fcrgen","family":"Bajorath","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,5,2]]},"reference":[{"key":"314_CR1","doi-asserted-by":"publisher","first-page":"1413","DOI":"10.1021\/ci200409x","volume":"52","author":"A Varnek","year":"2012","unstructured":"Varnek A, Baskin I (2012) Machine learning methods for property prediction in cheminformatics: quo vadis? J Chem Inf Model 52:1413\u20131437","journal-title":"J Chem Inf Model"},{"key":"314_CR2","doi-asserted-by":"publisher","first-page":"4977","DOI":"10.1021\/jm4004285","volume":"57","author":"A Cherkasov","year":"2014","unstructured":"Cherkasov A, Muratov E, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuzmin VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977\u20135010","journal-title":"J Med Chem"},{"key":"314_CR3","doi-asserted-by":"publisher","first-page":"318","DOI":"10.1016\/j.drudis.2014.10.012","volume":"20","author":"A Lavecchia","year":"2015","unstructured":"Lavecchia A (2015) Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 20:318\u2013331","journal-title":"Drug Discov Today"},{"key":"314_CR4","doi-asserted-by":"publisher","first-page":"1538","DOI":"10.1016\/j.drudis.2018.05.010","volume":"23","author":"Y Lo","year":"2018","unstructured":"Lo Y, Rensi SE, Torng W, Altman RB (2018) Machine learning in chemoinformatics and drug discovery. Drug Discov Today 23:1538\u20131546","journal-title":"Drug Discov Today"},{"key":"314_CR5","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1002\/minf.201100059","volume":"30","author":"K Hansen","year":"2011","unstructured":"Hansen K, Baehrens D, Schroeter T, Rupp M, M\u00fcller K-R (2011) Visual interpretation of kernel-based prediction models. Mol Inform 30:817\u2013826","journal-title":"Mol Inform"},{"key":"314_CR6","doi-asserted-by":"publisher","first-page":"2451","DOI":"10.1021\/ci500410g","volume":"54","author":"J Balfer","year":"2014","unstructured":"Balfer J, Bajorath J (2014) Introduction of a methodology for visualization and graphical interpretation of Bayesian classification models. J Chem Inf Model 54:2451\u20132468","journal-title":"J Chem Inf Model"},{"key":"314_CR7","doi-asserted-by":"publisher","first-page":"1136","DOI":"10.1021\/acs.jcim.5b00175","volume":"55","author":"J Balfer","year":"2015","unstructured":"Balfer J, Bajorath J (2015) Visualization and interpretation of support vector machine activity predictions. J Chem Inf Model 55:1136\u20131147","journal-title":"J Chem Inf Model"},{"key":"314_CR8","doi-asserted-by":"crossref","unstructured":"Ribeiro MT, Singh S, Guestrin C. (2016) \u201cWhy should I trust you?\u201d Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining 1:1135\u20131144","DOI":"10.1145\/2939672.2939778"},{"key":"314_CR9","doi-asserted-by":"publisher","first-page":"2618","DOI":"10.1021\/acs.jcim.7b00274","volume":"57","author":"P Polishchuk","year":"2017","unstructured":"Polishchuk P (2017) Interpretation of quantitative structure-activity relationship models: Past, present, and future. J Chem Inf Model 57:2618\u20132639","journal-title":"J Chem Inf Model"},{"key":"314_CR10","unstructured":"Nielsen MA (2015) Neural networks and deep learning. Determination Press"},{"key":"314_CR11","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-3264-1","volume-title":"The nature of statistical learning theory","author":"VN Vapnik","year":"2000","unstructured":"Vapnik VN (2000) The nature of statistical learning theory, 2nd edn. Springer, New York","edition":"2"},{"key":"314_CR12","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L (2001) Random forests. Mach Learn 45:5\u201332","journal-title":"Mach Learn"},{"key":"314_CR13","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1023\/B:STCO.0000035301.49549.88","volume":"14","author":"AJ Smola","year":"2004","unstructured":"Smola AJ, Sch\u00f6lkopf B (2004) A tutorial on support vector regression. Stat Comput 14:199\u2013222","journal-title":"Stat Comput"},{"key":"314_CR14","doi-asserted-by":"publisher","first-page":"6371","DOI":"10.1021\/acsomega.7b01079","volume":"2","author":"R Rodr\u00edguez-P\u00e9rez","year":"2017","unstructured":"Rodr\u00edguez-P\u00e9rez R, Vogt M, Bajorath J (2017) Support vector machine classification and regression prioritize different structural features for binary compound activity and potency value prediction. ACS Omega 2:6371\u20136379","journal-title":"ACS Omega"},{"key":"314_CR15","first-page":"1","volume-title":"Handbook of uncertainty quantification","author":"B Iooss","year":"2016","unstructured":"Iooss B, Saltelli A (2016) Introduction to sensitivity analysis. In: Ghanem R, Higdon D, Owhadi H (eds) Handbook of uncertainty quantification. Springer International Publishing, Cham, pp 1\u201320"},{"key":"314_CR16","doi-asserted-by":"publisher","first-page":"3201","DOI":"10.1021\/jm00095a016","volume":"35","author":"SS So","year":"1992","unstructured":"So SS, Richards WG (1992) Application of neural networks: quantitative structure- activity relationships of the derivatives of 2,4-diamino-5-(substituted-benzyl)pyrimidines as DHFR Inhibitors. J Med Chem 35:3201\u20133207","journal-title":"J Med Chem"},{"key":"314_CR17","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1080\/10629360290002073","volume":"13","author":"II Baskin","year":"2002","unstructured":"Baskin II, Ait AO, Halberstam NM, Palyulin VA, Zefirov NS (2002) An approach to the interpretation of backpropagation neural network models in QSAR studies. SAR QSAR Environ Res 13:35\u201341","journal-title":"SAR QSAR Environ Res"},{"key":"314_CR18","doi-asserted-by":"publisher","first-page":"647","DOI":"10.4155\/fmc.11.23","volume":"3","author":"U Johansson","year":"2011","unstructured":"Johansson U, S\u00f6nstr\u00f6d C, Norinder U, Bostr\u00f6m H (2011) Trade-off between accuracy and interpretability for predictive in silico modeling. Fut Med Chem 3:647\u2013663","journal-title":"Fut Med Chem"},{"key":"314_CR19","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jmedchem.9b01101","author":"R Rodr\u00edguez-P\u00e9rez","year":"2020","unstructured":"Rodr\u00edguez-P\u00e9rez R, Bajorath J (2020) Interpretation of compound activity predictions from complex machine learning models using local approximations and Shapley values. J Med Chem. https:\/\/doi.org\/10.1021\/acs.jmedchem.9b01101","journal-title":"J Med Chem"},{"key":"314_CR20","unstructured":"Lundberg SM, Lee S (2017) A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (NIPS)"},{"key":"314_CR21","first-page":"307","volume-title":"Annals of mathematical studies","author":"LS Shapley","year":"1953","unstructured":"Shapley LS (1953) A value for N-person games. Contributions to the theory of games. In: Kuhn HW, Tucker AW (eds) Annals of mathematical studies. Princeton University Press, Princeton, pp 307\u2013317"},{"key":"314_CR22","volume-title":"A course in game theory","author":"MJ Osborne","year":"1994","unstructured":"Osborne MJ, Rubinstein A (1994) A course in game theory. The MIT Press, Cambridge, MA"},{"key":"314_CR23","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1007\/BF01769885","volume":"14","author":"HP Young","year":"1985","unstructured":"Young HP (1985) Monotonic solutions of cooperative games. Int J Game Theory 14:65\u201372","journal-title":"Int J Game Theory"},{"key":"314_CR24","doi-asserted-by":"publisher","first-page":"D1100","DOI":"10.1093\/nar\/gkr777","volume":"40","author":"A Gaulton","year":"2012","unstructured":"Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100\u2013D1107","journal-title":"Nucleic Acids Res"},{"key":"314_CR25","doi-asserted-by":"publisher","first-page":"2324","DOI":"10.1021\/acs.jcim.5b00559","volume":"55","author":"T Sterling","year":"2015","unstructured":"Sterling T, Irwin JJ (2015) ZINC 15\u2014ligand discovery for everyone. J Chem Inf Model 55:2324\u20132337","journal-title":"J Chem Inf Model"},{"key":"314_CR26","doi-asserted-by":"publisher","first-page":"730","DOI":"10.3390\/molecules22050730","volume":"22","author":"D Dimova","year":"2017","unstructured":"Dimova D, Bajorath J (2017) Assessing scaffold diversity of kinase inhibitors using alternative scaffold concepts and estimating the scaffold hopping potential for different kinases. Molecules 22:730\u2013740","journal-title":"Molecules"},{"key":"314_CR27","doi-asserted-by":"publisher","first-page":"742","DOI":"10.1021\/ci100050t","volume":"50","author":"D Rogers","year":"2010","unstructured":"Rogers D, Hahn M (2010) Extended connectivity fingerprints. J Chem Inf Model 50:742\u2013754","journal-title":"J Chem Inf Model"},{"key":"314_CR28","volume-title":"OpenEye scientific software","author":"OEChem Toolkit","year":"2019","unstructured":"OEChem Toolkit (2019) OpenEye scientific software. OEChem Toolkit, Santa Fe, NM"},{"key":"314_CR29","doi-asserted-by":"publisher","first-page":"7667","DOI":"10.1021\/acs.jmedchem.6b00906","volume":"59","author":"D Stumpfe","year":"2016","unstructured":"Stumpfe D, Dimova D, Bajorath J (2016) Computational method for the systematic identification of analog series and key compounds representing series and their biological activity profiles. J Med Chem 59:7667\u20137676","journal-title":"J Med Chem"},{"key":"314_CR30","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","volume":"405","author":"B Matthews","year":"1975","unstructured":"Matthews B (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405:442\u2013451","journal-title":"Biochim Biophys Acta"},{"key":"314_CR31","doi-asserted-by":"crossref","unstructured":"Brodersen KH, Ong CS, Stephan KE, Buhmann JM (2010) The balanced accuracy and its posterior distribution. In: Proceedings of the 20th international conference on pattern recognition (ICPR) 1:3121\u20133124","DOI":"10.1109\/ICPR.2010.764"},{"key":"314_CR32","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"314_CR33","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","volume":"63","author":"P Geurts","year":"2006","unstructured":"Geurts P, Wehenkel ED (2006) Extremely randomized trees. Mach Learn 63:3\u201342","journal-title":"Mach Learn"},{"key":"314_CR34","doi-asserted-by":"publisher","first-page":"1189","DOI":"10.1214\/aos\/1013203451","volume":"29","author":"J Friedman","year":"2001","unstructured":"Friedman J (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189\u20131232","journal-title":"Ann Stat"},{"key":"314_CR35","doi-asserted-by":"publisher","first-page":"367","DOI":"10.1016\/S0167-9473(01)00065-2","volume":"38","author":"J Friedman","year":"2002","unstructured":"Friedman J (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367\u2013378","journal-title":"Comput Stat Data Anal"},{"key":"314_CR36","volume-title":"Pattern recognition and machine learning","author":"CM Bishop","year":"2006","unstructured":"Bishop CM (2006) Pattern recognition and machine learning. Springer, New York"},{"key":"314_CR37","volume-title":"Pattern classification","author":"RO Duda","year":"2000","unstructured":"Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley, New York","edition":"2"},{"key":"314_CR38","unstructured":"Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on operating systems design and implementation (OSDI 16), Savannah, GA"},{"key":"314_CR39","unstructured":"Chollet F (2015) Keras. https:\/\/github.com\/keras-team\/keras"},{"key":"314_CR40","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1038\/s42256-019-0138-9","volume":"2","author":"SM Lundberg","year":"2020","unstructured":"Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2:56\u201367","journal-title":"Nat Mach Intell"},{"key":"314_CR41","doi-asserted-by":"publisher","first-page":"4367","DOI":"10.1021\/acsomega.9b00298","volume":"4","author":"R Rodr\u00edguez-P\u00e9rez","year":"2019","unstructured":"Rodr\u00edguez-P\u00e9rez R, Bajorath J (2019) Multitask machine learning for classifying highly and weakly potent kinase inhibitors. ACS Omega 4:4367\u20134375","journal-title":"ACS Omega"}],"container-title":["Journal of Computer-Aided Molecular Design"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10822-020-00314-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10822-020-00314-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10822-020-00314-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,5,3]],"date-time":"2021-05-03T17:27:28Z","timestamp":1620062848000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10822-020-00314-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,2]]},"references-count":41,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2020,10]]}},"alternative-id":["314"],"URL":"https:\/\/doi.org\/10.1007\/s10822-020-00314-0","relation":{},"ISSN":["0920-654X","1573-4951"],"issn-type":[{"value":"0920-654X","type":"print"},{"value":"1573-4951","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,2]]},"assertion":[{"value":"6 March 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 April 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 May 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}