{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:44:33Z","timestamp":1753875873673,"version":"3.41.2"},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2023,8,12]],"date-time":"2023-08-12T00:00:00Z","timestamp":1691798400000},"content-version":"vor","delay-in-days":11,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000266","name":"UK Engineering and Physical Sciences Research Council","doi-asserted-by":"crossref","award":["EP\/R022925\/2","EP\/W004801\/1","EP\/S026347\/1"],"award-info":[{"award-number":["EP\/R022925\/2","EP\/W004801\/1","EP\/S026347\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100012338","name":"Alan Turing Institute","doi-asserted-by":"publisher","award":["EP\/N510129\/1"],"award-info":[{"award-number":["EP\/N510129\/1"]}],"id":[{"id":"10.13039\/100012338","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000765","name":"University College London","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000765","id-type":"DOI","asserted-by":"publisher"}]},{"name":"China Scholarship Council under the UCL-CSC scholarship","award":["201908060002"],"award-info":[{"award-number":["201908060002"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Molecular docking is a commonly used approach for estimating binding conformations and their resultant binding affinities. Machine learning has been successfully deployed to enhance such affinity estimations. Many methods of varying complexity have been developed making use of some or all the spatial and categorical information available in these structures. The evaluation of such methods has mainly been carried out using datasets from PDBbind. Particularly the Comparative Assessment of Scoring Functions (CASF) 2007, 2013, and 2016 datasets with dedicated test sets. This work demonstrates that only a small number of simple descriptors is necessary to efficiently estimate binding affinity for these complexes without the need to know the exact binding conformation of a ligand.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The developed approach of using a small number of ligand and protein descriptors in conjunction with gradient boosting trees demonstrates high performance on the CASF datasets. This includes the commonly used benchmark CASF2016 where it appears to perform better than any other approach. This methodology is also useful for datasets where the spatial relationship between the ligand and protein is unknown as demonstrated using a large ChEMBL-derived dataset.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Code and data uploaded to https:\/\/github.com\/abbiAR\/PLBAffinity.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad502","type":"journal-article","created":{"date-parts":[[2023,8,12]],"date-time":"2023-08-12T15:44:38Z","timestamp":1691855078000},"source":"Crossref","is-referenced-by-count":2,"title":["Protein\u2013ligand binding affinity prediction exploiting sequence constituent homology"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3067-7365","authenticated-orcid":false,"given":"Abbi","family":"Abdel-Rehim","sequence":"first","affiliation":[{"name":"Department of Chemical Engineering and Biotechnology, University of Cambridge , Cambridge CB3 0AS, United Kingdom"}]},{"given":"Oghenejokpeme","family":"Orhobor","sequence":"additional","affiliation":[{"name":"The National Institute of Agricultural Botany , Cambridge CB3 0LE, United Kingdom"}]},{"given":"Lou","family":"Hang","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University College London , London WC1H 0AY, United Kingdom"}]},{"given":"Hao","family":"Ni","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University College London , London WC1H 0AY, United Kingdom"},{"name":"The Alan Turing Institute , London NW1 2DB, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7208-4387","authenticated-orcid":false,"given":"Ross D","family":"King","sequence":"additional","affiliation":[{"name":"Department of Chemical Engineering and Biotechnology, University of Cambridge , Cambridge CB3 0AS, United Kingdom"},{"name":"The Alan Turing Institute , London NW1 2DB, United Kingdom"},{"name":"Department of Biology and Biological Engineering, Chalmers University of Technology , Gothenburg 412 96, Sweden"},{"name":"Department of Computer Science and Engineering, Chalmers University of Technology , Gothenburg 412 96, Sweden"}]}],"member":"286","published-online":{"date-parts":[[2023,8,12]]},"reference":[{"key":"2023082911044167900_btad502-B1","doi-asserted-by":"crossref","first-page":"758","DOI":"10.1093\/bioinformatics\/btz665","article-title":"Learning from the ligand: using ligand-based features to improve binding affinity prediction","volume":"36","author":"Boyles","year":"2020","journal-title":"Bioinformatics"},{"key":"2023082911044167900_btad502-B2","doi-asserted-by":"crossref","first-page":"1079","DOI":"10.1021\/ci9000053","article-title":"Comparative assessment of scoring functions on a diverse test set","volume":"49","author":"Cheng","year":"2009","journal-title":"J Chem Inf Model"},{"first-page":"3371","year":"2018","author":"Gao","key":"2023082911044167900_btad502-B3"},{"key":"2023082911044167900_btad502-B4","doi-asserted-by":"crossref","first-page":"1616","DOI":"10.1021\/ja01062a035","article-title":"p-\u03c3-\u03c0 analysis. A method for the correlation of biological activity and chemical structure","volume":"86","author":"Hansch","year":"1964","journal-title":"J Am Chem Soc"},{"key":"2023082911044167900_btad502-B5","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1186\/s13321-018-0293-8","article-title":"PubChem chemical structure standardization","volume":"10","author":"H\u00e4hnke","year":"2018","journal-title":"J Cheminform"},{"key":"2023082911044167900_btad502-B6","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1021\/acs.jcim.7b00650","article-title":"KDEEP: protein\u2013ligand absolute binding affinity prediction via 3D-convolutional neural networks","volume":"58","author":"Jim\u00e9nez","year":"2018","journal-title":"J Chem Inf Model"},{"key":"2023082911044167900_btad502-B7","doi-asserted-by":"crossref","first-page":"3329","DOI":"10.1093\/bioinformatics\/btz111","article-title":"DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks","volume":"35","author":"Karimi","year":"2019","journal-title":"Bioinformatics"},{"key":"2023082911044167900_btad502-B8","doi-asserted-by":"crossref","first-page":"312","DOI":"10.2174\/138920307781369382","article-title":"Structure-based drug design: docking and scoring","volume":"8","author":"Kroemer","year":"2007","journal-title":"Curr Protein Pept Sci"},{"key":"2023082911044167900_btad502-B9","doi-asserted-by":"crossref","first-page":"e1465","DOI":"10.1002\/wcms.1465","article-title":"Machine-learning scoring functions for structure-based drug lead optimization","volume":"10","author":"Li","year":"2020","journal-title":"Wiley Interdiscip Rev Comput Mol Sci"},{"key":"2023082911044167900_btad502-B10","doi-asserted-by":"crossref","first-page":"1700","DOI":"10.1021\/ci500080q","article-title":"Comparative assessment of scoring functions on an updated benchmark: 1. Compilation of the test set","volume":"54","author":"Li","year":"2014","journal-title":"J Chem Inf Model"},{"key":"2023082911044167900_btad502-B11","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1021\/ci500731a","article-title":"Classification of current scoring functions","volume":"55","author":"Liu","year":"2015","journal-title":"J Chem Inf Model"},{"key":"2023082911044167900_btad502-B12","doi-asserted-by":"crossref","first-page":"3525","DOI":"10.1039\/D0CS00098A","article-title":"QSAR without borders","volume":"49","author":"Muratov","year":"2020","journal-title":"Chem Soc Rev"},{"key":"2023082911044167900_btad502-B13","doi-asserted-by":"crossref","first-page":"3291","DOI":"10.1021\/acs.jcim.9b00334","article-title":"AGL-score: algebraic graph learning score for protein\u2013ligand binding scoring, ranking, docking, and screening","volume":"59","author":"Nguyen","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2023082911044167900_btad502-B14","doi-asserted-by":"crossref","first-page":"e3179","DOI":"10.1002\/cnm.3179","article-title":"DG-GL: differential geometry-based geometric learning of molecular datasets","volume":"35","author":"Nguyen","year":"2019","journal-title":"Int J Numer Methods Biomed Eng"},{"key":"2023082911044167900_btad502-B15","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1186\/1758-2946-3-33","article-title":"Open babel: an open chemical toolbox","volume":"3","author":"O'Boyle","year":"2011","journal-title":"J Cheminform"},{"key":"2023082911044167900_btad502-B16","doi-asserted-by":"crossref","first-page":"211745","DOI":"10.1098\/rsos.211745","article-title":"A simple spatial extension to the extended connectivity interaction features for binding affinity prediction","volume":"9","author":"Orhobor","year":"2022","journal-title":"R Soc Open Sci"},{"key":"2023082911044167900_btad502-B17","doi-asserted-by":"crossref","first-page":"1376","DOI":"10.1093\/bioinformatics\/btaa982","article-title":"Extended connectivity interaction features: improving binding affinity prediction through chemical description","volume":"37","author":"S\u00e1nchez-Cruz","year":"2021","journal-title":"Bioinformatics"},{"key":"2023082911044167900_btad502-B18","doi-asserted-by":"crossref","first-page":"895","DOI":"10.1021\/acs.jcim.8b00545","article-title":"Comparative assessment of scoring functions: the CASF-2016 update","volume":"59","author":"Su","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2023082911044167900_btad502-B19","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1093\/bioinformatics\/bty535","article-title":"Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences","volume":"35","author":"Tsubaki","year":"2019","journal-title":"Bioinformatics"},{"key":"2023082911044167900_btad502-B20","doi-asserted-by":"crossref","first-page":"7946","DOI":"10.1021\/acs.jmedchem.2c00487","article-title":"On the frustration to predict binding affinities from protein\u2013ligand structures with deep neural networks","volume":"65","author":"Volkov","year":"2022","journal-title":"J Med Chem"},{"key":"2023082911044167900_btad502-B21","doi-asserted-by":"crossref","first-page":"69","DOI":"10.3389\/fphar.2020.00069","article-title":"Predicting or pretending: artificial intelligence for protein-ligand interactions lack of sufficiently large and unbiased datasets","volume":"11","author":"Yang","year":"2020","journal-title":"Front Pharmacol"},{"key":"2023082911044167900_btad502-B22","doi-asserted-by":"crossref","first-page":"i821","DOI":"10.1093\/bioinformatics\/bty593","article-title":"DeepDTA: deep drug-target binding affinity prediction","volume":"34","author":"\u00d6zt\u00fcrk","year":"2018","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad502\/51102642\/btad502.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/8\/btad502\/51278938\/btad502.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/8\/btad502\/51278938\/btad502.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,29]],"date-time":"2023-08-29T11:05:05Z","timestamp":1693307105000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad502\/7241686"}},"subtitle":[],"editor":[{"given":"Arne","family":"Elofsson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,8,1]]},"references-count":22,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2023,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad502","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2023,8,1]]},"published":{"date-parts":[[2023,8,1]]},"article-number":"btad502"}}