{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,16]],"date-time":"2026-04-16T22:47:33Z","timestamp":1776379653969,"version":"3.51.2"},"reference-count":52,"publisher":"Oxford University Press (OUP)","issue":"18","license":[{"start":{"date-parts":[[2019,2,15]],"date-time":"2019-02-15T00:00:00Z","timestamp":1550188800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R35GM124952"],"award-info":[{"award-number":["R35GM124952"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000185","name":"Defense Advanced Research Projects Agency","doi-asserted-by":"publisher","award":["FA8750-18-2-0027"],"award-info":[{"award-number":["FA8750-18-2-0027"]}],"id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100019038","name":"Texas A&M High Performance Research Computing","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100019038","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Drug discovery demands rapid quantification of compound\u2013protein interaction (CPI). However, there is a lack of methods that can predict compound\u2013protein affinity from sequences alone with high applicability, accuracy and interpretability.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We present a seamless integration of domain knowledges and learning-based approaches. Under novel representations of structurally annotated protein sequences, a semi-supervised deep learning model that unifies recurrent and convolutional neural networks has been proposed to exploit both unlabeled and labeled data, for jointly encoding molecular representations and predicting affinities. Our representations and models outperform conventional options in achieving relative error in IC50 within 5-fold for test cases and 20-fold for protein classes not included for training. Performances for new protein classes with few labeled data are further improved by transfer learning. Furthermore, separate and joint attention mechanisms are developed and embedded to our model to add to its interpretability, as illustrated in case studies for predicting and explaining selective drug\u2013target interactions. Lastly, alternative representations using protein sequences or compound graphs and a unified RNN\/GCNN-CNN model using graph CNN (GCNN) are also explored to reveal algorithmic challenges ahead.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Data and source codes are available at https:\/\/github.com\/Shen-Lab\/DeepAffinity.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz111","type":"journal-article","created":{"date-parts":[[2019,2,15]],"date-time":"2019-02-15T09:12:38Z","timestamp":1550221958000},"page":"3329-3338","source":"Crossref","is-referenced-by-count":450,"title":["DeepAffinity: interpretable deep learning of compound\u2013protein affinity through unified recurrent and convolutional neural networks"],"prefix":"10.1093","volume":"35","author":[{"given":"Mostafa","family":"Karimi","sequence":"first","affiliation":[{"name":"Department of Electrical and Computer Engineering , College Station, TX, USA"},{"name":"TEES\u2013AgriLife Center for Bioinformatics and Genomic Systems Engineering, College Station , TX, USA"}]},{"given":"Di","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering , College Station, TX, USA"}]},{"given":"Zhangyang","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Texas A&M University, College Station , TX, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1703-7796","authenticated-orcid":false,"given":"Yang","family":"Shen","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering , College Station, TX, USA"},{"name":"TEES\u2013AgriLife Center for Bioinformatics and Genomic Systems Engineering, College Station , TX, USA"}]}],"member":"286","published-online":{"date-parts":[[2019,2,15]]},"reference":[{"key":"2023020108350783100_btz111-B1","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1002\/wcms.1225","article-title":"Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening","volume":"5","author":"Ain","year":"2015","journal-title":"Wiley Interdiscip. Rev. Comput. Mol. Sci"},{"key":"2023020108350783100_btz111-B2","doi-asserted-by":"crossref","first-page":"29988","DOI":"10.1074\/jbc.271.47.29988","article-title":"X-ray structure of active site-inhibited clotting factor xa implications for drug design and substrate recognition","volume":"271","author":"Brandstetter","year":"1996","journal-title":"J. Biol. Chem"},{"key":"2023020108350783100_btz111-B3","doi-asserted-by":"crossref","first-page":"e1005690.","DOI":"10.1371\/journal.pcbi.1005690","article-title":"TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions","volume":"13","author":"Cang","year":"2017","journal-title":"PLoS Comput. Biol"},{"key":"2023020108350783100_btz111-B4","doi-asserted-by":"crossref","first-page":"e1000938","DOI":"10.1371\/journal.pcbi.1000938","article-title":"Drug off-target effects predicted using structural analysis in the context of a metabolic network model","volume":"6","author":"Chang","year":"2010","journal-title":"PLoS Comput. Biol"},{"key":"2023020108350783100_btz111-B5","doi-asserted-by":"crossref","first-page":"696","DOI":"10.1093\/bib\/bbv066","article-title":"Drug\u2013target interaction prediction: databases, web servers and computational models","volume":"17","author":"Chen","year":"2016","journal-title":"Brief. Bioinf"},{"key":"2023020108350783100_btz111-B6","doi-asserted-by":"crossref","first-page":"2373","DOI":"10.1039\/c2mb25110h","article-title":"Prediction of chemical\u2013protein interactions: multitarget-qsar versus computational chemogenomic methods","volume":"8","author":"Cheng","year":"2012","journal-title":"Mol. BioSyst"},{"key":"2023020108350783100_btz111-B7","doi-asserted-by":"crossref","first-page":"W72","DOI":"10.1093\/nar\/gki396","article-title":"Scratch: a protein structure and structural feature prediction server","volume":"33","author":"Cheng","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020108350783100_btz111-B8","doi-asserted-by":"crossref","first-page":"1832","DOI":"10.1109\/TCBB.2016.2570211","article-title":"Effectively identifying compound\u2013protein interactions by learning from positive and unlabeled examples","volume":"15","author":"Cheng","year":"2016","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinf"},{"key":"2023020108350783100_btz111-B9","doi-asserted-by":"crossref","first-page":"103","DOI":"10.3115\/v1\/W14-4012","article-title":"On the properties of neural machine translation: encoder\u2013decoder approaches","author":"Cho","year":"2014","journal-title":"Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation"},{"key":"2023020108350783100_btz111-B10","doi-asserted-by":"crossref","first-page":"1757","DOI":"10.1021\/acs.jcim.6b00601","article-title":"Convolutional embedding of attributed molecular graphs for physical property prediction","volume":"57","author":"Coley","year":"2017","journal-title":"J. Chem. Inf. Model"},{"key":"2023020108350783100_btz111-B11","doi-asserted-by":"crossref","first-page":"D292","DOI":"10.1093\/nar\/gkt940","article-title":"Pdbsum additions","volume":"42","author":"De Beer","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020108350783100_btz111-B12","doi-asserted-by":"crossref","first-page":"391.","DOI":"10.1002\/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9","article-title":"Indexing by latent semantic analysis","volume":"41","author":"Deerwester","year":"1990","journal-title":"J. Am. Soc. Inf. Sci"},{"key":"2023020108350783100_btz111-B13","doi-asserted-by":"crossref","first-page":"D222","DOI":"10.1093\/nar\/gkt1223","article-title":"Pfam: the protein families database","volume":"42","author":"Finn","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020108350783100_btz111-B14","doi-asserted-by":"crossref","first-page":"W30","DOI":"10.1093\/nar\/gkv397","article-title":"Hmmer web server: 2015 update","volume":"43","author":"Finn","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023020108350783100_btz111-B15","first-page":"3371","author":"Gao","year":"2018"},{"key":"2023020108350783100_btz111-B16","first-page":"1263","article-title":"Neural message passing for quantum chemistry","volume":"70","author":"Gilmer","year":"2017","journal-title":"Proceedings of the 34th International Conference on Machine Learning, Sydney"},{"key":"2023020108350783100_btz111-B17","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1146\/annurev.biophys.36.040306.132550","article-title":"Calculation of protein\u2013ligand binding affinities","volume":"36","author":"Gilson","year":"2007","journal-title":"Annu. Rev. Biophys. Biomol. Struct"},{"key":"2023020108350783100_btz111-B18","article-title":"Atomic convolutional networks for predicting protein\u2013ligand binding affinity","author":"Gomes","year":"2017","journal-title":"arXiv Preprint arXiv: 1703.10603"},{"key":"2023020108350783100_btz111-B19","doi-asserted-by":"crossref","first-page":"1424","DOI":"10.1021\/jm2010332","article-title":"Rational approaches to improving selectivity in drug design","volume":"55","author":"Huggins","year":"2012","journal-title":"J. Med. Chem"},{"key":"2023020108350783100_btz111-B20","doi-asserted-by":"crossref","first-page":"10300","DOI":"10.1074\/jbc.275.14.10300","article-title":"Structure-based design of a low molecular weight, nonphosphorus, nonpeptide, and highly selective inhibitor of protein\u2013tyrosine phosphatase 1b","volume":"275","author":"Iversen","year":"2000","journal-title":"J. Biol. Chem"},{"key":"2023020108350783100_btz111-B21","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1021\/acs.jcim.7b00650","article-title":"KDEEP: protein\u2013ligand absolute binding affinity prediction via 3D-convolutional neural networks","volume":"58","author":"Jimenez","year":"2018","journal-title":"J. Chem. Inf. Model"},{"key":"2023020108350783100_btz111-B22","first-page":"2323","article-title":"Junction tree variational autoencoder for molecular graph generation","author":"Jin","year":"2018","journal-title":"Proceedings of the 35th International Conference on Machine Learning"},{"key":"2023020108350783100_btz111-B23","first-page":"1700","volume-title":"Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing","author":"Kalchbrenner","year":"2013"},{"key":"2023020108350783100_btz111-B24","doi-asserted-by":"crossref","first-page":"175.","DOI":"10.1038\/nature08506","article-title":"Predicting new molecular targets for known drugs","volume":"462","author":"Keiser","year":"2009","journal-title":"Nature"},{"key":"2023020108350783100_btz111-B25","first-page":"1885","author":"Koh","year":"2017"},{"key":"2023020108350783100_btz111-B26","doi-asserted-by":"crossref","first-page":"D684","DOI":"10.1093\/nar\/gkm795","article-title":"Stitch: interaction networks of chemicals and proteins","volume":"36","author":"Kuhn","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020108350783100_btz111-B27","doi-asserted-by":"crossref","first-page":"5851","DOI":"10.1021\/jm060999m","article-title":"Prediction of protein\u2013ligand interactions. Docking and scoring: successes and gaps","volume":"49","author":"Leach","year":"2006","journal-title":"J. Med. Chem"},{"key":"2023020108350783100_btz111-B28","article-title":"Independently recurrent neural network (indrnn): building A longer and deeper RNN","author":"Li","year":"2018","journal-title":"CoRR"},{"key":"2023020108350783100_btz111-B29","doi-asserted-by":"crossref","first-page":"D198","DOI":"10.1093\/nar\/gkl999","article-title":"Bindingdb: a web-accessible database of experimentally determined protein\u2013ligand binding affinities","volume":"35","author":"Liu","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023020108350783100_btz111-B30","first-page":"289","author":"Lu","year":"2016"},{"key":"2023020108350783100_btz111-B31","doi-asserted-by":"crossref","first-page":"573.","DOI":"10.1038\/s41467-017-00680-8","article-title":"A network integration approach for drug\u2013target interaction prediction and computational drug repositioning from heterogeneous information","volume":"8","author":"Luo","year":"2017","journal-title":"Nat. Commun"},{"key":"2023020108350783100_btz111-B32","doi-asserted-by":"crossref","first-page":"2592","DOI":"10.1093\/bioinformatics\/btu352","article-title":"Sspro\/accpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity","volume":"30","author":"Magnan","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020108350783100_btz111-B33","doi-asserted-by":"crossref","first-page":"80.","DOI":"10.3389\/fenvs.2015.00080","article-title":"Deeptox: toxicity prediction using deep learning","volume":"3","author":"Mayr","year":"2016","journal-title":"Front. Environ. Sci"},{"key":"2023020108350783100_btz111-B34","article-title":"Efficient estimation of word representations in vector space","author":"Mikolov","year":"2013","journal-title":"arXiv Preprint arXiv: 1301.3781"},{"key":"2023020108350783100_btz111-B35","doi-asserted-by":"crossref","first-page":"2063","DOI":"10.1001\/jama.2014.3002","article-title":"Genomics-enabled drug repositioning and repurposing: insights from an IOM Roundtable activity","volume":"311","author":"Power","year":"2014","journal-title":"JAMA"},{"key":"2023020108350783100_btz111-B36","first-page":"1135","author":"Ribeiro","year":"2016"},{"key":"2023020108350783100_btz111-B37","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1038\/nrd.2016.230","article-title":"A comprehensive map of molecular drug targets","volume":"16","author":"Santos","year":"2017","journal-title":"Nat. Rev. Drug Discov"},{"key":"2023020108350783100_btz111-B38","first-page":"41","author":"Shi","year":"2013"},{"key":"2023020108350783100_btz111-B39","first-page":"1139","author":"Sutskever","year":"2013"},{"key":"2023020108350783100_btz111-B40","first-page":"3104","author":"Sutskever","year":"2014"},{"key":"2023020108350783100_btz111-B41","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"Uniref clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"Suzek","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020108350783100_btz111-B42","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/1752-0509-7-S6-S3","article-title":"Scalable prediction of compound\u2013protein interactions using minwise hashing","volume":"7","author":"Tabei","year":"2013","journal-title":"BMC Syst. Biol"},{"key":"2023020108350783100_btz111-B43","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1016\/j.ymeth.2016.06.024","article-title":"Boosting compound\u2013protein interaction prediction by deep learning","volume":"110","author":"Tian","year":"2016","journal-title":"Methods"},{"key":"2023020108350783100_btz111-B44","article-title":"Atomnet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery","author":"Wallach","year":"2015","journal-title":"arXiv Preprint arXiv: 1510.02855"},{"key":"2023020108350783100_btz111-B45","first-page":"086033","article-title":"Deep learning with feature embedding for compound\u2013protein interaction prediction","author":"Wan","year":"2016","journal-title":"bioRxiv"},{"key":"2023020108350783100_btz111-B46","doi-asserted-by":"crossref","first-page":"i126","DOI":"10.1093\/bioinformatics\/btt234","article-title":"Predicting drug\u2013target interactions using restricted Boltzmann machines","volume":"29","author":"Wang","year":"2013","journal-title":"Bioinformatics"},{"key":"2023020108350783100_btz111-B47","doi-asserted-by":"crossref","first-page":"W623","DOI":"10.1093\/nar\/gkp456","article-title":"Pubchem: a public information system for analyzing bioactivities of small molecules","volume":"37","author":"Wang","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023020108350783100_btz111-B48","doi-asserted-by":"crossref","first-page":"W430","DOI":"10.1093\/nar\/gkw306","article-title":"Raptorx-property: a web server for protein structure property prediction","volume":"44","author":"Wang","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023020108350783100_btz111-B49","first-page":"4792","author":"Wang","year":"2016"},{"key":"2023020108350783100_btz111-B50","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1021\/ci00057a005","article-title":"Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules","volume":"28","author":"Weininger","year":"1988","journal-title":"J. Chem. Inf. Comput. Sci"},{"key":"2023020108350783100_btz111-B51","first-page":"285","author":"Xu","year":"2017"},{"key":"2023020108350783100_btz111-B52","doi-asserted-by":"crossref","first-page":"e37608","DOI":"10.1371\/journal.pone.0037608","article-title":"A systematic prediction of multiple drug\u2013target interactions from chemical, genomic, and pharmacological data","volume":"7","author":"Yu","year":"2012","journal-title":"PLoS One"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/18\/3329\/48975803\/bioinformatics_35_18_3329.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/18\/3329\/48975803\/bioinformatics_35_18_3329.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T14:42:19Z","timestamp":1675262539000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/18\/3329\/5320555"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,2,15]]},"references-count":52,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2019,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz111","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/351601","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,9,15]]},"published":{"date-parts":[[2019,2,15]]}}}