{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T10:40:20Z","timestamp":1780396820624,"version":"3.54.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1011036","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2023,4,12]],"date-time":"2023-04-12T00:00:00Z","timestamp":1681257600000}}],"reference-count":54,"publisher":"Public Library of Science (PLoS)","issue":"3","license":[{"start":{"date-parts":[[2023,3,31]],"date-time":"2023-03-31T00:00:00Z","timestamp":1680220800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Drug-target binding affinity prediction plays a key role in the early stage of drug discovery. Numerous experimental and data-driven approaches have been developed for predicting drug-target binding affinity. However, experimental methods highly rely on the limited structural-related information from drug-target pairs, domain knowledge, and time-consuming assays. On the other hand, learning-based methods have shown an acceptable prediction performance. However, most of them utilize several simple and complex types of proteins and drug compounds data, ranging from the protein sequences to the topology of a graph representation of drug compounds, employing multiple deep neural networks for encoding and feature extraction, and so, leads to the computational overheads. In this study, we propose a unified measure for protein sequence encoding, named BiComp, which provides compression-based and evolutionary-related features from the protein sequences. Specifically, we employ Normalized Compression Distance and Smith-Waterman measures for capturing complementary information from the algorithmic information theory and biological domains, respectively. We utilize the proposed measure to encode the input proteins feeding a new deep neural network-based method for drug-target binding affinity prediction, named BiComp-DTA. BiComp-DTA is evaluated utilizing four benchmark datasets for drug-target binding affinity prediction. Compared to the state-of-the-art methods, which employ complex models for protein encoding and feature extraction, BiComp-DTA provides superior efficiency in terms of accuracy, runtime, and the number of trainable parameters. The latter achievement facilitates execution of BiComp-DTA on a normal desktop computer in a fast fashion. As a comparative study, we evaluate BiComp\u2019s efficiency against its components for drug-target binding affinity prediction. The results have shown superior accuracy of BiComp due to the orthogonality and complementary nature of Smith-Waterman and Normalized Compression Distance measures for protein sequences. Such a protein sequence encoding provides efficient representation with no need for multiple sources of information, deep domain knowledge, and complex neural networks.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1011036","type":"journal-article","created":{"date-parts":[[2023,3,31]],"date-time":"2023-03-31T18:17:02Z","timestamp":1680286622000},"page":"e1011036","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":52,"title":["BiComp-DTA: Drug-target binding affinity prediction through complementary biological-related and compression-based featurization approach"],"prefix":"10.1371","volume":"19","author":[{"given":"Mahmood","family":"Kalemati","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Mojtaba","family":"Zamani Emani","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3105-2511","authenticated-orcid":true,"given":"Somayyeh","family":"Koohi","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"340","published-online":{"date-parts":[[2023,3,31]]},"reference":[{"issue":"4","key":"pcbi.1011036.ref001","doi-asserted-by":"crossref","first-page":"696","DOI":"10.1093\/bib\/bbv066","article-title":"Drug\u2013target interaction prediction: databases, web servers and computational models","volume":"17","author":"X Chen","year":"2016","journal-title":"Briefings in bioinformatics"},{"issue":"2","key":"pcbi.1011036.ref002","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1006\/meth.1999.0852","article-title":"Isothermal titration calorimetry of protein\u2013protein interactions.","volume":"19","author":"MM Pierce","year":"1999","journal-title":"Methods."},{"issue":"1","key":"pcbi.1011036.ref003","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1517\/17460441.2011.537322","article-title":"Fluorescence polarization assays in small molecule screening.","volume":"6","author":"WA Lea","year":"2011","journal-title":"Expert opinion on drug discovery."},{"issue":"1\u20132","key":"pcbi.1011036.ref004","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/S0925-4005(98)00321-9","article-title":"Surface plasmon resonance sensors.","volume":"54","author":"J Homola","year":"1999","journal-title":"Sensors and actuators B: Chemical"},{"key":"pcbi.1011036.ref005","doi-asserted-by":"crossref","first-page":"113608","DOI":"10.1016\/j.ab.2020.113608","article-title":"A systematic approach to quantitative Western blot analysis","volume":"593","author":"L Pillai-Kastoori","year":"2020","journal-title":"Analytical biochemistry"},{"issue":"9","key":"pcbi.1011036.ref006","doi-asserted-by":"crossref","first-page":"1193","DOI":"10.1007\/s12272-016-0791-z","article-title":"Target identification for biologically active small molecules using chemical biology approaches","volume":"39","author":"H Lee","year":"2016","journal-title":"Archives of pharmacal research"},{"issue":"1","key":"pcbi.1011036.ref007","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.drudis.2015.08.001","article-title":"Identifying compound efficacy targets in phenotypic drug discovery","volume":"21","author":"M Schirle","year":"2016","journal-title":"Drug discovery today"},{"issue":"2","key":"pcbi.1011036.ref008","doi-asserted-by":"crossref","first-page":"325","DOI":"10.1093\/bib\/bbu010","article-title":"Toward more realistic drug\u2013target interaction predictions","volume":"16","author":"T Pahikkala","year":"2015","journal-title":"Briefings in bioinformatics"},{"issue":"1","key":"pcbi.1011036.ref009","first-page":"1","article-title":"SimBoost: a read-across approach for predicting drug\u2013target binding affinities using gradient boosting machines","volume":"9","author":"T He","year":"2017","journal-title":"Journal of cheminformatics"},{"issue":"1","key":"pcbi.1011036.ref010","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-021-83679-y","article-title":"Prediction of drug\u2013target binding affinity using similarity-based convolutional neural network.","volume":"11","author":"J Shim","year":"2021","journal-title":"Scientific Reports."},{"key":"pcbi.1011036.ref011","doi-asserted-by":"crossref","first-page":"115810","DOI":"10.1016\/j.eswa.2021.115810","article-title":"Drug-target continuous binding affinity prediction using multiple sources of information","volume":"186","author":"B Tanoori","year":"2021","journal-title":"Expert Systems with Applications"},{"issue":"7","key":"pcbi.1011036.ref012","doi-asserted-by":"crossref","first-page":"1964","DOI":"10.1093\/bioinformatics\/btac048","article-title":"NerLTR-DTA: drug\u2013target binding affinity prediction based on neighbor relationship and learning to rank","volume":"38","author":"X Ru","year":"2022","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1011036.ref013","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"TF Smith","year":"1981","journal-title":"Journal of molecular biology"},{"key":"pcbi.1011036.ref014","article-title":"ABSOLUTE MACHINE LEARNING: Answer Every Question","author":"P. Jain","year":"2020","journal-title":"Prachi Jain;"},{"issue":"17","key":"pcbi.1011036.ref015","doi-asserted-by":"crossref","first-page":"i821","DOI":"10.1093\/bioinformatics\/bty593","article-title":"DeepDTA: deep drug\u2013target binding affinity prediction","volume":"34","author":"H \u00d6zt\u00fcrk","year":"2018","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1011036.ref016","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1021\/ci00057a005","article-title":"SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules","volume":"28","author":"D. Weininger","year":"1988","journal-title":"Journal of chemical information and computer sciences"},{"key":"pcbi.1011036.ref017","article-title":"WideDTA: prediction of drug-target binding affinity","author":"H \u00d6zt\u00fcrk","year":"2019","journal-title":"arXiv preprint arXiv:1902.04166"},{"key":"pcbi.1011036.ref018","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1109\/BIBM47256.2019.8983125","article-title":"AttentionDTA: prediction of drug\u2013target binding affinity using attention model.","author":"Q Zhao","year":"2019","journal-title":"In2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)"},{"key":"pcbi.1011036.ref019","article-title":"Deep drug-target binding affinity prediction with multiple attention blocks","author":"Y Zeng","year":"2021","journal-title":"Briefings in bioinformatics"},{"key":"pcbi.1011036.ref020","article-title":"A Mutual Attention Model for Drug Target Binding Affinity Prediction","author":"N. Aleb","year":"2021","journal-title":"IEEE\/ACM Transactions on Computational Biology and Bioinformatics"},{"issue":"6","key":"pcbi.1011036.ref021","doi-asserted-by":"crossref","first-page":"4659","DOI":"10.1007\/s11063-021-10617-4","article-title":"Multilevel Attention Models for Drug Target Binding Affinity Prediction","volume":"53","author":"N. Aleb","year":"2021","journal-title":"Neural Processing Letters"},{"key":"pcbi.1011036.ref022","first-page":"30","article-title":"Attention is all you need","author":"A Vaswani","year":"2017","journal-title":"Advances in neural information processing systems"},{"key":"pcbi.1011036.ref023","unstructured":"Shin B, Park S, Kang K, Ho JC. Self-attention based molecule representation for predicting drug-target interaction. In Machine Learning for Healthcare Conference 2019 Oct 28 (pp. 230\u2013248). PMLR."},{"issue":"1","key":"pcbi.1011036.ref024","doi-asserted-by":"crossref","first-page":"bbab506","DOI":"10.1093\/bib\/bbab506","article-title":"FusionDTA: attention-based feature polymerizer and knowledge distillation for drug-target binding affinity prediction","author":"W Yuan","year":"2022","journal-title":"Briefings in Bioinformatics"},{"issue":"15","key":"pcbi.1011036.ref025","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"A Rives","year":"2021","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"pcbi.1011036.ref026","article-title":"Reformer: The efficient transformer.","author":"N Kitaev","year":"2020","journal-title":"arXiv preprint arXiv:2001.04451"},{"issue":"8","key":"pcbi.1011036.ref027","doi-asserted-by":"crossref","first-page":"1140","DOI":"10.1093\/bioinformatics\/btaa921","article-title":"GraphDTA: Predicting drug\u2013target binding affinity with graph neural networks","volume":"37","author":"T Nguyen","year":"2021","journal-title":"Bioinformatics"},{"issue":"35","key":"pcbi.1011036.ref028","doi-asserted-by":"crossref","first-page":"20701","DOI":"10.1039\/D0RA02297G","article-title":"Drug\u2013target affinity prediction using graph neural network and contact maps.","volume":"10","author":"M Jiang","year":"2020","journal-title":"RSC Advances"},{"key":"pcbi.1011036.ref029","article-title":"MGraphDTA: deep multiscale graph neural network for explainable drug\u2013target binding affinity prediction.","author":"Z Yang","year":"2022","journal-title":"Chemical Science"},{"key":"pcbi.1011036.ref030","doi-asserted-by":"crossref","first-page":"170433","DOI":"10.1109\/ACCESS.2020.3024238","article-title":"DeepH-DTA: deep learning for predicting drug-target interactions: a case study of COVID-19 drug repurposing.","volume":"8","author":"M Abdel-Basset","year":"2020","journal-title":"Ieee Access."},{"key":"pcbi.1011036.ref031","doi-asserted-by":"crossref","first-page":"2022","DOI":"10.1145\/3308558.3313562","article-title":"Heterogeneous graph attention network","author":"X Wang","year":"2019","journal-title":"InThe world wide web conference"},{"key":"pcbi.1011036.ref032","article-title":"Convolutional LSTM network: A machine learning approach for precipitation nowcasting","volume":"28","author":"X Shi","year":"2015","journal-title":"Advances in neural information processing systems"},{"key":"pcbi.1011036.ref033","unstructured":"Knyazev, B., Taylor, G.W., Amer, M., 2019. Understanding attention and generalization ingraph neural networks. In: Proceedings of NeurIPS, pp. 4202\u20134212."},{"key":"pcbi.1011036.ref034","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.aiopen.2021.01.001","article-title":"Graph neural networks: A review of methods and applications.","volume":"1","author":"J Zhou","year":"2020","journal-title":"AI Open."},{"key":"pcbi.1011036.ref035","unstructured":"Garg V, Jegelka S, Jaakkola T. Generalization and representational limits of graph neural networks. In International Conference on Machine Learning 2020 Nov 21 (pp. 3419\u20133430). PMLR."},{"key":"pcbi.1011036.ref036","article-title":"7z format\u201d, http:\/\/www.7zip.org\/7z.html","author":"Igor Pavlov","year":"2022","journal-title":"Last visited"},{"issue":"9","key":"pcbi.1011036.ref037","doi-asserted-by":"crossref","first-page":"1396","DOI":"10.1093\/bioinformatics\/btv006","article-title":"Integrating alignment-based and alignment-free sequence similarity measures for biological sequence classification","volume":"31","author":"I Borozan","year":"2015","journal-title":"Bioinformatics"},{"issue":"4","key":"pcbi.1011036.ref038","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1093\/bioinformatics\/bti806","article-title":"Application of compression-based distance measures to protein sequence classification: a methodological study","volume":"22","author":"A Kocsor","year":"2006","journal-title":"Bioinformatics"},{"issue":"3","key":"pcbi.1011036.ref039","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1002\/qsar.200710043","article-title":"On some aspects of variable selection for partial least squares regression models","volume":"27","author":"PP Roy","year":"2008","journal-title":"QSAR & Combinatorial Science"},{"issue":"11","key":"pcbi.1011036.ref040","doi-asserted-by":"crossref","first-page":"1046","DOI":"10.1038\/nbt.1990","article-title":"Comprehensive analysis of kinase inhibitor selectivity","volume":"29","author":"MI Davis","year":"2011","journal-title":"Nature biotechnology"},{"issue":"3","key":"pcbi.1011036.ref041","doi-asserted-by":"crossref","first-page":"735","DOI":"10.1021\/ci400709d","article-title":"Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis","volume":"54","author":"J Tang","year":"2014","journal-title":"Journal of Chemical Information and Modeling"},{"key":"pcbi.1011036.ref042","article-title":"Therapeutics data commons: Machine learning datasets and tasks for drug discovery and development.","author":"K Huang","year":"2021","journal-title":"arXiv preprint arXiv:2102.09548."},{"issue":"12","key":"pcbi.1011036.ref043","doi-asserted-by":"crossref","first-page":"2977","DOI":"10.1021\/jm030580l","article-title":"The PDBbind database: Collection of binding affinities for protein\u2212 ligand complexes with known three-dimensional structures","volume":"47","author":"R Wang","year":"2004","journal-title":"Journal of medicinal chemistry"},{"key":"pcbi.1011036.ref044","article-title":"Mitigating cold start problems in drug-target affinity prediction with interaction knowledge transferring","author":"TM Nguyen","year":"2022","journal-title":"arXiv preprint arXiv:2202.01195"},{"issue":"10","key":"pcbi.1011036.ref045","doi-asserted-by":"crossref","first-page":"2819","DOI":"10.1021\/acschembio.8b00881","article-title":"Adversarial Controls for Scientific Machine Learning.","volume":"13","author":"KV Chuang","year":"2018","journal-title":"ACS Chemical Biology"},{"issue":"12","key":"pcbi.1011036.ref046","doi-asserted-by":"crossref","first-page":"5957","DOI":"10.1021\/acs.jcim.0c00565","article-title":"Adding stochastic negative examples into machine learning improves molecular bioactivity prediction","volume":"60","author":"EL C\u00e1ceres","year":"2020","journal-title":"Journal of Chemical Information and Modeling"},{"key":"pcbi.1011036.ref047","volume-title":"Bioinformatics: Sequence and Genome Analysis","author":"DM Mount","year":"2004"},{"issue":"4","key":"pcbi.1011036.ref048","doi-asserted-by":"crossref","first-page":"1523","DOI":"10.1109\/TIT.2005.844059","article-title":"Clustering by compression","volume":"51","author":"R Cilibrasi","year":"2005","journal-title":"IEEE Transactions on Information theory"},{"issue":"2","key":"pcbi.1011036.ref049","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1093\/bioinformatics\/17.2.149","article-title":"An information-based sequence distance and its application to whole mitochondrial genome phylogeny","volume":"17","author":"M Li","year":"2001","journal-title":"Bioinformatics"},{"issue":"6","key":"pcbi.1011036.ref050","doi-asserted-by":"crossref","first-page":"393","DOI":"10.3390\/e20060393","article-title":"Comparison of compression-based measures with application to the evolution of primate genomes","volume":"20","author":"D Pratas","year":"2018","journal-title":"Entropy"},{"issue":"1","key":"pcbi.1011036.ref051","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-017-1319-7","article-title":"Alignment-free sequence comparison: benefits, applications, and tools","volume":"18","author":"A Zielezinski","year":"2017","journal-title":"Genome biology"},{"key":"pcbi.1011036.ref052","volume-title":"An introduction to Kolmogorov complexity and its applications.","author":"Springer","year":"2008"},{"issue":"1","key":"pcbi.1011036.ref053","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1007\/BF01386329","article-title":"The norm of the Schur product operation","volume":"4","author":"C. Davis","year":"1962","journal-title":"Numerische Mathematik."},{"key":"pcbi.1011036.ref054","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1016\/S0024-3795(98)10162-3","article-title":"Hadamard inverses, square roots and products of almost semidefinite matrices","volume":"288","author":"R. Reams","year":"1999","journal-title":"Linear Algebra and its Applications"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1011036","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2023,4,12]],"date-time":"2023-04-12T00:00:00Z","timestamp":1681257600000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011036","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,12]],"date-time":"2023-04-12T17:44:33Z","timestamp":1681321473000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011036"}},"subtitle":[],"editor":[{"given":"Avner","family":"Schlessinger","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2023,3,31]]},"references-count":54,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,3,31]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1011036","relation":{"new_version":[{"id-type":"doi","id":"10.1371\/journal.pcbi.1011036","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,31]]}}}