{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T05:22:38Z","timestamp":1775798558410,"version":"3.50.1"},"reference-count":48,"publisher":"Oxford University Press (OUP)","issue":"9","license":[{"start":{"date-parts":[[2024,9,2]],"date-time":"2024-09-02T00:00:00Z","timestamp":1725235200000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["NSF-III2246796"],"award-info":[{"award-number":["NSF-III2246796"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["NSF-III2152030"],"award-info":[{"award-number":["NSF-III2152030"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,9,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The identification and understanding of drug\u2013target interactions (DTIs) play a pivotal role in the drug discovery and development process. Sequence representations of drugs and proteins in computational model offer advantages such as their widespread availability, easier input quality control, and reduced computational resource requirements. These make them an efficient and accessible tools for various computational biology and drug discovery applications. Many sequence-based DTI prediction methods have been developed over the years. Despite the advancement in methodology, cold start DTI prediction involving unknown drug or protein remains a challenging task, particularly for sequence-based models. Introducing DTI-LM, a novel framework leveraging advanced pretrained language models, we harness their exceptional context-capturing abilities along with neighborhood information to predict DTIs. DTI-LM is specifically designed to rely solely on sequence representations for drugs and proteins, aiming to bridge the gap between warm start and cold start predictions.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Large-scale experiments on four datasets show that DTI-LM can achieve state-of-the-art performance on DTI predictions. Notably, it excels in overcoming the common challenges faced by sequence-based models in cold start predictions for proteins, yielding impressive results. The incorporation of neighborhood information through a graph attention network further enhances prediction accuracy. Nevertheless, a disparity persists between cold start predictions for proteins and drugs. A detailed examination of DTI-LM reveals that language models exhibit contrasting capabilities in capturing similarities between drugs and proteins.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Source code is available at: https:\/\/github.com\/compbiolabucf\/DTI-LM.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae533","type":"journal-article","created":{"date-parts":[[2024,9,2]],"date-time":"2024-09-02T13:47:35Z","timestamp":1725284855000},"source":"Crossref","is-referenced-by-count":37,"title":["DTI-LM: language model powered drug\u2013target interaction prediction"],"prefix":"10.1093","volume":"40","author":[{"given":"Khandakar Tanvir","family":"Ahmed","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Central Florida , Orlando, FL 32816,","place":["United States"]},{"name":"Genomics and Bioinformatics Cluster, University of Central Florida , Orlando, FL 32816,","place":["United States"]}]},{"given":"Md Istiaq","family":"Ansari","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Central Florida , Orlando, FL 32816,","place":["United States"]},{"name":"Genomics and Bioinformatics Cluster, University of Central Florida , Orlando, FL 32816,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3605-9373","authenticated-orcid":false,"given":"Wei","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Central Florida , Orlando, FL 32816,","place":["United States"]},{"name":"Genomics and Bioinformatics Cluster, University of Central Florida , Orlando, FL 32816,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2024,9,2]]},"reference":[{"key":"2024102914113465300_btae533-B1","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1126\/science.abj8754","article-title":"Accurate prediction of protein structures and interactions using a three-track neural network","volume":"373","author":"Baek","year":"2021","journal-title":"Science"},{"key":"2024102914113465300_btae533-B2","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbae293","article-title":"Hierarchical multimodal self-attention-based graph neural network for DTI prediction","volume":"25","author":"Bian","year":"2024","journal-title":"Brief Bioinform"},{"key":"2024102914113465300_btae533-B3","doi-asserted-by":"crossref","first-page":"2102","DOI":"10.1093\/bioinformatics\/btac020","article-title":"ProteinBERT: a universal deep-learning model of protein sequence and function","volume":"38","author":"Brandes","year":"2022","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B4","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1038\/d41573-021-00190-9","article-title":"Clinical development times for innovative drugs","volume":"21","author":"Brown","year":"2021","journal-title":"Nat Rev Drug Discov"},{"key":"2024102914113465300_btae533-B5","doi-asserted-by":"crossref","first-page":"4406","DOI":"10.1093\/bioinformatics\/btaa524","article-title":"TransformerCPI: improving compound\u2013protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments","volume":"36","author":"Chen","year":"2020","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B6","doi-asserted-by":"crossref","first-page":"2208","DOI":"10.1109\/TCBB.2021.3077905","article-title":"Drug\u2013target interaction prediction using multi-head self-attention and graph attention network","volume":"19","author":"Cheng","year":"2022","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2024102914113465300_btae533-B7","author":"Chithrananda","year":"2020"},{"key":"2024102914113465300_btae533-B8","author":"Devlin","year":"2018"},{"key":"2024102914113465300_btae533-B9","doi-asserted-by":"crossref","first-page":"7112","DOI":"10.1109\/TPAMI.2021.3095381","article-title":"ProtTrans: toward understanding the language of life through self-supervised learning","volume":"44","author":"Elnaggar","year":"2021","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2024102914113465300_btae533-B10","doi-asserted-by":"crossref","first-page":"1297","DOI":"10.1038\/s42256-023-00740-3","article-title":"Neural scaling of deep chemical models","volume":"5","author":"Frey","year":"2023","journal-title":"Nat Mach Intell"},{"key":"2024102914113465300_btae533-B11","doi-asserted-by":"crossref","first-page":"830","DOI":"10.1093\/bioinformatics\/btaa880","article-title":"MolTrans: molecular interaction transformer for drug\u2013target interaction prediction","volume":"37","author":"Huang","year":"2021","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B12","author":"HuggingFace"},{"key":"2024102914113465300_btae533-B13","doi-asserted-by":"crossref","first-page":"bbac016","DOI":"10.1093\/bib\/bbac016","article-title":"Identifying drug\u2013target interactions via heterogeneous graph attention networks combined with cross-modal similarities","volume":"23","author":"Jiang","year":"2022","journal-title":"Brief Bioinform"},{"key":"2024102914113465300_btae533-B14","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2024102914113465300_btae533-B15","doi-asserted-by":"crossref","first-page":"2706","DOI":"10.1021\/acsomega.1c05203","article-title":"TransDTI: transformer-based language models for estimating DTIs and building a drug recommendation workflow","volume":"7","author":"Kalakoti","year":"2022","journal-title":"ACS Omega"},{"key":"2024102914113465300_btae533-B16","doi-asserted-by":"crossref","first-page":"1710","DOI":"10.3390\/pharmaceutics14081710","article-title":"Fine-tuning of Bert model to accurately predict drug\u2013target interactions","volume":"14","author":"Kang","year":"2022","journal-title":"Pharmaceutics"},{"key":"2024102914113465300_btae533-B17","doi-asserted-by":"crossref","first-page":"D1373","DOI":"10.1093\/nar\/gkac956","article-title":"PubChem 2023 update","volume":"51","author":"Kim","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024102914113465300_btae533-B18","doi-asserted-by":"crossref","first-page":"D1091","DOI":"10.1093\/nar\/gkt1068","article-title":"DrugBank 4.0: shedding new light on drug metabolism","volume":"42","author":"Law","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2024102914113465300_btae533-B19","doi-asserted-by":"crossref","first-page":"3582","DOI":"10.1093\/bioinformatics\/btac377","article-title":"Effective drug\u2013target interaction prediction with mutual interaction neural network","volume":"38","author":"Li","year":"2022","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B20","author":"Liaw","year":"2018"},{"key":"2024102914113465300_btae533-B21","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Lin","year":"2023","journal-title":"Science"},{"key":"2024102914113465300_btae533-B22","doi-asserted-by":"crossref","first-page":"D198","DOI":"10.1093\/nar\/gkl999","article-title":"BindingDB: a web-accessible database of experimentally determined protein\u2013ligand binding affinities","volume":"35","author":"Liu","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2024102914113465300_btae533-B23","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1038\/s41467-017-00680-8","article-title":"A network integration approach for drug\u2013target interaction prediction and computational drug repositioning from heterogeneous information","volume":"8","author":"Luo","year":"2017","journal-title":"Nat Commun"},{"key":"2024102914113465300_btae533-B24","doi-asserted-by":"crossref","first-page":"1679","DOI":"10.1093\/bib\/bbaa012","article-title":"Biological applications of knowledge graph embedding models","volume":"22","author":"Mohamed","year":"2021","journal-title":"Brief Bioinform"},{"key":"2024102914113465300_btae533-B25","doi-asserted-by":"crossref","first-page":"bbac269","DOI":"10.1093\/bib\/bbac269","article-title":"Mitigating cold-start problems in drug\u2013target affinity prediction with interaction knowledge transferring","volume":"23","author":"Nguyen","year":"2022","journal-title":"Brief Bioinform"},{"key":"2024102914113465300_btae533-B26","doi-asserted-by":"crossref","first-page":"i821","DOI":"10.1093\/bioinformatics\/bty593","article-title":"DeepDTA: deep drug\u2013target binding affinity prediction","volume":"34","author":"\u00d6zt\u00fcrk","year":"2018","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B27","doi-asserted-by":"crossref","first-page":"942","DOI":"10.1021\/acs.jcim.6b00740","article-title":"Protein\u2013ligand scoring with convolutional neural networks","volume":"57","author":"Ragoza","year":"2017","journal-title":"J Chem Inf Model"},{"key":"2024102914113465300_btae533-B28","author":"RDKit"},{"key":"2024102914113465300_btae533-B29","doi-asserted-by":"crossref","first-page":"1256","DOI":"10.1038\/s42256-022-00580-7","article-title":"Large-scale chemical language representations capture molecular structure and properties","volume":"4","author":"Ross","year":"2022","journal-title":"Nat Mach Intell"},{"key":"2024102914113465300_btae533-B30","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1038\/msb.2011.75","article-title":"Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega","volume":"7","author":"Sievers","year":"2011","journal-title":"Mol Syst Biol"},{"key":"2024102914113465300_btae533-B31","doi-asserted-by":"crossref","first-page":"3666","DOI":"10.1093\/bioinformatics\/bty374","article-title":"Development and evaluation of a deep learning model for protein\u2013ligand binding affinity prediction","volume":"34","author":"Stepniewska-Dziubinska","year":"2018","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B32","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"Suzek","year":"2015","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B33","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1186\/s13321-020-00447-2","article-title":"DTiGEMS+: drug\u2013target interaction prediction using graph embedding, graph mining, and similarity-based techniques","volume":"12","author":"Thafar","year":"2020","journal-title":"J Cheminform"},{"key":"2024102914113465300_btae533-B34","doi-asserted-by":"crossref","first-page":"D523","DOI":"10.1093\/nar\/gkac1052","article-title":"UniProt: The Universal Protein Knowledgebase in 2023","volume":"51","author":"The UniProt Consortium","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2024102914113465300_btae533-B35","author":"Wallach","year":"2015"},{"key":"2024102914113465300_btae533-B36","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1093\/bioinformatics\/bty543","article-title":"NeoDTI: neural integration of neighbor information from a heterogeneous network for discovering new drug\u2013target interactions","volume":"35","author":"Wan","year":"2019","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B37","author":"Wang","year":"2021"},{"key":"2024102914113465300_btae533-B38","doi-asserted-by":"crossref","first-page":"bbad516","DOI":"10.1093\/bib\/bbae516","article-title":"Predicting drug\u2013target binding affinity with cross-scale graph contrastive learning","volume":"25","author":"Wang","year":"2024","journal-title":"Brief Bioinform"},{"key":"2024102914113465300_btae533-B39","first-page":"246","author":"Wang","year":"2023"},{"key":"2024102914113465300_btae533-B40","doi-asserted-by":"crossref","first-page":"1401","DOI":"10.1021\/acs.jproteome.6b00618","article-title":"Deep-learning-based drug\u2013target interaction prediction","volume":"16","author":"Wen","year":"2017","journal-title":"J Proteome Res"},{"key":"2024102914113465300_btae533-B41","doi-asserted-by":"crossref","first-page":"844","DOI":"10.1001\/jama.2020.1166","article-title":"Estimated research and development investment needed to bring a new medicine to market, 2009-2018","volume":"323","author":"Wouters","year":"2020","journal-title":"JAMA"},{"key":"2024102914113465300_btae533-B42","doi-asserted-by":"publisher","author":"Wu","year":"2022","DOI":"10.1101\/2022.07.21.500999,"},{"key":"2024102914113465300_btae533-B43","author":"Khodabandeh Yalabadi"},{"key":"2024102914113465300_btae533-B44","doi-asserted-by":"crossref","first-page":"i232","DOI":"10.1093\/bioinformatics\/btn162","article-title":"Prediction of drug\u2013target interaction networks from the integration of chemical and genomic spaces","volume":"24","author":"Yamanishi","year":"2008","journal-title":"Bioinformatics"},{"key":"2024102914113465300_btae533-B45","doi-asserted-by":"crossref","first-page":"6775","DOI":"10.1038\/s41467-021-27137-3","article-title":"A unified drug\u2013target interaction prediction framework based on knowledge graph and recommendation system","volume":"12","author":"Ye","year":"2021","journal-title":"Nat Commun"},{"key":"2024102914113465300_btae533-B46","doi-asserted-by":"crossref","first-page":"bbad079","DOI":"10.1093\/bib\/bbad079","article-title":"MHTAN-DTI: metapath-based hierarchical transformer and attention network for drug\u2013target interaction prediction","volume":"24","author":"Zhang","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024102914113465300_btae533-B47","doi-asserted-by":"crossref","first-page":"8993","DOI":"10.3390\/ijms22168993","article-title":"SAG-DTA: prediction of drug\u2013target affinity using self-attention graph network","volume":"22","author":"Zhang","year":"2021","journal-title":"Int J Mol Sci"},{"key":"2024102914113465300_btae533-B48","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1038\/s42256-020-0152-y","article-title":"Predicting drug\u2013protein interaction using quasi-visual question answering system","volume":"2","author":"Zheng","year":"2020","journal-title":"Nat Mach Intell"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae533\/58995571\/btae533.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/9\/btae533\/60195121\/btae533.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/9\/btae533\/60195121\/btae533.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,29]],"date-time":"2024-10-29T14:11:54Z","timestamp":1730211114000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae533\/7747660"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,9]]},"references-count":48,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2024,9,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae533","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,9]]},"published":{"date-parts":[[2024,9]]},"article-number":"btae533"}}