{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,18]],"date-time":"2026-05-18T20:00:35Z","timestamp":1779134435943,"version":"3.51.4"},"reference-count":74,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,6,7]],"date-time":"2024-06-07T00:00:00Z","timestamp":1717718400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,6,7]],"date-time":"2024-06-07T00:00:00Z","timestamp":1717718400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["32070662"],"award-info":[{"award-number":["32070662"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61832019"],"award-info":[{"award-number":["61832019"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["32030063"],"award-info":[{"award-number":["32030063"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Identification of interactions between chemical compounds and proteins is crucial for various applications, including drug discovery, target identification, network pharmacology, and elucidation of protein functions. Deep neural network-based approaches are becoming increasingly popular in efficiently identifying compound-protein interactions with high-throughput capabilities, narrowing down the scope of candidates for traditional labor-intensive, time-consuming and expensive experimental techniques. In this study, we proposed an end-to-end approach termed SPVec-SGCN-CPI, which utilized simplified graph convolutional network (SGCN) model with low-dimensional and continuous features generated from our previously developed model SPVec and graph topology information to predict compound-protein interactions. The SGCN technique, dividing the local neighborhood aggregation and nonlinearity layer-wise propagation steps, effectively aggregates K-order neighbor information while avoiding neighbor explosion and expediting training. The performance of the SPVec-SGCN-CPI method was assessed across three datasets and compared against four machine learning- and deep learning-based methods, as well as six state-of-the-art methods. Experimental results revealed that SPVec-SGCN-CPI outperformed all these competing methods, particularly excelling in unbalanced data scenarios. By propagating node features and topological information to the feature space, SPVec-SGCN-CPI effectively incorporates interactions between compounds and proteins, enabling the fusion of heterogeneity. Furthermore, our method scored all unlabeled data in ChEMBL, confirming the top five ranked compound-protein interactions through molecular docking and existing evidence. These findings suggest that our model can reliably uncover compound-protein interactions within unlabeled compound-protein pairs, carrying substantial implications for drug re-profiling and discovery. In summary, SPVec-SGCN demonstrates its efficacy in accurately predicting compound-protein interactions, showcasing potential to enhance target identification and streamline drug discovery processes.<\/jats:p><jats:p><jats:bold>Scientific contributions<\/jats:bold><\/jats:p><jats:p>The methodology presented in this work not only enables the comparatively accurate prediction of compound-protein interactions but also, for the first time, take sample imbalance which is very common in real world and computation efficiency into consideration simultaneously, accelerating the target identification and drug discovery process.<\/jats:p>","DOI":"10.1186\/s13321-024-00862-9","type":"journal-article","created":{"date-parts":[[2024,6,7]],"date-time":"2024-06-07T16:02:30Z","timestamp":1717776150000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model"],"prefix":"10.1186","volume":"16","author":[{"given":"Yufang","family":"Zhang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiayi","family":"Li","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shenggeng","family":"Lin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianwei","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yi","family":"Xiong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dong-Qing","family":"Wei","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,6,7]]},"reference":[{"key":"862_CR1","doi-asserted-by":"publisher","first-page":"1315","DOI":"10.1007\/s11030-021-10217-3","volume":"25","author":"R Gupta","year":"2021","unstructured":"Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P (2021) Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 25:1315\u20131360","journal-title":"Mol Divers"},{"issue":"4","key":"862_CR2","doi-asserted-by":"publisher","first-page":"232","DOI":"10.1038\/nchembio.1199","volume":"9","author":"M Schenone","year":"2013","unstructured":"Schenone M, Dan\u010d\u00edk V, Wagner BK, Clemons PA (2013) Target identification and mechanism of action in chemical biology and drug discovery. Nat Chem Biol 9(4):232\u2013240","journal-title":"Nat Chem Biol"},{"issue":"2","key":"862_CR3","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1016\/S0167-6296(02)00126-1","volume":"22","author":"JA DiMasi","year":"2003","unstructured":"DiMasi JA, Hansen RW, Grabowski HG (2003) The price of innovation: new estimates of drug development costs. J Health Econ 22(2):151\u2013185","journal-title":"J Health Econ"},{"key":"862_CR4","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1016\/j.isprsjprs.2016.01.011","volume":"114","author":"M Belgiu","year":"2016","unstructured":"Belgiu M, Dr\u0103gu\u0163 L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Remote Sens 114:24\u201331","journal-title":"ISPRS J Photogramm Remote Sens"},{"issue":"12","key":"862_CR5","doi-asserted-by":"publisher","first-page":"1565","DOI":"10.1038\/nbt1206-1565","volume":"24","author":"WS Noble","year":"2006","unstructured":"Noble WS (2006) What is a support vector machine? Nat Biotechnol 24(12):1565\u20131567","journal-title":"Nat Biotechnol"},{"issue":"12","key":"862_CR6","doi-asserted-by":"publisher","first-page":"2295","DOI":"10.1109\/JPROC.2017.2761740","volume":"105","author":"V Sze","year":"2017","unstructured":"Sze V, Chen Y-H, Yang T-J, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295\u20132329","journal-title":"Proc IEEE"},{"issue":"1","key":"862_CR7","first-page":"3149","volume":"30","author":"G Ke","year":"2017","unstructured":"Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W et al (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30(1):3149\u20133157","journal-title":"Adv Neural Inf Process Syst"},{"issue":"2","key":"862_CR8","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1093\/bioinformatics\/bty535","volume":"35","author":"M Tsubaki","year":"2019","unstructured":"Tsubaki M, Tomii K, Sese J (2019) Compound\u2013protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics 35(2):309\u2013318","journal-title":"Bioinformatics"},{"issue":"16","key":"862_CR9","doi-asserted-by":"publisher","first-page":"4406","DOI":"10.1093\/bioinformatics\/btaa524","volume":"36","author":"L Chen","year":"2020","unstructured":"Chen L, Tan X, Wang D, Zhong F, Liu X, Yang T et al (2020) TransformerCPI: improving compound\u2013protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics 36(16):4406\u20134414","journal-title":"Bioinformatics"},{"issue":"4","key":"862_CR10","doi-asserted-by":"publisher","first-page":"308","DOI":"10.1016\/j.cels.2020.03.002","volume":"10","author":"S Li","year":"2020","unstructured":"Li S, Wan F, Shu H, Jiang T, Zhao D, Zeng J (2020) MONN: a multi-objective neural network for predicting compound-protein interactions and affinities. Cell Syst 10(4):308\u2013322","journal-title":"Cell Syst"},{"issue":"9","key":"862_CR11","doi-asserted-by":"publisher","first-page":"2531","DOI":"10.1039\/C9SC03414E","volume":"11","author":"AS Rifaioglu","year":"2020","unstructured":"Rifaioglu AS, Nalbat E, Atalay V, Martin MJ, Cetin-Atalay R, Do\u011fan T (2020) DEEPScreen: high performance drug\u2013target interaction prediction with convolutional neural networks using 2-D structural compound representations. Chem Sci 11(9):2531\u20132557","journal-title":"Chem Sci"},{"key":"862_CR12","first-page":"100044","volume":"3","author":"P V\u00e4th","year":"2022","unstructured":"V\u00e4th P, M\u00fcnch M, Raab C, Schleif F-M (2022) PROVAL: a framework for comparison of protein sequence embeddings. J Comput Math 3:100044","journal-title":"J Comput Math"},{"issue":"2","key":"862_CR13","first-page":"025004","volume":"1","author":"G Lambard","year":"2020","unstructured":"Lambard G, Gracheva E (2020) SMILES-X: autonomous molecular compounds characterization for small datasets without descriptors. Mach Learn: Sci Technol 1(2):025004","journal-title":"Mach Learn: Sci Technol"},{"issue":"2","key":"862_CR14","first-page":"1","volume":"23","author":"G Di Gennaro","year":"2021","unstructured":"Di Gennaro G, Buonanno A, Palmieri FA (2021) Considerations about learning Word2Vec. J Supercomput 23(2):1\u201316","journal-title":"J Supercomput"},{"issue":"8","key":"862_CR15","doi-asserted-by":"publisher","first-page":"2102","DOI":"10.1093\/bioinformatics\/btac020","volume":"38","author":"N Brandes","year":"2022","unstructured":"Brandes N, Ofer D, Peleg Y, Rappoport N, Linial M (2022) ProteinBERT: a universal deep-learning model of protein sequence and function. Bioinformatics 38(8):2102\u20132110. https:\/\/doi.org\/10.1093\/bioinformatics\/btac020","journal-title":"Bioinformatics"},{"key":"862_CR16","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbac131","author":"Z Wu","year":"2022","unstructured":"Wu Z, Jiang D, Wang J, Zhang X, Du H, Pan L et al (2022) Knowledge-based BERT: a method to extract molecular features like computational chemists. Brief Bioinform. https:\/\/doi.org\/10.1093\/bib\/bbac131","journal-title":"Brief Bioinform"},{"issue":"3","key":"862_CR17","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1093\/bib\/bbac142","volume":"23","author":"A Villegas-Morcillo","year":"2022","unstructured":"Villegas-Morcillo A, Gomez AM, Sanchez V (2022) An analysis of protein language model embeddings for fold prediction. Brief Bioinform 23(3):142","journal-title":"Brief Bioinform"},{"issue":"1","key":"862_CR18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40649-019-0069-y","volume":"6","author":"S Zhang","year":"2019","unstructured":"Zhang S, Tong H, Xu J, Maciejewski R (2019) Graph convolutional networks: a comprehensive review. Comput Soc Netw 6(1):1\u201323","journal-title":"Comput Soc Netw"},{"issue":"35","key":"862_CR19","doi-asserted-by":"publisher","first-page":"20701","DOI":"10.1039\/D0RA02297G","volume":"10","author":"M Jiang","year":"2020","unstructured":"Jiang M, Li Z, Zhang S, Wang S, Wang X, Yuan Q et al (2020) Drug\u2013target affinity prediction using graph neural network and contact maps. RSC Adv 10(35):20701\u201320712","journal-title":"RSC Adv"},{"issue":"8","key":"862_CR20","doi-asserted-by":"publisher","first-page":"1140","DOI":"10.1093\/bioinformatics\/btaa921","volume":"7","author":"T Nguyen","year":"2021","unstructured":"Nguyen T, Le H, Quinn TP, Nguyen T, Le TD, Venkatesh S (2021) GraphDTA: predicting drug-target binding affinity with graph neural networks. Bioinformatics 7(8):1140\u20131147. https:\/\/doi.org\/10.1093\/bioinformatics\/btaa921","journal-title":"Bioinformatics"},{"issue":"2","key":"862_CR21","doi-asserted-by":"publisher","first-page":"016","DOI":"10.1093\/bib\/bbac016","volume":"23","author":"L Jiang","year":"2022","unstructured":"Jiang L, Sun J, Wang Y, Ning Q, Luo N, Yin M (2022) Identifying drug\u2013target interactions via heterogeneous graph attention networks combined with cross-modal similarities. Brief Bioinform 23(2):016. https:\/\/doi.org\/10.1093\/bib\/bbac016","journal-title":"Brief Bioinform"},{"issue":"9","key":"862_CR22","doi-asserted-by":"publisher","first-page":"3981","DOI":"10.1021\/acs.jcim.9b00387","volume":"59","author":"J Lim","year":"2019","unstructured":"Lim J, Ryu S, Park K, Choe YJ, Ham J, Kim WY (2019) Predicting drug-target interaction using a novel graph neural network with 3D structure-embedded graph representation. J Chem Inf Model 59(9):3981\u20133988. https:\/\/doi.org\/10.1021\/acs.jcim.9b00387","journal-title":"J Chem Inf Model"},{"key":"862_CR23","doi-asserted-by":"crossref","unstructured":"Purkayastha S, Mondal I, Sarkar S, Goyal P, Pillai JK (2019) Drug-Drug Interactions Prediction Based on Drug Embedding and Graph Auto-Encoder. Paper presented at 19th international conference on bioinformatics and bioengineering, Athens, Greece, 28\u201330 Octobor 2019.","DOI":"10.1109\/BIBE.2019.00104"},{"key":"862_CR24","doi-asserted-by":"crossref","unstructured":"Xiong W, Li F, Yu H, Ji D (2019) Extracting Drug-drug Interactions with a Dependency-based Graph Convolution Neural Network. Paper presented at 19th international conference on bioinformatics and bioengineering, Athens, Greece, 28\u201330 Octobor 2019.","DOI":"10.1109\/BIBM47256.2019.8983150"},{"issue":"2","key":"862_CR25","doi-asserted-by":"publisher","first-page":"819","DOI":"10.1109\/TCBB.2020.3017547","volume":"19","author":"Y Zhang","year":"2022","unstructured":"Zhang Y, Chen L, Li S (2022) CIPHER-SC: disease-gene association inference using graph convolution on a context-aware network with single-cell data. IEEE\/ACM Trans Comput Biol Bioinform 19(2):819\u2013829. https:\/\/doi.org\/10.1109\/TCBB.2020.3017547","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"862_CR26","doi-asserted-by":"publisher","DOI":"10.3390\/cells8090977","author":"C Li","year":"2019","unstructured":"Li C, Liu H, Hu Q, Que J, Yao J (2019) a novel computational model for predicting microRNA-disease associations based on heterogeneous graph convolutional networks. Cells. https:\/\/doi.org\/10.3390\/cells8090977","journal-title":"Cells"},{"issue":"1","key":"862_CR27","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1186\/s12920-018-0372-8","volume":"11","author":"A Rao","year":"2018","unstructured":"Rao A, Vg S, Joseph T, Kotte S, Sivadasan N, Srinivasan R (2018) Phenotype-driven gene prioritization for rare diseases using graph convolution on heterogeneous networks. BMC Med Genomics 11(1):57. https:\/\/doi.org\/10.1186\/s12920-018-0372-8","journal-title":"BMC Med Genomics"},{"key":"862_CR28","doi-asserted-by":"publisher","first-page":"108696","DOI":"10.1016\/j.patcog.2022.108696","volume":"128","author":"T Zhang","year":"2022","unstructured":"Zhang T, Shan HR, Little MA (2022) Causal GraphSAGE: a robust graph method for classification based on causal sampling. Pattern Recogn 128:108696. https:\/\/doi.org\/10.1016\/j.patcog.2022.108696","journal-title":"Pattern Recogn"},{"key":"862_CR29","doi-asserted-by":"crossref","unstructured":"Ying R, He R, Chen K, Eksombatchai P, Hamilton WL, Leskovec J. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. Paper presented at proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, London, United Kingdom, 19\u201323 August 2018.","DOI":"10.1145\/3219819.3219890"},{"key":"862_CR30","unstructured":"Chen J, Zhu J, Song L (2018) Stochastic Training of Graph Convolutional Networks with Variance Reduction. Paper presented at 35th international conference on machine learning, Stockholmsm\u00e4ssan, Stockholm, 10\u201315 July 2018."},{"key":"862_CR31","doi-asserted-by":"publisher","DOI":"10.4855\/arXiv.1801.10247","author":"J Chen","year":"2018","unstructured":"Chen J, Ma T, Xiao C (2018) Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint. https:\/\/doi.org\/10.4855\/arXiv.1801.10247","journal-title":"arXiv preprint"},{"key":"862_CR32","doi-asserted-by":"crossref","unstructured":"Zhang C, Li QC, Song DW (2019) Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks. Paper presented at proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, Hong Kong, China, 4 November 2019.","DOI":"10.18653\/v1\/D19-1464"},{"key":"862_CR33","unstructured":"Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. Paper presented at proceedings of the 31st international conference on neural information processing systems, Long Beach, California, 4\u20137 December 2017."},{"key":"862_CR34","doi-asserted-by":"crossref","unstructured":"Li C, Yang Y, Feng M, Chakradhar S, Zhou H (2016) Optimizing Memory Efficiency for Deep Convolutional Neural Networks on GPUs. Paper presented at SC '16: proceedings of the international conference for high performance computing, networking, storage and analysis, Salt Lake City, Utah, 13\u201318 November 2016.","DOI":"10.1109\/SC.2016.53"},{"issue":"5","key":"862_CR35","doi-asserted-by":"publisher","first-page":"1350","DOI":"10.1016\/j.drudis.2022.02.023","volume":"27","author":"B-X Du","year":"2022","unstructured":"Du B-X, Qin Y, Jiang Y-F, Xu Y, Yiu S-M, Yu H et al (2022) Compound\u2013protein interaction prediction by deep learning: databases, descriptors and models. Drug Discov Today 27(5):1350\u20131366","journal-title":"Drug Discov Today"},{"key":"862_CR36","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1016\/j.ymeth.2016.06.024","volume":"110","author":"K Tian","year":"2016","unstructured":"Tian K, Shao M, Wang Y, Guan J, Zhou S (2016) Boosting compound-protein interaction prediction by deep learning. Methods 110:64\u201372","journal-title":"Methods"},{"issue":"12","key":"862_CR37","doi-asserted-by":"publisher","first-page":"i221","DOI":"10.1093\/bioinformatics\/btv256","volume":"31","author":"H Liu","year":"2015","unstructured":"Liu H, Sun J, Guan J, Zheng J, Zhou S (2015) Improving compound\u2013protein interaction prediction by building up highly credible negative samples. Bioinformatics 31(12):i221\u2013i229","journal-title":"Bioinformatics"},{"issue":"12","key":"862_CR38","doi-asserted-by":"publisher","first-page":"1339","DOI":"10.1016\/j.patrec.2013.04.019","volume":"34","author":"T Putthiporn","year":"2013","unstructured":"Putthiporn T, Chidchanok L (2013) Handling imbalanced data sets with synthetic boundary data generation using bootstrap re-sampling and AdaBoost techniques. Pattern Recognit Lett 34(12):1339\u20131347","journal-title":"Pattern Recognit Lett"},{"key":"862_CR39","doi-asserted-by":"publisher","first-page":"895","DOI":"10.3389\/fchem.2019.00895","volume":"7","author":"Y-F Zhang","year":"2020","unstructured":"Zhang Y-F, Wang X, Kaushik AC, Chu Y, Shan X, Zhao M-Z et al (2020) SPVec: a Word2vec-inspired feature representation method for drug-target interaction prediction. Front Chem 7:895","journal-title":"Front Chem"},{"key":"862_CR40","doi-asserted-by":"crossref","unstructured":"Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V (2019) Accurate, efficient and scalable graph embedding. Paper presented at 2019 IEEE international parallel and distributed processing symposium, Rio de Janeiro, Brazil, 20\u201324 May 2019.","DOI":"10.1109\/IPDPS.2019.00056"},{"issue":"D1","key":"862_CR41","doi-asserted-by":"publisher","first-page":"D1100","DOI":"10.1093\/nar\/gkr777","volume":"40","author":"A Gaulton","year":"2012","unstructured":"Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A et al (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucl Acids Res 40(D1):D1100\u2013D1107","journal-title":"Nucl Acids Res"},{"issue":"D1","key":"862_CR42","doi-asserted-by":"publisher","first-page":"D1045","DOI":"10.1093\/nar\/gkv1072","volume":"44","author":"MK Gilson","year":"2016","unstructured":"Gilson MK, Liu T, Baitaluk M, Nicola G, Hwang L, Chong J (2016) BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucl Acids Res 44(D1):D1045\u2013D1053","journal-title":"Nucl Acids Res"},{"issue":"D1","key":"862_CR43","doi-asserted-by":"publisher","first-page":"D1102","DOI":"10.1093\/nar\/gky1033","volume":"47","author":"S Kim","year":"2019","unstructured":"Kim S, Chen J, Cheng T, Gindulyte A, He J, He S et al (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res 47(D1):D1102\u2013D1109","journal-title":"Nucleic Acids Res"},{"key":"862_CR44","doi-asserted-by":"publisher","first-page":"W441","DOI":"10.1093\/nar\/gkp253","volume":"37","author":"RZ Cer","year":"2009","unstructured":"Cer RZ, Mudunuri U, Stephens R, Lebeda FJ (2009) IC50-to-Ki: a web-based tool for converting IC50 to Ki values for inhibitors of enzyme activity and ligand binding. Nucl Acids Res 37:W441-445","journal-title":"Nucl Acids Res"},{"key":"862_CR45","volume-title":"Database systems for advanced applications","author":"Y Zheng","year":"2023","unstructured":"Zheng Y, Tang P, Qiu W, Wang H, Guo J, Huang Z (2023) A novel deep learning framework for interpretable drug-target interaction prediction with attention and multi-task mechanism. In: Wang X, Sapino ML, Han W-S, El Abbadi A, Dobbie G, Feng Z, Shao Y, Yin H (eds) Database systems for advanced applications. Springer Nature Switzerland, Cham"},{"issue":"1","key":"862_CR46","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1186\/s13321-016-0130-x","volume":"8","author":"Z Wang","year":"2016","unstructured":"Wang Z, Liang L, Yin Z, Lin J (2016) Improving chemical similarity ensemble approach in target prediction. J Cheminform 8(1):20","journal-title":"J Cheminform"},{"issue":"1","key":"862_CR47","doi-asserted-by":"publisher","first-page":"1989","DOI":"10.1038\/s41467-023-37572-z","volume":"14","author":"A Chatterjee","year":"2023","unstructured":"Chatterjee A, Walters R, Shafi Z, Ahmed OS, Sebek M, Gysi D et al (2023) Improving the generalizability of protein-ligand binding predictions with AI-Bind. Nat Commun 14(1):1989","journal-title":"Nat Commun"},{"issue":"9","key":"862_CR48","doi-asserted-by":"publisher","first-page":"3981","DOI":"10.1021\/acs.jcim.9b00387","volume":"59","author":"J Lim","year":"2019","unstructured":"Lim J, Ryu S, Park K, Choe YJ, Ham J, Kim WY (2019) Predicting drug-target interaction using a novel graph neural network with 3D structure-embedded graph representation. J Chem Inf Model 59(9):3981\u20133988","journal-title":"J Chem Inf Model"},{"issue":"10","key":"862_CR49","doi-asserted-by":"publisher","first-page":"4131","DOI":"10.1021\/acs.jcim.9b00628","volume":"59","author":"W Torng","year":"2019","unstructured":"Torng W, Altman RB (2019) Graph convolutional neural networks for predicting drug-target interactions. J Chem Inf Model 59(10):4131\u20134149","journal-title":"J Chem Inf Model"},{"issue":"9","key":"862_CR50","doi-asserted-by":"publisher","first-page":"2805","DOI":"10.1093\/bioinformatics\/btaa010","volume":"36","author":"X Zeng","year":"2020","unstructured":"Zeng X, Zhu S, Hou Y, Zhang P, Li L, Li J et al (2020) Network-based prediction of drug-target interactions using an arbitrary-order proximity embedded deep forest. Bioinformatics 36(9):2805\u20132812","journal-title":"Bioinformatics"},{"issue":"7","key":"862_CR51","doi-asserted-by":"publisher","first-page":"1775","DOI":"10.1039\/C9SC04336E","volume":"11","author":"X Zeng","year":"2020","unstructured":"Zeng X, Zhu S, Lu W, Liu Z, Huang J, Zhou Y et al (2020) Target identification among known drugs by deep learning from heterogeneous networks. Chem Sci 11(7):1775\u20131797","journal-title":"Chem Sci"},{"issue":"6","key":"862_CR52","doi-asserted-by":"publisher","first-page":"e1007129","DOI":"10.1371\/journal.pcbi.1007129","volume":"15","author":"I Lee","year":"2019","unstructured":"Lee I, Keum J, Nam H (2019) DeepConv-DTI: prediction of drug-target interactions via deep learning with convolution on protein sequences. PLoS Comput Biol 15(6):e1007129. https:\/\/doi.org\/10.1371\/journal.pcbi.1007129","journal-title":"PLoS Comput Biol"},{"issue":"18","key":"862_CR53","doi-asserted-by":"publisher","first-page":"14061","DOI":"10.3390\/ijms241814061","volume":"24","author":"Y Huang","year":"2023","unstructured":"Huang Y, Huang H-Y, Chen Y, Lin Y-C-D, Yao L, Lin T et al (2023) A robust drug-target interaction prediction framework with capsule network and transfer learning. Int J Mol Sci 24(18):14061","journal-title":"Int J Mol Sci"},{"key":"862_CR54","doi-asserted-by":"publisher","first-page":"108339","DOI":"10.1016\/j.compbiomed.2024.108339","volume":"173","author":"M Gao","year":"2024","unstructured":"Gao M, Zhang D, Chen Y, Zhang Y, Wang Z, Wang X et al (2024) GraphormerDTI: a graph transformer-based approach for drug-target interaction prediction. Comput Biol Med 173:108339","journal-title":"Comput Biol Med"},{"key":"862_CR55","doi-asserted-by":"publisher","DOI":"10.4855\/arXiv.1711.11027","author":"A Bra\u017einskas","year":"2017","unstructured":"Bra\u017einskas A, Havrylov S, Titov I (2017) Embedding words as distributions with a Bayesian skip-gram model. arXiv preprint. https:\/\/doi.org\/10.4855\/arXiv.1711.11027","journal-title":"arXiv preprint"},{"key":"862_CR56","doi-asserted-by":"publisher","DOI":"10.4855\/arXiv.1301.3781","author":"T Mikolov","year":"2013","unstructured":"Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint. https:\/\/doi.org\/10.4855\/arXiv.1301.3781","journal-title":"arXiv preprint"},{"issue":"3","key":"862_CR57","doi-asserted-by":"publisher","first-page":"2627","DOI":"10.1007\/s11063-019-10043-7","volume":"50","author":"K Ghiasi-Shirazi","year":"2019","unstructured":"Ghiasi-Shirazi K (2019) Generalizing the convolution operator in convolutional neural networks. Neural Process Lett 50(3):2627\u20132646","journal-title":"Neural Process Lett"},{"key":"862_CR58","doi-asserted-by":"crossref","unstructured":"Hariharan B, Arbel\u00e1ez P, Girshick R, Malik J (2015) Hypercolumns for object segmentation and fine-grained localization. Paper presented at 2015 IEEE conference on computer vision and pattern recognition, Boston, Massachusetts, 7\u201312 June 2015.","DOI":"10.1109\/CVPR.2015.7298642"},{"key":"862_CR59","doi-asserted-by":"crossref","unstructured":"Huang G, Sun Y, Liu Z, Sedra D, Weinberger KQ (2016) Deep networks with stochastic depth. Paper presented at computer vision\u2013ECCV 2016: 14th European conference, Amsterdam, The Netherlands, 11\u201314 October 2016.","DOI":"10.1007\/978-3-319-46493-0_39"},{"key":"862_CR60","doi-asserted-by":"publisher","first-page":"1","DOI":"10.31031\/acsr.2023.04.000578","volume":"4","author":"AS Lang","year":"2023","unstructured":"Lang AS, Chong WK, W\u00f6rner JH (2023) Fine-tuning ChemBERTa-2 for aqueous solubility prediction. Ann Chem Sci Res 4:1\u20133. https:\/\/doi.org\/10.31031\/acsr.2023.04.000578","journal-title":"Ann Chem Sci Res"},{"issue":"6637","key":"862_CR61","doi-asserted-by":"publisher","first-page":"1123","DOI":"10.1126\/science.ade2574","volume":"379","author":"Z Lin","year":"2023","unstructured":"Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W et al (2023) Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379(6637):1123\u20131130. https:\/\/doi.org\/10.1126\/science.ade2574","journal-title":"Science"},{"key":"862_CR62","doi-asserted-by":"publisher","first-page":"1297","DOI":"10.1038\/s42256-023-00740-3","volume":"5","author":"NC Frey","year":"2023","unstructured":"Frey NC, Soklaski R, Axelrod S et al (2023) Neural scaling of deep chemical models. Nat Mach Intell 5:1297\u20131305. https:\/\/doi.org\/10.1038\/s42256-023-00740-3","journal-title":"Nat Mach Intell"},{"key":"862_CR63","doi-asserted-by":"publisher","first-page":"4348","DOI":"10.1038\/s41467-022-32007-7","volume":"13","author":"N Ferruz","year":"2022","unstructured":"Ferruz N, Schmidt S, H\u00f6cker B (2022) ProtGPT2 is a deep unsupervised language model for protein design. Nat Commun 13:4348. https:\/\/doi.org\/10.1038\/s41467-022-32007-7","journal-title":"Nat Commun"},{"key":"862_CR64","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1186\/s13321-023-00767-z","volume":"15","author":"N Song","year":"2023","unstructured":"Song N, Dong R, Pu Y et al (2023) PMF-CPI: assessing drug selectivity with a pretrained multi-functional model for compound\u2013protein interactions. J Cheminform 15:97. https:\/\/doi.org\/10.1186\/s13321-023-00767-z","journal-title":"J Cheminform"},{"key":"862_CR65","doi-asserted-by":"publisher","unstructured":"Quan Z, Guo Y, Lin X,Wang Z-Y, Zeng X (2019) GraphCPI: Graph Neural Representation Learning for Compound-Protein Interaction. Paper presented at 2019 IEEE international conference on bioinformatics and biomedicine, San Diego, California, 18\u201321 November 2019. https:\/\/doi.org\/10.1109\/BIBM47256.2019.8983267.","DOI":"10.1109\/BIBM47256.2019.8983267"},{"issue":"21","key":"862_CR66","doi-asserted-by":"publisher","first-page":"3577","DOI":"10.3390\/rs12213577","volume":"12","author":"S Chen","year":"2020","unstructured":"Chen S, Wang X, Guo H, Xie P, Wang J, Hao X (2020) A conditional probability interpolation method based on a space-time cube for MODIS snow cover products gap filling. Remote Sens 12(21):3577. https:\/\/doi.org\/10.3390\/rs12213577","journal-title":"Remote Sens"},{"key":"862_CR67","doi-asserted-by":"publisher","first-page":"1292869","DOI":"10.3389\/fchem.2023.1292869","volume":"11","author":"W Shan","year":"2023","unstructured":"Shan W, Chen L, Xu H, Zhong Q, Xu Y et al (2023) GcForest-based compound-protein interaction prediction model and its application in discovering small-molecule drugs targeting CD47. Front Chem 11:1292869. https:\/\/doi.org\/10.3389\/fchem.2023.1292869","journal-title":"Front Chem"},{"key":"862_CR68","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1080\/07391102.2023.2291829","volume":"12","author":"F Palhamkhani","year":"2023","unstructured":"Palhamkhani F, Alipour M, Dehnad A, Abbasi K, Razzaghi P, Ghasemi JB (2023) DeepCompoundNet: enhancing compound-protein interaction prediction with multimodal convolutional neural networks. J Biomol Struct Dyn 12:1\u201310. https:\/\/doi.org\/10.1080\/07391102.2023.2291829","journal-title":"J Biomol Struct Dyn"},{"key":"862_CR69","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1186\/s12859-024-05671-3","volume":"25","author":"A Dehghan","year":"2024","unstructured":"Dehghan A, Abbasi K, Razzaghi P (2024) CCL-DTI: contributing the contrastive loss in drug\u2013target interaction prediction. BMC Bioinform 25:48. https:\/\/doi.org\/10.1186\/s12859-024-05671-3","journal-title":"BMC Bioinform"},{"issue":"2","key":"862_CR70","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1016\/S0960-894X(01)00710-7","volume":"12","author":"LL Chang","year":"2002","unstructured":"Chang LL, Truong Q, Mumford RA, Egger LA, Kidambi U, Lyons K et al (2002) The discovery of small molecule carbamates as potent dual \u03b14\u03b21\/\u03b14\u03b27 integrin antagonists. Bioorg Med Chem Lett 12(2):159\u2013163","journal-title":"Bioorg Med Chem Lett"},{"issue":"11","key":"862_CR71","doi-asserted-by":"publisher","first-page":"4720","DOI":"10.1021\/jm500261q","volume":"57","author":"TW Johnson","year":"2014","unstructured":"Johnson TW, Richardson PF, Bailey S, Brooun A, Burke BJ, Collins MR et al (2014) Discovery of (10 R)-7-Amino-12-fluoro-2, 10, 16-trimethyl-15-oxo-10, 15, 16, 17-tetrahydro-2H-8, 4-(metheno) pyrazolo [4, 3-h][2,5,11]-benzoxadiazacyclotetradecine-3-carbonitrile (PF-06463922), a macrocyclic inhibitor of anaplastic lymphoma kinase (ALK) and c-ros oncogene 1 (ROS1) with preclinical brain exposure and broad-spectrum potency against ALK-resistant mutations. J Med Chem 57(11):4720\u20134744","journal-title":"J Med Chem"},{"issue":"18","key":"862_CR72","first-page":"6043","volume":"15","author":"IE Kopka","year":"2002","unstructured":"Kopka IE, Young DN, Lin LS, Mumford RA, Magriotis PA, MacCoss M et al (2002) Substituted N-(3, 5-dichlorobenzenesulfonyl)-L-prolyl-phenylalanine analogues as potent VLA-4 antagonists. Bioorg Med Chem Lett 15(18):6043\u20136053","journal-title":"Bioorg Med Chem Lett"},{"issue":"14","key":"862_CR73","doi-asserted-by":"publisher","first-page":"6328","DOI":"10.1021\/jm300238h","volume":"55","author":"MK Parai","year":"2012","unstructured":"Parai MK, Huggins DJ, Cao H, Nalam MN, Ali A, Schiffer CA et al (2012) Design, synthesis, and biological and structural evaluations of novel HIV-1 protease inhibitors to combat drug resistance. J Med Chem 55(14):6328\u20136341","journal-title":"J Med Chem"},{"issue":"11","key":"862_CR74","doi-asserted-by":"publisher","first-page":"3295","DOI":"10.1016\/j.bmcl.2010.04.045","volume":"20","author":"H Liu","year":"2010","unstructured":"Liu H, Altenbach RJ, Diaz GJ, Manelli AM, Martin RL, Miller TR et al (2010) In vitro studies on a class of quinoline containing histamine H3 antagonists. Bioorg Med Chem Lett 20(11):3295\u20133300","journal-title":"Bioorg Med Chem Lett"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-024-00862-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-024-00862-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-024-00862-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,7]],"date-time":"2024-06-07T17:04:05Z","timestamp":1717779845000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-024-00862-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,7]]},"references-count":74,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["862"],"URL":"https:\/\/doi.org\/10.1186\/s13321-024-00862-9","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,7]]},"assertion":[{"value":"29 November 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 May 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 June 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The author declares no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"67"}}