{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T11:43:00Z","timestamp":1769946180541,"version":"3.49.0"},"reference-count":73,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2021,2,26]],"date-time":"2021-02-26T00:00:00Z","timestamp":1614297600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,9,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>A global effort is underway to identify compounds for the treatment of COVID-19. Since de novo compound design is an extremely long, time-consuming and expensive process, efforts are underway to discover existing compounds that can be repurposed for COVID-19 and new viral diseases.<\/jats:p>\n                  <jats:p>We propose a machine learning representation framework that uses deep learning induced vector embeddings of compounds and viral proteins as features to predict compound-viral protein activity. The prediction model in-turn uses a consensus framework to rank approved compounds against viral proteins of interest.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Our consensus framework achieves a high mean Pearson correlation of 0.916, mean R2 of 0.840 and a low mean squared error of 0.313 for the task of compound-viral protein activity prediction on an independent test set. As a use case, we identify a ranked list of 47 compounds common to three main proteins of SARS-COV-2 virus (PL-PRO, 3CL-PRO and Spike protein) as potential targets including 21 antivirals, 15 anticancer, 5 antibiotics and 6 other investigational human compounds. We perform additional molecular docking simulations to demonstrate that majority of these compounds have low binding energies and thus high binding affinity with the potential to be effective against the SARS-COV-2 virus.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>All the source code and data is available at: https:\/\/github.com\/raghvendra5688\/Drug-Repurposing and https:\/\/dx.doi.org\/10.17632\/8rrwnbcgmx.3. We also implemented a web-server at: https:\/\/machinelearning-protein.qcri.org\/index.html.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab130","type":"journal-article","created":{"date-parts":[[2021,2,24]],"date-time":"2021-02-24T12:23:03Z","timestamp":1614169383000},"page":"2544-2555","source":"Crossref","is-referenced-by-count":12,"title":["A modeling framework for embedding-based predictions for compound\u2013viral protein activity"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1779-3150","authenticated-orcid":false,"given":"Raghvendra","family":"Mall","sequence":"first","affiliation":[{"name":"Digital Health and Precision Medicine Center, Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar"}]},{"given":"Abdurrahman","family":"Elbasir","sequence":"additional","affiliation":[{"name":"ICT Division, College of Science and Engineering, Hamad Bin Khalifa University, Doha 34110, Qatar"}]},{"given":"Hossam","family":"Almeer","sequence":"additional","affiliation":[{"name":"Digital Health and Precision Medicine Center, Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar"}]},{"given":"Zeyaul","family":"Islam","sequence":"additional","affiliation":[{"name":"Diabetes Research Center, Qatar Biomedical Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar"}]},{"given":"Prasanna R.","family":"Kolatkar","sequence":"additional","affiliation":[{"name":"Diabetes Research Center, Qatar Biomedical Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar"}]},{"given":"Sanjay","family":"Chawla","sequence":"additional","affiliation":[{"name":"Digital Health and Precision Medicine Center, Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar"}]},{"given":"Ehsan","family":"Ullah","sequence":"additional","affiliation":[{"name":"Digital Health and Precision Medicine Center, Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar"}]}],"member":"286","published-online":{"date-parts":[[2021,2,26]]},"reference":[{"key":"2023051609205258600_btab130-B1","volume-title":"Foundations of Linear and Generalized Linear Models","author":"Agresti","year":"2015"},{"key":"2023051609205258600_btab130-B2","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1016\/j.ijid.2020.02.018","article-title":"Discovery and development of safe-in-man broad-spectrum antiviral agents","volume":"93","author":"Andersen","year":"2020","journal-title":"Int. J. Infectious Dis"},{"key":"2023051609205258600_btab130-B3","article-title":"Searching for target-specific and multi-targeting organics for Covid-19 in the drugbank database with a double scoring approach","author":"Arul","year":"2020","journal-title":"Scientific reports 10, 1\u201316"},{"key":"2023051609205258600_btab130-B4","volume-title":"Assay Guidance Manual [Internet]","author":"Beck","year":"2017"},{"key":"2023051609205258600_btab130-B5","doi-asserted-by":"crossref","first-page":"784","DOI":"10.1016\/j.csbj.2020.03.025","article-title":"Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-COV-2) through a drug-target interaction deep learning model","volume":"18","author":"Beck","year":"2020","journal-title":"Comput. Struct. Biotechnol. J"},{"key":"2023051609205258600_btab130-B6","doi-asserted-by":"crossref","first-page":"1813","DOI":"10.1056\/NEJMoa2007764","article-title":"Remdesivir for the treatment of Covid-19\u2014preliminary report","volume":"383","author":"Beigel","year":"2020","journal-title":"N. Engl. J. Med"},{"key":"2023051609205258600_btab130-B7","doi-asserted-by":"crossref","first-page":"e0171355","DOI":"10.1371\/journal.pone.0171355","article-title":"Impact of genetic variation on three dimensional structure and function of proteins","volume":"12","author":"Bhattacharya","year":"2017","journal-title":"PLoS One"},{"key":"2023051609205258600_btab130-B8","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1093\/nar\/gkg095","article-title":"The swiss-prot protein knowledgebase and its supplement trembl in 2003","volume":"31","author":"Boeckmann","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023051609205258600_btab130-B9","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023051609205258600_btab130-B10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13321-020-00445-4","article-title":"One molecular fingerprint to rule them all: drugs, biomolecules, and the metabolome","volume":"12","author":"Capecchi","year":"2020","journal-title":"J. Cheminformatics"},{"key":"2023051609205258600_btab130-B11","article-title":"Drug repurposing approach targeted against main protease of sars-cov-2 exploiting \u2018neighbourhood behaviour\u2019in 3d protein structural space and 2d chemical space of small molecules","author":"Chakraborti","year":"2020"},{"key":"2023051609205258600_btab130-B12","first-page":"785","author":"Chen","year":"2016"},{"key":"2023051609205258600_btab130-B13","author":"Connor","year":"2020"},{"key":"2023051609205258600_btab130-B14","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1016\/S1473-3099(20)30120-1","article-title":"An interactive web-based dashboard to track Covid-19 in real time","volume":"20","author":"Dong","year":"2020","journal-title":"Lancet Infect. Dis"},{"key":"2023051609205258600_btab130-B15","first-page":"155","volume-title":"Advances in Neural Information Processing Systems","author":"Drucker","year":"1997"},{"key":"2023051609205258600_btab130-B16","article-title":"Repurposing FDA-approved drugs for Covid-19 using a data-driven approach","author":"Duarte","year":"2020","journal-title":"ChemRxiv"},{"key":"2023051609205258600_btab130-B17","doi-asserted-by":"crossref","first-page":"2216","DOI":"10.1093\/bioinformatics\/bty953","article-title":"Deepcrystal: a deep learning framework for sequence-based protein crystallization prediction","volume":"35","author":"Elbasir","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051609205258600_btab130-B18","doi-asserted-by":"crossref","first-page":"1429","DOI":"10.1093\/bioinformatics\/btz762","article-title":"Bcrystal: an interpretable sequence-based protein crystallization predictor","volume":"36","author":"Elbasir","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051609205258600_btab130-B19","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1016\/j.pharmthera.2006.09.001","article-title":"Protease inhibitors and their peptidomimetic derivatives as potential drugs","volume":"113","author":"Fear","year":"2007","journal-title":"Pharmacol. Ther"},{"key":"2023051609205258600_btab130-B20","volume-title":"Medical Microbiology","author":"Fleischmann","year":"1996"},{"key":"2023051609205258600_btab130-B21","article-title":"Coronavirus (Covid-19) update: FDA issues emergency use authorization for potential covid-19 treatment","volume":"1","year":"2020","journal-title":"FDA News Release"},{"key":"2023051609205258600_btab130-B22","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1214\/aos\/1013203451","article-title":"Greedy function approximation: a gradient boosting machine","volume":"29","author":"Friedman","year":"2001","journal-title":"Ann. Stat"},{"key":"2023051609205258600_btab130-B23","first-page":"3371","article-title":"Interpretable drug target prediction using deep neural representation","author":"Gao","year":"2018"},{"key":"2023051609205258600_btab130-B24","doi-asserted-by":"crossref","first-page":"D945","DOI":"10.1093\/nar\/gkw1074","article-title":"The chembl database in 2017","volume":"45","author":"Gaulton","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023051609205258600_btab130-B25","volume-title":"Schmidhuber J.A. Cummins","author":"Gers","year":"1999"},{"key":"2023051609205258600_btab130-B26","volume-title":"Deep Learning","author":"Goodfellow","year":"2016"},{"key":"2023051609205258600_btab130-B27","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1038\/s41586-020-2286-9","article-title":"A SARS-COV-2 protein interaction map reveals targets for drug repurposing","volume":"583","author":"Gordon","year":"2020","journal-title":"Nature"},{"key":"2023051609205258600_btab130-B28","doi-asserted-by":"crossref","first-page":"1700111","DOI":"10.1002\/minf.201700111","article-title":"Generative recurrent networks for de novo drug design","volume":"37","author":"Gupta","year":"2018","journal-title":"Mol. Informatics"},{"key":"2023051609205258600_btab130-B29","article-title":"Network medicine framework for identifying drug repurposing opportunities for Covid-19","author":"Gysi","year":"2020","journal-title":"Proceedings of the National Academy of Sciences 118"},{"key":"2023051609205258600_btab130-B30","volume-title":"Assay Guidance Manual [Internet]","author":"Haas","year":"2017"},{"key":"2023051609205258600_btab130-B31","volume-title":"Digital Design and Computer Architecture","author":"Harris","year":"2010"},{"key":"2023051609205258600_btab130-B32","doi-asserted-by":"crossref","first-page":"2605","DOI":"10.1093\/bioinformatics\/bty166","article-title":"Deepsol: a deep learning framework for sequence-based protein solubility prediction","volume":"34","author":"Khurana","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051609205258600_btab130-B33","doi-asserted-by":"crossref","first-page":"D1202","DOI":"10.1093\/nar\/gkv951","article-title":"Pubchem substance and compound databases","volume":"44","author":"Kim","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023051609205258600_btab130-B34","article-title":"Semi-supervised classification with graph convolutional networks","author":"Kipf","year":"2016","journal-title":"International Conference on Learning Representations, 1\u201314"},{"key":"2023051609205258600_btab130-B35","doi-asserted-by":"crossref","first-page":"935","DOI":"10.1038\/nrd1549","article-title":"Docking and scoring in virtual screening for drug discovery: methods and applications","volume":"3","author":"Kitchen","year":"2004","journal-title":"Nat. Rev. Drug Discov"},{"key":"2023051609205258600_btab130-B36","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1002\/aic.690370209","article-title":"Nonlinear principal component analysis using autoassociative neural networks","volume":"37","author":"Kramer","year":"1991","journal-title":"AIChE J"},{"key":"2023051609205258600_btab130-B37","first-page":"4601","author":"Lamb","year":"2016"},{"key":"2023051609205258600_btab130-B38","first-page":"1","article-title":"Structure of the SARS-COV-2 spike receptor-binding domain bound to the ACE2 receptor","author":"Lan","year":"2020","journal-title":"Nature"},{"key":"2023051609205258600_btab130-B39","first-page":"1","article-title":"Rdkit documentation","volume":"1","author":"Landrum","year":"2013","journal-title":"Release"},{"key":"2023051609205258600_btab130-B40","first-page":"1995","article-title":"Convolutional networks for images, speech, and time series","volume":"3361","author":"LeCun","year":"1995","journal-title":"The Handbook of Brain Theory and Neural Networks"},{"key":"2023051609205258600_btab130-B41","doi-asserted-by":"crossref","first-page":"D198","DOI":"10.1093\/nar\/gkl999","article-title":"Bindingdb: a web-accessible database of experimentally determined protein\u2013ligand binding affinities","volume":"35","author":"Liu","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023051609205258600_btab130-B42","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1186\/s12918-017-0412-6","article-title":"Detection of statistically significant network changes in complex biological networks","volume":"11","author":"Mall","year":"2017","journal-title":"BMC Syst. Biol"},{"key":"2023051609205258600_btab130-B43","doi-asserted-by":"crossref","first-page":"e39\u2013e39","DOI":"10.1093\/nar\/gky015","article-title":"RGBM: regularized gradient boosting machines for identification of the transcriptional regulators of discrete glioma subtypes","volume":"46","author":"Mall","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023051609205258600_btab130-B44","doi-asserted-by":"crossref","first-page":"1086","DOI":"10.1109\/TNNLS.2014.2333879","article-title":"Very sparse LSSVM reductions for large-scale data","volume":"26","author":"Mall","year":"2015","journal-title":"IEEE Trans. Neural Netw. Learn. Syst"},{"key":"2023051609205258600_btab130-B45","author":"Martin","year":"2020"},{"key":"2023051609205258600_btab130-B46","first-page":"1","article-title":"Benchmark on a large cohort for sleep-wake classification with machine learning techniques","volume":"2","author":"Palotti","year":"2019","journal-title":"NPJ Dig. Med"},{"key":"2023051609205258600_btab130-B47","article-title":"Repurposed antiviral drugs for Covid-19; interim who solidarity trial results","author":"Pan","year":"2020","journal-title":"New England journal of medicine 384, 497\u2013511"},{"key":"2023051609205258600_btab130-B48","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023051609205258600_btab130-B49","article-title":"Molecular sets (MOSES): a benchmarking platform for molecular generation models","author":"Polykovskiy","year":"2018","journal-title":"arXiv Preprint arXiv:1811.12823"},{"key":"2023051609205258600_btab130-B50","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1038\/newbio233223b0","article-title":"Protein data bank","volume":"233","year":"1971","journal-title":"Nat. New Biol"},{"key":"2023051609205258600_btab130-B51","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1038\/nrd.2018.168","article-title":"Drug repurposing: progress, challenges and recommendations","volume":"18","author":"Pushpakom","year":"2019","journal-title":"Nat. Rev. Drug Discov"},{"key":"2023051609205258600_btab130-B52","first-page":"7647","author":"Rao","year":"2019"},{"key":"2023051609205258600_btab130-B53","doi-asserted-by":"crossref","first-page":"1092","DOI":"10.1093\/bioinformatics\/btx662","article-title":"Parsnip: sequence-based protein solubility prediction using gradient boosting machine","volume":"34","author":"Rawi","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051609205258600_btab130-B54","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1038\/s41586-020-2577-1","article-title":"Discovery of SARS-COV-2 antiviral drugs through large-scale compound repurposing","volume":"586","author":"Riva","year":"2020","journal-title":"Nature"},{"key":"2023051609205258600_btab130-B55","volume-title":"Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk Assessment","author":"Roy","year":"2015"},{"key":"2023051609205258600_btab130-B56","doi-asserted-by":"crossref","DOI":"10.1038\/s41467-020-17189-2","article-title":"Exploring the SARS-COV-2 virus-host-drug interactome for drug repurposing","author":"Sadegh","year":"2020","journal-title":"Nature communications 11, 1\u20139."},{"key":"2023051609205258600_btab130-B57","first-page":"1824","article-title":"Pharmacologic treatments for coronavirus disease 2019 (Covid-19): a review","volume":"323","author":"Sanders","year":"2020","journal-title":"JAMA"},{"key":"2023051609205258600_btab130-B58","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1023\/A:1018628609742","article-title":"Least squares support vector machine classifiers","volume":"9","author":"Suykens","year":"1999","journal-title":"Neural Process Lett"},{"key":"2023051609205258600_btab130-B59","doi-asserted-by":"crossref","first-page":"782","DOI":"10.3389\/fchem.2019.00782","article-title":"Comparison study of computational prediction tools for drug-target binding affinities","volume":"7","author":"Thafar","year":"2019","journal-title":"Front. Chem"},{"key":"2023051609205258600_btab130-B60","doi-asserted-by":"crossref","first-page":"D158","DOI":"10.1093\/nar\/gkw1099","article-title":"Uniprot: the universal protein knowledgebase","volume":"45","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023051609205258600_btab130-B61","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1186\/s12967-018-1472-0","article-title":"Harnessing Qatar biobank to understand type 2 diabetes and obesity in adult Qataris from the first qatar biobank project","volume":"16","author":"Ullah","year":"2018","journal-title":"J. Transl. Med"},{"key":"2023051609205258600_btab130-B62","first-page":"2322","author":"Ullah","year":"2017"},{"key":"2023051609205258600_btab130-B63","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani","year":"2017"},{"key":"2023051609205258600_btab130-B64","article-title":"Graph attention networks","author":"Veli\u010dkovi\u0107","year":"2017","journal-title":"International Conference on Learning Representations, PP. 1\u201312"},{"key":"2023051609205258600_btab130-B65","author":"Verma","year":"2020"},{"key":"2023051609205258600_btab130-B66","article-title":"Atomnet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery","author":"Wallach","year":"2015","journal-title":"CoRR, abs\/1510.02855"},{"key":"2023051609205258600_btab130-B67","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1038\/nature17180","article-title":"Therapeutic efficacy of the small molecule GS-5734 against ebola virus in rhesus monkeys","volume":"531","author":"Warren","year":"2016","journal-title":"Nature"},{"key":"2023051609205258600_btab130-B68","doi-asserted-by":"crossref","first-page":"D13","DOI":"10.1093\/nar\/gkm1000","article-title":"Database resources of the national center for biotechnology information","volume":"36","author":"Wheeler","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023051609205258600_btab130-B69","doi-asserted-by":"crossref","first-page":"D1074","DOI":"10.1093\/nar\/gkx1037","article-title":"Drugbank 5.0: a major update to the drugbank database for 2018","volume":"46","author":"Wishart","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023051609205258600_btab130-B70","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1089\/phage.2020.0014","article-title":"Bacteriophages could be a potential game changer in the trajectory of coronavirus disease (Covid-19","volume":"1","author":"Wojewodzic","year":"2020","journal-title":"PHAGE"},{"key":"2023051609205258600_btab130-B71","year":"2020"},{"key":"2023051609205258600_btab130-B72","doi-asserted-by":"crossref","first-page":"4624","DOI":"10.1021\/acs.jproteome.0c00316","article-title":"Repurpose open data to discover therapeutics for Covid-19 using deep learning","volume":"19","author":"Zeng","year":"2020","journal-title":"J. Proteome Res"},{"key":"2023051609205258600_btab130-B73","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1038\/s41421-020-0153-3","article-title":"Network-based drug repurposing for novel coronavirus 2019-NCOV\/SARS-COV-2","volume":"6","author":"Zhou","year":"2020","journal-title":"Cell Discov"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab130\/38462081\/btab130.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/17\/2544\/50339098\/btab130.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/17\/2544\/50339098\/btab130.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T09:24:14Z","timestamp":1684229054000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/17\/2544\/6151691"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,2,26]]},"references-count":73,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2021,9,9]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab130","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,9,1]]},"published":{"date-parts":[[2021,2,26]]}}}