{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T02:46:15Z","timestamp":1768272375752,"version":"3.49.0"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2017,8,26]],"date-time":"2017-08-26T00:00:00Z","timestamp":1503705600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>In silico approaches often fail to utilize bioactivity data available for orthologous targets due to insufficient evidence highlighting the benefit for such an approach. Deeper investigation into orthologue chemical space and its influence toward expanding compound and target coverage is necessary to improve the confidence in this practice.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here we present analysis of the orthologue chemical space in ChEMBL and PubChem and its impact on target prediction. We highlight the number of conflicting bioactivities between human and orthologues is low and annotations are overall compatible. Chemical space analysis shows orthologues are chemically dissimilar to human with high intra-group similarity, suggesting they could effectively extend the chemical space modelled. Based on these observations, we show the benefit of orthologue inclusion in terms of novel target coverage. We also benchmarked predictive models using a time-series split and also using bioactivities from Chemistry Connect and HTS data available at AstraZeneca, showing that orthologue bioactivity inclusion statistically improved performance.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Orthologue-based bioactivity prediction and the compound training set are available at www.github.com\/lhm30\/PIDGINv2.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx525","type":"journal-article","created":{"date-parts":[[2017,8,25]],"date-time":"2017-08-25T11:09:17Z","timestamp":1503659357000},"page":"72-79","source":"Crossref","is-referenced-by-count":31,"title":["Orthologue chemical space and its influence on target prediction"],"prefix":"10.1093","volume":"34","author":[{"given":"Lewis H","family":"Mervin","sequence":"first","affiliation":[{"name":"Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK"}]},{"given":"Krishna C","family":"Bulusu","sequence":"additional","affiliation":[{"name":"Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK"},{"name":"Oncology Innovative Medicines and Early Development, AstraZeneca, Cambridge, UK"}]},{"given":"Leen","family":"Kalash","sequence":"additional","affiliation":[{"name":"Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK"}]},{"given":"Avid M","family":"Afzal","sequence":"additional","affiliation":[{"name":"Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK"}]},{"given":"Fredrik","family":"Svensson","sequence":"additional","affiliation":[{"name":"Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK"}]},{"given":"Mike A","family":"Firth","sequence":"additional","affiliation":[{"name":"Discovery Sciences, AstraZeneca R&D, Cambridge Science Park, Cambridge, UK"}]},{"given":"Ian","family":"Barrett","sequence":"additional","affiliation":[{"name":"Discovery Sciences, AstraZeneca R&D, Cambridge Science Park, Cambridge, UK"}]},{"given":"Ola","family":"Engkvist","sequence":"additional","affiliation":[{"name":"Discovery Sciences, AstraZeneca R&D Gothenburg, M\u00f6lndal, Sweden"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7271-0824","authenticated-orcid":false,"given":"Andreas","family":"Bender","sequence":"additional","affiliation":[{"name":"Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK"}]}],"member":"286","published-online":{"date-parts":[[2017,8,26]]},"reference":[{"key":"2023020208412350400_btx525-B1","doi-asserted-by":"crossref","first-page":"D1083","DOI":"10.1093\/nar\/gkt1031","article-title":"The ChEMBL bioactivity database: an update","volume":"42","author":"Bento","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020208412350400_btx525-B2","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1016\/j.ymeth.2014.09.006","article-title":"Tools for in silico target fishing","volume":"71","author":"Cereto-Massagu\u00e9","year":"2015","journal-title":"Methods"},{"key":"2023020208412350400_btx525-B3","article-title":"Standardizer was used for structure canonicalization and transformation","author":"ChemAxon","year":"2015"},{"key":"2023020208412350400_btx525-B4","doi-asserted-by":"crossref","first-page":"D8","DOI":"10.1093\/nar\/gks1189","article-title":"Database resources of the National Center for Biotechnology Information","volume":"41","author":"Coordinators","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023020208412350400_btx525-B5","doi-asserted-by":"crossref","first-page":"1105","DOI":"10.1111\/cbdd.12578","article-title":"Identification of orthologous target pairs with shared active compounds and comparison of organism-specific activity patterns","volume":"86","author":"Dimova","year":"2015","journal-title":"Chem. Biol. Drug Des"},{"key":"2023020208412350400_btx525-B6","doi-asserted-by":"crossref","first-page":"2721","DOI":"10.1093\/bioinformatics\/btv214","article-title":"Protein homology reveals new targets for bioactive small molecules","volume":"31","author":"Gfeller","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020208412350400_btx525-B7","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1016\/j.drudis.2015.07.018","article-title":"In silico assessment of adverse drug reactions and associated mechanisms","volume":"21","author":"Ivanov","year":"2016","journal-title":"Drug Discov. Today"},{"key":"2023020208412350400_btx525-B8","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1038\/sj.bjp.0707308","article-title":"Chemogenomic approaches to drug discovery: similar receptors bind similar ligands","volume":"152","author":"Klabunde","year":"2007","journal-title":"Br. J. Pharmacol"},{"key":"2023020208412350400_btx525-B9","doi-asserted-by":"crossref","first-page":"1957","DOI":"10.1021\/ci300435j","article-title":"In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Na\u00efve Bayes and Parzen-Rosenblatt window","volume":"53","author":"Koutsoukas","year":"2013","journal-title":"J. Chem. Inf. Model"},{"key":"2023020208412350400_btx525-B10","doi-asserted-by":"crossref","first-page":"e1002333.","DOI":"10.1371\/journal.pcbi.1002333","article-title":"Global analysis of small molecule binding to related protein targets","volume":"8","author":"Kruger","year":"2012","journal-title":"PLoS Comput. Biol"},{"key":"2023020208412350400_btx525-B11","author":"Landrum","year":"2006"},{"key":"2023020208412350400_btx525-B12","doi-asserted-by":"crossref","first-page":"288","DOI":"10.1016\/j.drudis.2015.12.007","article-title":"In silico methods to address polypharmacology: current status, applications and future perspectives","volume":"21","author":"Lavecchia","year":"2016","journal-title":"Drug Discov. Today"},{"key":"2023020208412350400_btx525-B13","doi-asserted-by":"crossref","first-page":"e1003253","DOI":"10.1371\/journal.pcbi.1003253","article-title":"Target prediction for an open access set of compounds active against Mycobacterium tuberculosis","volume":"9","author":"Mart\u00ednez-Jim\u00e9nez","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023020208412350400_btx525-B14","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1186\/s13321-015-0098-y","article-title":"Target prediction utilising negative bioactivity data covering large chemical space","volume":"7","author":"Mervin","year":"2015","journal-title":"J. Cheminform"},{"key":"2023020208412350400_btx525-B15","doi-asserted-by":"crossref","first-page":"3007","DOI":"10.1021\/acschembio.6b00538","article-title":"Understanding cytotoxicity and cytostaticity in a high-throughput screening collection","volume":"11","author":"Mervin","year":"2016","journal-title":"ACS Chem. Biol"},{"key":"2023020208412350400_btx525-B16","doi-asserted-by":"crossref","first-page":"2106465","DOI":"10.1155\/2016\/2106465","article-title":"Global mapping of traditional chinese medicine into bioactivity space and pathways annotation improves mechanistic understanding and discovers relationships between therapeutic action (sub)classes","volume":"2016","author":"Mohamad Zobir","year":"2016","journal-title":"Evid. Based Complement Alternat. Med"},{"key":"2023020208412350400_btx525-B17","doi-asserted-by":"crossref","first-page":"1019","DOI":"10.1016\/j.drudis.2011.10.005","article-title":"Making every SAR point count: the development of Chemistry Connect for the large-scale integration of structure and bioactivity data","volume":"16","author":"Muresan","year":"2011","journal-title":"Drug Discov. Today"},{"key":"2023020208412350400_btx525-B18","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/s13321-015-0063-9","article-title":"Proteochemometric modelling coupled to in silico target prediction: an integrated approach for the simultaneous prediction of polypharmacology and binding affinity\/potency of small molecules","volume":"7","author":"Paricharak","year":"2015","journal-title":"J. Cheminform"},{"key":"2023020208412350400_btx525-B19","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1186\/1758-2946-5-49","article-title":"Are phylogenetic trees suitable for chemogenomics analyses of bioactivity data sets: the importance of shared active compounds and choosing a suitable data embedding method, as exemplified on Kinases","volume":"5","author":"Paricharak","year":"2013","journal-title":"J. Cheminform"},{"key":"2023020208412350400_btx525-B20","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023020208412350400_btx525-B21","doi-asserted-by":"crossref","first-page":"D833","DOI":"10.1093\/nar\/gkw943","article-title":"DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants","volume":"45","author":"Pi\u00f1ero","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023020208412350400_btx525-B22","first-page":"61","article-title":"Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods","volume":"10","author":"Platt","year":"1999","journal-title":"Adv. Large Margin Class"},{"key":"2023020208412350400_btx525-B23","author":"Ramsundar","year":"2015"},{"key":"2023020208412350400_btx525-B24","doi-asserted-by":"crossref","first-page":"1466","DOI":"10.1021\/jm0108202","article-title":"Carbonic anhydrase inhibitors. A general approach for the preparation of water-soluble sulfonamides incorporating polyamino-polycarboxylate tails and of their metal complexes possessing long-lasting, topical intraocular pressure-lowering properties","volume":"45","author":"Scozzafava","year":"2002","journal-title":"J. Med. Chem"},{"key":"2023020208412350400_btx525-B25","doi-asserted-by":"crossref","first-page":"2499","DOI":"10.1021\/ci400099q","article-title":"Estimating error rates in bioactivity databases","volume":"53","author":"Tiikkainen","year":"2013","journal-title":"J. Chem. Inf. Model"},{"key":"2023020208412350400_btx525-B26","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1021\/ci600426e","article-title":"Evaluating virtual screening methods: good and bad metrics for the \u201cearly recognition\u201d problem","volume":"47","author":"Truchon","year":"2007","journal-title":"J. Chem. Inf. Model"},{"key":"2023020208412350400_btx525-B27","doi-asserted-by":"crossref","first-page":"7010","DOI":"10.1021\/jm3003069","article-title":"Identifying novel adenosine receptor ligands by simultaneous proteochemometric modeling of rat and human bioactivity data","volume":"55","author":"van Westen","year":"2012","journal-title":"J. Med. Chem"},{"key":"2023020208412350400_btx525-B28","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1208\/s12248-012-9449-z","article-title":"TargetHunter: an in silico target identification tool for predicting therapeutic potential of small organic molecules based on chemogenomic database","volume":"15","author":"Wang","year":"2013","journal-title":"AAPS J"},{"key":"2023020208412350400_btx525-B29","doi-asserted-by":"crossref","first-page":"W623","DOI":"10.1093\/nar\/gkp456","article-title":"PubChem: a public information system for analyzing bioactivities of small molecules","volume":"37","author":"Wang","year":"2009","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/1\/72\/49043532\/bioinformatics_34_1_72.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/1\/72\/49043532\/bioinformatics_34_1_72.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T08:56:30Z","timestamp":1675328190000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/1\/72\/4095638"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,8,26]]},"references-count":29,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2018,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx525","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,1,1]]},"published":{"date-parts":[[2017,8,26]]}}}