{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T00:40:22Z","timestamp":1774572022100,"version":"3.50.1"},"reference-count":77,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T00:00:00Z","timestamp":1686268800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T00:00:00Z","timestamp":1686268800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Off-target drug interactions are a major reason for candidate failure in the drug discovery process. Anticipating potential drug\u2019s adverse effects in the early stages is necessary to minimize health risks to patients, animal testing, and economical costs. With the constantly increasing size of virtual screening libraries, AI-driven methods can be exploited as first-tier screening tools to provide liability estimation for drug candidates. In this work we present ProfhEX, an AI-driven suite of 46 OECD-compliant machine learning models that can profile small molecules on 7 relevant liability groups: cardiovascular, central nervous system, gastrointestinal, endocrine, renal, pulmonary and immune system toxicities. Experimental affinity data was collected from public and commercial data sources. The entire chemical space comprised 289\u2032202 activity data for a total of 210\u2032116 unique compounds, spanning over 46 targets with dataset sizes ranging from 819 to 18896. Gradient boosting and random forest algorithms were initially employed and ensembled for the selection of a champion model. Models were validated according to the OECD principles, including robust internal (cross validation, bootstrap, y-scrambling) and external validation. Champion models achieved an average Pearson correlation coefficient of 0.84 (SD of 0.05), an R\n                    <jats:sup>2<\/jats:sup>\n                    determination coefficient of 0.68 (SD\u2009=\u20090.1) and a root mean squared error of 0.69 (SD of 0.08). All liability groups showed good hit-detection power with an average enrichment factor at 5% of 13.1 (SD of 4.5) and AUC of 0.92 (SD of 0.05). Benchmarking against already existing tools demonstrated the predictive power of ProfhEX models for large-scale liability profiling. This platform will be further expanded with the inclusion of new targets and through complementary modelling approaches, such as structure and pharmacophore-based models. ProfhEX is freely accessible at the following address:\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/profhex.exscalate.eu\/\">https:\/\/profhex.exscalate.eu\/<\/jats:ext-link>\n                    .\n                  <\/jats:p>","DOI":"10.1186\/s13321-023-00728-6","type":"journal-article","created":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T06:02:38Z","timestamp":1686290558000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["ProfhEX: AI-based platform for small molecules liability profiling"],"prefix":"10.1186","volume":"15","author":[{"given":"Filippo","family":"Lunghini","sequence":"first","affiliation":[]},{"given":"Anna","family":"Fava","sequence":"additional","affiliation":[]},{"given":"Vincenzo","family":"Pisapia","sequence":"additional","affiliation":[]},{"given":"Francesco","family":"Sacco","sequence":"additional","affiliation":[]},{"given":"Daniela","family":"Iaconis","sequence":"additional","affiliation":[]},{"given":"Andrea Rosario","family":"Beccari","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,6,9]]},"reference":[{"key":"728_CR1","doi-asserted-by":"publisher","first-page":"961","DOI":"10.4155\/fmc.11.62","volume":"3","author":"J Achenbach","year":"2011","unstructured":"Achenbach J, Tiikkainen P, Franke L, Proschak E (2011) Computational tools for polypharmacology and repurposing. Future Med Chem 3:961\u2013968. https:\/\/doi.org\/10.4155\/fmc.11.62","journal-title":"Future Med Chem"},{"key":"728_CR2","doi-asserted-by":"publisher","first-page":"420","DOI":"10.1021\/acs.jmedchem.8b00760","volume":"62","author":"E Proschak","year":"2019","unstructured":"Proschak E, Stark H, Merk D (2019) Polypharmacology by design: a medicinal chemist\u2019s perspective on multitargeting compounds. J Med Chem 62:420\u2013444. https:\/\/doi.org\/10.1021\/acs.jmedchem.8b00760","journal-title":"J Med Chem"},{"key":"728_CR3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3389\/fphar.2015.00157","volume":"6","author":"G Rastelli","year":"2015","unstructured":"Rastelli G, Pinzi L (2015) Computational polypharmacology comes of age. Front Pharmacol 6:1\u20134. https:\/\/doi.org\/10.3389\/fphar.2015.00157","journal-title":"Front Pharmacol"},{"key":"728_CR4","doi-asserted-by":"publisher","first-page":"7874","DOI":"10.1021\/jm5006463","volume":"57","author":"A Anighoro","year":"2014","unstructured":"Anighoro A, Bajorath J, Rastelli G (2014) Polypharmacology: challenges and opportunities in drug discovery. J Med Chem 57:7874\u20137887","journal-title":"J Med Chem"},{"key":"728_CR5","doi-asserted-by":"publisher","DOI":"10.1002\/cmdc.201600067","author":"Z Tan","year":"2016","unstructured":"Tan Z, Chaudhai R, Zhang S (2016) Polypharmacology in drug development: a minireview of current technologies. ChemMedChem. https:\/\/doi.org\/10.1002\/cmdc.201600067","journal-title":"ChemMedChem"},{"key":"728_CR6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3389\/fdata.2019.00025","volume":"2","author":"MS Rao","year":"2019","unstructured":"Rao MS, Gupta R, Liguori MJ et al (2019) Novel computational approach to predict off-target interactions for small molecules. Front Big Data 2:1\u201317. https:\/\/doi.org\/10.3389\/fdata.2019.00025","journal-title":"Front Big Data"},{"key":"728_CR7","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1021\/acs.chemrestox.9b00227","volume":"33","author":"AH Vo","year":"2020","unstructured":"Vo AH, Van Vleet TR, Gupta RR et al (2020) An overview of machine learning and big data for drug toxicity evaluation. Chem Res Toxicol 33:20\u201337. https:\/\/doi.org\/10.1021\/acs.chemrestox.9b00227","journal-title":"Chem Res Toxicol"},{"key":"728_CR8","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1038\/nature11159","volume":"486","author":"E Lounkine","year":"2012","unstructured":"Lounkine E, Keiser MJ, Whitebread S et al (2012) Large-scale prediction and testing of drug activity on side-effect targets. Nat 486:361\u2013367. https:\/\/doi.org\/10.1038\/nature11159","journal-title":"Nat"},{"key":"728_CR9","doi-asserted-by":"publisher","first-page":"D1080","DOI":"10.1093\/NAR\/GKV1192","volume":"44","author":"VB Siramshetty","year":"2016","unstructured":"Siramshetty VB, Nickel J, Omieczynski C et al (2016) WITHDRAWN\u2014a resource for withdrawn and discontinued drugs. Nucleic Acids Res 44:D1080\u2013D1086. https:\/\/doi.org\/10.1093\/NAR\/GKV1192","journal-title":"Nucleic Acids Res"},{"issue":"13","key":"728_CR10","doi-asserted-by":"publisher","first-page":"419","DOI":"10.1038\/nrd4309","volume":"136","author":"D Cook","year":"2014","unstructured":"Cook D, Brown D, Alexander R et al (2014) (2014) Lessons learned from the fate of AstraZeneca\u2019s drug pipeline: a five-dimensional framework. Nat Rev Drug Discov 136(13):419\u2013431. https:\/\/doi.org\/10.1038\/nrd4309","journal-title":"Nat Rev Drug Discov"},{"key":"728_CR11","doi-asserted-by":"publisher","first-page":"909","DOI":"10.1038\/nrd3845","volume":"11","author":"J Bowes","year":"2012","unstructured":"Bowes J, Brown AJ, Hamon J et al (2012) Reducing safety-related drug attrition: the use of in vitro pharmacological profiling. Nat Rev Drug Discov 11:909\u2013922. https:\/\/doi.org\/10.1038\/nrd3845","journal-title":"Nat Rev Drug Discov"},{"key":"728_CR12","doi-asserted-by":"publisher","first-page":"1624","DOI":"10.1016\/j.drudis.2020.07.005","volume":"25","author":"L Zhao","year":"2020","unstructured":"Zhao L, Ciallella HL, Aleksunes LM, Zhu H (2020) Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling. Drug Discov Today 25:1624\u20131638. https:\/\/doi.org\/10.1016\/j.drudis.2020.07.005","journal-title":"Drug Discov Today"},{"key":"728_CR13","doi-asserted-by":"publisher","first-page":"1315","DOI":"10.1007\/s11030-021-10217-3","volume":"25","author":"R Gupta","year":"2021","unstructured":"Gupta R, Srivastava D, Sahu M et al (2021) Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 25:1315\u20131360. https:\/\/doi.org\/10.1007\/s11030-021-10217-3","journal-title":"Mol Divers"},{"key":"728_CR14","doi-asserted-by":"publisher","first-page":"104662","DOI":"10.1016\/j.yrtph.2020.104662","volume":"114","author":"AM Avila","year":"2020","unstructured":"Avila AM, Bebenek I, Bonzo JA et al (2020) An FDA\/CDER perspective on nonclinical testing strategies: classical toxicology approaches and new approach methodologies (NAMs). Regul Toxicol Pharmacol 114:104662","journal-title":"Regul Toxicol Pharmacol"},{"key":"728_CR15","doi-asserted-by":"publisher","first-page":"108","DOI":"10.1016\/J.VASCN.2017.02.020","volume":"87","author":"JJ Lynch","year":"2017","unstructured":"Lynch JJ, Van Vleet TR, Mittelstadt SW, Blomme EAG (2017) Potential functional and pathological side effects related to off-target pharmacological activity. J Pharmacol Toxicol Methods 87:108\u2013126. https:\/\/doi.org\/10.1016\/J.VASCN.2017.02.020","journal-title":"J Pharmacol Toxicol Methods"},{"key":"728_CR16","doi-asserted-by":"publisher","first-page":"100188","DOI":"10.1016\/J.COMTOX.2021.100188","volume":"20","author":"A Bassan","year":"2021","unstructured":"Bassan A, Alves VM, Amberg A et al (2021) In silico approaches in organ toxicity hazard assessment: Current status and future needs for predicting heart, kidney and lung toxicities. Comput Toxicol 20:100188. https:\/\/doi.org\/10.1016\/J.COMTOX.2021.100188","journal-title":"Comput Toxicol"},{"key":"728_CR17","doi-asserted-by":"publisher","first-page":"100223","DOI":"10.1016\/J.COMTOX.2022.100223","volume":"22","author":"KM Crofton","year":"2022","unstructured":"Crofton KM, Bassan A, Behl M et al (2022) Current status and future directions for a neurotoxicity hazard assessment framework that integrates in silico approaches. Comput Toxicol 22:100223. https:\/\/doi.org\/10.1016\/J.COMTOX.2022.100223","journal-title":"Comput Toxicol"},{"key":"728_CR18","doi-asserted-by":"publisher","first-page":"1427","DOI":"10.1002\/MED.21764","volume":"41","author":"S Vatansever","year":"2021","unstructured":"Vatansever S, Schlessinger A, Wacker D et al (2021) Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: state-of-the-arts and future directions. Med Res Rev 41:1427. https:\/\/doi.org\/10.1002\/MED.21764","journal-title":"Med Res Rev"},{"key":"728_CR19","doi-asserted-by":"publisher","first-page":"1409","DOI":"10.1007\/s11030-021-10239-x","volume":"25","author":"A R\u00e1cz","year":"2021","unstructured":"R\u00e1cz A, Bajusz D, Miranda-Quintana RA, H\u00e9berger K (2021) Machine learning models for classification tasks related to drug safety. Mol Divers 25:1409\u20131424. https:\/\/doi.org\/10.1007\/s11030-021-10239-x","journal-title":"Mol Divers"},{"key":"728_CR20","doi-asserted-by":"publisher","first-page":"4538","DOI":"10.1016\/J.CSBJ.2021.08.011","volume":"19","author":"P Carracedo-Reboredo","year":"2021","unstructured":"Carracedo-Reboredo P, Li\u00f1ares-Blanco J, Rodr\u00edguez-Fern\u00e1ndez N et al (2021) A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 19:4538\u20134558. https:\/\/doi.org\/10.1016\/J.CSBJ.2021.08.011","journal-title":"Comput Struct Biotechnol J"},{"key":"728_CR21","doi-asserted-by":"publisher","first-page":"395","DOI":"10.1208\/S12248-012-9449-Z","volume":"15","author":"L Wang","year":"2013","unstructured":"Wang L, Ma C, Wipf P et al (2013) TargetHunter: an in silico target identification tool for predicting therapeutic potential of small organic molecules based on chemogenomic database. AAPS J 15:395. https:\/\/doi.org\/10.1208\/S12248-012-9449-Z","journal-title":"AAPS J"},{"key":"728_CR22","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1007\/S10822-016-9915-2","volume":"30","author":"ZJ Yao","year":"2016","unstructured":"Yao ZJ, Dong J, Che YJ et al (2016) TargetNet: a web service for predicting potential drug-target interaction profiling via multi-target SAR models. J Comput Aided Mol Des 30:413\u2013424. https:\/\/doi.org\/10.1007\/S10822-016-9915-2","journal-title":"J Comput Aided Mol Des"},{"key":"728_CR23","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1021\/acs.jcim.8b00524","volume":"59","author":"M Awale","year":"2019","unstructured":"Awale M, Reymond JL (2019) Polypharmacology browser PPB2: target prediction combining nearest neighbors with machine learning. J Chem Inf Model 59:10\u201317. https:\/\/doi.org\/10.1021\/acs.jcim.8b00524","journal-title":"J Chem Inf Model"},{"key":"728_CR24","doi-asserted-by":"publisher","first-page":"D930","DOI":"10.1093\/NAR\/GKY1075","volume":"47","author":"D Mendez","year":"2019","unstructured":"Mendez D, Gaulton A, Bento AP et al (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47:D930\u2013D940. https:\/\/doi.org\/10.1093\/NAR\/GKY1075","journal-title":"Nucleic Acids Res"},{"key":"728_CR25","doi-asserted-by":"publisher","first-page":"D1388","DOI":"10.1093\/NAR\/GKAA971","volume":"49","author":"S Kim","year":"2021","unstructured":"Kim S, Chen J, Cheng T et al (2021) PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 49:D1388\u2013D1395. https:\/\/doi.org\/10.1093\/NAR\/GKAA971","journal-title":"Nucleic Acids Res"},{"key":"728_CR26","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1093\/TOXSCI\/KFL103","volume":"95","author":"DJ Dix","year":"2007","unstructured":"Dix DJ, Houck KA, Martin MT et al (2007) The ToxCast program for prioritizing toxicity testing of environmental chemicals. Toxicol Sci 95:5\u201312. https:\/\/doi.org\/10.1093\/TOXSCI\/KFL103","journal-title":"Toxicol Sci"},{"key":"728_CR27","doi-asserted-by":"publisher","first-page":"163","DOI":"10.14573\/ALTEX.1803011","volume":"35","author":"RS Thomas","year":"2018","unstructured":"Thomas RS, Paules RS, Simeonov A et al (2018) The US Federal Tox21 Program: a strategic and operational plan for continued leadership. ALTEX 35:163\u2013168. https:\/\/doi.org\/10.14573\/ALTEX.1803011","journal-title":"ALTEX"},{"key":"728_CR28","doi-asserted-by":"publisher","first-page":"1023","DOI":"10.1289\/EHP.1510267","volume":"124","author":"K Mansouri","year":"2016","unstructured":"Mansouri K, Abdelaziz A, Rybacka A et al (2016) CERAPP: collaborative estrogen receptor activity prediction project. Environ Health Perspect 124:1023\u20131033. https:\/\/doi.org\/10.1289\/EHP.1510267","journal-title":"Environ Health Perspect"},{"key":"728_CR29","doi-asserted-by":"publisher","first-page":"27002","DOI":"10.1289\/EHP5580","volume":"128","author":"K Mansouri","year":"2020","unstructured":"Mansouri K, Kleinstreuer N, Abdelaziz AM et al (2020) CoMPARA: Collaborative modeling project for androgen receptor activity. Environ Health Perspect 128:27002. https:\/\/doi.org\/10.1289\/EHP5580","journal-title":"Environ Health Perspect"},{"key":"728_CR30","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1186\/S12859-017-1960-X\/FIGURES\/10","volume":"18","author":"K Lee","year":"2017","unstructured":"Lee K, Lee M, Kim D (2017) Utilizing random Forest QSAR models with optimized parameters for target identification and its application to target-fishing server. BMC Bioinformatics 18:75\u201386. https:\/\/doi.org\/10.1186\/S12859-017-1960-X\/FIGURES\/10","journal-title":"BMC Bioinformatics"},{"key":"728_CR31","doi-asserted-by":"publisher","first-page":"5441","DOI":"10.1039\/c8sc00148k","volume":"9","author":"A Mayr","year":"2018","unstructured":"Mayr A, Klambauer G, Unterthiner T et al (2018) Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chem Sci 9:5441\u20135451. https:\/\/doi.org\/10.1039\/c8sc00148k","journal-title":"Chem Sci"},{"key":"728_CR32","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-022-00590-y","author":"AK Arshadi","year":"2021","unstructured":"Arshadi AK (2021) MolData, a molecular benchmark for disease and target based machine learning. J Cheminform. https:\/\/doi.org\/10.1186\/s13321-022-00590-y","journal-title":"J Cheminform"},{"key":"728_CR33","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/S13321-019-0334-Y\/FIGURES\/7","volume":"11","author":"T Hanser","year":"2019","unstructured":"Hanser T, Steinmetz FP, Plante J et al (2019) Avoiding hERG-liability in drug design via synergetic combinations of different (Q)SAR methodologies and data sources: a case study in an industrial setting. J Cheminform 11:1\u201313. https:\/\/doi.org\/10.1186\/S13321-019-0334-Y\/FIGURES\/7","journal-title":"J Cheminform"},{"key":"728_CR34","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1186\/S13321-017-0232-0","volume":"9","author":"EB Lenselink","year":"2017","unstructured":"Lenselink EB, Ten Dijke N, Bongers B et al (2017) Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set. J Cheminform 9:45. https:\/\/doi.org\/10.1186\/S13321-017-0232-0","journal-title":"J Cheminform"},{"key":"728_CR35","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/S13321-018-0325-4\/TABLES\/6","volume":"11","author":"N Bosc","year":"2019","unstructured":"Bosc N, Atkinson F, Felix E et al (2019) Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery. J Cheminform 11:1\u201316. https:\/\/doi.org\/10.1186\/S13321-018-0325-4\/TABLES\/6","journal-title":"J Cheminform"},{"key":"728_CR36","doi-asserted-by":"publisher","first-page":"100089","DOI":"10.1016\/J.COMTOX.2019.100089","volume":"12","author":"LSK Konda","year":"2019","unstructured":"Konda LSK, Keerthi Praba S, Kristam R (2019) hERG liability classification models using machine learning techniques. Comput Toxicol 12:100089. https:\/\/doi.org\/10.1016\/J.COMTOX.2019.100089","journal-title":"Comput Toxicol"},{"key":"728_CR37","doi-asserted-by":"publisher","first-page":"2200133","DOI":"10.1002\/MINF.202200133","volume":"41","author":"M Tullius Scotti","year":"2022","unstructured":"Tullius Scotti M, Herrera-Acevedo C, Barros de Menezes RP et al (2022) MolPredictX: online biological activity predictions by machine learning models. Mol Inform 41:2200133. https:\/\/doi.org\/10.1002\/MINF.202200133","journal-title":"Mol Inform"},{"key":"728_CR38","unstructured":"OECD Guidance document on the validation of (Quantitative) structure activity relationship [(Q)SAR] Models. Tech. Rep. ENV\/JM\/MONO(2007)2, Paris, FR, 2007."},{"key":"728_CR39","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1145\/1656274.1656280","volume":"11","author":"MR Berthold","year":"2006","unstructured":"Berthold MR, Cebron N, Dill F et al (2006) KNIME: the konstanz information miner. Data Anal Mach Learn Appl 11:319\u2013326. https:\/\/doi.org\/10.1145\/1656274.1656280","journal-title":"Data Anal Mach Learn Appl"},{"key":"728_CR40","doi-asserted-by":"publisher","first-page":"D480","DOI":"10.1093\/NAR\/GKAA1100","volume":"49","author":"A Bateman","year":"2021","unstructured":"Bateman A, Martin MJ, Orchard S et al (2021) UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res 49:D480\u2013D489. https:\/\/doi.org\/10.1093\/NAR\/GKAA1100","journal-title":"Nucleic Acids Res"},{"key":"728_CR41","doi-asserted-by":"publisher","first-page":"1957","DOI":"10.1021\/CI300435J\/SUPPL_FILE\/CI300435J_SI_008.PDF","volume":"53","author":"A Koutsoukas","year":"2013","unstructured":"Koutsoukas A, Lowe R, Kalantarmotamedi Y et al (2013) In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Na\u00efve Bayes and Parzen-Rosenblatt Window. J Chem Inf Model 53:1957\u20131966. https:\/\/doi.org\/10.1021\/CI300435J\/SUPPL_FILE\/CI300435J_SI_008.PDF","journal-title":"J Chem Inf Model"},{"key":"728_CR42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-015-0098-y","volume":"7","author":"LH Mervin","year":"2015","unstructured":"Mervin LH, Afzal AM, Drakakis G et al (2015) Target prediction utilising negative bioactivity data covering large chemical space. J Cheminform 7:1\u201316. https:\/\/doi.org\/10.1186\/s13321-015-0098-y","journal-title":"J Cheminform"},{"key":"728_CR43","volume-title":"Pipeline pilot version 2018","author":"BIOVIA, Dassault Systemes","year":"2011","unstructured":"BIOVIA, Dassault Systemes (2011) Pipeline pilot version 2018. Dassault Syst\u00e8mes, San Diego"},{"key":"728_CR44","doi-asserted-by":"publisher","first-page":"1189","DOI":"10.1021\/ci100176x","volume":"50","author":"D Fourches","year":"2010","unstructured":"Fourches D, Muratov E, Tropsha A (2010) Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research. J Chem Inf Model 50:1189\u20131204","journal-title":"J Chem Inf Model"},{"key":"728_CR45","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/S13321-020-00468-X\/","volume":"12","author":"G Idakwo","year":"2020","unstructured":"Idakwo G, Thangapandian S, Luttrell J et al (2020) Structure\u2013activity relationship-based chemical classification of highly imbalanced Tox21 datasets. J Cheminform 12:1\u201319. https:\/\/doi.org\/10.1186\/S13321-020-00468-X\/","journal-title":"J Cheminform"},{"key":"728_CR46","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/S13321-019-0356-5\/TABLES\/5","volume":"11","author":"CHG Allen","year":"2019","unstructured":"Allen CHG, Mervin LH, Mahmoud SY, Bender A (2019) Leveraging heterogeneous data from GHS toxicity annotations, molecular and protein target descriptors and Tox21 assay readouts to predict and rationalise acute toxicity. J Cheminform 11:1\u201319. https:\/\/doi.org\/10.1186\/S13321-019-0356-5\/TABLES\/5","journal-title":"J Cheminform"},{"key":"728_CR47","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1007\/978-1-4939-2269-7_18","volume":"1263","author":"TA Wenderski","year":"2015","unstructured":"Wenderski TA, Stratton CF, Bauer RA et al (2015) Principal component analysis as a tool for library design: a case study investigating natural products, brand-name drugs, natural product-like libraries, and drug-like libraries. Methods Mol Biol 1263:225. https:\/\/doi.org\/10.1007\/978-1-4939-2269-7_18","journal-title":"Methods Mol Biol"},{"key":"728_CR48","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1186\/s13321-021-00526-y","volume":"13","author":"C Manelfi","year":"2021","unstructured":"Manelfi C, Gemei M, Talarico C et al (2021) \u201cMolecular Anatomy\u201d: a new multi-dimensional hierarchical scaffold analysis tool. J Cheminform 13:13\u201354","journal-title":"J Cheminform"},{"key":"728_CR49","unstructured":"SAS Institute Inc. SAS\/VIYA\u00ae 3.5 of the SAS System for Unix. https:\/\/www.sas.com\/en\/software\/viya.html. Accessed 01 Mar 2022"},{"key":"728_CR50","doi-asserted-by":"publisher","first-page":"1189","DOI":"10.1214\/aos\/1013203451","volume":"29","author":"JH Friedman","year":"2001","unstructured":"Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189\u20131232. https:\/\/doi.org\/10.1214\/aos\/1013203451","journal-title":"Ann Stat"},{"issue":"45","key":"728_CR51","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"451","author":"L Breiman","year":"2001","unstructured":"Breiman L (2001) Random forests. Mach Learn 451(45):5\u201332. https:\/\/doi.org\/10.1023\/A:1010933404324","journal-title":"Mach Learn"},{"key":"728_CR52","doi-asserted-by":"publisher","first-page":"174","DOI":"10.1080\/00224065.1981.11978748","volume":"13","author":"RL Iman","year":"1981","unstructured":"Iman RL, Helton JC, Campbell JE (1981) An approach to sensitivity analysis of computer models: part i\u2014introduction, input variable selection and preliminary variable assessment. J Qual Technol 13:174\u2013183. https:\/\/doi.org\/10.1080\/00224065.1981.11978748","journal-title":"J Qual Technol"},{"key":"728_CR53","unstructured":"Sastry K, Goldberg D, Kendall G (2005) Genetic Algorithms. In: Burke EK, Kendall G (eds) Search Methodologies. Springer, Boston, MA. https:\/\/link.springer.com\/chapter\/10.1007\/0-387-28356-0_4#citeas"},{"key":"728_CR54","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1002\/QSAR.200390007","volume":"22","author":"A Tropsha","year":"2003","unstructured":"Tropsha A, Gramatica P, Gombar VK (2003) The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb Sci 22:69\u201377. https:\/\/doi.org\/10.1002\/QSAR.200390007","journal-title":"QSAR Comb Sci"},{"key":"728_CR55","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1007\/978-1-62703-059-5_21","volume":"930","author":"P Gramatica","year":"2013","unstructured":"Gramatica P (2013) On the development and validation of QSAR models. Methods Mol Biol 930:499\u2013526. https:\/\/doi.org\/10.1007\/978-1-62703-059-5_21","journal-title":"Methods Mol Biol"},{"key":"728_CR56","doi-asserted-by":"publisher","first-page":"4791","DOI":"10.3390\/molecules17054791","volume":"17","author":"F Sahigara","year":"2012","unstructured":"Sahigara F, Mansouri K, Ballabio D et al (2012) Comparison of different approaches to define the applicability domain of QSAR models. Molecules 17:4791\u20134810. https:\/\/doi.org\/10.3390\/molecules17054791","journal-title":"Molecules"},{"key":"728_CR57","doi-asserted-by":"publisher","first-page":"6582","DOI":"10.1021\/jm300687e","volume":"55","author":"MM Mysinger","year":"2012","unstructured":"Mysinger MM, Carchia M, Irwin JJ, Shoichet BK (2012) Directory of useful decoys, enhanced (DUD-E): Better ligands and decoys for better benchmarking. J Med Chem 55:6582\u20136594. https:\/\/doi.org\/10.1021\/jm300687e","journal-title":"J Med Chem"},{"key":"728_CR58","volume-title":"The concise encyclopedia of statistics","author":"Y Dodge","year":"2008","unstructured":"Dodge Y (2008) The concise encyclopedia of statistics. Springer, New York NY"},{"key":"728_CR59","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-6-27","volume":"6","author":"L Ruddigkeit","year":"2014","unstructured":"Ruddigkeit L, Awale M, Reymond JL (2014) Expanding the fragrance chemical space for virtual screening. J Cheminform 6:1\u201312. https:\/\/doi.org\/10.1186\/1758-2946-6-27","journal-title":"J Cheminform"},{"key":"728_CR60","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1021\/ci600423u","volume":"47","author":"T Fink","year":"2007","unstructured":"Fink T, Raymond JL (2007) Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discove. J Chem Inf Model 47:342\u2013353. https:\/\/doi.org\/10.1021\/ci600423u","journal-title":"J Chem Inf Model"},{"key":"728_CR61","doi-asserted-by":"publisher","first-page":"4294","DOI":"10.1016\/J.BMCL.2014.07.018","volume":"24","author":"D Sampson","year":"2014","unstructured":"Sampson D, Bricker B, Zhu XY et al (2014) Further evaluation of the tropane analogs of haloperidol. Bioorg Med Chem Lett 24:4294\u20134297. https:\/\/doi.org\/10.1016\/J.BMCL.2014.07.018","journal-title":"Bioorg Med Chem Lett"},{"key":"728_CR62","unstructured":"Saito DR, Long DD, Jacobsen JR. Theravance, Inc. Disubstituted alkyl-8-azabicyclo [3.2.1.] octane compounds as mu opioid receptor antagonists. WO2009029257A1, 27 Aug 2007. https:\/\/patents.google.com\/patent\/WO2009029257A1\/en"},{"key":"728_CR63","doi-asserted-by":"publisher","first-page":"2926","DOI":"10.1016\/J.BMCL.2017.04.092","volume":"27","author":"L Jiang","year":"2017","unstructured":"Jiang L, Beattie DT, Jacobsen JR et al (2017) Discovery of N-substituted-endo-3-(8-aza-bicyclo[3.2.1]oct-3-yl)-phenol and -phenyl carboxamide series of \u03bc-opioid receptor antagonists. Bioorg Med Chem Lett 27:2926\u20132930. https:\/\/doi.org\/10.1016\/J.BMCL.2017.04.092","journal-title":"Bioorg Med Chem Lett"},{"key":"728_CR64","doi-asserted-by":"publisher","first-page":"4521","DOI":"10.1016\/J.BMCL.2010.06.026","volume":"20","author":"A Alker","year":"2010","unstructured":"Alker A, Binggeli A, Christ AD et al (2010) Piperidinyl-nicotinamides as potent and selective somatostatin receptor subtype 5 antagonists. Bioorg Med Chem Lett 20:4521\u20134525. https:\/\/doi.org\/10.1016\/J.BMCL.2010.06.026","journal-title":"Bioorg Med Chem Lett"},{"key":"728_CR65","doi-asserted-by":"publisher","first-page":"2887","DOI":"10.1016\/J.BMC.2005.12.010","volume":"14","author":"L Dosen-Micovic","year":"2006","unstructured":"Dosen-Micovic L, Ivanovic M, Micovic V (2006) Steric interactions and the activity of fentanyl analogs at the \u03bc-opioid receptor. Bioorg Med Chem 14:2887\u20132895. https:\/\/doi.org\/10.1016\/J.BMC.2005.12.010","journal-title":"Bioorg Med Chem"},{"key":"728_CR66","doi-asserted-by":"publisher","first-page":"1711","DOI":"10.1016\/J.BMCL.2014.02.049","volume":"24","author":"SF McHardy","year":"2014","unstructured":"McHardy SF, Bohmann JA, Corbett MR et al (2014) Design, synthesis, and characterization of novel, nonquaternary reactivators of GF-inhibited human acetylcholinesterase. Bioorg Med Chem Lett 24:1711\u20131714. https:\/\/doi.org\/10.1016\/J.BMCL.2014.02.049","journal-title":"Bioorg Med Chem Lett"},{"key":"728_CR67","unstructured":"Becker C, Rubens C, Adams J et al. ARYx Therapeutics Inc. DIBENZO[b,f][1,4]OXAZAPINE COMPOUNDS. US20080255088A1, 15 March 2007. https:\/\/patents.google.com\/patent\/US20080255088"},{"key":"728_CR68","doi-asserted-by":"publisher","first-page":"4150","DOI":"10.1021\/ACS.JCIM.9B00633\/ASSET\/IMAGES\/LARGE\/CI9B00633_0005.JPEG","volume":"59","author":"J Zhang","year":"2019","unstructured":"Zhang J, Mucs D, Norinder U, Svensson F (2019) LightGBM: an effective and scalable algorithm for prediction of chemical toxicity-application to the Tox21 and mutagenicity data sets. J Chem Inf Model 59:4150\u20134158. https:\/\/doi.org\/10.1021\/ACS.JCIM.9B00633\/ASSET\/IMAGES\/LARGE\/CI9B00633_0005.JPEG","journal-title":"J Chem Inf Model"},{"key":"728_CR69","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-021-00571-7","volume":"13","author":"SS Kolmar","year":"2021","unstructured":"Kolmar SS, Grulke CM (2021) The effect of noise on the predictive limit of QSAR models. J Cheminform 13:1\u201319. https:\/\/doi.org\/10.1186\/s13321-021-00571-7","journal-title":"J Cheminform"},{"key":"728_CR70","volume-title":"Neglected factors in pharmacology and neuroscience research: biopharmaceutics, animal characteristics, maintenance, testing conditions","author":"V Claassen","year":"2013","unstructured":"Claassen V (2013) Neglected factors in pharmacology and neuroscience research: biopharmaceutics, animal characteristics, maintenance, testing conditions, vol 12. Elsevier, Amsterdam"},{"key":"728_CR71","doi-asserted-by":"publisher","DOI":"10.1016\/j.comtox.2020.100126","author":"LL Pham","year":"2020","unstructured":"Pham LL, Watford SM, Pradeep P et al (2020) Variability in in vivo studies: defining the upper limit of performance for predictions of systemic effect levels. Comput Toxicol. https:\/\/doi.org\/10.1016\/j.comtox.2020.100126","journal-title":"Comput Toxicol"},{"key":"728_CR72","doi-asserted-by":"publisher","first-page":"1949","DOI":"10.1021\/CI8001974","volume":"48","author":"P Mazzatorta","year":"2008","unstructured":"Mazzatorta P, Estevez MD, Coulet M, Schilter B (2008) Modeling oral rat chronic toxicity. J Chem Inf Model 48:1949\u20131954. https:\/\/doi.org\/10.1021\/CI8001974","journal-title":"J Chem Inf Model"},{"key":"728_CR73","doi-asserted-by":"publisher","first-page":"587","DOI":"10.1007\/S00204-017-2067-X","volume":"92","author":"L Truong","year":"2018","unstructured":"Truong L, Ouedraogo G, Pham LL et al (2018) Predicting in vivo effect levels for repeat-dose systemic toxicity using chemical, biological, kinetic and study covariates. Arch Toxicol 92:587\u2013600. https:\/\/doi.org\/10.1007\/S00204-017-2067-X","journal-title":"Arch Toxicol"},{"key":"728_CR74","doi-asserted-by":"publisher","first-page":"444","DOI":"10.1016\/J.DRUDIS.2010.03.013","volume":"15","author":"SY Yang","year":"2010","unstructured":"Yang SY (2010) Pharmacophore modeling and applications in drug discovery: challenges and recent advances. Drug Discov Today 15:444\u2013450. https:\/\/doi.org\/10.1016\/J.DRUDIS.2010.03.013","journal-title":"Drug Discov Today"},{"key":"728_CR75","doi-asserted-by":"publisher","DOI":"10.1002\/WCMS.1468","author":"D Schaller","year":"2020","unstructured":"Schaller D, \u0160ribar D, Noonan T et al (2020) Next generation 3D pharmacophore modeling. Wiley Interdiscip Rev Comput Mol Sci. https:\/\/doi.org\/10.1002\/WCMS.1468","journal-title":"Wiley Interdiscip Rev Comput Mol Sci"},{"key":"728_CR76","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/S13321-020-00444-5\/TABLES\/3","volume":"12","author":"I Cort\u00e9s-Ciriano","year":"2020","unstructured":"Cort\u00e9s-Ciriano I, \u0160kuta C, Bender A, Svozil D (2020) QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction. J Cheminform 12:1\u201317. https:\/\/doi.org\/10.1186\/S13321-020-00444-5\/TABLES\/3","journal-title":"J Cheminform"},{"key":"728_CR77","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1007\/S10822-016-9915-2\/FIGURES\/7","volume":"30","author":"ZJ Yao","year":"2016","unstructured":"Yao ZJ, Dong J, Che YJ et al (2016) TargetNet: a web service for predicting potential drug\u2013target interaction profiling via multi-target SAR models. J Comput Aided Mol Des 30:413\u2013424. https:\/\/doi.org\/10.1007\/S10822-016-9915-2\/FIGURES\/7","journal-title":"J Comput Aided Mol Des"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00728-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-023-00728-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00728-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,9]],"date-time":"2023-06-09T06:04:25Z","timestamp":1686290665000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-023-00728-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,9]]},"references-count":77,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["728"],"URL":"https:\/\/doi.org\/10.1186\/s13321-023-00728-6","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-2073134\/v1","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,9]]},"assertion":[{"value":"16 September 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 May 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 June 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"60"}}