{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:28:31Z","timestamp":1772166511775,"version":"3.50.1"},"reference-count":56,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,6,2]],"date-time":"2023-06-02T00:00:00Z","timestamp":1685664000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,2]],"date-time":"2023-06-02T00:00:00Z","timestamp":1685664000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Cambridge Centre for Data Driven Discovery and Accelerate Programme for Scientific Discovery","award":["Theoretical Scientific and Philosophical Perspectives on Biological Understanding in the Age of Artificial Intelligence"],"award-info":[{"award-number":["Theoretical Scientific and Philosophical Perspectives on Biological Understanding in the Age of Artificial Intelligence"]}]},{"DOI":"10.13039\/501100004359","name":"Swedish Research Council","doi-asserted-by":"crossref","award":["2020-03731"],"award-info":[{"award-number":["2020-03731"]}],"id":[{"id":"10.13039\/501100004359","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001862","name":"FORMAS","doi-asserted-by":"crossref","award":["2018-00924"],"award-info":[{"award-number":["2018-00924"]}],"id":[{"id":"10.13039\/501100001862","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100007051","name":"Uppsala University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100007051","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The applicability domain of machine learning models trained on structural fingerprints for the prediction of biological endpoints is often limited by the lack of diversity of chemical space of the training data. In this work, we developed similarity-based merger models which combined the outputs of individual models trained on cell morphology (based on Cell Painting) and chemical structure (based on chemical fingerprints) and the structural and morphological similarities of the compounds in the test dataset to compounds in the training dataset. We applied these similarity-based merger models using logistic regression models on the predictions and similarities as features and predicted assay hit calls of 177 assays from ChEMBL, PubChem and the Broad Institute (where the required Cell Painting annotations were available). We found that the similarity-based merger models outperformed other models with an additional 20% assays (79 out of 177 assays) with an AUC\u2009&gt;\u20090.70 compared with 65 out of 177 assays using structural models and 50 out of 177 assays using Cell Painting models. Our results demonstrated that similarity-based merger models combining structure and cell morphology models can more accurately predict a wide range of biological assay outcomes and further expanded the applicability domain by better extrapolating to new structural and morphology spaces.<\/jats:p>\n                  <jats:p>\n                    <jats:bold>Graphical Abstract<\/jats:bold>\n                  <\/jats:p>","DOI":"10.1186\/s13321-023-00723-x","type":"journal-article","created":{"date-parts":[[2023,6,2]],"date-time":"2023-06-02T05:42:30Z","timestamp":1685684550000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":28,"title":["Merging bioactivity predictions from cell morphology and chemical fingerprint models using similarity to training data"],"prefix":"10.1186","volume":"15","author":[{"given":"Srijit","family":"Seal","sequence":"first","affiliation":[]},{"given":"Hongbin","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Maria-Anna","family":"Trapotsi","sequence":"additional","affiliation":[]},{"given":"Satvik","family":"Singh","sequence":"additional","affiliation":[]},{"given":"Jordi","family":"Carreras-Puigvert","sequence":"additional","affiliation":[]},{"given":"Ola","family":"Spjuth","sequence":"additional","affiliation":[]},{"given":"Andreas","family":"Bender","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,6,2]]},"reference":[{"key":"723_CR1","doi-asserted-by":"publisher","DOI":"10.1039\/D1CB00069A","author":"M-A Trapotsi","year":"2022","unstructured":"Trapotsi M-A, Hosseini-Gerami L, Bender A (2022) Computational analyses of mechanism of action (MoA): data, methods and integration. RSC Chem Biol. https:\/\/doi.org\/10.1039\/D1CB00069A","journal-title":"RSC Chem Biol"},{"key":"723_CR2","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1080\/10629360903568671","volume":"21","author":"A Sazonovas","year":"2010","unstructured":"Sazonovas A, Japertas P, Didziapetris R (2010) Estimation of reliability of predictions and model applicability domain evaluation in the analysis of acute toxicity (LD50). SAR QSAR Environ Res 21:127\u2013148. https:\/\/doi.org\/10.1080\/10629360903568671","journal-title":"SAR QSAR Environ Res"},{"key":"723_CR3","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1007\/978-1-4939-7899-1_6","volume":"1800","author":"S Kar","year":"2018","unstructured":"Kar S, Roy K, Leszczynski J (2018) Applicability domain: a step toward confident predictions and decidability for QSAR modeling. Methods Mol Biol 1800:141\u2013169. https:\/\/doi.org\/10.1007\/978-1-4939-7899-1_6","journal-title":"Methods Mol Biol"},{"key":"723_CR4","doi-asserted-by":"publisher","first-page":"839","DOI":"10.1021\/ci0500381","volume":"45","author":"S Dimitrov","year":"2005","unstructured":"Dimitrov S, Dimitrova G, Pavlov T et al (2005) A stepwise approach for defining the applicability domain of SAR and QSAR models. J Chem Inf Model 45:839\u2013849. https:\/\/doi.org\/10.1021\/ci0500381","journal-title":"J Chem Inf Model"},{"key":"723_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-015-0069-3","volume":"7","author":"D Bajusz","year":"2015","unstructured":"Bajusz D, R\u00e1cz A, H\u00e9berger K (2015) Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations? J Cheminform 7:1\u201313. https:\/\/doi.org\/10.1186\/s13321-015-0069-3","journal-title":"J Cheminform"},{"key":"723_CR6","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1038\/s41573-020-00117-w","volume":"20","author":"SN Chandrasekaran","year":"2021","unstructured":"Chandrasekaran SN, Ceulemans H, Boyd JD, Carpenter AE (2021) Image-based profiling for drug discovery: due for a machine-learning upgrade? Nat Rev Drug Discov 20:145\u2013159. https:\/\/doi.org\/10.1038\/s41573-020-00117-w","journal-title":"Nat Rev Drug Discov"},{"key":"723_CR7","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1016\/1074-5521(95)90283-X","volume":"2","author":"LM Kauvar","year":"1995","unstructured":"Kauvar LM, Higgins DL, Villar HO et al (1995) Predicting ligand binding to proteins by affinity fingerprinting. Chem Biol 2:107\u2013118. https:\/\/doi.org\/10.1016\/1074-5521(95)90283-X","journal-title":"Chem Biol"},{"key":"723_CR8","doi-asserted-by":"publisher","first-page":"2830","DOI":"10.1021\/acs.jcim.0c00250","volume":"60","author":"U Norinder","year":"2020","unstructured":"Norinder U, Spjuth O, Svensson F (2020) Using predicted bioactivity profiles to improve predictive modeling. J Chem Inf Model 60:2830\u20132837. https:\/\/doi.org\/10.1021\/acs.jcim.0c00250","journal-title":"J Chem Inf Model"},{"key":"723_CR9","doi-asserted-by":"publisher","first-page":"2445","DOI":"10.1021\/ci600197y","volume":"46","author":"A Bender","year":"2006","unstructured":"Bender A, Jenkins JL, Glick M et al (2006) \u201cBayes affinity fingerprints\u201d Improve retrieval rates in virtual screening and define orthogonal bioactivity space: when are multitarget drugs a feasible concept? J Chem Inf Model 46:2445\u20132456. https:\/\/doi.org\/10.1021\/ci600197y","journal-title":"J Chem Inf Model"},{"key":"723_CR10","doi-asserted-by":"publisher","DOI":"10.1016\/J.SLASD.2022.12.003","author":"A Liu","year":"2023","unstructured":"Liu A, Seal S, Yang H, Bender A (2023) Using chemical and biological data to predict drug toxicity. SLAS Discov. https:\/\/doi.org\/10.1016\/J.SLASD.2022.12.003","journal-title":"SLAS Discov"},{"key":"723_CR11","doi-asserted-by":"publisher","first-page":"1399","DOI":"10.1021\/cb3001028","volume":"7","author":"PM Petrone","year":"2012","unstructured":"Petrone PM, Simms B, Nigsch F et al (2012) Rethinking molecular similarity: comparing compounds on the basis of biological activity. ACS Chem Biol 7:1399\u20131409. https:\/\/doi.org\/10.1021\/cb3001028","journal-title":"ACS Chem Biol"},{"key":"723_CR12","doi-asserted-by":"publisher","first-page":"1087","DOI":"10.1038\/s41587-020-0502-7","volume":"38","author":"M Duran-Frigola","year":"2020","unstructured":"Duran-Frigola M, Pauls E, Guitart-Pla O et al (2020) Extending the small-molecule similarity principle to all levels of biology with the chemical checker. Nat Biotechnol 38:1087\u20131096. https:\/\/doi.org\/10.1038\/s41587-020-0502-7","journal-title":"Nat Biotechnol"},{"key":"723_CR13","doi-asserted-by":"publisher","first-page":"1757","DOI":"10.1038\/nprot.2016.105","volume":"11","author":"MA Bray","year":"2016","unstructured":"Bray MA, Singh S, Han H et al (2016) Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat Protoc 11:1757\u20131774. https:\/\/doi.org\/10.1038\/nprot.2016.105","journal-title":"Nat Protoc"},{"key":"723_CR14","doi-asserted-by":"publisher","first-page":"e2005970","DOI":"10.1371\/journal.pbio.2005970","volume":"16","author":"C McQuin","year":"2018","unstructured":"McQuin C, Goodman A, Chernyshev V et al (2018) Cell profiler 30: next-generation image processing for biology. PLoS Biol 16:e2005970. https:\/\/doi.org\/10.1371\/journal.pbio.2005970","journal-title":"PLoS Biol"},{"key":"723_CR15","doi-asserted-by":"crossref","unstructured":"Lapins M, Spjuth O (2019) Evaluation of Gene Expression and Phenotypic Profiling Data as Quantitative Descriptors for Predicting Drug Targets and Mechanisms of Action. bioRxiv 580654","DOI":"10.1101\/580654"},{"key":"723_CR16","doi-asserted-by":"publisher","first-page":"422","DOI":"10.1021\/acs.chemrestox.0c00303","volume":"34","author":"S Seal","year":"2021","unstructured":"Seal S, Yang H, Vollmers L, Bender A (2021) Comparison of cellular morphological descriptors and molecular fingerprints for the prediction of cytotoxicity- and proliferation-related assays. Chem Res Toxicol 34:422\u2013437. https:\/\/doi.org\/10.1021\/acs.chemrestox.0c00303","journal-title":"Chem Res Toxicol"},{"key":"723_CR17","doi-asserted-by":"publisher","first-page":"1053","DOI":"10.1016\/j.chembiol.2021.12.009","volume":"29","author":"M Akbarzadeh","year":"2022","unstructured":"Akbarzadeh M, Deipenwisch I, Schoelermann B et al (2022) Morphological profiling by means of the cell painting assay enables identification of tubulin-targeting compounds. Cell Chem Biol 29:1053-1064.e3. https:\/\/doi.org\/10.1016\/j.chembiol.2021.12.009","journal-title":"Cell Chem Biol"},{"key":"723_CR18","doi-asserted-by":"publisher","first-page":"858","DOI":"10.1038\/s42003-022-03763-5","volume":"5","author":"S Seal","year":"2022","unstructured":"Seal S, Carreras-Puigvert J, Trapotsi MA et al (2022) Integrating cell morphology with gene expression and chemical structure to aid mitochondrial toxicity detection. Commun Biol 5:858. https:\/\/doi.org\/10.1038\/s42003-022-03763-5","journal-title":"Commun Biol"},{"key":"723_CR19","doi-asserted-by":"publisher","first-page":"1733","DOI":"10.1021\/acschembio.2c00076","volume":"17","author":"MA Trapotsi","year":"2022","unstructured":"Trapotsi MA, Mouchet E, Williams G et al (2022) Cell morphological profiling enables high-throughput screening for PROteolysis TArgeting chimera (PROTAC) phenotypic signature. ACS Chem Biol 17:1733\u20131744. https:\/\/doi.org\/10.1021\/acschembio.2c00076","journal-title":"ACS Chem Biol"},{"key":"723_CR20","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1091\/mbc.E21-11-0538","volume":"33","author":"JC Caicedo","year":"2022","unstructured":"Caicedo JC, Arevalo J, Piccioni F et al (2022) Cell painting predicts impact of lung cancer variants. Mol Biol Cell 33:49. https:\/\/doi.org\/10.1091\/mbc.E21-11-0538","journal-title":"Mol Biol Cell"},{"key":"723_CR21","doi-asserted-by":"publisher","DOI":"10.1007\/3-540-45014-9_1","volume-title":"Ensemble methods in machine learning lect Notes Comput Sci (including subser lect notes artif intell lect notes bioinformatics)","author":"TG Dietterich","year":"2000","unstructured":"Dietterich TG (2000) Ensemble methods in machine learning lect Notes Comput Sci (including subser lect notes artif intell lect notes bioinformatics). Springer, Berlin. https:\/\/doi.org\/10.1007\/3-540-45014-9_1"},{"key":"723_CR22","doi-asserted-by":"publisher","first-page":"353","DOI":"10.1021\/acs.chemrestox.9b00259","volume":"33","author":"X Li","year":"2020","unstructured":"Li X, Kleinstreuer NC, Fourches D (2020) Hierarchical quantitative structure-activity relationship modeling approach for integrating binary, multiclass, and regression models of acute oral systemic toxicity. Chem Res Toxicol 33:353\u2013366. https:\/\/doi.org\/10.1021\/acs.chemrestox.9b00259","journal-title":"Chem Res Toxicol"},{"key":"723_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-017-0230-2","volume":"9","author":"W Klingspohn","year":"2017","unstructured":"Klingspohn W, Mathea M, Ter Laak A et al (2017) Efficiency of different measures for defining the applicability domain of classification models. J Cheminform 9:1\u201317. https:\/\/doi.org\/10.1186\/s13321-017-0230-2","journal-title":"J Cheminform"},{"key":"723_CR24","doi-asserted-by":"publisher","first-page":"1912","DOI":"10.1021\/ci049782w","volume":"44","author":"RP Sheridan","year":"2004","unstructured":"Sheridan RP, Feuston BP, Maiorov VN, Kearsley SK (2004) Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR. J Chem Inf Comput Sci 44:1912\u20131928. https:\/\/doi.org\/10.1021\/ci049782w","journal-title":"J Chem Inf Comput Sci"},{"issue":"11","key":"723_CR25","doi-asserted-by":"publisher","first-page":"911","DOI":"10.1016\/j.cels.2022.10.001","volume":"13","author":"GP Way","year":"2022","unstructured":"Way GP, Natoli T, Adeboye A, Litichevskiy L et al (2022) Morphology and gene expression profiling provide complementary information for mapping cell state. Cell Syst 13(11):911-923.e9. https:\/\/doi.org\/10.1016\/j.cels.2022.10.001","journal-title":"Cell Syst"},{"issue":"12","key":"723_CR26","doi-asserted-by":"publisher","first-page":"1550","DOI":"10.1038\/s41592-022-01667-0","volume":"19","author":"M Haghighi","year":"2022","unstructured":"Haghighi M, Caicedo JC, Cimini B et al (2022) High-dimensional gene expression and morphology profiles of cells across 28,000 genetic and chemical perturbations. Nat Methods 19(12):1550\u20131557. https:\/\/doi.org\/10.1038\/s41592-022-01667-0","journal-title":"Nat Methods"},{"issue":"1","key":"723_CR27","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-023-37570-1","volume":"14","author":"N Moshkov","year":"2023","unstructured":"Moshkov N, Becker T, Yang K et al (2023) Predicting compound activity from phenotypic profiles and chemical structures. Nat Commun 14(1):1\u201311. https:\/\/doi.org\/10.1038\/s41467-023-37570-1","journal-title":"Nat Commun"},{"key":"723_CR28","doi-asserted-by":"publisher","first-page":"848","DOI":"10.1016\/j.chembiol.2021.01.009","volume":"28","author":"J Wilke","year":"2021","unstructured":"Wilke J, Kawamura T, Xu H et al (2021) Discovery of a \u03c31 receptor antagonist by combination of unbiased cell painting and thermal proteome profiling. Cell Chem Biol 28:848-854.e5. https:\/\/doi.org\/10.1016\/j.chembiol.2021.01.009","journal-title":"Cell Chem Biol"},{"key":"723_CR29","doi-asserted-by":"publisher","first-page":"883","DOI":"10.1039\/c5tx00406c","volume":"5","author":"CHG Allen","year":"2016","unstructured":"Allen CHG, Koutsoukas A, Cort\u00e9s-Ciriano I et al (2016) Improving the prediction of organism-level toxicity through integration of chemical, protein target and cytotoxicity qHTS data. Toxicol Res 5:883\u2013894. https:\/\/doi.org\/10.1039\/c5tx00406c","journal-title":"Toxicol Res"},{"key":"723_CR30","doi-asserted-by":"publisher","first-page":"793","DOI":"10.1021\/ci500016v","volume":"54","author":"R Liu","year":"2014","unstructured":"Liu R, Wallqvist A (2014) Merging applicability domains for in silico assessment of chemical mutagenicity. J Chem Inf Model 54:793\u2013800. https:\/\/doi.org\/10.1021\/ci500016v","journal-title":"J Chem Inf Model"},{"key":"723_CR31","doi-asserted-by":"publisher","first-page":"e1009888","DOI":"10.1371\/journal.pcbi.1009888","volume":"18","author":"YL Chow","year":"2022","unstructured":"Chow YL, Singh S, Carpenter AE, Way GP (2022) Predicting drug polypharmacology from cell morphology readouts using variational autoencoder latent space arithmetic. PLoS Comput Biol 18:e1009888. https:\/\/doi.org\/10.1371\/journal.pcbi.1009888","journal-title":"PLoS Comput Biol"},{"key":"723_CR32","first-page":"63","volume":"5","author":"K Niforou","year":"2008","unstructured":"Niforou K, Anagnostopoulos A, Vougas K et al (2008) The proteome profile of the human osteosarcoma U2OS cell line. Cancer Genom Proteom 5:63\u201378","journal-title":"Cancer Genom Proteom"},{"key":"723_CR33","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1089\/adt.2006.053","volume":"5","author":"F Fan","year":"2007","unstructured":"Fan F, Wood KV (2007) Bioluminescent assays for high-throughput screening. Assay Drug Dev Technol 5:127\u2013136. https:\/\/doi.org\/10.1089\/adt.2006.053","journal-title":"Assay Drug Dev Technol"},{"key":"723_CR34","doi-asserted-by":"publisher","DOI":"10.1268\/f1000research.52676.1","author":"JL Medina-Franco","year":"2021","unstructured":"Medina-Franco JL, Martinez-Mayorga K, Fern\u00e1ndez-de Gortari E et al (2021) Rationality over fashion and hype in drug design. F1000Res. https:\/\/doi.org\/10.1268\/f1000research.52676.1","journal-title":"F1000Res"},{"key":"723_CR35","doi-asserted-by":"publisher","first-page":"1040","DOI":"10.1016\/j.drudis.2020.11.037","volume":"26","author":"A Bender","year":"2021","unstructured":"Bender A, Cortes-Ciriano I (2021) Artificial intelligence in drug discovery: what is realistic, what are illusions? part 2: a discussion of chemical and biological data. Drug Discov Today 26:1040\u20131052. https:\/\/doi.org\/10.1016\/j.drudis.2020.11.037","journal-title":"Drug Discov Today"},{"key":"723_CR36","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s43586-020-00001-2","volume":"1","author":"R van de Schoot","year":"2021","unstructured":"van de Schoot R, Depaoli S, King R et al (2021) Bayesian statistics and modelling. Nat Rev Methods Prim 1:1\u201326. https:\/\/doi.org\/10.1038\/s43586-020-00001-2","journal-title":"Nat Rev Methods Prim"},{"key":"723_CR37","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1021\/acs.jcim.9b00587","volume":"60","author":"V Korolev","year":"2020","unstructured":"Korolev V, Mitrofanov A, Korotcov A, Tkachenko V (2020) Graph convolutional neural networks as \u2018general-purpose\u2019 property predictors: the universality and limits of applicability. J Chem Inf Model 60:22\u201328","journal-title":"J Chem Inf Model"},{"key":"723_CR38","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-020-69354-8","volume":"10","author":"MJ Cox","year":"2020","unstructured":"Cox MJ, Jaensch S, Van de Waeter J et al (2020) Tales of 1008 small molecules: phenomic profiling through live-cell imaging in a panel of reporter cell lines. Sci Rep 10:1\u201314. https:\/\/doi.org\/10.1038\/s41598-020-69354-8","journal-title":"Sci Rep"},{"key":"723_CR39","unstructured":"JUMP-Cell Painting Consortium. https:\/\/jump-cellpainting.broadinstitute.org\/. Accessed 2 May 2022"},{"key":"723_CR40","doi-asserted-by":"publisher","first-page":"1163","DOI":"10.1021\/acs.jcim.8b00670","volume":"59","author":"M Hofmarcher","year":"2019","unstructured":"Hofmarcher M, Rumetshofer E, Clevert DA et al (2019) Accurate prediction of biological assays with high-throughput microscopy images and convolutional networks. J Chem Inf Model 59:1163\u20131171. https:\/\/doi.org\/10.1021\/acs.jcim.8b00670","journal-title":"J Chem Inf Model"},{"issue":"D1","key":"723_CR41","doi-asserted-by":"publisher","first-page":"D930","DOI":"10.1093\/nar\/gky1075","volume":"47","author":"D Mendez","year":"2019","unstructured":"Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Felix E et al (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47(D1):D930\u2013D940","journal-title":"Nucleic Acids Res"},{"key":"723_CR42","unstructured":"Luis V (2021) Prediction of Cytotoxicity Related PubChem Assays Using High-Content-Imaging Descriptors derived from Cell-Painting [Unpublished master's thesis], TU Darmstadt."},{"key":"723_CR43","unstructured":"PubChem. https:\/\/pubchem.ncbi.nlm.nih.gov\/. Accessed 4 Jun 2022"},{"key":"723_CR44","doi-asserted-by":"publisher","first-page":"D607","DOI":"10.1093\/nar\/gky1131","volume":"47","author":"D Szklarczyk","year":"2019","unstructured":"Szklarczyk D, Gable AL, Lyon D et al (2019) STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 47:D607\u2013D613. https:\/\/doi.org\/10.1093\/nar\/gky1131","journal-title":"Nucleic Acids Res"},{"key":"723_CR45","doi-asserted-by":"publisher","first-page":"2498","DOI":"10.1101\/gr.1239303","volume":"13","author":"P Shannon","year":"2003","unstructured":"Shannon P, Markiel A, Ozier O et al (2003) Cytoscape: a software Environment for integrated models of biomolecular interaction networks. Genome Res 13:2498\u20132504. https:\/\/doi.org\/10.1101\/gr.1239303","journal-title":"Genome Res"},{"key":"723_CR46","doi-asserted-by":"publisher","first-page":"1091","DOI":"10.1093\/bioinformatics\/btp101","volume":"25","author":"G Bindea","year":"2009","unstructured":"Bindea G, Mlecnik B, Hackl H et al (2009) ClueGO: a cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25:1091\u20131093. https:\/\/doi.org\/10.1093\/bioinformatics\/btp101","journal-title":"Bioinformatics"},{"key":"723_CR47","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/gigascience\/giw014","volume":"6","author":"MA Bray","year":"2017","unstructured":"Bray MA, Gustafsdottir SM, Rohban MH et al (2017) A dataset of images and morphological profiles of 30 000 small-molecule treatments using the Cell Painting assay. Gigascience 6:1\u20135","journal-title":"Gigascience"},{"key":"723_CR48","doi-asserted-by":"publisher","unstructured":"GigaDB Dataset - DOI https:\/\/doi.org\/10.5524\/100351 - Supporting data for \"A dataset of images and morphological profiles of 30,000 small-molecule treatments using the Cell Painting assay. http:\/\/gigadb.org\/dataset\/100351. Accessed 5 Oct 2022","DOI":"10.5524\/100351"},{"key":"723_CR49","unstructured":"Swain M (2019) MolVS: Molecule Validation and Standardization. In: MolVS. https:\/\/molvs.readthedocs.io\/en\/latest\/. Accessed 15 Apr 2021"},{"key":"723_CR50","unstructured":"Landrum G (2006) RDKit: Open-source Cheminformatics. In: http:\/\/www.rdkit.org. Accessed 2 Mar 2022"},{"key":"723_CR51","unstructured":"Blocklist Features - Cell Profiler. https:\/\/figshare.com\/articles\/dataset\/Blacklist_Features_-_Cell_Profiler\/10255811. Accessed 11 Apr 2021"},{"key":"723_CR52","first-page":"2825","volume":"12","author":"F Pedregosa Fabianpedregosa","year":"2011","unstructured":"Pedregosa Fabianpedregosa F, Michel V, Grisel Oliviergrisel O et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"723_CR53","unstructured":"Cytomining\/Pycytominer: Cytominer Python Package. https:\/\/github.com\/cytomining\/pycytominer. Accessed 4 Jun 2022"},{"key":"723_CR54","first-page":"281","volume":"13","author":"J Bergstra","year":"2012","unstructured":"Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281\u2013305","journal-title":"J Mach Learn Res"},{"key":"723_CR55","doi-asserted-by":"publisher","first-page":"458","DOI":"10.1002\/bimj.200410135","volume":"47","author":"R Fluss","year":"2005","unstructured":"Fluss R, Faraggi D, Reiser B (2005) Estimation of the youden index and its associated cutoff point. Biometrical J 47:458\u2013472. https:\/\/doi.org\/10.1002\/bimj.200410135","journal-title":"Biometrical J"},{"key":"723_CR56","unstructured":"API reference \u2014 pandas 1.3.1 documentation. https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/index.html. Accessed 29 Jul 2021"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00723-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-023-00723-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00723-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,2]],"date-time":"2023-06-02T05:48:11Z","timestamp":1685684891000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-023-00723-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,2]]},"references-count":56,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["723"],"URL":"https:\/\/doi.org\/10.1186\/s13321-023-00723-x","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.08.11.503624","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,2]]},"assertion":[{"value":"5 February 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 April 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 June 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval and consent to participate"}},{"value":"The authors declare no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"56"}}