{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T12:39:30Z","timestamp":1777639170367,"version":"3.51.4"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T00:00:00Z","timestamp":1614988800000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T00:00:00Z","timestamp":1614988800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100009104","name":"Universit\u00e0 Degli Studi di Modena e Reggio Emila","doi-asserted-by":"publisher","award":["FAR2019, DR496\/2019"],"award-info":[{"award-number":["FAR2019, DR496\/2019"]}],"id":[{"id":"10.13039\/100009104","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Banca Popolare dell'Emilia Romagna"},{"DOI":"10.13039\/501100009879","name":"Regione Emilia-Romagna","doi-asserted-by":"publisher","award":["PhD fellowship grant"],"award-info":[{"award-number":["PhD fellowship grant"]}],"id":[{"id":"10.13039\/501100009879","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The development of selective inhibitors of the clinically relevant human Carbonic Anhydrase (hCA) isoforms IX and XII has become a major topic in drug research, due to their deregulation in several types of cancer. Indeed, the selective inhibition of these two isoforms, especially with respect to the homeostatic isoform II, holds great promise to develop anticancer drugs with limited side effects. Therefore, the development of in silico models able to predict the activity and selectivity against the desired isoform(s) is of central interest. In this work, we have developed a series of machine learning classification models, trained on high confidence data extracted from ChEMBL, able to predict the activity and selectivity profiles of ligands for human Carbonic Anhydrase isoforms II, IX and XII. The training datasets were built with a procedure that made use of flexible bioactivity thresholds to obtain well-balanced active and inactive classes. We used multiple algorithms and sampling sizes to finally select activity models able to classify active or inactive molecules with excellent performances. Remarkably, the results herein reported turned out to be better than those obtained by models built with the classic approach of selecting an a priori activity threshold. The sequential application of such validated models enables virtual screening to be performed in a fast and more reliable way to predict the activity and selectivity profiles against the investigated isoforms.<\/jats:p>","DOI":"10.1186\/s13321-021-00499-y","type":"journal-article","created":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T18:03:05Z","timestamp":1615053785000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":30,"title":["Prediction of activity and selectivity profiles of human Carbonic Anhydrase inhibitors using machine learning classification models"],"prefix":"10.1186","volume":"13","author":[{"given":"Annachiara","family":"Tinivella","sequence":"first","affiliation":[]},{"given":"Luca","family":"Pinzi","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2474-0607","authenticated-orcid":false,"given":"Giulio","family":"Rastelli","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,3,6]]},"reference":[{"key":"499_CR1","doi-asserted-by":"publisher","first-page":"267","DOI":"10.3109\/14756366.2012.737323","volume":"28","author":"M Aggarwal","year":"2013","unstructured":"Aggarwal M, Boone CD, Kondeti B, McKenna R (2013) Structural annotation of human carbonic anhydrases. J Enzyme Inhib Med Chem 28:267\u2013277. https:\/\/doi.org\/10.3109\/14756366.2012.737323","journal-title":"J Enzyme Inhib Med Chem"},{"key":"499_CR2","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1002\/med.10025","volume":"23","author":"CT Supuran","year":"2003","unstructured":"Supuran CT, Scozzafava A, Casini A (2003) Carbonic anhydrase inhibitors. Med Res Rev 23:146\u2013189. https:\/\/doi.org\/10.1002\/med.10025","journal-title":"Med Res Rev"},{"key":"499_CR3","doi-asserted-by":"publisher","first-page":"84","DOI":"10.18632\/oncotarget.422","volume":"3","author":"PC McDonald","year":"2012","unstructured":"McDonald PC, Winum J-Y, Supuran CT, Dedhar S (2012) Recent developments in targeting carbonic anhydrase IX for cancer therapeutics. Oncotarget 3:84\u201397","journal-title":"Oncotarget"},{"key":"499_CR4","doi-asserted-by":"publisher","first-page":"14212","DOI":"10.1073\/pnas.97.26.14212","volume":"97","author":"B Ulmasov","year":"2000","unstructured":"Ulmasov B, Waheed A, Shah GN et al (2000) Purification and kinetic analysis of recombinant CA XII, a membrane carbonic anhydrase overexpressed in certain cancers. Proc Natl Acad Sci 97:14212\u201314217. https:\/\/doi.org\/10.1073\/pnas.97.26.14212","journal-title":"Proc Natl Acad Sci"},{"key":"499_CR5","doi-asserted-by":"publisher","first-page":"767","DOI":"10.1038\/nrd3554","volume":"10","author":"D Neri","year":"2011","unstructured":"Neri D, Supuran CT (2011) Interfering with pH regulation in tumours as a therapeutic strategy. Nat Rev Drug Discov 10:767\u2013777. https:\/\/doi.org\/10.1038\/nrd3554","journal-title":"Nat Rev Drug Discov"},{"key":"499_CR6","first-page":"3","volume":"5","author":"MY Mboge","year":"2015","unstructured":"Mboge MY, McKenna R, Frost SC (2015) Advances in anti-cancer drug development targeting carbonic anhydrase IX and XII. Top anti-cancer Res 5:3\u201342","journal-title":"Top anti-cancer Res"},{"key":"499_CR7","doi-asserted-by":"publisher","first-page":"3588","DOI":"10.1021\/jm011112j","volume":"45","author":"S Gr\u00fcneberg","year":"2002","unstructured":"Gr\u00fcneberg S, Stubbs MT, Klebe G (2002) Successful virtual screening for novel inhibitors of human carbonic anhydrase: strategy and experimental confirmation. J Med Chem 45:3588\u20133602. https:\/\/doi.org\/10.1021\/jm011112j","journal-title":"J Med Chem"},{"key":"499_CR8","doi-asserted-by":"publisher","first-page":"515","DOI":"10.1021\/ci600469w","volume":"47","author":"T Tuccinardi","year":"2007","unstructured":"Tuccinardi T, Nuti E, Ortore G et al (2007) Analysis of human carbonic anhydrase II: docking reliability and receptor-based 3D-QSAR study. J Chem Inf Model 47:515\u2013525. https:\/\/doi.org\/10.1021\/ci600469w","journal-title":"J Chem Inf Model"},{"key":"499_CR9","doi-asserted-by":"publisher","first-page":"1851","DOI":"10.3390\/ijms19071851","volume":"19","author":"G Poli","year":"2018","unstructured":"Poli G, Jha V, Martinelli A et al (2018) Development of a fingerprint-based scoring function for the prediction of the binding mode of carbonic anhydrase II inhibitors. Int J Mol Sci 19:1851. https:\/\/doi.org\/10.3390\/ijms19071851","journal-title":"Int J Mol Sci"},{"key":"499_CR10","doi-asserted-by":"publisher","first-page":"2694","DOI":"10.3762\/bjoc.12.267","volume":"12","author":"SP Leelananda","year":"2016","unstructured":"Leelananda SP, Lindert S (2016) Computational methods in drug discovery. Beilstein J Org Chem 12:2694\u20132718","journal-title":"Beilstein J Org Chem"},{"key":"499_CR11","doi-asserted-by":"publisher","first-page":"4331","DOI":"10.3390\/ijms20184331","volume":"20","author":"L Pinzi","year":"2019","unstructured":"Pinzi L, Rastelli G (2019) Molecular docking: shifting paradigms in drug discovery. Int J Mol Sci 20:4331. https:\/\/doi.org\/10.3390\/ijms20184331","journal-title":"Int J Mol Sci"},{"key":"499_CR12","doi-asserted-by":"publisher","first-page":"983","DOI":"10.1021\/ci9800211","volume":"38","author":"P Willett","year":"1998","unstructured":"Willett P, Barnard JM, Downs GM (1998) Chemical similarity searching. J Chem Inf Comput Sci 38:983\u2013996. https:\/\/doi.org\/10.1021\/ci9800211","journal-title":"J Chem Inf Comput Sci"},{"key":"499_CR13","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1080\/14756366.2019.1705291","volume":"35","author":"G Poli","year":"2020","unstructured":"Poli G, Galati S, Martinelli A et al (2020) Development of a cheminformatics platform for selectivity analyses of carbonic anhydrase inhibitors. J Enzyme Inhib Med Chem 35:365\u2013371. https:\/\/doi.org\/10.1080\/14756366.2019.1705291","journal-title":"J Enzyme Inhib Med Chem"},{"key":"499_CR14","doi-asserted-by":"publisher","first-page":"462","DOI":"10.1021\/ci050348j","volume":"46","author":"J Hert","year":"2006","unstructured":"Hert J, Willett P, Wilton DJ et al (2006) New methods for ligand-based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching. J Chem Inf Model 46:462\u2013470. https:\/\/doi.org\/10.1021\/ci050348j","journal-title":"J Chem Inf Model"},{"key":"499_CR15","doi-asserted-by":"publisher","first-page":"318","DOI":"10.1016\/j.drudis.2014.10.012","volume":"20","author":"A Lavecchia","year":"2015","unstructured":"Lavecchia A (2015) Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 20:318\u2013331. https:\/\/doi.org\/10.1016\/j.drudis.2014.10.012","journal-title":"Drug Discov Today"},{"key":"499_CR16","doi-asserted-by":"publisher","first-page":"1538","DOI":"10.1016\/j.drudis.2018.05.010","volume":"23","author":"Y-C Lo","year":"2018","unstructured":"Lo Y-C, Rensi SE, Torng W, Altman RB (2018) Machine learning in chemoinformatics and drug discovery. Drug Discov Today 23:1538\u20131546. https:\/\/doi.org\/10.1016\/j.drudis.2018.05.010","journal-title":"Drug Discov Today"},{"key":"499_CR17","doi-asserted-by":"publisher","first-page":"595","DOI":"10.1007\/s10822-016-9938-8","volume":"30","author":"S Kearnes","year":"2016","unstructured":"Kearnes S, McCloskey K, Berndl M et al (2016) Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 30:595\u2013608. https:\/\/doi.org\/10.1007\/s10822-016-9938-8","journal-title":"J Comput Aided Mol Des"},{"key":"499_CR18","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1021\/acscentsci.6b00367","volume":"3","author":"H Altae-Tran","year":"2017","unstructured":"Altae-Tran H, Ramsundar B, Pappu AS, Pande V (2017) Low data drug discovery with one-shot learning. ACS Cent Sci 3:283\u2013293. https:\/\/doi.org\/10.1021\/acscentsci.6b00367","journal-title":"ACS Cent Sci"},{"key":"499_CR19","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1021\/ci500747n","volume":"55","author":"J Ma","year":"2015","unstructured":"Ma J, Sheridan RP, Liaw A et al (2015) Deep neural nets as a method for quantitative structure\u2013activity relationships. J Chem Inf Model 55:263\u2013274. https:\/\/doi.org\/10.1021\/ci500747n","journal-title":"J Chem Inf Model"},{"key":"499_CR20","doi-asserted-by":"publisher","first-page":"518","DOI":"10.1111\/j.1747-0285.2008.00670.x","volume":"71","author":"D Stumpfe","year":"2008","unstructured":"Stumpfe D, Geppert H, Bajorath J (2008) Methods for computer-aided chemical biology. Part 3: analysis of structure\u2013selectivity relationships through single- or dual-step selectivity searching and bayesian classification. Chem Biol Drug Des 71:518\u2013528. https:\/\/doi.org\/10.1111\/j.1747-0285.2008.00670.x","journal-title":"Chem Biol Drug Des"},{"key":"499_CR21","doi-asserted-by":"publisher","first-page":"517","DOI":"10.1007\/978-1-60761-839-3_21","volume":"672","author":"AM Wassermann","year":"2011","unstructured":"Wassermann AM, Geppert H, Bajorath J (2011) Application of support vector machine-based ranking strategies to search for target-selective compounds. Methods Mol Biol 672:517\u2013530. https:\/\/doi.org\/10.1007\/978-1-60761-839-3_21","journal-title":"Methods Mol Biol"},{"key":"499_CR22","doi-asserted-by":"publisher","first-page":"D1100","DOI":"10.1093\/nar\/gkr777","volume":"40","author":"A Gaulton","year":"2012","unstructured":"Gaulton A, Bellis LJ, Bento AP et al (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100\u2013D1107. https:\/\/doi.org\/10.1093\/nar\/gkr777","journal-title":"Nucleic Acids Res"},{"key":"499_CR23","doi-asserted-by":"publisher","first-page":"1680","DOI":"10.1016\/j.drudis.2017.08.010","volume":"22","author":"L Zhang","year":"2017","unstructured":"Zhang L, Tan J, Han D, Zhu H (2017) From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discov Today 22:1680\u20131685. https:\/\/doi.org\/10.1016\/j.drudis.2017.08.010","journal-title":"Drug Discov Today"},{"key":"499_CR24","doi-asserted-by":"publisher","first-page":"791","DOI":"10.1080\/17460441.2019.1615435","volume":"14","author":"T Fischer","year":"2019","unstructured":"Fischer T, Gazzola S, Riedl R (2019) Approaching target selectivity by de novo drug design. Expert Opin Drug Discov 14:791\u2013803. https:\/\/doi.org\/10.1080\/17460441.2019.1615435","journal-title":"Expert Opin Drug Discov"},{"key":"499_CR25","doi-asserted-by":"publisher","first-page":"513","DOI":"10.1517\/17425255.4.5.513","volume":"4","author":"E Stjernschantz","year":"2008","unstructured":"Stjernschantz E, Vermeulen NPE, Oostenbrink C (2008) Computational prediction of drug binding and rationalisation of selectivity towards cytochromes P450. Expert Opin Drug Metab Toxicol 4:513\u2013527. https:\/\/doi.org\/10.1517\/17425255.4.5.513","journal-title":"Expert Opin Drug Metab Toxicol"},{"key":"499_CR26","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1186\/s13321-016-0121-y","volume":"8","author":"F Montanari","year":"2016","unstructured":"Montanari F, Zdrazil B, Digles D, Ecker GF (2016) Selectivity profiling of BCRP versus P-gp inhibition: from automated collection of polypharmacology data to multi-label learning. J Cheminform 8:7. https:\/\/doi.org\/10.1186\/s13321-016-0121-y","journal-title":"J Cheminform"},{"key":"499_CR27","doi-asserted-by":"crossref","unstructured":"Zhang J, Bloedorn E, Rosen L, Venese D (2004) Learning rules from highly unbalanced data sets. In: Fourth IEEE International Conference on Data Mining (ICDM\u201904). pp 571\u2013574","DOI":"10.1109\/ICDM.2004.10015"},{"key":"499_CR28","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-Learn: Machine learning in Python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"499_CR29","unstructured":"RDKit: Open-source cheminformatics. http:\/\/www.rdkit.org. Accessed 1 June 2020"},{"key":"499_CR30","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1021\/ci00046a002","volume":"25","author":"RE Carhart","year":"1985","unstructured":"Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 25:64\u201373. https:\/\/doi.org\/10.1021\/ci00046a002","journal-title":"J Chem Inf Comput Sci"},{"key":"499_CR31","unstructured":"Thresholds for \u201crandom\u201d in fingerprints the RDKit supports. http:\/\/rdkit.blogspot.com\/2013\/10\/fingerprint-thresholds.html. Accessed 1 Oct 2020"},{"key":"499_CR32","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1186\/s13321-016-0148-0","volume":"8","author":"NM O\u2019Boyle","year":"2016","unstructured":"O\u2019Boyle NM, Sayle RA (2016) Comparing structural fingerprints using a literature-based similarity benchmark. J Cheminform 8:36. https:\/\/doi.org\/10.1186\/s13321-016-0148-0","journal-title":"J Cheminform"},{"key":"499_CR33","doi-asserted-by":"publisher","first-page":"e39076","DOI":"10.1371\/journal.pone.0039076","volume":"7","author":"J Zhang","year":"2012","unstructured":"Zhang J, Han B, Wei X et al (2012) A two-step target binding and selectivity support vector machines approach for virtual screening of dopamine receptor subtype-selective ligands. PLoS ONE 7:e39076","journal-title":"PLoS ONE"},{"key":"499_CR34","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1145\/1656274.1656280","volume":"11","author":"MR Berthold","year":"2009","unstructured":"Berthold MR, Cebron N, Dill F et al (2009) KNIME - the Konstanz Information Miner: version 2.0 and beyond. SIGKDD Explor Newsl 11:26\u201331. https:\/\/doi.org\/10.1145\/1656274.1656280","journal-title":"SIGKDD Explor Newsl"},{"key":"499_CR35","unstructured":"RDKit: List of available descriptors. https:\/\/www.rdkit.org\/docs\/GettingStartedInPython.html#list-of-available-descriptors. Accessed 1 June 2020"},{"key":"499_CR36","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","volume":"405","author":"BW Matthews","year":"1975","unstructured":"Matthews BW (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta - Protein Struct 405:442\u2013451. https:\/\/doi.org\/10.1016\/0005-2795(75)90109-9","journal-title":"Biochim Biophys Acta - Protein Struct"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00499-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s13321-021-00499-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00499-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,6]],"date-time":"2021-03-06T18:16:10Z","timestamp":1615054570000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-021-00499-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,6]]},"references-count":36,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["499"],"URL":"https:\/\/doi.org\/10.1186\/s13321-021-00499-y","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,6]]},"assertion":[{"value":"14 October 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 March 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declaration"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"18"}}