{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T04:13:49Z","timestamp":1776140029855,"version":"3.50.1"},"reference-count":116,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,5,29]],"date-time":"2020-05-29T00:00:00Z","timestamp":1590710400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,5,29]],"date-time":"2020-05-29T00:00:00Z","timestamp":1590710400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Ministry of Education, Youth and Sports of the Czech Republic","award":["LM2018130"],"award-info":[{"award-number":["LM2018130"]}]},{"name":"Ministry of Education, Youth and Sports of the Czech Republic","award":["RVO 68378050-KAV-NPUI"],"award-info":[{"award-number":["RVO 68378050-KAV-NPUI"]}]},{"name":"Ministry of Education, Youth and Sports of the Czech Republic","award":["LM2018130"],"award-info":[{"award-number":["LM2018130"]}]},{"name":"Ministry of Education, Youth and Sports of the Czech Republic","award":["RVO 68378050-KAV-NPUI"],"award-info":[{"award-number":["RVO 68378050-KAV-NPUI"]}]},{"DOI":"10.13039\/100010665","name":"H2020 Marie Sk\u0142odowska-Curie Actions","doi-asserted-by":"publisher","award":["703543"],"award-info":[{"award-number":["703543"]}],"id":[{"id":"10.13039\/100010665","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100011264","name":"FP7 People: Marie-Curie Actions","doi-asserted-by":"publisher","award":["238701"],"award-info":[{"award-number":["238701"]}],"id":[{"id":"10.13039\/100011264","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100011264","name":"FP7 People: Marie-Curie Actions","doi-asserted-by":"publisher","award":["238701"],"award-info":[{"award-number":["238701"]}],"id":[{"id":"10.13039\/100011264","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>An affinity fingerprint is the vector consisting of compound\u2019s affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regression models trained on bioactivity data from the ChEMBL database. Both real-valued (rv-QAFFP) and binary (b-QAFFP) versions of the QAFFP fingerprint were implemented and their performance in similarity searching, biological activity classification and scaffold hopping was assessed and compared to that of the 1024 bits long Morgan2 fingerprint (the RDKit implementation of the ECFP4 fingerprint). In both similarity searching and biological activity classification, the QAFFP fingerprint yields retrieval rates, measured by AUC (~\u20090.65 and ~\u20090.70 for similarity searching depending on data sets, and ~\u20090.85 for classification) and EF5 (~\u20094.67 and ~\u20095.82 for similarity searching depending on data sets, and ~\u20092.10 for classification), comparable to that of the Morgan2 fingerprint (similarity searching AUC of ~\u20090.57 and ~\u20090.66, and EF5 of ~\u20094.09 and ~\u20096.41, depending on data sets, classification AUC of ~\u20090.87, and EF5 of ~\u20092.16). However, the QAFFP fingerprint outperforms the Morgan2 fingerprint in scaffold hopping as it is able to retrieve 1146 out of existing 1749 scaffolds, while the Morgan2 fingerprint reveals only 864 scaffolds.<\/jats:p>","DOI":"10.1186\/s13321-020-00443-6","type":"journal-article","created":{"date-parts":[[2020,5,29]],"date-time":"2020-05-29T12:03:08Z","timestamp":1590753788000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":36,"title":["QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping"],"prefix":"10.1186","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5325-4934","authenticated-orcid":false,"given":"C.","family":"\u0160kuta","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2036-494X","authenticated-orcid":false,"given":"I.","family":"Cort\u00e9s-Ciriano","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9597-0629","authenticated-orcid":false,"given":"W.","family":"Dehaen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2473-1919","authenticated-orcid":false,"given":"P.","family":"K\u0159\u00ed\u017e","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0717-1817","authenticated-orcid":false,"given":"G. J. P.","family":"van Westen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6855-0012","authenticated-orcid":false,"given":"I. V.","family":"Tetko","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6683-7546","authenticated-orcid":false,"given":"A.","family":"Bender","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2577-5163","authenticated-orcid":false,"given":"D.","family":"Svozil","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,5,29]]},"reference":[{"issue":"7\u20138","key":"443_CR1","doi-asserted-by":"crossref","first-page":"358","DOI":"10.1016\/j.drudis.2013.01.007","volume":"18","author":"Y Tanrikulu","year":"2013","unstructured":"Tanrikulu Y, Kruger B, Proschak E (2013) The holistic integration of virtual screening in drug discovery. Drug Discov Today 18(7\u20138):358\u2013364","journal-title":"Drug Discov Today"},{"issue":"5","key":"443_CR2","doi-asserted-by":"crossref","first-page":"742","DOI":"10.1021\/ci100050t","volume":"50","author":"D Rogers","year":"2010","unstructured":"Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742\u2013754","journal-title":"J Chem Inf Model"},{"key":"443_CR3","volume-title":"Handbook of molecular descriptors","author":"V Consonni","year":"2000","unstructured":"Consonni V, Todeschini R (2000) Handbook of molecular descriptors. Wiley-VCH, New York"},{"issue":"4","key":"443_CR4","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1016\/j.drudis.2014.11.004","volume":"20","author":"AM Wassermann","year":"2015","unstructured":"Wassermann AM, Lounkine E, Davies JW, Glick M, Camargo LM (2015) The opportunities of mining historical and collective data in drug discovery. Drug Discov Today 20(4):422\u2013434","journal-title":"Drug Discov Today"},{"issue":"2","key":"443_CR5","first-page":"277","volume":"19","author":"S Paricharak","year":"2016","unstructured":"Paricharak S, Mendez-Lucio O, Chavan Ravindranath A, Bender A, Ijzerman AP, van Westen GJ (2016) Data-driven approaches used for compound library design, hit triage and bioactivity modeling in high-throughput screening. Brief Bioinform 19(2):277\u2013285","journal-title":"Brief Bioinform"},{"issue":"10","key":"443_CR6","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1038\/nrc1951","volume":"6","author":"RH Shoemaker","year":"2006","unstructured":"Shoemaker RH (2006) The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer 6(10):813\u2013823","journal-title":"Nat Rev Cancer"},{"issue":"14","key":"443_CR7","doi-asserted-by":"crossref","first-page":"1088","DOI":"10.1093\/jnci\/81.14.1088","volume":"81","author":"KD Paull","year":"1989","unstructured":"Paull KD, Shoemaker RH, Hodes L, Monks A, Scudiero DA, Rubinstein L, Plowman J, Boyd MR (1989) Display and analysis of patterns of differential activity of drugs against human tumor cell lines: development of mean graph and COMPARE algorithm. J Natl Cancer Inst 81(14):1088\u20131092","journal-title":"J Natl Cancer Inst"},{"issue":"4","key":"443_CR8","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1016\/S1093-3263(01)00126-7","volume":"20","author":"DW Zaharevitz","year":"2002","unstructured":"Zaharevitz DW, Holbeck SL, Bowerman C, Svetlik PA (2002) COMPARE: a web accessible tool for investigating mechanisms of cell growth inhibition. J Mol Graph Model 20(4):297\u2013303","journal-title":"J Mol Graph Model"},{"issue":"5081","key":"443_CR9","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1126\/science.1411538","volume":"258","author":"JN Weinstein","year":"1992","unstructured":"Weinstein JN, Kohn KW, Grever MR, Viswanadhan VN, Rubinstein LV, Monks AP, Scudiero DA, Welch L, Koutsoukos AD, Chiausa AJ et al (1992) Neural computing in cancer drug development: predicting mechanism of action. Science 258(5081):447\u2013451","journal-title":"Science"},{"issue":"5298","key":"443_CR10","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1126\/science.275.5298.343","volume":"275","author":"JN Weinstein","year":"1997","unstructured":"Weinstein JN, Myers TG, O\u2019Connor PM, Friend SH, Fornace AJ Jr, Kohn KW, Fojo T, Bates SE, Rubinstein LV, Anderson NL et al (1997) An information-intensive approach to the molecular pharmacology of cancer. Science 275(5298):343\u2013349","journal-title":"Science"},{"issue":"2","key":"443_CR11","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1016\/1074-5521(95)90283-X","volume":"2","author":"LM Kauvar","year":"1995","unstructured":"Kauvar LM, Higgins DL, Villar HO, Sportsman JR, Engqvist-Goldstein A, Bukar R, Bauer KE, Dilley H, Rocke DM (1995) Predicting ligand binding to proteins by affinity fingerprinting. Chem Biol 2(2):107\u2013118","journal-title":"Chem Biol"},{"issue":"2","key":"443_CR12","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1073\/pnas.0407790101","volume":"102","author":"AF Fliri","year":"2005","unstructured":"Fliri AF, Loging WT, Thadeio PF, Volkmann RA (2005) Biological spectra analysis: linking biological activity profiles to molecular structure. Proc Natl Acad Sci USA 102(2):261\u2013266","journal-title":"Proc Natl Acad Sci USA"},{"issue":"22","key":"443_CR13","doi-asserted-by":"crossref","first-page":"6918","DOI":"10.1021\/jm050494g","volume":"48","author":"AF Fliri","year":"2005","unstructured":"Fliri AF, Loging WT, Thadeio PF, Volkmann RA (2005) Biospectra analysis: model proteome characterizations for linking molecular structure and biological response. J Med Chem 48(22):6918\u20136925","journal-title":"J Med Chem"},{"issue":"35","key":"443_CR14","doi-asserted-by":"crossref","first-page":"10543","DOI":"10.1021\/ja035413p","volume":"125","author":"SJ Haggarty","year":"2003","unstructured":"Haggarty SJ, Clemons PA, Schreiber SL (2003) Chemical genomic profiling of biological networks using graph theory and combinations of small molecule perturbations. J Am Chem Soc 125(35):10543\u201310545","journal-title":"J Am Chem Soc"},{"issue":"45","key":"443_CR15","doi-asserted-by":"crossref","first-page":"14740","DOI":"10.1021\/ja048170p","volume":"126","author":"YK Kim","year":"2004","unstructured":"Kim YK, Arai MA, Arai T, Lamenzo JO, Dean EF 3rd, Patterson N, Clemons PA, Schreiber SL (2004) Relationship of stereochemical and skeletal diversity of small molecules to cellular measurement space. J Am Chem Soc 126(45):14740\u201314745","journal-title":"J Am Chem Soc"},{"issue":"15","key":"443_CR16","doi-asserted-by":"crossref","first-page":"2432","DOI":"10.1021\/jm0010670","volume":"44","author":"S Anzali","year":"2001","unstructured":"Anzali S, Barnickel G, Cezanne B, Krug M, Filimonov D, Poroikov V (2001) Discriminating between drugs and nondrugs by prediction of activity spectra for substances (PASS). J Med Chem 44(15):2432\u20132437","journal-title":"J Med Chem"},{"issue":"1\u20132","key":"443_CR17","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1080\/10629360601054032","volume":"18","author":"V Poroikov","year":"2007","unstructured":"Poroikov V, Filimonov D, Lagunin A, Gloriozova T, Zakharov A (2007) PASS: identification of probable targets and mechanisms of toxicity. SAR QSAR Environ Res 18(1\u20132):101\u2013110","journal-title":"SAR QSAR Environ Res"},{"issue":"4","key":"443_CR18","doi-asserted-by":"crossref","first-page":"371","DOI":"10.2174\/1568026053828394","volume":"5","author":"P Beroza","year":"2005","unstructured":"Beroza P, Damodaran K, Lum RT (2005) Target-related affinity profiling: Telik\u2019s lead discovery technology. Curr Top Med Chem 5(4):371\u2013381","journal-title":"Curr Top Med Chem"},{"issue":"20","key":"443_CR19","doi-asserted-by":"crossref","first-page":"4875","DOI":"10.1021\/jm049950b","volume":"47","author":"N Hsu","year":"2004","unstructured":"Hsu N, Cai D, Damodaran K, Gomez RF, Keck JG, Laborde E, Lum RT, Macke TJ, Martin G, Schow SR et al (2004) Novel cyclooxygenase-1 inhibitors discovered using affinity fingerprints. J Med Chem 47(20):4875\u20134880","journal-title":"J Med Chem"},{"issue":"6","key":"443_CR20","doi-asserted-by":"crossref","first-page":"1336","DOI":"10.1124\/mol.65.6.1336","volume":"65","author":"RM Wadkins","year":"2004","unstructured":"Wadkins RM, Hyatt JL, Yoon KJ, Morton CL, Lee RE, Damodaran K, Beroza P, Danks MK, Potter PM (2004) Discovery of novel selective inhibitors of human intestinal carboxylesterase for the amelioration of irinotecan-induced diarrhea: synthesis, quantitative structure-activity relationship analysis, and biological activity. Mol Pharmacol 65(6):1336\u20131343","journal-title":"Mol Pharmacol"},{"issue":"26","key":"443_CR21","doi-asserted-by":"crossref","first-page":"9059","DOI":"10.1073\/pnas.0802982105","volume":"105","author":"D Plouffe","year":"2008","unstructured":"Plouffe D, Brinker A, McNamara C, Henson K, Kato N, Kuhen K, Nagle A, Adrian F, Matzen JT, Anderson P et al (2008) In silico activity profiling reveals the mechanism of action of antimalarials discovered in a high-throughput screen. Proc Natl Acad Sci USA 105(26):9059\u20139064","journal-title":"Proc Natl Acad Sci USA"},{"issue":"8","key":"443_CR22","doi-asserted-by":"crossref","first-page":"1399","DOI":"10.1021\/cb3001028","volume":"7","author":"PM Petrone","year":"2012","unstructured":"Petrone PM, Simms B, Nigsch F, Lounkine E, Kutchukian P, Cornett A, Deng Z, Davies JW, Jenkins JL, Glick M (2012) Rethinking molecular similarity: comparing compounds on the basis of biological activity. ACS Chem Biol 7(8):1399\u20131409","journal-title":"ACS Chem Biol"},{"issue":"5","key":"443_CR23","doi-asserted-by":"crossref","first-page":"771","DOI":"10.1177\/1087057113520226","volume":"19","author":"V Dancik","year":"2014","unstructured":"Dancik V, Carrel H, Bodycombe NE, Seiler KP, Fomina-Yadlin D, Kubicek ST, Hartwell K, Shamji AF, Wagner BK, Clemons PA (2014) Connecting small molecules with similar assay performance profiles leads to new biological hypotheses. J Biomol Screen 19(5):771\u2013781","journal-title":"J Biomol Screen"},{"issue":"13\u201314","key":"443_CR24","doi-asserted-by":"crossref","first-page":"674","DOI":"10.1016\/j.drudis.2013.02.005","volume":"18","author":"PM Petrone","year":"2013","unstructured":"Petrone PM, Wassermann AM, Lounkine E, Kutchukian P, Simms B, Jenkins J, Selzer P, Glick M (2013) Biodiversity of small molecules\u2013a new perspective in screening set selection. Drug Discov Today. 18(13\u201314):674\u2013680","journal-title":"Drug Discov Today."},{"issue":"7","key":"443_CR25","doi-asserted-by":"crossref","first-page":"1622","DOI":"10.1021\/cb5001839","volume":"9","author":"AM Wassermann","year":"2014","unstructured":"Wassermann AM, Lounkine E, Urban L, Whitebread S, Chen S, Hughes K, Guo H, Kutlina E, Fekete A, Klumpp M et al (2014) A screening pattern recognition method finds new and divergent targets for drugs and natural products. ACS Chem Biol 9(7):1622\u20131631","journal-title":"ACS Chem Biol"},{"issue":"11","key":"443_CR26","doi-asserted-by":"crossref","first-page":"3024","DOI":"10.1021\/acschembio.6b00358","volume":"11","author":"A Cortes Cabrera","year":"2016","unstructured":"Cortes Cabrera A, Lucena-Agell D, Redondo-Horcajo M, Barasoain I, Diaz JF, Fasching B, Petrone PM (2016) Aggregated compound biological signatures facilitate phenotypic drug discovery and target elucidation. ACS Chem Biol 11(11):3024\u20133034","journal-title":"ACS Chem Biol"},{"issue":"5","key":"443_CR27","doi-asserted-by":"crossref","first-page":"956","DOI":"10.1021\/acs.jcim.5b00054","volume":"55","author":"M Maciejewski","year":"2015","unstructured":"Maciejewski M, Wassermann AM, Glick M, Lounkine E (2015) Experimental design strategy: weak reinforcement leads to increased hit rates and enhanced chemical diversity. J Chem Inf Model 55(5):956\u2013962","journal-title":"J Chem Inf Model"},{"issue":"5","key":"443_CR28","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1021\/acschembio.6b00029","volume":"11","author":"S Paricharak","year":"2016","unstructured":"Paricharak S, Ijzerman AP, Bender A, Nigsch F (2016) Analysis of iterative screening with stepwise compound selection based on Novartis in-house HTS data. ACS Chem Biol 11(5):1255\u20131264","journal-title":"ACS Chem Biol"},{"issue":"7","key":"443_CR29","doi-asserted-by":"crossref","first-page":"1880","DOI":"10.1021\/ci500190p","volume":"54","author":"S Riniker","year":"2014","unstructured":"Riniker S, Wang Y, Jenkins JL, Landrum GA (2014) Using information from historical high-throughput screens to predict active compounds. J Chem Inf Model 54(7):1880\u20131891","journal-title":"J Chem Inf Model"},{"issue":"D1","key":"443_CR30","doi-asserted-by":"crossref","first-page":"D955","DOI":"10.1093\/nar\/gkw1118","volume":"45","author":"Y Wang","year":"2017","unstructured":"Wang Y, Bryant SH, Cheng T, Wang J, Gindulyte A, Shoemaker BA, Thiessen PA, He S, Zhang J (2017) PubChem BioAssay: 2017 update. Nucleic Acids Res 45(D1):D955\u2013D963","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"443_CR31","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1021\/acs.jcim.5b00498","volume":"56","author":"KY Helal","year":"2016","unstructured":"Helal KY, Maciejewski M, Gregori-Puigjane E, Glick M, Wassermann AM (2016) Public domain HTS fingerprints: design and evaluation of compound bioactivity profiles from PubChem\u2019s Bioassay Repository. J Chem Inf Model 56(2):390\u2013398","journal-title":"J Chem Inf Model"},{"issue":"17","key":"443_CR32","doi-asserted-by":"crossref","first-page":"3401","DOI":"10.1021\/jm950800y","volume":"39","author":"H Briem","year":"1996","unstructured":"Briem H, Kuntz ID (1996) Molecular similarity based on DOCK-generated fingerprints. J Med Chem 39(17):3401\u20133408","journal-title":"J Med Chem"},{"issue":"10","key":"443_CR33","doi-asserted-by":"crossref","first-page":"e75992","DOI":"10.1371\/journal.pone.0075992","volume":"8","author":"RG Coleman","year":"2013","unstructured":"Coleman RG, Carchia M, Sterling T, Irwin JJ, Shoichet BK (2013) Ligand pose and orientational sampling in molecular docking. PLoS ONE 8(10):e75992","journal-title":"PLoS ONE"},{"issue":"2","key":"443_CR34","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1021\/ci990439e","volume":"40","author":"UF Lessel","year":"2000","unstructured":"Lessel UF, Briem H (2000) Flexsim-X: a method for the detection of molecules with similar biological activity. J Chem Inf Comput Sci 40(2):246\u2013253","journal-title":"J Chem Inf Comput Sci"},{"issue":"3","key":"443_CR35","doi-asserted-by":"crossref","first-page":"470","DOI":"10.1006\/jmbi.1996.0477","volume":"261","author":"M Rarey","year":"1996","unstructured":"Rarey M, Kramer B, Lengauer T, Klebe G (1996) A fast flexible docking method using an incremental construction algorithm. J Mol Biol 261(3):470\u2013489","journal-title":"J Mol Biol"},{"key":"443_CR36","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1186\/1472-6807-10-32","volume":"10","author":"Z Simon","year":"2010","unstructured":"Simon Z, Vigh-Smeller M, Peragovics A, Csukly G, Zahoranszky-Kohalmi G, Rauscher AA, Jelinek B, Hari P, Bitter I, Malnasi-Csizmadia A et al (2010) Relating the shape of protein binding sites to binding affinity profiles: is there an association? BMC Struct Biol 10:32","journal-title":"BMC Struct Biol"},{"issue":"21","key":"443_CR37","doi-asserted-by":"crossref","first-page":"8377","DOI":"10.1021\/jm400813y","volume":"56","author":"L Vegner","year":"2013","unstructured":"Vegner L, Peragovics A, Tombor L, Jelinek B, Czobor P, Bender A, Simon Z, Malnasi-Csizmadia A (2013) Experimental confirmation of new drug-target interactions predicted by Drug Profile Matching. J Med Chem 56(21):8377\u20138388","journal-title":"J Med Chem"},{"issue":"46","key":"443_CR38","doi-asserted-by":"crossref","first-page":"6885","DOI":"10.2174\/1381612822666160831104718","volume":"22","author":"A Peragovics","year":"2016","unstructured":"Peragovics A, Simon Z, Malnasi-Csizmadia A, Bender A (2016) Modeling polypharmacological profiles by affinity fingerprinting. Curr Pharm Des 22(46):6885\u20136894","journal-title":"Curr Pharm Des"},{"issue":"7","key":"443_CR39","doi-asserted-by":"crossref","first-page":"966","DOI":"10.1016\/j.ejmech.2006.12.028","volume":"42","author":"S Murali","year":"2007","unstructured":"Murali S, Hojo S, Tsujishita H, Nakamura H, Fukunishi Y (2007) In-silico drug screening method based on the protein-compound affinity matrix using the factor selection technique. Eur J Med Chem 42(7):966\u2013976","journal-title":"Eur J Med Chem"},{"issue":"6","key":"443_CR40","doi-asserted-by":"crossref","first-page":"2610","DOI":"10.1021\/ci600334u","volume":"46","author":"Y Fukunishi","year":"2006","unstructured":"Fukunishi Y, Hojo S, Nakamura H (2006) An efficient in silico screening method based on the protein-compound affinity matrix and its application to the design of a focused library for cytochrome P450 (CYP) ligands. J Chem Inf Model 46(6):2610\u20132622","journal-title":"J Chem Inf Model"},{"issue":"6","key":"443_CR41","doi-asserted-by":"crossref","first-page":"2445","DOI":"10.1021\/ci600197y","volume":"46","author":"A Bender","year":"2006","unstructured":"Bender A, Jenkins JL, Glick M, Deng Z, Nettles JH, Davies JW (2006) \u201cBayes affinity fingerprints\u201d improve retrieval rates in virtual screening and define orthogonal bioactivity space: when are multitarget drugs a feasible concept? J Chem Inf Model 46(6):2445\u20132456","journal-title":"J Chem Inf Model"},{"issue":"12","key":"443_CR42","doi-asserted-by":"crossref","first-page":"4977","DOI":"10.1021\/jm4004285","volume":"57","author":"A Cherkasov","year":"2014","unstructured":"Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R et al (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57(12):4977\u20135010","journal-title":"J Med Chem"},{"issue":"12","key":"443_CR43","doi-asserted-by":"crossref","first-page":"1283","DOI":"10.1517\/17460441.2015.1083006","volume":"10","author":"T Wang","year":"2015","unstructured":"Wang T, Wu MB, Lin JP, Yang LR (2015) Quantitative structure-activity relationship: promising advances in drug discovery platforms. Expert Opin Drug Discov 10(12):1283\u20131300","journal-title":"Expert Opin Drug Discov"},{"issue":"3","key":"443_CR44","doi-asserted-by":"crossref","first-page":"1600082","DOI":"10.1002\/minf.201600082","volume":"36","author":"IV Tetko","year":"2017","unstructured":"Tetko IV, Maran U, Tropsha A (2017) Public (Q)SAR Services, integrated modeling environments, and model repositories on the web: state of the art and perspectives for future development. Mol Inform 36(3):1600082","journal-title":"Mol Inform"},{"issue":"6","key":"443_CR45","doi-asserted-by":"crossref","first-page":"475","DOI":"10.2174\/138620711795767866","volume":"14","author":"F Lopez-Vallejo","year":"2011","unstructured":"Lopez-Vallejo F, Caulfield T, Martinez-Mayorga K, Giulianotti MA, Nefzi A, Houghten RA, Medina-Franco JL (2011) Integrating virtual screening and combinatorial chemistry for accelerated drug discovery. Comb Chem High Throughput Screen 14(6):475\u2013487","journal-title":"Comb Chem High Throughput Screen"},{"issue":"8","key":"443_CR46","doi-asserted-by":"crossref","first-page":"2077","DOI":"10.1021\/acs.jcim.7b00166","volume":"57","author":"EJ Martin","year":"2017","unstructured":"Martin EJ, Polyakov VR, Tian L, Perez RC (2017) Profile-QSAR 2.0: kinase virtual screening accuracy comparable to four-concentration IC50s for realistically novel compounds. J Chem Inf Model 57(8):2077\u20132088","journal-title":"J Chem Inf Model"},{"issue":"1","key":"443_CR47","doi-asserted-by":"crossref","first-page":"474","DOI":"10.1021\/acs.jmedchem.6b01611","volume":"60","author":"B Merget","year":"2017","unstructured":"Merget B, Turk S, Eid S, Rippmann F, Fulle S (2017) Profiling prediction of kinase inhibitors: toward the virtual assay. J Med Chem 60(1):474\u2013485","journal-title":"J Med Chem"},{"issue":"1","key":"443_CR48","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1111\/cbdd.12294","volume":"84","author":"J Balfer","year":"2014","unstructured":"Balfer J, Heikamp K, Laufer S, Bajorath J (2014) Modeling of compound profiling experiments using support vector machines. Chem Biol Drug Des 84(1):75\u201385","journal-title":"Chem Biol Drug Des"},{"issue":"24","key":"443_CR49","doi-asserted-by":"crossref","first-page":"11067","DOI":"10.1021\/jm3014508","volume":"55","author":"D Dimova","year":"2012","unstructured":"Dimova D, Iyer P, Vogt M, Totzke F, Kubbutat MH, Schachtele C, Laufer S, Bajorath J (2012) Assessing the target differentiation potential of imidazole-based protein kinase inhibitors. J Med Chem 55(24):11067\u201311071","journal-title":"J Med Chem"},{"issue":"Database issue","key":"443_CR50","doi-asserted-by":"crossref","first-page":"D1100","DOI":"10.1093\/nar\/gkr777","volume":"40","author":"A Gaulton","year":"2012","unstructured":"Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B et al (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(Database issue):D1100\u2013D1107","journal-title":"Nucleic Acids Res"},{"issue":"Database issue","key":"443_CR51","doi-asserted-by":"crossref","first-page":"D1083","DOI":"10.1093\/nar\/gkt1031","volume":"42","author":"AP Bento","year":"2014","unstructured":"Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Kruger FA, Light Y, Mak L, McGlinchey S et al (2014) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42(Database issue):D1083\u2013D1090","journal-title":"Nucleic Acids Res"},{"key":"443_CR52","unstructured":"Landrum GA (2006) RDKit: Open-Source Cheminformatics Software. In"},{"key":"443_CR53","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-020-00444-5","author":"I Cort\u00e9s-Ciriano","year":"2020","unstructured":"Cort\u00e9s-Ciriano I, \u0160kuta C, Bender A, Svozil D (2020) QSAR-derived affinity fingerprints (part 2): modeling performance for potency prediction. J Cheminform. https:\/\/doi.org\/10.1186\/s13321-020-00444-5","journal-title":"J Cheminform"},{"issue":"6","key":"443_CR54","doi-asserted-by":"crossref","first-page":"1596","DOI":"10.1021\/ci5001168","volume":"54","author":"U Norinder","year":"2014","unstructured":"Norinder U, Carlsson L, Boyer S, Eklund M (2014) Introducing conformal prediction in predictive modeling. A transparent and flexible alternative to applicability domain determination. J Chem Inf Model 54(6):1596\u20131603","journal-title":"J Chem Inf Model"},{"key":"443_CR55","first-page":"371","volume":"9","author":"G Shafer","year":"2008","unstructured":"Shafer G, Vovk V (2008) A tutorial on conformal prediction. J Mach Learn Res. 9:371\u2013421","journal-title":"J Mach Learn Res."},{"issue":"6\u20137","key":"443_CR56","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1002\/minf.201400165","volume":"34","author":"I Cortes-Ciriano","year":"2015","unstructured":"Cortes-Ciriano I, Bender A, Malliavin T (2015) Prediction of PARP inhibition with proteochemometric modelling and conformal prediction. Mol Inform 34(6\u20137):357\u2013366","journal-title":"Mol Inform"},{"issue":"5","key":"443_CR57","doi-asserted-by":"crossref","first-page":"1132","DOI":"10.1021\/acs.jcim.8b00054","volume":"58","author":"F Svensson","year":"2018","unstructured":"Svensson F, Aniceto N, Norinder U, Cortes-Ciriano I, Spjuth O, Carlsson L, Bender A (2018) Conformal regression for quantitative structure-activity relationship modeling-quantifying prediction uncertainty. J Chem Inf Model 58(5):1132\u20131140","journal-title":"J Chem Inf Model"},{"key":"443_CR58","doi-asserted-by":"crossref","first-page":"150032","DOI":"10.1038\/sdata.2015.32","volume":"2","author":"A Gaulton","year":"2015","unstructured":"Gaulton A, Kale N, van Westen GJ, Bellis LJ, Bento AP, Davies M, Hersey A, Papadatos G, Forster M, Wege P et al (2015) A large-scale crop protection bioassay data set. Sci Data 2:150032","journal-title":"Sci Data"},{"issue":"9","key":"443_CR59","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1007\/s10822-015-9860-5","volume":"29","author":"G Papadatos","year":"2015","unstructured":"Papadatos G, Gaulton A, Hersey A, Overington JP (2015) Activity, assay and target data curation and quality in the ChEMBL database. J Comput Aided Mol Des 29(9):885\u2013896","journal-title":"J Comput Aided Mol Des"},{"issue":"D1","key":"443_CR60","doi-asserted-by":"crossref","first-page":"D930","DOI":"10.1093\/nar\/gky1075","volume":"47","author":"D Mendez","year":"2019","unstructured":"Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Felix E, Magarinos MP, Mosquera JF, Mutowo P, Nowotka M et al (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47(D1):D930\u2013D940","journal-title":"Nucleic Acids Res"},{"key":"443_CR61","unstructured":"IMI eTOX standardiser. https:\/\/pypi.org\/project\/standardiser\/"},{"issue":"1","key":"443_CR62","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L (2001) Random forests. Mach Learn 45(1):5\u201332","journal-title":"Mach Learn"},{"key":"443_CR63","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"issue":"11","key":"443_CR64","doi-asserted-by":"crossref","first-page":"2837","DOI":"10.1021\/ci400482e","volume":"53","author":"RP Sheridan","year":"2013","unstructured":"Sheridan RP (2013) Using random forest to model the domain applicability of another random forest model. J Chem Inf Model 53(11):2837\u20132850","journal-title":"J Chem Inf Model"},{"issue":"1","key":"443_CR65","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1093\/bioinformatics\/btv529","volume":"32","author":"I Cortes-Ciriano","year":"2016","unstructured":"Cortes-Ciriano I, van Westen GJ, Bouvier G, Nilges M, Overington JP, Bender A, Malliavin TE (2016) Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel. Bioinformatics 32(1):85\u201395","journal-title":"Bioinformatics"},{"issue":"4","key":"443_CR66","doi-asserted-by":"crossref","first-page":"269","DOI":"10.1016\/S1093-3263(01)00123-1","volume":"20","author":"A Golbraikh","year":"2002","unstructured":"Golbraikh A, Tropsha A (2002) Beware of q2! J Mol Graph Model 20(4):269\u2013276","journal-title":"J Mol Graph Model"},{"issue":"1","key":"443_CR67","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1002\/qsar.200390007","volume":"22","author":"A Tropsha","year":"2003","unstructured":"Tropsha A, Gramatica P, Gombar VK (2003) The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb Sci 22(1):69\u201377","journal-title":"QSAR Comb Sci"},{"key":"443_CR68","doi-asserted-by":"crossref","unstructured":"Tropsha A, Golbraikh A (2010) Predictive quantitative structure-activity relationships modeling development and validation of QSAR Models. In: Handbook of chemoinformatics algorithms, pp 211\u2013232","DOI":"10.1201\/9781420082999-c7"},{"issue":"7","key":"443_CR69","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1021\/ci100176x","volume":"50","author":"D Fourches","year":"2010","unstructured":"Fourches D, Muratov E, Tropsha A (2010) Trust, but verify: on the importance of chemical structure curation in cheminformatics and QSAR modeling research. J Chem Inf Model 50(7):1189\u20131204","journal-title":"J Chem Inf Model"},{"issue":"7","key":"443_CR70","doi-asserted-by":"crossref","first-page":"1316","DOI":"10.1021\/acs.jcim.5b00206","volume":"55","author":"DL Alexander","year":"2015","unstructured":"Alexander DL, Tropsha A, Winkler DA (2015) Beware of R(2): simple, unambiguous assessment of the prediction accuracy of QSAR and QSPR models. J Chem Inf Model 55(7):1316\u20131322","journal-title":"J Chem Inf Model"},{"issue":"15\u201316","key":"443_CR71","doi-asserted-by":"crossref","first-page":"700","DOI":"10.1016\/j.drudis.2006.06.013","volume":"11","author":"IV Tetko","year":"2006","unstructured":"Tetko IV, Bruneau P, Mewes HW, Rohrer DC, Poda GI (2006) Can we estimate the accuracy of ADME-Tox predictions? Drug Discov Today 11(15\u201316):700\u2013707","journal-title":"Drug Discov Today"},{"issue":"5","key":"443_CR72","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1002\/minf.201501019","volume":"35","author":"M Mathea","year":"2016","unstructured":"Mathea M, Klingspohn W, Baumann K (2016) Chemoinformatic classification methods and their applicability domain. Mol Inform. 35(5):160\u2013180","journal-title":"Mol Inform."},{"issue":"2430","key":"443_CR73","first-page":"345","volume":"2002","author":"H Papadopoulos","year":"2002","unstructured":"Papadopoulos H, Proedrou K, Vovk V, Gammerman A (2002) Inductive confidence machines for regression. Mach Learn Ecml 2002(2430):345\u2013356","journal-title":"Mach Learn Ecml"},{"issue":"1\u20132","key":"443_CR74","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1007\/s10472-013-9368-4","volume":"74","author":"V Vovk","year":"2015","unstructured":"Vovk V (2015) Cross-conformal predictors. Ann Math Artif Intell 74(1\u20132):9\u201328","journal-title":"Ann Math Artif Intell"},{"key":"443_CR75","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1016\/j.ymeth.2014.08.005","volume":"71","author":"A Cereto-Massague","year":"2015","unstructured":"Cereto-Massague A, Ojeda MJ, Valls C, Mulero M, Garcia-Vallve S, Pujadas G (2015) Molecular fingerprint similarity search in virtual screening. Methods 71:58\u201363","journal-title":"Methods"},{"issue":"22","key":"443_CR76","doi-asserted-by":"crossref","first-page":"3256","DOI":"10.1039\/b409865j","volume":"2","author":"J Hert","year":"2004","unstructured":"Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A (2004) Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org Biomol Chem 2(22):3256\u20133266","journal-title":"Org Biomol Chem"},{"issue":"1","key":"443_CR77","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1021\/ci800249s","volume":"49","author":"A Bender","year":"2009","unstructured":"Bender A, Jenkins JL, Scheiber J, Sukuru SC, Glick M, Davies JW (2009) How similar are similarity searching methods? A principal component analysis of molecular descriptor space. J Chem Inf Model 49(1):108\u2013119","journal-title":"J Chem Inf Model"},{"issue":"2","key":"443_CR78","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1021\/ci800302g","volume":"49","author":"G Papadatos","year":"2009","unstructured":"Papadatos G, Cooper AW, Kadirkamanathan V, Macdonald SJ, McLay IM, Pickett SD, Pritchard JM, Willett P, Gillet VJ (2009) Analysis of neighborhood behavior in lead optimization and array design. J Chem Inf Model 49(2):195\u2013208","journal-title":"J Chem Inf Model"},{"issue":"3","key":"443_CR79","doi-asserted-by":"crossref","first-page":"962","DOI":"10.1021\/acs.jcim.8b00550","volume":"59","author":"N Sturm","year":"2018","unstructured":"Sturm N, Sun J, Vandriessche Y, Mayr A, Klambauer G, Carlsson L, Engkvist O, Chen H (2018) Application of bioactivity profile-based fingerprints for building machine learning models. J Chem Inf Model 59(3):962\u2013972","journal-title":"J Chem Inf Model"},{"issue":"22","key":"443_CR80","doi-asserted-by":"crossref","first-page":"3204","DOI":"10.1039\/b409813g","volume":"2","author":"A Bender","year":"2004","unstructured":"Bender A, Glen RC (2004) Molecular similarity: a key technique in molecular informatics. Org Biomol Chem 2(22):3204\u20133218","journal-title":"Org Biomol Chem"},{"issue":"17","key":"443_CR81","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1016\/S1359-6446(02)02411-X","volume":"7","author":"RP Sheridan","year":"2002","unstructured":"Sheridan RP, Kearsley SK (2002) Why do we need so many chemical similarity search methods? Drug Discov Today 7(17):903\u2013911","journal-title":"Drug Discov Today"},{"key":"443_CR82","volume-title":"Concepts and applications of molecular similarity","author":"AM Johnson","year":"1990","unstructured":"Johnson AM, Maggiora GM (1990) Concepts and applications of molecular similarity. Willey, New York"},{"issue":"4","key":"443_CR83","doi-asserted-by":"crossref","first-page":"332","DOI":"10.2174\/138620709788167980","volume":"12","author":"JL Melville","year":"2009","unstructured":"Melville JL, Burke EK, Hirst JD (2009) Machine learning in virtual screening. Comb Chem High Throughput Screen 12(4):332\u2013343","journal-title":"Comb Chem High Throughput Screen"},{"issue":"3","key":"443_CR84","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1016\/j.drudis.2014.10.012","volume":"20","author":"A Lavecchia","year":"2015","unstructured":"Lavecchia A (2015) Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 20(3):318\u2013331","journal-title":"Drug Discov Today"},{"issue":"7\u20138","key":"443_CR85","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1016\/j.drudis.2011.10.024","volume":"17","author":"H Sun","year":"2012","unstructured":"Sun H, Tawa G, Wallqvist A (2012) Classification of scaffold-hopping approaches. Drug Discov Today 17(7\u20138):310\u2013324","journal-title":"Drug Discov Today"},{"issue":"11","key":"443_CR86","doi-asserted-by":"crossref","first-page":"1217","DOI":"10.2174\/138955706778742768","volume":"6","author":"N Brown","year":"2006","unstructured":"Brown N, Jacoby E (2006) On scaffolds and hopping in medicinal chemistry. Mini Rev Med Chem 6(11):1217\u20131229","journal-title":"Mini Rev Med Chem"},{"issue":"15","key":"443_CR87","doi-asserted-by":"crossref","first-page":"5707","DOI":"10.1021\/jm100492z","volume":"53","author":"M Vogt","year":"2010","unstructured":"Vogt M, Stumpfe D, Geppert H, Bajorath J (2010) Scaffold hopping using two-dimensional fingerprints: true potential, black magic, or a hopeless endeavor? Guidelines for virtual screening. J Med Chem 53(15):5707\u20135715","journal-title":"J Med Chem"},{"issue":"1","key":"443_CR88","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1186\/s13321-016-0158-y","volume":"8","author":"S Latti","year":"2016","unstructured":"Latti S, Niinivehmas S, Pentikainen OT (2016) Rocker: open source, easy-to-use tool for AUC and enrichment calculations and ROC visualization. J Cheminform 8(1):45","journal-title":"J Cheminform"},{"issue":"1","key":"443_CR89","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1148\/radiology.143.1.7063747","volume":"143","author":"JA Hanley","year":"1982","unstructured":"Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29\u201336","journal-title":"Radiology"},{"issue":"2","key":"443_CR90","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1021\/ci600426e","volume":"47","author":"JF Truchon","year":"2007","unstructured":"Truchon JF, Bayly CI (2007) Evaluating virtual screening methods: good and bad metrics for the \u201cearly recognition\u201d problem. J Chem Inf Model 47(2):488\u2013508","journal-title":"J Chem Inf Model"},{"issue":"1","key":"443_CR91","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1186\/1758-2946-5-26","volume":"5","author":"S Riniker","year":"2013","unstructured":"Riniker S, Landrum GA (2013) Open-source platform to benchmark fingerprints for ligand-based virtual screening. J Cheminform 5(1):26","journal-title":"J Cheminform"},{"issue":"4","key":"443_CR92","doi-asserted-by":"crossref","first-page":"502","DOI":"10.1021\/jm000375v","volume":"44","author":"DA Pearlman","year":"2001","unstructured":"Pearlman DA, Charifson PS (2001) Improved scoring of ligand-protein interactions using OWFEG free energy grids. J Med Chem 44(4):502\u2013511","journal-title":"J Med Chem"},{"issue":"8","key":"443_CR93","doi-asserted-by":"crossref","first-page":"1957","DOI":"10.1021\/ci300435j","volume":"53","author":"A Koutsoukas","year":"2013","unstructured":"Koutsoukas A, Lowe R, Kalantarmotamedi Y, Mussa HY, Klaffke W, Mitchell JB, Glen RC, Bender A (2013) In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Naive Bayes and Parzen-Rosenblatt window. J Chem Inf Model 53(8):1957\u20131966","journal-title":"J Chem Inf Model"},{"issue":"3","key":"443_CR94","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1198\/000313006X118430","volume":"60","author":"T Hothorn","year":"2006","unstructured":"Hothorn T, Hornik K, Van de Wiel MA, Zeileis A (2006) A Lego system for conditional inference. Am Stat 60(3):257\u2013263","journal-title":"Am Stat"},{"issue":"8","key":"443_CR95","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v028.i08","volume":"28","author":"T Hothorn","year":"2008","unstructured":"Hothorn T, Hornik K, van de Wiel MAV, Zeileis A (2008) Implementing a class of permutation tests: the coin package. J Stat Softw 28(8):1\u201323","journal-title":"J Stat Softw"},{"issue":"11","key":"443_CR96","doi-asserted-by":"crossref","first-page":"2829","DOI":"10.1021\/ci400466r","volume":"53","author":"S Riniker","year":"2013","unstructured":"Riniker S, Fechner N, Landrum GA (2013) Heterogeneous classifier fusion for ligand-based virtual screening: or, how decision making by committee can be a good thing. J Chem Inf Model 53(11):2829\u20132836","journal-title":"J Chem Inf Model"},{"issue":"3\u20134","key":"443_CR97","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/s10822-008-9189-4","volume":"22","author":"JJ Irwin","year":"2008","unstructured":"Irwin JJ (2008) Community benchmarks for virtual screening. J Comput Aided Mol Des 22(3\u20134):193\u2013199","journal-title":"J Comput Aided Mol Des"},{"issue":"2","key":"443_CR98","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1021\/ci8002649","volume":"49","author":"SG Rohrer","year":"2009","unstructured":"Rohrer SG, Baumann K (2009) Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data. J Chem Inf Model 49(2):169\u2013184","journal-title":"J Chem Inf Model"},{"issue":"8","key":"443_CR99","doi-asserted-by":"crossref","first-page":"1831","DOI":"10.1021\/ci200199u","volume":"51","author":"K Heikamp","year":"2011","unstructured":"Heikamp K, Bajorath J (2011) Large-scale similarity search profiling of ChEMBL compound data sets. J Chem Inf Model 51(8):1831\u20131839","journal-title":"J Chem Inf Model"},{"issue":"23","key":"443_CR100","doi-asserted-by":"crossref","first-page":"6789","DOI":"10.1021\/jm0608356","volume":"49","author":"N Huang","year":"2006","unstructured":"Huang N, Shoichet BK, Irwin JJ (2006) Benchmarking sets for molecular docking. J Med Chem 49(23):6789\u20136801","journal-title":"J Med Chem"},{"key":"443_CR101","doi-asserted-by":"crossref","first-page":"e201302002","DOI":"10.5936\/csbj.201302002","volume":"5","author":"P Willett","year":"2013","unstructured":"Willett P (2013) Fusing similarity rankings in ligand-based virtual screening. Comput Struct Biotechnol J 5:e201302002","journal-title":"Comput Struct Biotechnol J"},{"issue":"9","key":"443_CR102","doi-asserted-by":"crossref","first-page":"991","DOI":"10.1016\/0021-9681(66)90032-4","volume":"19","author":"E Rogot","year":"1966","unstructured":"Rogot E, Goldberg ID (1966) A proposed index for measuring agreement in test-retest studies. J Chronic Dis 19(9):991\u20131006","journal-title":"J Chronic Dis"},{"issue":"11","key":"443_CR103","doi-asserted-by":"crossref","first-page":"2884","DOI":"10.1021\/ci300261r","volume":"52","author":"R Todeschini","year":"2012","unstructured":"Todeschini R, Consonni V, Xiang H, Holliday J, Buscema M, Willett P (2012) Similarity coefficients for binary chemoinformatics data: overview and extended comparison using simulated and real data sets. J Chem Inf Model 52(11):2884\u20132901","journal-title":"J Chem Inf Model"},{"key":"443_CR104","first-page":"12","volume-title":"Using random forest to learn imbalanced data","author":"C Chen","year":"2004","unstructured":"Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data. Department of Statistics, UC Berkeley, Berkeley, p 12"},{"issue":"1","key":"443_CR105","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1021\/ci0003911","volume":"41","author":"Y Xu","year":"2001","unstructured":"Xu Y, Johnson M (2001) Algorithm for naming molecular equivalence classes represented by labeled pseudographs. J Chem Inf Comput Sci 41(1):181\u2013185","journal-title":"J Chem Inf Comput Sci"},{"issue":"15","key":"443_CR106","doi-asserted-by":"crossref","first-page":"2887","DOI":"10.1021\/jm9602928","volume":"39","author":"GW Bemis","year":"1996","unstructured":"Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39(15):2887\u20132893","journal-title":"J Med Chem"},{"issue":"6\u20137","key":"443_CR107","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1002\/minf.201000061","volume":"29","author":"A Tropsha","year":"2010","unstructured":"Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29(6\u20137):476\u2013488","journal-title":"Mol Inform"},{"key":"443_CR108","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1007\/978-1-62703-059-5_21","volume":"930","author":"P Gramatica","year":"2013","unstructured":"Gramatica P (2013) On the development and validation of QSAR models. Methods Mol Biol 930:499\u2013526","journal-title":"Methods Mol Biol"},{"issue":"11\u201312","key":"443_CR109","doi-asserted-by":"crossref","first-page":"898","DOI":"10.1002\/minf.201300051","volume":"32","author":"T Kalliokoski","year":"2013","unstructured":"Kalliokoski T, Kramer C, Vulpetti A (2013) Quality issues with public domain chemogenomics data. Mol Inform 32(11\u201312):898\u2013905","journal-title":"Mol Inform"},{"issue":"4","key":"443_CR110","doi-asserted-by":"crossref","first-page":"e61007","DOI":"10.1371\/journal.pone.0061007","volume":"8","author":"T Kalliokoski","year":"2013","unstructured":"Kalliokoski T, Kramer C, Vulpetti A, Gedeck P (2013) Comparability of mixed IC(5)(0) data\u2014a statistical analysis. PLoS ONE 8(4):e61007","journal-title":"PLoS ONE"},{"issue":"6","key":"443_CR111","doi-asserted-by":"crossref","first-page":"2805","DOI":"10.1021\/acsomega.7b00274","volume":"2","author":"L Zhao","year":"2017","unstructured":"Zhao L, Wang W, Sedykh A, Zhu H (2017) Experimental errors in QSAR modeling sets: what we can do and what we cannot do. ACS Omega 2(6):2805\u20132812","journal-title":"ACS Omega"},{"issue":"7","key":"443_CR112","doi-asserted-by":"crossref","first-page":"1243","DOI":"10.1021\/acs.jcim.6b00129","volume":"56","author":"D Fourches","year":"2016","unstructured":"Fourches D, Muratov E, Tropsha A (2016) Trust, but verify II: a practical guide to chemogenomics data curation. J Chem Inf Model 56(7):1243\u20131252","journal-title":"J Chem Inf Model"},{"issue":"7","key":"443_CR113","doi-asserted-by":"crossref","first-page":"2932","DOI":"10.1021\/jm201706b","volume":"55","author":"D Stumpfe","year":"2012","unstructured":"Stumpfe D, Bajorath J (2012) Exploring activity cliffs in medicinal chemistry. J Med Chem 55(7):2932\u20132942","journal-title":"J Med Chem"},{"issue":"6\u20137","key":"443_CR114","doi-asserted-by":"crossref","first-page":"438","DOI":"10.1002\/minf.201400026","volume":"33","author":"J Bajorath","year":"2014","unstructured":"Bajorath J (2014) Exploring activity cliffs from a chemoinformatics perspective. Mol Inform 33(6\u20137):438\u2013442","journal-title":"Mol Inform"},{"issue":"1","key":"443_CR115","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1186\/s13321-018-0325-4","volume":"11","author":"N Bosc","year":"2019","unstructured":"Bosc N, Atkinson F, Felix E, Gaulton A, Hersey A, Leach AR (2019) Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery. J Cheminform 11(1):4","journal-title":"J Cheminform"},{"issue":"3","key":"443_CR116","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1021\/acs.jcim.7b00447","volume":"58","author":"A Cortes Cabrera","year":"2018","unstructured":"Cortes Cabrera A, Petrone PM (2018) Optimal HTS fingerprint definitions by using a desirability function and a genetic algorithm. J Chem Inf Model 58(3):641\u2013646","journal-title":"J Chem Inf Model"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-020-00443-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-020-00443-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-020-00443-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,1]],"date-time":"2023-10-01T17:42:57Z","timestamp":1696182177000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-020-00443-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,5,29]]},"references-count":116,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["443"],"URL":"https:\/\/doi.org\/10.1186\/s13321-020-00443-6","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,5,29]]},"assertion":[{"value":"7 August 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 May 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 May 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare that they have no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"39"}}