{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,23]],"date-time":"2025-10-23T16:58:37Z","timestamp":1761238717070,"version":"3.37.3"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"7","license":[{"start":{"date-parts":[[2020,2,28]],"date-time":"2020-02-28T00:00:00Z","timestamp":1582848000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,2,28]],"date-time":"2020-02-28T00:00:00Z","timestamp":1582848000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000268","name":"Biotechnology and Biological Sciences Research Council","doi-asserted-by":"publisher","award":["BB\/R505821\/1"],"award-info":[{"award-number":["BB\/R505821\/1"]}],"id":[{"id":"10.13039\/501100000268","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Comput Aided Mol Des"],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Reaction-based de novo design refers to the in-silico generation of novel chemical structures by combining reagents using structural transformations derived from known reactions. The driver for using reaction-based transformations is to increase the likelihood of the designed molecules being synthetically accessible. We have previously described a reaction-based de novo design method based on reaction vectors which are transformation rules that are encoded automatically from reaction databases. A limitation of reaction vectors is that they account for structural changes that occur at the core of a reaction only, and they do not consider the presence of competing functionalities that can compromise the reaction outcome. Here, we present the development of a Reaction Class Recommender to enhance the reaction vector framework. The recommender is intended to be used as a filter on the reaction vectors that are applied during de novo design to reduce the combinatorial explosion of in-silico molecules produced while limiting the generated structures to those which are most likely to be synthesisable. The recommender has been validated using an external data set extracted from the recent medicinal chemistry literature and in two simulated de novo design experiments. Results suggest that the use of the recommender drastically reduces the number of solutions explored by the algorithm while preserving the chance of finding relevant solutions and increasing the global synthetic accessibility of the designed molecules.<\/jats:p>","DOI":"10.1007\/s10822-020-00300-6","type":"journal-article","created":{"date-parts":[[2020,2,28]],"date-time":"2020-02-28T06:02:48Z","timestamp":1582869768000},"page":"783-803","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["Enhancing reaction-based de novo design using a multi-label reaction class recommender"],"prefix":"10.1007","volume":"34","author":[{"given":"Gian Marco","family":"Ghiandoni","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael J.","family":"Bodkin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Beining","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dimitar","family":"Hristozov","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"James E. A.","family":"Wallace","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"James","family":"Webster","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8403-3111","authenticated-orcid":false,"given":"Valerie J.","family":"Gillet","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,2,28]]},"reference":[{"key":"300_CR1","first-page":"165","volume-title":"Lead generation approaches in drug discovery","author":"M Hartenfeller","year":"2010","unstructured":"Hartenfeller M, Schneider G, Hartenfeller M, Proschak E (2010) De novo drug design. In: Bajorath J (ed) Lead generation approaches in drug discovery. Wiley, Hoboken, pp 165\u2013185"},{"key":"300_CR2","doi-asserted-by":"publisher","first-page":"4077","DOI":"10.1021\/acs.jmedchem.5b01849","volume":"59","author":"P Schneider","year":"2016","unstructured":"Schneider P, Schneider G (2016) De novo design at the edge of chaos. J Med Chem 59:4077\u20134086. https:\/\/doi.org\/10.1021\/acs.jmedchem.5b01849","journal-title":"J Med Chem"},{"key":"300_CR3","doi-asserted-by":"publisher","first-page":"2765","DOI":"10.1021\/jm030809x","volume":"46","author":"HM Vinkers","year":"2003","unstructured":"Vinkers HM, de Jonge MR, Daeyaert FFD et al (2003) SYNOPSIS: SYNthesize and OPtimize System in Silico. J Med Chem 46:2765\u20132773. https:\/\/doi.org\/10.1021\/jm030809x","journal-title":"J Med Chem"},{"key":"300_CR4","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1002380","author":"M Hartenfeller","year":"2012","unstructured":"Hartenfeller M, Zettl H, Walter M et al (2012) Dogs: reaction-driven de novo design of bioactive compounds. PLoS Comput Biol. https:\/\/doi.org\/10.1371\/journal.pcbi.1002380","journal-title":"PLoS Comput Biol"},{"key":"300_CR5","doi-asserted-by":"publisher","first-page":"1241","DOI":"10.1016\/j.drudis.2018.01.039","volume":"23","author":"H Chen","year":"2018","unstructured":"Chen H, Engkvist O, Wang Y et al (2018) The rise of deep learning in drug discovery. Drug Discov Today 23:1241\u20131250. https:\/\/doi.org\/10.1016\/j.drudis.2018.01.039","journal-title":"Drug Discov Today"},{"key":"300_CR6","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","volume":"4","author":"R G\u00f3mez-Bombarelli","year":"2018","unstructured":"G\u00f3mez-Bombarelli R, Wei JN, Duvenaud D et al (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4:268\u2013276","journal-title":"ACS Cent Sci"},{"key":"300_CR7","doi-asserted-by":"publisher","first-page":"1700123","DOI":"10.1002\/minf.201700123","volume":"37","author":"T Blaschke","year":"2018","unstructured":"Blaschke T, Olivecrona M, Engkvist O et al (2018) Application of generative autoencoder in de novo molecular design. Mol Inf 37:1700123. https:\/\/doi.org\/10.1002\/minf.201700123","journal-title":"Mol Inf"},{"key":"300_CR8","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1021\/acscentsci.7b00512","volume":"4","author":"MHS Segler","year":"2017","unstructured":"Segler MHS, Kogej T, Tyrchan C, Waller MP (2017) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4:120\u2013131","journal-title":"ACS Cent Sci"},{"key":"300_CR9","doi-asserted-by":"publisher","first-page":"875","DOI":"10.1021\/acs.jcim.6b00754","volume":"57","author":"W Yuan","year":"2017","unstructured":"Yuan W, Jiang D, Nambiar DK et al (2017) Chemical space mimicry for drug discovery. J Chem Inf Model 57:875\u2013882","journal-title":"J Chem Inf Model"},{"key":"300_CR10","doi-asserted-by":"publisher","first-page":"1163","DOI":"10.1021\/ci800413m","volume":"49","author":"H Patel","year":"2009","unstructured":"Patel H, Bodkin MJ, Chen B, Gillet VJ (2009) Knowledge-based approach to de Novo design using reaction vectors. J Chem Inf Model 49:1163\u20131184. https:\/\/doi.org\/10.1021\/ci800413m","journal-title":"J Chem Inf Model"},{"key":"300_CR11","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1002\/9783527677016.ch11","volume-title":"De novo molecular design","author":"VJ Gillet","year":"2013","unstructured":"Gillet VJ, Bodkin MJ, Hristozov D (2013) Multiobjective de novo design of synthetically accessible compounds. In: Schneider G (ed) De novo molecular design. Wiley, Hoboken, pp 267\u2013285"},{"key":"300_CR12","doi-asserted-by":"crossref","unstructured":"Hristozov D, Bodkin M, Chen B, et al (2011) Validation of reaction vectors for de novo design. In: Bienstock RJ (ed) Library design, search methods, and applications of fragment-based drug design. pp 29\u201343","DOI":"10.1021\/bk-2011-1076.ch002"},{"key":"300_CR13","doi-asserted-by":"publisher","first-page":"4167","DOI":"10.1021\/acs.jcim.9b00537","volume":"59","author":"GM Ghiandoni","year":"2019","unstructured":"Ghiandoni GM, Bodkin MJ, Chen B et al (2019) Development and application of a data-driven reaction classification model: comparison of an Electronic Lab Notebook and medicinal chemistry literature. J Chem Inf Model 59:4167\u20134187","journal-title":"J Chem Inf Model"},{"key":"300_CR14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.4018\/jdwm.2007070101","volume":"3","author":"G Tsoumakas","year":"2007","unstructured":"Tsoumakas G, Katakis I (2007) Multi-label classification. Int J Data Warehous Min 3:1\u201313. https:\/\/doi.org\/10.4018\/jdwm.2007070101","journal-title":"Int J Data Warehous Min"},{"key":"300_CR15","doi-asserted-by":"publisher","first-page":"1152","DOI":"10.1021\/ci7004753","volume":"48","author":"K Kawai","year":"2008","unstructured":"Kawai K, Fujishima S, Takahashi Y (2008) Predictive activity profiling of drugs by topological-fragment-spectra-based support vector machines. J Chem Inform Model 48:1152\u20131160. https:\/\/doi.org\/10.1021\/ci7004753","journal-title":"J Chem Inform Model"},{"key":"300_CR16","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1273\/cbij.9.41","volume":"9","author":"K Kawai","year":"2009","unstructured":"Kawai K, Takahashi Y (2009) Identification of the dual action antihypertensive drugs using TFS-based support vector machines. Chem-Bio Inform J 9:41\u201351. https:\/\/doi.org\/10.1273\/cbij.9.41","journal-title":"Chem-Bio Inform J"},{"key":"300_CR17","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1186\/s13321-015-0071-9","volume":"7","author":"AM Afzal","year":"2015","unstructured":"Afzal AM, Mussa HY, Turner RE et al (2015) A multi-label approach to target prediction taking ligand promiscuity into account. J Cheminform 7:24. https:\/\/doi.org\/10.1186\/s13321-015-0071-9","journal-title":"J Cheminform"},{"key":"300_CR18","doi-asserted-by":"publisher","first-page":"2820","DOI":"10.1021\/ci900311j","volume":"49","author":"L Michielan","year":"2009","unstructured":"Michielan L, Stephanie F, Terfloth L et al (2009) Exploring potency and selectivity receptor antagonist profiles using a multilabel classification approach: the human adenosine receptors as a key study. J Chem Inf Model 49:2820\u20132836. https:\/\/doi.org\/10.1021\/ci900311j","journal-title":"J Chem Inf Model"},{"key":"300_CR19","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1002\/minf.201100052","volume":"31","author":"T Zhang","year":"2012","unstructured":"Zhang T, Dai H, Liu LA et al (2012) Classification models for predicting cytochrome P450 enzyme-substrate selectivity. Mol Inform 31:53\u201362. https:\/\/doi.org\/10.1002\/minf.201100052","journal-title":"Mol Inform"},{"key":"300_CR20","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1186\/s12859-015-0774-y","volume":"16","author":"W Zhang","year":"2015","unstructured":"Zhang W, Liu F, Luo L, Zhang J (2015) Predicting drug side effects by multi-label learning and ensemble learning. BMC Bioinform 16:365. https:\/\/doi.org\/10.1186\/s12859-015-0774-y","journal-title":"BMC Bioinform"},{"key":"300_CR21","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1021\/ci700175m","volume":"48","author":"D Hristozov","year":"2007","unstructured":"Hristozov D, Gasteiger J, Da Costa B (2007) Multilabeled classification approach to find a plant source for terpenoids. J Chem Inf Model 48:56\u201367. https:\/\/doi.org\/10.1021\/ci700175m","journal-title":"J Chem Inf Model"},{"key":"300_CR22","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1021\/ci00046a002","volume":"25","author":"RE Carhart","year":"1985","unstructured":"Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 25:64\u201373. https:\/\/doi.org\/10.1021\/ci00046a002","journal-title":"J Chem Inf Comput Sci"},{"key":"300_CR23","unstructured":"Lowe D (2017) Chemical reactions from US patents (1976-Sep2016)"},{"key":"300_CR24","unstructured":"EPAM (2017) Indigo Toolkit. lifescience.opensource.epam.com\/indigo%0A"},{"key":"300_CR25","doi-asserted-by":"publisher","first-page":"1924","DOI":"10.1021\/ci050413p","volume":"46","author":"P Gedeck","year":"2006","unstructured":"Gedeck P, Rohde B, Bartels C (2006) QSAR - How good is it in practice? Comparison of descriptor sets on an unbiased cross section of corporate data sets. J Chem Inf Model 46:1924\u20131936. https:\/\/doi.org\/10.1021\/ci050413p","journal-title":"J Chem Inf Model"},{"key":"300_CR26","unstructured":"Laggner C (2005) SMARTS patterns for functional group classification. https:\/\/sourceforge.net\/projects\/openbabel"},{"key":"300_CR27","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1002\/(SICI)1097-0290(199824)61:1<47::AID-BIT9>3.0.CO;2-Z","volume":"61","author":"A Gobbi","year":"1998","unstructured":"Gobbi A, Poppinger D (1998) Genetic optimization of combinatorial libraries. Biotechnol Bioeng 61:47\u201354","journal-title":"Biotechnol Bioeng"},{"key":"300_CR28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3390\/molecules21010001","volume":"21","author":"ES Salmina","year":"2016","unstructured":"Salmina ES, Haider N, Tetko IV (2016) Extended functional groups (EFG): an efficient set for chemical characterization and structure-activity relationship studies of chemical compounds. Molecules 21:1. https:\/\/doi.org\/10.3390\/molecules21010001","journal-title":"Molecules"},{"key":"300_CR29","doi-asserted-by":"publisher","first-page":"7507","DOI":"10.1016\/J.ESWA.2014.06.015","volume":"41","author":"L Rokach","year":"2014","unstructured":"Rokach L, Schclar A, Itach E (2014) Ensemble methods for multi-label classification. Expert Syst Appl 41:7507\u20137523. https:\/\/doi.org\/10.1016\/J.ESWA.2014.06.015","journal-title":"Expert Syst Appl"},{"key":"300_CR30","doi-asserted-by":"publisher","first-page":"1079","DOI":"10.1109\/TKDE.2010.164","volume":"23","author":"G Tsoumakas","year":"2011","unstructured":"Tsoumakas G, Katakis I, Vlahavas I (2011) Random k-Labelsets for multilabel classification. IEEE Trans Knowl Data Eng 23:1079\u20131089. https:\/\/doi.org\/10.1109\/TKDE.2010.164","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"300_CR31","doi-asserted-by":"publisher","first-page":"667","DOI":"10.1007\/978-0-387-09823-4_34","volume-title":"Data mining and knowledge discovery handbook","author":"G Tsoumakas","year":"2009","unstructured":"Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. Data mining and knowledge discovery handbook. Springer, Boston, pp 667\u2013685"},{"key":"300_CR32","unstructured":"Szyma\u0144ski P, Kajdanowicz T (2017) A scikit-based Python environment for performing multi-label classification"},{"key":"300_CR33","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"300_CR34","unstructured":"Read J (2010) Scalable multi-label classification. University of Waikato"},{"key":"300_CR35","unstructured":"Diamond Light Source (2017) Diamond fragment libraries. https:\/\/www.diamond.ac.uk\/Instruments\/Mx\/Fragment-Screening\/Fragment-Libraries.html"},{"key":"300_CR36","doi-asserted-by":"publisher","first-page":"2322","DOI":"10.1039\/C5SC03115J","volume":"7","author":"OB Cox","year":"2016","unstructured":"Cox OB, Krojer T, Collins P et al (2016) A poised fragment library enables rapid synthetic expansion yielding the first reported inhibitors of PHIP(2), an atypical bromodomain. Chem Sci 7:2322\u20132330. https:\/\/doi.org\/10.1039\/C5SC03115J","journal-title":"Chem Sci"},{"key":"300_CR37","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1186\/s13321-017-0203-5","volume":"9","author":"J Sun","year":"2017","unstructured":"Sun J, Jeliazkova N, Chupakhin V et al (2017) ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics. J Cheminform 9:17. https:\/\/doi.org\/10.1186\/s13321-017-0203-5","journal-title":"J Cheminform"},{"key":"300_CR38","unstructured":"Chemical Computing Group (2019) Molecular Operating Environment (MOE)"},{"key":"300_CR39","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1186\/1758-2946-1-8","volume":"1","author":"P Ertl","year":"2009","unstructured":"Ertl P, Schuffenhauer A (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminform 1:8. https:\/\/doi.org\/10.1186\/1758-2946-1-8","journal-title":"J Cheminform"},{"key":"300_CR40","volume-title":"Statistical power analysis for the behavioral sciences","author":"J Cohen","year":"1988","unstructured":"Cohen J (1988) Statistical power analysis for the behavioral sciences. Routledge, Abingdon"},{"key":"300_CR41","doi-asserted-by":"publisher","first-page":"699","DOI":"10.1021\/ci0503560","volume":"46","author":"U Fechner","year":"2006","unstructured":"Fechner U, Schneider G (2006) Flux (1): a virtual synthesis scheme for fragment-based de novo design. J Chem Inf Model 46:699\u2013707. https:\/\/doi.org\/10.1021\/ci0503560","journal-title":"J Chem Inf Model"},{"key":"300_CR42","doi-asserted-by":"publisher","first-page":"656","DOI":"10.1021\/ci6005307","volume":"47","author":"U Fechner","year":"2007","unstructured":"Fechner U, Schneider G (2007) Flux (2): comparison of molecular mutation and crossover operators for ligand-based de novo design. J Chem Inf Model 47:656\u2013667. https:\/\/doi.org\/10.1021\/ci6005307","journal-title":"J Chem Inf Model"},{"key":"300_CR43","unstructured":"Enamine (2018) Building blocks. https:\/\/enamine.net\/building-blocks"}],"container-title":["Journal of Computer-Aided Molecular Design"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10822-020-00300-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10822-020-00300-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10822-020-00300-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,2,27]],"date-time":"2021-02-27T00:46:01Z","timestamp":1614386761000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10822-020-00300-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,28]]},"references-count":43,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2020,7]]}},"alternative-id":["300"],"URL":"https:\/\/doi.org\/10.1007\/s10822-020-00300-6","relation":{},"ISSN":["0920-654X","1573-4951"],"issn-type":[{"type":"print","value":"0920-654X"},{"type":"electronic","value":"1573-4951"}],"subject":[],"published":{"date-parts":[[2020,2,28]]},"assertion":[{"value":"24 July 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 February 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 February 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}