{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T08:44:56Z","timestamp":1778057096315,"version":"3.51.4"},"reference-count":158,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,9,17]],"date-time":"2020-09-17T00:00:00Z","timestamp":1600300800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,9,17]],"date-time":"2020-09-17T00:00:00Z","timestamp":1600300800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100010665","name":"H2020 Marie Sk\u0142odowska-Curie Actions","doi-asserted-by":"publisher","award":["676434"],"award-info":[{"award-number":["676434"]}],"id":[{"id":"10.13039\/100010665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The technological advances of the past century, marked by the computer revolution and the advent of high-throughput screening technologies in drug discovery, opened the path to the computational analysis and visualization of bioactive molecules. For this purpose, it became necessary to represent molecules in a syntax that would be readable by computers and understandable by scientists of various fields. A large number of chemical representations have been developed over the years, their numerosity being due to the fast development of computers and the complexity of producing a representation that encompasses all structural and chemical characteristics. We present here some of the most popular electronic molecular and macromolecular representations used in drug discovery, many of which are based on graph representations. Furthermore, we describe applications of these representations in AI-driven drug discovery. Our aim is to provide a brief guide on structural representations that are essential to the practice of AI in drug discovery. This review serves as a guide for researchers who have little experience with the handling of chemical representations and plan to work on applications at the interface of these fields.<\/jats:p>","DOI":"10.1186\/s13321-020-00460-5","type":"journal-article","created":{"date-parts":[[2020,9,17]],"date-time":"2020-09-17T15:15:30Z","timestamp":1600355730000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":447,"title":["Molecular representations in AI-driven drug discovery: a review and practical guide"],"prefix":"10.1186","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6455-1958","authenticated-orcid":false,"given":"Laurianne","family":"David","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0403-4067","authenticated-orcid":false,"given":"Amol","family":"Thakkar","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6170-6088","authenticated-orcid":false,"given":"Roc\u00edo","family":"Mercado","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4970-6461","authenticated-orcid":false,"given":"Ola","family":"Engkvist","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,9,17]]},"reference":[{"issue":"2","key":"460_CR1","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1515\/ci-2016-0206","volume":"38","author":"B Lawlor","year":"2016","unstructured":"Lawlor B (2016) The chemical structure association trust. Chem Int. 38(2):12\u201315","journal-title":"Chem Int."},{"issue":"3","key":"460_CR2","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1021\/c160030a007","volume":"8","author":"WJ Wiswesser","year":"1968","unstructured":"Wiswesser WJ (1968) 107 years of line-formula notations (1861\u2013968). J Chem Doc. 8(3):146\u2013150","journal-title":"J Chem Doc."},{"key":"460_CR3","unstructured":"Zhou P, Shang Z. 2D molecular graphics: a flattened world of chemistry and biology"},{"issue":"3","key":"460_CR4","doi-asserted-by":"publisher","first-page":"1107","DOI":"10.1021\/ci050550m","volume":"46","author":"AM Clark","year":"2006","unstructured":"Clark AM, Labute P, Santavy M (2006) 2D structure depiction. J Chem Inf Model 46(3):1107\u20131123","journal-title":"J Chem Inf Model"},{"key":"460_CR5","unstructured":"RasMol and OpenRasMol. http:\/\/www.openrasmol.org\/. Accessed 27 Apr 2020."},{"issue":"4","key":"460_CR6","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1016\/S0160-9327(02)01468-0","volume":"26","author":"E Francoeur","year":"2002","unstructured":"Francoeur E (2002) Cyrus Levinthal, the Kluge and the origins of interactive molecular graphics. Endeavour 26(4):127\u2013131","journal-title":"Endeavour"},{"issue":"4","key":"460_CR7","doi-asserted-by":"publisher","first-page":"234","DOI":"10.1021\/c160047a009","volume":"12","author":"RJ Feldmann","year":"1972","unstructured":"Feldmann RJ, Heller SR, Bacon CRT (1972) An interactive, versatile, three-dimensional display, manipulation and plotting system for biomedical research. J Chem Doc. 12(4):234\u2013237","journal-title":"J Chem Doc."},{"key":"460_CR8","unstructured":"Gelberg A. Chemical notations. In: Encyclopedia of library and information science. 1970. p. 510\u201328"},{"issue":"1","key":"460_CR9","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) SMILES, a Chemical Language And Information System: 1: introduction to methodology and encoding rules. J Chem Inf Comput Sci 28(1):31\u201336","journal-title":"J Chem Inf Comput Sci"},{"issue":"1","key":"460_CR10","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1186\/s13321-015-0068-4","volume":"7","author":"SR Heller","year":"2015","unstructured":"Heller SR, McNaught A, Pletnev I, Stein S, Tchekhovskoi D (2015) InChI, the IUPAC International Chemical Identifier. J Cheminform. 7(1):23","journal-title":"J Cheminform."},{"key":"460_CR11","doi-asserted-by":"publisher","first-page":"58","DOI":"10.1016\/j.ymeth.2014.08.005","volume":"71","author":"A Cereto-Massagu\u00e9","year":"2015","unstructured":"Cereto-Massagu\u00e9 A, Ojeda MJ, Valls C, Mulero M, Garcia-Vallv\u00e9 S, Pujadas G (2015) Molecular fingerprint similarity search in virtual screening. Methods 71:58\u201363","journal-title":"Methods"},{"issue":"3","key":"460_CR12","doi-asserted-by":"publisher","first-page":"588","DOI":"10.1021\/ci00019a017","volume":"34","author":"MA Siani","year":"1994","unstructured":"Siani MA, Weininger D, Blaney JM (1994) CHUCKLES: a method for representing and searching peptide and peptoid sequences on both monomer and atomic levels. J Chem Inf Comput Sci 34(3):588\u2013593","journal-title":"J Chem Inf Comput Sci"},{"key":"460_CR13","doi-asserted-by":"publisher","first-page":"1026","DOI":"10.1021\/ci00028a012","volume":"35","author":"MA Siani","year":"1995","unstructured":"Siani MA, Weininger D, James CA, Blaney JM (1995) CHORTLES: a method for representing oligomeric and template-based mixtures. J Chem Inf Comput Sci 35:1026\u20131033","journal-title":"J Chem Inf Comput Sci"},{"issue":"10","key":"460_CR14","doi-asserted-by":"publisher","first-page":"2796","DOI":"10.1021\/ci3001925","volume":"52","author":"T Zhang","year":"2012","unstructured":"Zhang T, Li H, Xi H, Stanton RV, Rotstein SH (2012) HELM: a hierarchical notation language for complex biomolecule structure representation. J Chem Inf Model 52(10):2796\u20132806","journal-title":"J Chem Inf Model"},{"issue":"6","key":"460_CR15","doi-asserted-by":"publisher","first-page":"1558","DOI":"10.1021\/ci400571e","volume":"54","author":"K Tanaka","year":"2014","unstructured":"Tanaka K, Aoki-Kinoshita KF, Kotera M, Sawaki H, Tsuchiya S, Fujita N et al (2014) WURCS: the Web3 Unique Representation Of Carbohydrate Structures. J Chem Inf Model 54(6):1558\u20131566","journal-title":"J Chem Inf Model"},{"issue":"12","key":"460_CR16","doi-asserted-by":"publisher","first-page":"2404","DOI":"10.1021\/ci800128b","volume":"48","author":"JH Jensen","year":"2008","unstructured":"Jensen JH, Hoeg-Jensen T, Padkj\u00e6r SB (2008) Building a biochemformatics database. J Chem Inf Model 48(12):2404\u20132413","journal-title":"J Chem Inf Model"},{"issue":"1","key":"460_CR17","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/nar\/28.1.235","volume":"28","author":"HM Berman","year":"2000","unstructured":"Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H et al (2000) The protein data bank. Nucleic Acids Res 28(1):235\u2013242","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"460_CR18","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1186\/s13321-018-0277-8","volume":"10","author":"G Grethe","year":"2018","unstructured":"Grethe G, Blanke G, Kraut H, Goodman JM (2018) International chemical identifier for reactions (RInChI). J Cheminform. 10(1):22","journal-title":"J Cheminform."},{"issue":"910","key":"460_CR19","doi-asserted-by":"publisher","first-page":"693","DOI":"10.1007\/s10822-005-9008-0","volume":"19","author":"A Varnek","year":"2005","unstructured":"Varnek A, Fourches D, Hoonakker F, Solovev VP (2005) Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures. J Comput Aided Mol Des. 19(910):693\u2013703","journal-title":"J Comput Aided Mol Des."},{"key":"460_CR20","doi-asserted-by":"crossref","unstructured":"Dugundji J, Ugi I. An algebraic model of constitutional chemistry as a basis for chemical computer programs. In: Computers in chemistry. Springer; 2006. p. 19\u201364","DOI":"10.1007\/BFb0051317"},{"issue":"1","key":"460_CR21","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1021\/ci00017a010","volume":"34","author":"JR Rose","year":"1994","unstructured":"Rose JR, Gasteiger J (1994) HORACE: an automatic system for the hierarchical classification of chemical reactions. J Chem Inf Comput Sci 34(1):74\u201390","journal-title":"J Chem Inf Comput Sci"},{"issue":"1","key":"460_CR22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-2-1","volume":"2","author":"P Ertl","year":"2010","unstructured":"Ertl P (2010) Molecular structure input on the web. J Cheminform. 2(1):1\u20139","journal-title":"J Cheminform."},{"issue":"12","key":"460_CR23","first-page":"41","volume":"11","author":"R Guha","year":"2011","unstructured":"Guha R, Wiggins GD, Wild DJ, Baik MH, Pierce ME, Fox GC (2011) Improving usability and accessibility of cheminformatics tools for chemists through cyberinfrastructure and education. Silico Biol. 11(12):41\u201360","journal-title":"Silico Biol."},{"issue":"1","key":"460_CR24","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1002\/minf.201000100","volume":"30","author":"A Varnek","year":"2011","unstructured":"Varnek A, Baskin II (2011) Chemoinformatics as a theoretical chemistry discipline. Mol Inform. 30(1):20\u201332","journal-title":"Mol Inform."},{"issue":"6\u20137","key":"460_CR25","doi-asserted-by":"publisher","first-page":"506","DOI":"10.1002\/minf.201100005","volume":"30","author":"M Vazquez","year":"2011","unstructured":"Vazquez M, Krallinger M, Leitner F, Valencia A (2011) Text mining for drugs and chemical compounds: methods, tools and applications. Mol Inform. 30(6\u20137):506\u2013519","journal-title":"Mol Inform."},{"key":"460_CR26","doi-asserted-by":"publisher","first-page":"2545","DOI":"10.1021\/acs.jcim.9b00266","volume":"59","author":"AC Mater","year":"2019","unstructured":"Mater AC, Coote ML (2019) Deep learning in chemistry. J Chem Inf Model 59:2545\u20132559","journal-title":"J Chem Inf Model"},{"issue":"4","key":"460_CR27","doi-asserted-by":"publisher","first-page":"557","DOI":"10.1002\/wcms.36","volume":"1","author":"WA Warr","year":"2011","unstructured":"Warr WA (2011) Representation of chemical structures. Wiley Interdiscip Rev Comput Mol Sci. 1(4):557\u2013579","journal-title":"Wiley Interdiscip Rev Comput Mol Sci."},{"key":"460_CR28","unstructured":"National Academy of Sciences UNRC. In: Survey of chemical notations systems. 1964. p. 1\u2013467"},{"key":"460_CR29","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/978-1-4020-6291-9","volume-title":"An introduction to chemoinformatics","author":"AR Leach","year":"2007","unstructured":"Leach AR, Gillet VJ (2007) An introduction to chemoinformatics. Springer, Netherlands, pp 1\u2013255"},{"key":"460_CR30","unstructured":"ChemDraw. PerkinElmer Informatics."},{"issue":"Pt 1","key":"460_CR31","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1107\/S1600576719014092","volume":"53","author":"CF MacRae","year":"2020","unstructured":"MacRae CF, Sovago I, Cottrell SJ, Galek PTA, McCabe P, Pidcock E et al (2020) Mercury 40: from visualization to analysis, design and prediction. J Appl Crystallogr. 53(Pt 1):226\u2013235","journal-title":"J Appl Crystallogr."},{"key":"460_CR32","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1186\/1758-2946-4-17","volume":"4","author":"DH Marcus","year":"2012","unstructured":"Marcus DH, Donald EC, David CL, Tim EZ, Vandermeersch GRH (2012) Avogadro: an advanced semantic chemical editor, visualization, and analysis platform. J Cheminform. 4:17","journal-title":"J Cheminform."},{"issue":"6","key":"460_CR33","doi-asserted-by":"publisher","first-page":"1272","DOI":"10.1107\/S0021889811038970","volume":"44","author":"K Momma","year":"2011","unstructured":"Momma K, Izumi F (2011) VESTA 3 for three-dimensional visualization of crystal, volumetric and morphology data. J Appl Crystallogr 44(6):1272\u20131276","journal-title":"J Appl Crystallogr"},{"key":"460_CR34","unstructured":"Delano WL. PyMOL: An Open-Source Molecular Graphics Tool. https:\/\/www.ccp4.ac.uk\/newsletters\/newsletter40\/11_pymol.pdf. Accessed May 27 2020."},{"issue":"1","key":"460_CR35","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1016\/0263-7855(96)00018-5","volume":"14","author":"W Humphrey","year":"1996","unstructured":"Humphrey W, Dalke A, Schulten K (1996) VMD: visual molecular dynamics. J Mol Graph\u00a014(1):33\u201338","journal-title":"J Mol Graph"},{"key":"460_CR36","doi-asserted-by":"crossref","unstructured":"Kay E, Bondy JA, Murty USR. Graph Theory with Applications. Vol. 28, Operational Research Quarterly (1970-1977). 1977. p. 237","DOI":"10.2307\/3008805"},{"issue":"5","key":"460_CR37","doi-asserted-by":"publisher","first-page":"787","DOI":"10.1021\/ci00027a001","volume":"35","author":"A Dietz","year":"1995","unstructured":"Dietz A (1995) Yet another representation of molecular structure. J Chem Inf Comput Sci 35(5):787\u2013802","journal-title":"J Chem Inf Comput Sci"},{"key":"460_CR38","doi-asserted-by":"publisher","first-page":"9","DOI":"10.1186\/1758-2946-4-22","volume":"4","author":"NM O\u2019Boyle","year":"2012","unstructured":"O\u2019Boyle NM (2012) Towards a Universal SMILES representation - A standard method to generate canonical SMILES based on the InChI. J Cheminform. 4:9","journal-title":"J Cheminform."},{"issue":"3","key":"460_CR39","doi-asserted-by":"publisher","first-page":"244","DOI":"10.1021\/ci00007a012","volume":"32","author":"A Dalby","year":"1992","unstructured":"Dalby A, Nourse JG, Hounshell WD, Gushurst AKI, Grier DL, Leland BA et al (1992) Description of several chemical structure file formats used by computer programs developed at molecular design limited. J Chem Inf Comput Sci 32(3):244\u2013255","journal-title":"J Chem Inf Comput Sci"},{"key":"460_CR40","doi-asserted-by":"publisher","DOI":"10.1002\/9783527816880","volume-title":"Chemoinformatics: basic concepts and methods","author":"T Engel","year":"2018","unstructured":"Engel T, Gasteiger J (2018) Chemoinformatics: basic concepts and methods. Wiley, New York"},{"key":"460_CR41","unstructured":"Leigh GJ, Favre HA, Metanomski WV. Principles of chemical nomenclature: a guide to IUPAC recommendations. Blackwell Science Ltd, editor. European Journal of Medicinal Chemistry. The Royal Society of Chemistry; 1998"},{"key":"460_CR42","unstructured":"Color Books - IUPAC | International Union of Pure and Applied Chemistry. https:\/\/iupac.org\/what-we-do\/books\/color-books\/. Accessed 15 Dec 2019"},{"issue":"1","key":"460_CR43","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1016\/0020-0271(68)90004-1","volume":"4","author":"GM Dyson","year":"1968","unstructured":"Dyson GM, Lynch MF, Morgan HL (1968) A modified IUPAC-Dyson notation system for chemical structures. Inf Storage Retr 4(1):27\u201383","journal-title":"Inf Storage Retr"},{"issue":"2","key":"460_CR44","doi-asserted-by":"publisher","first-page":"88","DOI":"10.1021\/ci00034a005","volume":"22","author":"WJ Wiswesser","year":"1982","unstructured":"Wiswesser WJ (1982) How the WLN began in 1949 and how it might be in 1999. J Chem Inf Comput Sci 22(2):88\u201393","journal-title":"J Chem Inf Comput Sci"},{"issue":"3","key":"460_CR45","doi-asserted-by":"publisher","first-page":"258","DOI":"10.1021\/ci00047a023","volume":"25","author":"WJ Wiswesser","year":"1985","unstructured":"Wiswesser WJ (1985) Historic development of chemical notations. J Chem Inf Comput Sci 25(3):258\u2013263","journal-title":"J Chem Inf Comput Sci"},{"key":"460_CR46","first-page":"16","volume":"6","author":"WJ Wiswesser","year":"1955","unstructured":"Wiswesser WJ (1955) Molecular structure and taste simulation. Va J Sci. 6:16\u201321","journal-title":"Va J Sci."},{"key":"460_CR47","volume-title":"Applications of deep-learning in exploiting large-scale and heterogeneous compound data in industrial pharmaceutical research","author":"L David","year":"2019","unstructured":"David L, Ar\u00fas-Pous J, Karlsson J, Engkvist O, Bjerrum EJ, Kogej T et al (2019) Applications of deep-learning in exploiting large-scale and heterogeneous compound data in industrial pharmaceutical research, vol 10. Frontiers in Pharmacology, Frontiers Media SA, New York"},{"key":"460_CR48","unstructured":"Daylight. https:\/\/www.daylight.com\/. Accessed 23 Apr 2020"},{"key":"460_CR49","unstructured":"RDKit, Open-Source Cheminformatics. http:\/\/www.rdkit.org"},{"issue":"4","key":"460_CR50","doi-asserted-by":"publisher","first-page":"131","DOI":"10.3390\/biom8040131","volume":"8","author":"E Bjerrum","year":"2018","unstructured":"Bjerrum E, Sattarov B (2018) Improving chemical autoencoder latent space and molecular de novo generation diversity with heteroencoders. Biomolecules. 8(4):131","journal-title":"Biomolecules."},{"key":"460_CR51","unstructured":"Bjerrum EJ. SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules. arXiv Prepr. 2017"},{"issue":"10","key":"460_CR52","doi-asserted-by":"publisher","first-page":"2111","DOI":"10.1021\/acs.jcim.5b00543","volume":"55","author":"N Schneider","year":"2015","unstructured":"Schneider N, Sayle RA, Landrum GA (2015) Get your atoms in order-an open-source implementation of a novel and robust molecular canonicalization algorithm. J Chem Inf Model 55(10):2111\u20132120","journal-title":"J Chem Inf Model"},{"issue":"2","key":"460_CR53","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1021\/c160017a018","volume":"5","author":"HL Morgan","year":"1965","unstructured":"Morgan HL (1965) The generation of a unique machine description for chemical structures\u2014a technique developed at chemical abstracts service. J Chem Doc. 5(2):107\u2013113","journal-title":"J Chem Doc."},{"issue":"1","key":"460_CR54","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-018-0279-6","volume":"10","author":"M Quir\u00f3s","year":"2018","unstructured":"Quir\u00f3s M, Gra\u017eulis S, Girdzijauskait\u0117 S, Merkys A, Vaitkus A (2018) Using SMILES strings for the description of chemical connectivity in the Crystallography Open Database. J Cheminform 10(1):1\u201317","journal-title":"J Cheminform"},{"key":"460_CR55","unstructured":"ChemAxon Extended SMILES and SMARTS - CXSMILES and CXSMARTS - Documentation. https:\/\/docs.chemaxon.com\/display\/docs\/ChemAxon_Extended_SMILES_and_SMARTS_-_CXSMILES_and_CXSMARTS.html#src-1806633_ChemAxonExtendedSMILESandSMARTS-CXSMILESandCXSMARTS-Fragmentgrouping. Accessed 8 Apr 2020"},{"key":"460_CR56","unstructured":"OpenSMILES Home Page. http:\/\/opensmiles.org\/. Accessed 23 Apr 2020"},{"key":"460_CR57","unstructured":"Daylight Theory: SMARTS - A Language for Describing Molecular Patterns. https:\/\/www.daylight.com\/dayhtml\/doc\/theory\/theory.smarts.html. Accessed 15 Nov 2020"},{"issue":"1","key":"460_CR58","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1186\/1758-2946-5-10","volume":"5","author":"C Southan","year":"2013","unstructured":"Southan C (2013) InChI in the wild: an assessment of InChIKey searching in Google. J Cheminform. 5(1):10","journal-title":"J Cheminform."},{"key":"460_CR59","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1186\/1758-2946-4-39","volume":"4","author":"I Pletnev","year":"2012","unstructured":"Pletnev I, Erin A, McNaught A, Blinov K, Tchekhovskoi D, Heller S (2012) InChIKey collision resistance: an experimental testing. J Cheminform. 4:12","journal-title":"J Cheminform."},{"issue":"8","key":"460_CR60","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1007\/s10822-015-9854-3","volume":"29","author":"WA Warr","year":"2015","unstructured":"Warr WA (2015) Many InChIs and quite some feat. J Comput Aided Mol Des 29(8):681\u2013694","journal-title":"J Comput Aided Mol Des"},{"key":"460_CR61","unstructured":"Kode-Chemoinformatics. https:\/\/chm.kode-solutions.net\/products_dragon.php. Accessed 23 Apr 2020"},{"key":"460_CR62","unstructured":"Dalke A. MACCS key 44. http:\/\/www.dalkescientific.com\/writings\/diary\/archive\/2014\/10\/17\/maccs_key_44.html. Accessed 28 Mar 2020"},{"key":"460_CR63","unstructured":"MDL Information Systems I. MACCS keys"},{"issue":"6","key":"460_CR64","doi-asserted-by":"publisher","first-page":"1273","DOI":"10.1021\/ci010132r","volume":"42","author":"JL Durant","year":"2002","unstructured":"Durant JL, Leland BA, Henry DR, Nourse JG (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42(6):1273\u20131280","journal-title":"J Chem Inf Comput Sci"},{"issue":"19","key":"460_CR65","doi-asserted-by":"publisher","first-page":"2894","DOI":"10.1002\/(SICI)1521-3773(19991004)38:19<2894::AID-ANIE2894>3.0.CO;2-F","volume":"38","author":"G Schneider","year":"1999","unstructured":"Schneider G, Neidhart W, Giller T, Schmid G (1999) \u201cScaffold-Hopping\u201d by topological pharmacophore search: a contribution to virtual screening. Angew Chemie Int Ed. 38(19):2894\u20132896","journal-title":"Angew Chemie Int Ed."},{"issue":"5","key":"460_CR66","doi-asserted-by":"publisher","first-page":"742","DOI":"10.1021\/ci100050t","volume":"50","author":"D Rogers","year":"2010","unstructured":"Rogers D, Hahn M (2010) Extended-Connectivity Fingerprints. J Chem Inf Model 50(5):742\u2013754","journal-title":"J Chem Inf Model"},{"key":"460_CR67","unstructured":"CAS Content | CAS. https:\/\/www.cas.org\/about\/cas-content. Accessed 8 Apr 2020"},{"issue":"6\u20137","key":"460_CR68","doi-asserted-by":"publisher","first-page":"469","DOI":"10.1002\/minf.201400052","volume":"33","author":"WA Warr","year":"2014","unstructured":"Warr WA (2014) A short review of chemical reaction database systems, computer-aided synthesis design, reaction prediction and synthetic feasibility. Mol Inform. 33(6\u20137):469\u2013476","journal-title":"Mol Inform."},{"key":"460_CR69","doi-asserted-by":"crossref","unstructured":"Jensen KF, Coley CW, Eyke NS (2019) Autonomous discovery in the chemical sciences part I: Progress. Angew Chemie Int Ed","DOI":"10.1002\/anie.201909987"},{"issue":"1","key":"460_CR70","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1186\/1758-2946-5-45","volume":"5","author":"G Grethe","year":"2013","unstructured":"Grethe G, Goodman JM, Allen CH (2013) International chemical identifier for reactions (RInChI). J Cheminform. 5(1):45","journal-title":"J Cheminform."},{"key":"460_CR71","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-017-0210-6","volume":"9","author":"PM Jacob","year":"2017","unstructured":"Jacob PM, Lan T, Goodman JM, Lapkin AA (2017) A possible extension to the RInChI as a means of providing machine readable process data. J Cheminform. 9:1","journal-title":"J Cheminform."},{"issue":"4","key":"460_CR72","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1021\/ci00052a009","volume":"26","author":"S Fujita","year":"1986","unstructured":"Fujita S (1986) Description of organic reactions based on imaginary transition structures.\u00a01. introduction of new concepts. J Chem Inf Comput Sci. 26(4):205\u2013212","journal-title":"J Chem Inf Comput Sci."},{"issue":"6","key":"460_CR73","doi-asserted-by":"publisher","first-page":"2516","DOI":"10.1021\/acs.jcim.9b00102","volume":"59","author":"RI Nugmanov","year":"2019","unstructured":"Nugmanov RI, Mukhametgaleev RN, Akhmetshin T, Gimadiev TR, Afonina VA, Madzhidov TI et al (2019) CGRtools: Python library for molecule, reaction, and condensed graph of reaction processing. J Chem Inf Model 59(6):2516\u20132521","journal-title":"J Chem Inf Model"},{"key":"460_CR74","doi-asserted-by":"crossref","unstructured":"Gasteiger J, Jochum C (2006) EROS A computer program for generating sequences of reactions. In: Organic Compunds. Springer, pp 93\u2013126","DOI":"10.1007\/BFb0050147"},{"issue":"11","key":"460_CR75","doi-asserted-by":"publisher","first-page":"2884","DOI":"10.1021\/ci400442f","volume":"53","author":"H Kraut","year":"2013","unstructured":"Kraut H, Eiblmaier J, Grethe G, L\u00f6w P, Matuszczyk H, Saller H (2013) Algorithm for reaction classification. J Chem Inf Model 53(11):2884\u20132895","journal-title":"J Chem Inf Model"},{"issue":"2","key":"460_CR76","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1021\/op500373e","volume":"19","author":"A B\u00f8gevig","year":"2015","unstructured":"B\u00f8gevig A, Federsel HJ, Huerta F, Hutchings MG, Kraut H, Langer T et al (2015) Route design in the 21st century: the IC SYNTH software tool as an idea generator for synthesis prediction. Org Process Res Dev 19(2):357\u2013368","journal-title":"Org Process Res Dev"},{"issue":"7698","key":"460_CR77","doi-asserted-by":"publisher","first-page":"604","DOI":"10.1038\/nature25978","volume":"555","author":"MHS Segler","year":"2018","unstructured":"Segler MHS, Preuss M, Waller MP (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555(7698):604\u2013610","journal-title":"Nature"},{"issue":"6","key":"460_CR78","doi-asserted-by":"publisher","first-page":"560","DOI":"10.1002\/wcms.1140","volume":"3","author":"WL Chen","year":"2013","unstructured":"Chen WL, Chen DZ, Taylor KT (2013) Automatic reaction mapping and reaction center detection. Wiley Interdiscip Rev Comput Mol Sci 3(6):560\u2013593","journal-title":"Wiley Interdiscip Rev Comput Mol Sci"},{"issue":"1","key":"460_CR79","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1002\/wcms.5","volume":"1","author":"H Ehrlich","year":"2011","unstructured":"Ehrlich H, Rarey M (2011) Maximum common subgraph isomorphism algorithms and their applications in molecular science: a review. WIREs Comput Mol Sci 1(1):68\u201379","journal-title":"WIREs Comput Mol Sci"},{"key":"460_CR80","doi-asserted-by":"publisher","first-page":"521","DOI":"10.1023\/A:1021271615909","volume":"16","author":"JW Raymond","year":"2002","unstructured":"Raymond JW, Willett P (2002) Maximum common subgraph isomorphism algorithms for the matching of chemical structures. J Comput Aided Mol Design 16:521\u2013533","journal-title":"J Comput Aided Mol Design"},{"issue":"1","key":"460_CR81","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1021\/ci5006614","volume":"55","author":"N Schneider","year":"2015","unstructured":"Schneider N, Lowe DM, Sayle RA, Landrum GA (2015) Development of a novel fingerprint for chemical reactions and its application to large-scale reaction classification and similarity. J Chem Inf Model 55(1):39\u201353","journal-title":"J Chem Inf Model"},{"issue":"5","key":"460_CR82","doi-asserted-by":"publisher","first-page":"1163","DOI":"10.1021\/ci800413m","volume":"49","author":"H Patel","year":"2009","unstructured":"Patel H, Bodkin MJ, Chen B, Gillet VJ (2009) Knowledge-based approach to de novo design using reaction vectors. J Chem Inf Model 49(5):1163\u20131184","journal-title":"J Chem Inf Model"},{"issue":"10","key":"460_CR83","doi-asserted-by":"publisher","first-page":"4167","DOI":"10.1021\/acs.jcim.9b00537","volume":"59","author":"GM Ghiandoni","year":"2019","unstructured":"Ghiandoni GM, Bodkin MJ, Chen B, Hristozov D, Wallace JEA, Webster J et al (2019) Development and application of a data-driven reaction classification model: comparison of an electronic lab notebook and medicinal chemistry literature. J Chem Inf Model 59(10):4167\u20134187","journal-title":"J Chem Inf Model"},{"issue":"6","key":"460_CR84","doi-asserted-by":"publisher","first-page":"2529","DOI":"10.1021\/acs.jcim.9b00286","volume":"59","author":"CW Coley","year":"2019","unstructured":"Coley CW, Green WH, Jensen KF (2019) RDChiral: an RDKit wrapper for handling stereochemistry in retrosynthetic template extraction and application. J Chem Inf Model 59(6):2529\u20132537","journal-title":"J Chem Inf Model"},{"issue":"1","key":"460_CR85","doi-asserted-by":"publisher","first-page":"1800129","DOI":"10.1002\/adts.201800129","volume":"2","author":"JS Peerless","year":"2019","unstructured":"Peerless JS, Milliken NJB, Oweida TJ, Manning MD, Yingling YG (2019) Soft matter informatics: current progress and challenges. Adv Theory Simulations. 2(1):1800129","journal-title":"Adv Theory Simulations."},{"issue":"5","key":"460_CR86","doi-asserted-by":"publisher","first-page":"595","DOI":"10.1351\/pac198456050595","volume":"56","author":"Nomenclature and symbolism for amino acids and peptides","year":"1984","unstructured":"Nomenclature and symbolism for amino acids and peptides (1984) Pure Appl Chem 56(5):595\u2013624","journal-title":"Pure Appl Chem"},{"key":"460_CR87","doi-asserted-by":"publisher","first-page":"23","DOI":"10.3390\/ijms20235978","volume":"20","author":"P Minkiewicz","year":"2019","unstructured":"Minkiewicz P, Iwaniak A, Darewicz M (2019) BIOPEP-UWM database of bioactive peptides: current opportunities. Int J Mol Sci. 20:23","journal-title":"Int J Mol Sci."},{"issue":"6","key":"460_CR88","doi-asserted-by":"publisher","first-page":"1233","DOI":"10.1021\/acs.jcim.6b00442","volume":"57","author":"J Milton","year":"2017","unstructured":"Milton J, Zhang T, Bellamy C, Swayze E, Hart C, Weisser M et al (2017) HELM Software for Biopolymers. J Chem Inf Model 57(6):1233\u20131239","journal-title":"J Chem Inf Model"},{"issue":"9","key":"460_CR89","doi-asserted-by":"publisher","first-page":"2186","DOI":"10.1021\/ci2001988","volume":"51","author":"WL Chen","year":"2011","unstructured":"Chen WL, Leland BA, Durant JL, Grier DL, Christie BD, Nourse JG et al (2011) Self-contained sequence representation: bridging the gap between bioinformatics and cheminformatics. J Chem Inf Model 51(9):2186\u20132208","journal-title":"J Chem Inf Model"},{"key":"460_CR90","unstructured":"HELM - Pistoia Alliance. https:\/\/www.pistoiaalliance.org\/projects\/current-projects\/helm\/. Accessed 23 Apr 2020"},{"key":"460_CR91","unstructured":"Knispel R, B\u00fcki E, Horny\u00e1k G, Mihala N, Tomin A, Keresztes G, et al. Informatics tools leveraging the open HELM standard for managing and exploring databases of chemically modified complex biomolecules. https:\/\/chemaxon.com\/app\/uploads\/2016\/04\/biotoolkit_2016-04_102_A4.pdf. Accessed 27 May 2020"},{"issue":"11","key":"460_CR92","doi-asserted-by":"publisher","first-page":"1443","DOI":"10.4155\/tde.13.104","volume":"4","author":"BJ Bruno","year":"2013","unstructured":"Bruno BJ, Miller GD, Lim CS (2013) Basics and recent advances in peptide and protein drug delivery. Ther Deliv. 4(11):1443\u20131467","journal-title":"Ther Deliv."},{"issue":"2075","key":"460_CR93","first-page":"1","volume":"22","author":"P Minkiewicz","year":"2017","unstructured":"Minkiewicz P, Iwaniak A, Darewicz M (2017) Annotation of peptide structures using SMILES and other chemical codes-practical solutions. Molecules 22(2075):1\u201317","journal-title":"Molecules"},{"key":"460_CR94","doi-asserted-by":"publisher","first-page":"F1000","DOI":"10.12688\/f1000research.11587.1","volume":"6","author":"ZE Sauna","year":"2017","unstructured":"Sauna ZE, Lagass\u00e9 HAD, Alexaki A, Simhadri VL, Katagiri NH, Jankowski W et al (2017) Recent advances in (therapeutic protein) drug development. F1000 Research. 6:F1000","journal-title":"F1000 Research."},{"issue":"10","key":"460_CR95","doi-asserted-by":"publisher","first-page":"1678","DOI":"10.1039\/C9MD00292H","volume":"10","author":"P Valverde","year":"2019","unstructured":"Valverde P, Ard\u00e1 A, Reichardt NC, Jim\u00e9nez-Barbero J, Gimeno A (2019) Glycans in drug discovery. Medchemcomm. 10(10):1678\u20131691","journal-title":"Medchemcomm."},{"issue":"18","key":"460_CR96","doi-asserted-by":"publisher","first-page":"3146","DOI":"10.1002\/pola.28703","volume":"55","author":"EF Connor","year":"2017","unstructured":"Connor EF, Lees I, Maclean D (2017) Polymers as drugs-Advances in therapeutic applications of polymer binding agents. J Polym Sci Part A: Polym Chem 55(18):3146\u20133157","journal-title":"J Polym Sci Part A: Polym Chem"},{"issue":"1","key":"460_CR97","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/S0008-6215(01)00230-0","volume":"336","author":"A Bohne-Lang","year":"2001","unstructured":"Bohne-Lang A, Lang E, F\u00f6rster T, Von der Lieth CW (2001) LINUCS: LInear notation for unique description of carbohydrate sequences. Carbohydr Res 336(1):1\u201311","journal-title":"Carbohydr Res"},{"issue":"12","key":"460_CR98","doi-asserted-by":"publisher","first-page":"2162","DOI":"10.1016\/j.carres.2008.03.011","volume":"343","author":"S Herget","year":"2008","unstructured":"Herget S, Ranzinger R, Maass K, Lieth CW (2008) GlycoCT-a unifying sequence format for carbohydrates. Carbohydr Res. 343(12):2162\u20132171","journal-title":"Carbohydr Res."},{"key":"460_CR99","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1016\/j.pisc.2016.05.013","volume":"11","author":"R Ranzinger","year":"2017","unstructured":"Ranzinger R, Kochut KJ, Miller JA, Eavenson M, L\u00fctteke T, York WS (2017) GLYDE-II: the GLYcan data exchange format. Perspect Sci 11:24\u201330","journal-title":"Perspect Sci"},{"issue":"3","key":"460_CR100","doi-asserted-by":"publisher","first-page":"1276","DOI":"10.1021\/acs.jcim.9b00744","volume":"60","author":"PV Toukach","year":"2020","unstructured":"Toukach PV, Egorova KS (2020) New features of carbohydrate structure database notation (CSDB Linear), as compared to other carbohydrate notations. J Chem Inf Model 60(3):1276\u20131289","journal-title":"J Chem Inf Model"},{"issue":"14","key":"460_CR101","doi-asserted-by":"publisher","first-page":"2434","DOI":"10.1093\/bioinformatics\/bty990","volume":"35","author":"S Tsuchiya","year":"2019","unstructured":"Tsuchiya S, Yamada I, Aoki-Kinoshita KF (2019) GlycanFormatConverter: a conversion tool for translating the complexities of glycans. Bioinformatics 35(14):2434\u20132440","journal-title":"Bioinformatics"},{"issue":"15","key":"460_CR102","doi-asserted-by":"publisher","first-page":"2679","DOI":"10.1093\/bioinformatics\/bty168","volume":"34","author":"IY Chernyshov","year":"2018","unstructured":"Chernyshov IY, Toukach PV (2018) REStLESS: automated translation of glycan sequences from residue-based notation to SMILES and atomic coordinates. Bioinformatics 34(15):2679\u20132681","journal-title":"Bioinformatics"},{"issue":"4","key":"460_CR103","doi-asserted-by":"publisher","first-page":"632","DOI":"10.1021\/acs.jcim.6b00650","volume":"57","author":"M Matsubara","year":"2017","unstructured":"Matsubara M, Aoki-Kinoshita KF, Aoki NP, Yamada I, Narimatsu H (2017) WURCS 2.0 update to encapsulate ambiguous carbohydrate structures. J Chem Inf Model. 57(4):632\u2013637","journal-title":"J Chem Inf Model."},{"issue":"10","key":"460_CR104","doi-asserted-by":"publisher","first-page":"915","DOI":"10.1093\/glycob\/cwx066","volume":"27","author":"M Tiemeyer","year":"2017","unstructured":"Tiemeyer M, Aoki K, Paulson J, Cummings RD, York WS, Karlsson NG et al (2017) GlyTouCan: an accessible glycan structure repository. Glycobiology 27(10):915\u2013919","journal-title":"Glycobiology"},{"key":"460_CR105","unstructured":"Pillong M, Schneider G (2012) Representing carbohydrates by pseudoreceptor models for virtual screening in drug discovery. pp 131\u201346"},{"key":"460_CR106","doi-asserted-by":"crossref","unstructured":"Bojar D, Camacho DM, Collins JJ (2020) Using Natural Language Processing to Learn the Grammar of Glycans. bioRxiv","DOI":"10.1101\/2020.01.10.902114"},{"issue":"9","key":"460_CR107","doi-asserted-by":"publisher","first-page":"1523","DOI":"10.1021\/acscentsci.9b00476","volume":"5","author":"TS Lin","year":"2019","unstructured":"Lin TS, Coley CW, Mochigase H, Beech HK, Wang W, Wang Z et al (2019) BigSMILES: a structurally-based line notation for describing macromolecules. ACS Cent Sci. 5(9):1523\u20131531","journal-title":"ACS Cent Sci."},{"issue":"2","key":"460_CR108","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1351\/pac200880020277","volume":"80","author":"J Brecher","year":"2008","unstructured":"Brecher J (2008) Graphical representation standards for chemical structure diagrams: (IUPAC Recommendations 2008). Pure Appl Chem 80(2):277\u2013410","journal-title":"Pure Appl Chem"},{"key":"460_CR109","unstructured":"Xemistry Chemoinformatics. https:\/\/www.xemistry.com\/. Accessed 10 Jun 2020"},{"key":"460_CR110","unstructured":"Molinspiration Cheminformatics. https:\/\/www.molinspiration.com\/. Accessed 10 Jun 2020"},{"key":"460_CR111","unstructured":"OASA. http:\/\/bkchem.zirael.org\/oasa_en.html. Accessed 10 Jun 2020"},{"issue":"1","key":"460_CR112","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-016-0187-6","volume":"9","author":"EL Willighagen","year":"2017","unstructured":"Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N et al (2017) The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform. 9(1):1\u201319","journal-title":"J Cheminform."},{"key":"460_CR113","unstructured":"Mayfield J (2016) Higher quality chemical depictions: lessons learned and advice"},{"key":"460_CR114","unstructured":"The Consortium for Functional Glycomics. http:\/\/www.functionalglycomics.org\/static\/consortium\/consortium.shtml. Accessed 27 May 2020"},{"issue":"9","key":"460_CR115","doi-asserted-by":"publisher","first-page":"540","DOI":"10.1021\/ml100164p","volume":"1","author":"K Stierand","year":"2010","unstructured":"Stierand K, Rarey M (2010) Drawing the PDB: protein-ligand complexes in two dimensions. ACS Med Chem Lett. 1(9):540\u2013545","journal-title":"ACS Med Chem Lett."},{"key":"460_CR116","unstructured":"Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural Message Passing for Quantum Chemistry. arXiv Prepr"},{"key":"460_CR117","doi-asserted-by":"crossref","unstructured":"Withnall M, Lindel\u00f6f E, Engkvist O, Chen H (2019) Building attention and edge convolution neural networks for bioactivity and physical-chemical property prediction building attention and edge convolution neural networks for. p 2","DOI":"10.26434\/chemrxiv.9873599"},{"issue":"8","key":"460_CR118","doi-asserted-by":"publisher","first-page":"3370","DOI":"10.1021\/acs.jcim.9b00237","volume":"59","author":"K Yang","year":"2019","unstructured":"Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H et al (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59(8):3370\u20133388","journal-title":"J Chem Inf Model"},{"issue":"8","key":"460_CR119","doi-asserted-by":"publisher","first-page":"1757","DOI":"10.1021\/acs.jcim.6b00601","volume":"57","author":"CW Coley","year":"2017","unstructured":"Coley CW, Barzilay R, Green WH, Jaakkola TS, Jensen KF (2017) Convolutional embedding of attributed molecular graphs for physical property prediction. J Chem Inf Model 57(8):1757\u20131772","journal-title":"J Chem Inf Model"},{"key":"460_CR120","unstructured":"Li Y, Vinyals O, Dyer C, Pascanu R, Battaglia P (2018) Learning Deep Generative Models of Graphs. arXiv Prepr"},{"issue":"1","key":"460_CR121","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-018-0287-6","volume":"10","author":"Y Li","year":"2018","unstructured":"Li Y, Zhang L, Liu Z (2018) Multi-objective de novo drug design with conditional graph generative model. J Cheminform. 10(1):1\u201324","journal-title":"J Cheminform."},{"key":"460_CR122","unstructured":"Jin W, Barzilay R, Jaakkola T (2018) Junction Tree Variational Autoencoder for Molecular Graph Generation. arXiv Prepr"},{"key":"460_CR123","unstructured":"Popova M, Shvets M, Oliva J, Isayev O (2019) MolecularRNN: Generating realistic molecular graphs with optimized properties. arXiv Prepr"},{"key":"460_CR124","doi-asserted-by":"crossref","unstructured":"Jin W, Barzilay R, Jaakkola T (2019) Multi-Resolution Autoregressive Graph-to-Graph Translation for Molecules. chemArXiv. p 8266745","DOI":"10.26434\/chemrxiv.8266745.v1"},{"key":"460_CR125","unstructured":"Jin W, Yang K, Barzilay R, Jaakkola T (2018) Learning multimodal graph-to-graph translation for molecular optimization. arXiv Prepr. pp 1\u201314"},{"key":"460_CR126","doi-asserted-by":"crossref","unstructured":"Coley CW, Jin W, Rogers L, Jamison TF, Jaakkola TS, Green WH, et al (2018) A graph-convolutional neural network model for the prediction of chemical reactivity","DOI":"10.26434\/chemrxiv.7163189"},{"key":"460_CR127","unstructured":"Xu K, Hu W, Leskovec J, Jegelka S (2019) How powerful are graph neural networks? pp 1\u201316"},{"key":"460_CR128","unstructured":"Battaglia PW, Hamrick JB, Bapst V, Sanchez-Gonzalez A, Zambaldi V, Malinowski M, et al. Relational inductive biases, deep learning, and graph networks. 2018;1\u201340"},{"issue":"3","key":"460_CR129","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1007\/s11030-006-9041-5","volume":"10","author":"M Hassan","year":"2006","unstructured":"Hassan M, Brown RD, Varma-OBrien S, Rogers D (2006) Cheminformatics analysis and learning in a data pipelining environment. Mol Divers. 10(3):283\u2013299","journal-title":"Mol Divers."},{"key":"460_CR130","doi-asserted-by":"crossref","unstructured":"Todeschini R, Consonni V (2007) Methods and principles in medicinal chemistry. pp 438\u2013438","DOI":"10.1002\/9783527610907.scard"},{"issue":"6","key":"460_CR131","doi-asserted-by":"publisher","first-page":"1241","DOI":"10.1016\/j.drudis.2018.01.039","volume":"23","author":"H Chen","year":"2018","unstructured":"Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of deep learning in drug discovery. Drug Discov Today. 23(6):1241\u20131250","journal-title":"Drug Discov Today."},{"issue":"6400","key":"460_CR132","doi-asserted-by":"publisher","first-page":"360","DOI":"10.1126\/science.aat2663","volume":"361","author":"B Sanchez-Lengeling","year":"2018","unstructured":"Sanchez-Lengeling B, Aspuru-Guzik A (2018) Inverse molecular design using machine learning: generative models for matter engineering. Science. 361(6400):360\u2013365","journal-title":"Science."},{"issue":"7792","key":"460_CR133","doi-asserted-by":"publisher","first-page":"706","DOI":"10.1038\/s41586-019-1923-7","volume":"577","author":"AW Senior","year":"2020","unstructured":"Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T et al (2020) Improved protein structure prediction using potentials from deep learning. Nature 577(7792):706\u2013710","journal-title":"Nature"},{"key":"460_CR134","doi-asserted-by":"publisher","first-page":"290","DOI":"10.1136\/svn-2019-000290","volume":"4","author":"B Liu","year":"2019","unstructured":"Liu B, He H, Luo H, Zhang T, Jiang J (2019) Artificial intelligence and big data facilitated targeted drug discovery. Stroke Vasc Neurol. 4:290","journal-title":"Stroke Vasc Neurol."},{"key":"460_CR135","unstructured":"SureChEMBL: Non MedChem-Friendly SMARTS. https:\/\/www.surechembl.org\/knowledgebase\/169485-non-medchem-friendly-smarts. Accessed 5 Dec 2019"},{"issue":"8","key":"460_CR136","doi-asserted-by":"publisher","first-page":"2310","DOI":"10.1021\/ci300245q","volume":"52","author":"I Sushko","year":"2012","unstructured":"Sushko I, Salmina E, Potemkin VA, Poda G, Tetko IV (2012) ToxAlerts: a web server of structural alerts for toxic chemicals and compounds with potential adverse reactions. J Chem Inf Model 52(8):2310\u20132316","journal-title":"J Chem Inf Model"},{"issue":"7","key":"460_CR137","doi-asserted-by":"publisher","first-page":"2719","DOI":"10.1021\/jm901137j","volume":"53","author":"JB Baell","year":"2010","unstructured":"Baell JB, Holloway GA (2010) New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J Med Chem 53(7):2719\u20132740","journal-title":"J Med Chem"},{"key":"460_CR138","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-018-0323-6","volume":"11","author":"J Ar\u00fas-Pous","year":"2019","unstructured":"Ar\u00fas-Pous J, Johansson SV, Prykhodko O, Bjerrum EJ, Tyrchan C, Reymond JL et al (2019) Randomized SMILES strings improve the quality of molecular generative models. J Cheminform. 11:1","journal-title":"J Cheminform."},{"issue":"7","key":"460_CR139","doi-asserted-by":"publisher","first-page":"10883","DOI":"10.18632\/oncotarget.14073","volume":"8","author":"A Kadurin","year":"2017","unstructured":"Kadurin A, Aliper A, Kazennov A, Mamoshina P, Vanhaelen Q, Khrabrov K et al (2017) The cornucopia of meaningful leads: applying deep adversarial autoencoders for new molecule development in oncology. Oncotarget. 8(7):10883\u201310890","journal-title":"Oncotarget."},{"key":"460_CR140","unstructured":"Duvenaud D, Maclaurin D, Aguilera-Iparraguirre J, G\u00f3mez-Bombarelli R, Hirzel T, Aspuru-Guzik A, et al. Convolutional networks on graphs for learning molecular fingerprints. In: Advances in neural information processing systems. 2015. pp 2224\u201332"},{"issue":"3","key":"460_CR141","doi-asserted-by":"publisher","first-page":"181","DOI":"10.1039\/c0md00207k","volume":"2","author":"DA Urbanek","year":"2011","unstructured":"Urbanek DA, Proschak E, Tanrikulu Y, Becker S, Karas M, Schneider G (2011) Scaffold-hopping from aminoglycosides to small synthetic inhibitors of bacterial protein biosynthesis using a pseudoreceptor model. Medchemcomm. 2(3):181\u2013184","journal-title":"Medchemcomm."},{"issue":"1","key":"460_CR142","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1002\/prot.22424","volume":"77","author":"H Nassif","year":"2009","unstructured":"Nassif H, Al-Ali H, Khuri S, Keirouz W (2009) Prediction of protein-glucose binding sites using support vector machines. Proteins Struct Funct Bioinforma. 77(1):121\u2013132","journal-title":"Proteins Struct Funct Bioinforma."},{"issue":"10","key":"460_CR143","doi-asserted-by":"publisher","first-page":"2069","DOI":"10.1080\/07391102.2015.1106978","volume":"34","author":"PP Pai","year":"2016","unstructured":"Pai PP, Mondal S (2016) MOWGLI: prediction of protein\u2013MannOse interacting residues With ensemble classifiers usinG evoLutionary Information. J Biomol Struct Dyn 34(10):2069\u20132083","journal-title":"J Biomol Struct Dyn"},{"key":"460_CR144","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-020-3442-9","volume":"21","author":"Z Dezso","year":"2020","unstructured":"Dezso Z, Ceccarelli M (2020) Machine learning prediction of oncology drug targets based on protein and network properties. BMC Bioinf. 21:1","journal-title":"BMC Bioinf."},{"issue":"1","key":"460_CR145","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1007\/s13337-020-00571-5","volume":"31","author":"S Kumar","year":"2020","unstructured":"Kumar S, Maurya VK, Prasad AK, Bhatt MLB, Saxena SK (2020) Structural, glycosylation and antigenic variation between 2019 novel coronavirus (2019-nCoV) and SARS coronavirus (SARS-CoV). VirusDisease. 31(1):13\u201321","journal-title":"VirusDisease."},{"issue":"1","key":"460_CR146","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-019-0400-5","volume":"11","author":"A Nguyen","year":"2019","unstructured":"Nguyen A, Huang YC, Tremouilhac P, Jung N, Br\u00e4se S (2019) ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files. J Cheminform. 11(1):1\u20139","journal-title":"J Cheminform."},{"key":"460_CR147","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1021\/ci5002197","volume":"54","author":"P Frasconi","year":"2014","unstructured":"Frasconi P, Gabbrielli F, Lippi M, Marinai S (2014) Markov logic networks for optical chemical structure recognition. J Chem Inf Model 54:37","journal-title":"J Chem Inf Model"},{"issue":"3","key":"460_CR148","doi-asserted-by":"publisher","first-page":"1017","DOI":"10.1021\/acs.jcim.8b00669","volume":"59","author":"J Staker","year":"2019","unstructured":"Staker J, Marshall K, Abel R, McQuaw CM (2019) Molecular structure extraction from documents using deep learning. J Chem Inf Model 59(3):1017\u20131029","journal-title":"J Chem Inf Model"},{"key":"460_CR149","unstructured":"Picture - 5Y6N Zika virus helicase in complex with ADP. http:\/\/www.rcsb.org\/3d-view\/5Y6N. Accessed 8 Jan 2020"},{"key":"460_CR150","unstructured":"Picture - Lemon. https:\/\/pixabay.com\/sv\/vectors\/citron-citrus-mat-frukt-orange-148119\/. Accessed 8 Jan 2020"},{"key":"460_CR151","unstructured":"Picture - Orange. https:\/\/pixabay.com\/sv\/vectors\/apelsiner-frukt-saftiga-citrus-42394\/. Accessed 8 Jan 2020"},{"key":"460_CR152","unstructured":"Picture - Pills. https:\/\/pixabay.com\/fr\/photos\/thermom\u00e8tre-maux-de-t\u00eate-la-douleur-1539191\/. Accessed 30 Dec 2019"},{"key":"460_CR153","unstructured":"Picture - Rose Graphic Flower. https:\/\/pixabay.com\/vectors\/rose-graphic-flower-deco-398576\/. Accessed 31 Dec 2019"},{"key":"460_CR154","unstructured":"Picture - Red contact lens. https:\/\/unsplash.com\/photos\/R5CX8XDQLV0. Accessed 14 Jul 2020"},{"key":"460_CR155","unstructured":"Picture - Insulin. https:\/\/www.flickr.com\/photos\/102642344@N02\/10083633053\/. Accessed 26 Dec 2019"},{"key":"460_CR156","unstructured":"Picture - Cyclosporin A. https:\/\/pubchem.ncbi.nlm.nih.gov\/compound\/Cyclosporin-A#section=2D-Structure. Accessed 6 Dec 2019"},{"key":"460_CR157","unstructured":"Picture - Milk Bottle. https:\/\/pixabay.com\/vectors\/milk-bottle-glass-dairy-breakfast-2012800\/. Accessed 26 Dec 2019"},{"key":"460_CR158","unstructured":"Creative Commons\u2014Attribution 3.0 Unported\u2014CC BY 3.0. https:\/\/creativecommons.org\/licenses\/by\/3.0\/. Accessed 5 Dec 2019"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-020-00460-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-020-00460-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-020-00460-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,16]],"date-time":"2021-09-16T23:29:06Z","timestamp":1631834946000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-020-00460-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,9,17]]},"references-count":158,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["460"],"URL":"https:\/\/doi.org\/10.1186\/s13321-020-00460-5","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,9,17]]},"assertion":[{"value":"11 January 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 September 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 September 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"All the authors were employed by AstraZeneca and declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"56"}}