{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,13]],"date-time":"2026-06-13T16:42:44Z","timestamp":1781368964711,"version":"3.54.1"},"reference-count":77,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,4,22]],"date-time":"2020-04-22T00:00:00Z","timestamp":1587513600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2020,4,22]],"date-time":"2020-04-22T00:00:00Z","timestamp":1587513600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000183","name":"Army Research Office","doi-asserted-by":"publisher","award":["W911NF-18-1-0315"],"award-info":[{"award-number":["W911NF-18-1-0315"]}],"id":[{"id":"10.13039\/100000183","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Deep neural networks can directly learn from chemical structures without extensive, user-driven selection of descriptors in order to predict molecular properties\/activities with high reliability. But these approaches typically require large training sets to learn the endpoint-specific structural features and ensure reasonable prediction accuracy. Even though large datasets are becoming the new normal in drug discovery, especially when it comes to high-throughput screening or metabolomics datasets, one should also consider smaller datasets with challenging endpoints to model and forecast. Thus, it would be highly relevant to better utilize the tremendous compendium of unlabeled compounds from publicly-available datasets for improving the model performances for the user\u2019s particular series of compounds. In this study, we propose the\n                    <jats:bold>Mol<\/jats:bold>\n                    ecular\n                    <jats:bold>P<\/jats:bold>\n                    rediction\n                    <jats:bold>Mo<\/jats:bold>\n                    del\n                    <jats:bold>Fi<\/jats:bold>\n                    ne-\n                    <jats:bold>T<\/jats:bold>\n                    uning (\n                    <jats:bold>MolPMoFiT<\/jats:bold>\n                    ) approach, an effective transfer learning method based on self-supervised pre-training\u2009+\u2009task-specific fine-tuning for QSPR\/QSAR modeling. A large-scale molecular structure prediction model is pre-trained using one million unlabeled molecules from ChEMBL in a self-supervised learning manner, and can then be fine-tuned on various QSPR\/QSAR tasks for smaller chemical datasets with specific endpoints. Herein, the method is evaluated on four benchmark datasets (lipophilicity, FreeSolv, HIV, and blood\u2013brain barrier penetration). The results showed the method can achieve strong performances for all four datasets compared to other\n                    <jats:italic>state<\/jats:italic>\n                    -\n                    <jats:italic>of<\/jats:italic>\n                    -\n                    <jats:italic>the<\/jats:italic>\n                    -\n                    <jats:italic>art<\/jats:italic>\n                    machine learning modeling techniques reported in the literature so far.\n                  <\/jats:p>","DOI":"10.1186\/s13321-020-00430-x","type":"journal-article","created":{"date-parts":[[2020,4,22]],"date-time":"2020-04-22T14:03:32Z","timestamp":1587564212000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":115,"title":["Inductive transfer learning for molecular activity prediction: Next-Gen QSAR Models with MolPMoFiT"],"prefix":"10.1186","volume":"12","author":[{"given":"Xinhao","family":"Li","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5642-8303","authenticated-orcid":false,"given":"Denis","family":"Fourches","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2020,4,22]]},"reference":[{"key":"430_CR1","doi-asserted-by":"publisher","first-page":"4977","DOI":"10.1021\/jm4004285","volume":"57","author":"A Cherkasov","year":"2014","unstructured":"Cherkasov A, Muratov EN, Fourches D et al (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977\u20135010. https:\/\/doi.org\/10.1021\/jm4004285","journal-title":"J Med Chem"},{"key":"430_CR2","doi-asserted-by":"publisher","first-page":"2545","DOI":"10.1021\/acs.jcim.9b00266","volume":"59","author":"AC Mater","year":"2019","unstructured":"Mater AC, Coote ML (2019) Deep Learning in Chemistry. J Chem Inf Model 59:2545\u20132559. https:\/\/doi.org\/10.1021\/acs.jcim.9b00266","journal-title":"J Chem Inf Model"},{"key":"430_CR3","doi-asserted-by":"publisher","first-page":"476","DOI":"10.1002\/minf.201000061","volume":"29","author":"A Tropsha","year":"2010","unstructured":"Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29:476\u2013488. https:\/\/doi.org\/10.1002\/minf.201000061","journal-title":"Mol Inform"},{"key":"430_CR4","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1021\/ci500747n","volume":"55","author":"J Ma","year":"2015","unstructured":"Ma J, Sheridan RP, Liaw A et al (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55:263\u2013274. https:\/\/doi.org\/10.1021\/ci500747n","journal-title":"J Chem Inf Model"},{"key":"430_CR5","doi-asserted-by":"crossref","unstructured":"Fourches D, Williams AJ, Patlewicz G, et al (2018) Computational Tools for ADMET Profiling. In: Computational Toxicology. pp 211\u2013244","DOI":"10.1002\/9781119282594.ch8"},{"key":"430_CR6","doi-asserted-by":"publisher","first-page":"353","DOI":"10.1021\/acs.chemrestox.9b00259","volume":"33","author":"X Li","year":"2020","unstructured":"Li X, Kleinstreuer NC, Fourches D (2020) Hierarchical quantitative structure-activity relationship modeling approach for integrating binary, multiclass, and regression models of acute oral systemic toxicity. Chem Res Toxicol 33:353\u2013366. https:\/\/doi.org\/10.1021\/acs.chemrestox.9b00259","journal-title":"Chem Res Toxicol"},{"key":"430_CR7","doi-asserted-by":"publisher","first-page":"1286","DOI":"10.1021\/acs.jcim.7b00048","volume":"57","author":"J Ash","year":"2017","unstructured":"Ash J, Fourches D (2017) Characterizing the chemical space of ERK2 kinase inhibitors using descriptors computed from molecular dynamics trajectories. J Chem Inf Model 57:1286\u20131299. https:\/\/doi.org\/10.1021\/acs.jcim.7b00048","journal-title":"J Chem Inf Model"},{"key":"430_CR8","doi-asserted-by":"publisher","DOI":"10.1080\/17460441.2019.1664467","author":"D Fourches","year":"2019","unstructured":"Fourches D, Ash J (2019) 4D- quantitative structure\u2013activity relationship modeling: making a comeback. Expert Opin Drug Discov. https:\/\/doi.org\/10.1080\/17460441.2019.1664467","journal-title":"Expert Opin Drug Discov"},{"key":"430_CR9","doi-asserted-by":"publisher","first-page":"363","DOI":"10.2174\/1386207003331454","volume":"3","author":"L Xue","year":"2012","unstructured":"Xue L, Bajorath J (2012) Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Comb Chem High Throughput Screen 3:363\u2013372. https:\/\/doi.org\/10.2174\/1386207003331454","journal-title":"Comb Chem High Throughput Screen"},{"key":"430_CR10","unstructured":"Gilmer J, Schoenholz SS, Riley PF, et al (2017) Neural message passing for quantum chemistry. http:\/\/arxiv.org\/abs\/1704.01212"},{"key":"430_CR11","doi-asserted-by":"publisher","first-page":"3564","DOI":"10.1021\/acs.chemmater.9b01294","volume":"31","author":"C Chen","year":"2019","unstructured":"Chen C, Ye W, Zuo Y et al (2019) Graph networks as a universal machine learning framework for molecules and crystals. Chem Mater 31:3564\u20133572. https:\/\/doi.org\/10.1021\/acs.chemmater.9b01294","journal-title":"Chem Mater"},{"key":"430_CR12","doi-asserted-by":"publisher","first-page":"3370","DOI":"10.1021\/acs.jcim.9b00237","volume":"59","author":"K Yang","year":"2019","unstructured":"Yang K, Swanson K, Jin W et al (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370\u20133388. https:\/\/doi.org\/10.1021\/acs.jcim.9b00237","journal-title":"J Chem Inf Model"},{"key":"430_CR13","first-page":"2224","volume":"2015","author":"D Duvenaud","year":"2015","unstructured":"Duvenaud D, Maclaurin D, Aguilera-Iparraguirre J et al (2015) Convolutional networks on graphs for learning molecular fingerprints. Adv Neural Inf Process Syst 2015:2224\u20132232","journal-title":"Adv Neural Inf Process Syst"},{"key":"430_CR14","doi-asserted-by":"publisher","first-page":"1757","DOI":"10.1021\/acs.jcim.6b00601","volume":"57","author":"CW Coley","year":"2017","unstructured":"Coley CW, Barzilay R, Green WH et al (2017) Convolutional embedding of attributed molecular graphs for physical property prediction. J Chem Inf Model 57:1757\u20131772. https:\/\/doi.org\/10.1021\/acs.jcim.6b00601","journal-title":"J Chem Inf Model"},{"key":"430_CR15","doi-asserted-by":"publisher","first-page":"513","DOI":"10.1039\/C7SC02664A","volume":"9","author":"Z Wu","year":"2018","unstructured":"Wu Z, Ramsundar B, Feinberg EN et al (2018) MoleculeNet: a benchmark for molecular machine learning. Chem Sci 9:513\u2013530. https:\/\/doi.org\/10.1039\/C7SC02664A","journal-title":"Chem Sci"},{"key":"430_CR16","doi-asserted-by":"crossref","unstructured":"Pham T, Tran T, Venkatesh S (2018) Graph memory networks for molecular activity prediction. In: Proceedings - international conference on pattern recognition. pp 639\u2013644","DOI":"10.1109\/ICPR.2018.8545246"},{"key":"430_CR17","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jcim.9b00410","author":"X Wang","year":"2019","unstructured":"Wang X, Li Z, Jiang M et al (2019) Molecule property prediction based on spatial graph embedding. J Chem Inf Model. https:\/\/doi.org\/10.1021\/acs.jcim.9b00410","journal-title":"J Chem Inf Model"},{"key":"430_CR18","doi-asserted-by":"publisher","first-page":"1520","DOI":"10.1021\/acscentsci.8b00507","volume":"4","author":"EN Feinberg","year":"2018","unstructured":"Feinberg EN, Sur D, Wu Z et al (2018) PotentialNet for molecular property prediction. ACS Cent Sci 4:1520\u20131530. https:\/\/doi.org\/10.1021\/acscentsci.8b00507","journal-title":"ACS Cent Sci"},{"key":"430_CR19","doi-asserted-by":"publisher","first-page":"688","DOI":"10.1016\/j.cell.2020.01.021","volume":"180","author":"JM Stokes","year":"2020","unstructured":"Stokes JM, Yang K, Swanson K et al (2020) A deep learning approach to antibiotic discovery. Cell 180:688\u2013702.e13. https:\/\/doi.org\/10.1016\/j.cell.2020.01.021","journal-title":"Cell"},{"key":"430_CR20","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1186\/s13321-020-0414-z","volume":"12","author":"B Tang","year":"2020","unstructured":"Tang B, Kramer ST, Fang M et al (2020) A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility. J Cheminform 12:15. https:\/\/doi.org\/10.1186\/s13321-020-0414-z","journal-title":"J Cheminform"},{"key":"430_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-019-0407-y","volume":"12","author":"M Withnall","year":"2020","unstructured":"Withnall M, Lindel\u00f6f E, Engkvist O, Chen H (2020) Building attention and edge message passing neural networks for bioactivity and physical-chemical property prediction. J Cheminform 12:1\u201318. https:\/\/doi.org\/10.1186\/s13321-019-0407-y","journal-title":"J Cheminform"},{"key":"430_CR22","unstructured":"Goh GB, Hodas NO, Siegel C, Vishnu A (2017) SMILES2Vec: An interpretable general-purpose deep neural network for predicting chemical properties. http:\/\/arxiv.org\/abs\/1712.02034"},{"key":"430_CR23","doi-asserted-by":"publisher","first-page":"914","DOI":"10.1021\/acs.jcim.8b00803","volume":"59","author":"S Zheng","year":"2019","unstructured":"Zheng S, Yan X, Yang Y, Xu J (2019) Identifying structure-property relationships through SMILES syntax analysis with self-attention mechanism. J Chem Inf Model 59:914\u2013923. https:\/\/doi.org\/10.1021\/acs.jcim.8b00803","journal-title":"J Chem Inf Model"},{"key":"430_CR24","unstructured":"Kimber TB, Engelke S, Tetko I V, et al (2018) Synergy effect between convolutional neural networks and the multiplicity of SMILES for improvement of molecular prediction. http:\/\/arxiv.org\/abs\/1812.04439"},{"key":"430_CR25","unstructured":"Goh GB, Siegel C, Vishnu A, et al (2017) Chemception: a deep neural network with minimal chemistry knowledge matches the performance of expert-developed QSAR\/QSPR models. https:\/\/arxiv.org\/pdf\/1706.06689.pdf"},{"key":"430_CR26","doi-asserted-by":"crossref","unstructured":"Goh GB, Siegel C, Vishnu A, Hodas NO (2017) Using rule-based labels for weak supervised learning: a ChemNet for transferable chemical property prediction.","DOI":"10.1145\/3219819.3219838"},{"key":"430_CR27","unstructured":"Paul A, Jha D, Al-Bahrani R, et al (2018) CheMixNet: Mixed DNN architectures for predicting chemical properties using multiple molecular representations. http:\/\/arxiv.org\/abs\/1811.08283"},{"key":"430_CR28","unstructured":"Goh GB, Siegel C, Vishnu A, et al (2018) How much chemistry does a deep neural network need to know to make accurate predictions? In: Proceedings - 2018 IEEE winter conference on applications of computer vision, WACV 2018. pp 1340\u20131349"},{"key":"430_CR29","doi-asserted-by":"publisher","first-page":"1533","DOI":"10.1021\/acs.jcim.8b00338","volume":"58","author":"M Fernandez","year":"2018","unstructured":"Fernandez M, Ban F, Woo G et al (2018) Toxic colors: the use of deep learning for predicting toxicity of compounds merely from their graphic images. J Chem Inf Model 58:1533\u20131543. https:\/\/doi.org\/10.1021\/acs.jcim.8b00338","journal-title":"J Chem Inf Model"},{"key":"430_CR30","doi-asserted-by":"publisher","DOI":"10.1021\/acs.jcim.9b00713","author":"E Asilar","year":"2020","unstructured":"Asilar E, Hemmerich J, Ecker GF (2020) Image based liver toxicity prediction. J Chem Inf Model. https:\/\/doi.org\/10.1021\/acs.jcim.9b00713","journal-title":"J Chem Inf Model"},{"key":"430_CR31","doi-asserted-by":"publisher","first-page":"693","DOI":"10.1007\/s10822-005-9008-0","volume":"19","author":"A Varnek","year":"2005","unstructured":"Varnek A, Fourches D, Hoonakker F, Solov\u2019ev VP (2005) Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures. J Comput Aided Mol Des 19:693\u2013703. https:\/\/doi.org\/10.1007\/s10822-005-9008-0","journal-title":"J Comput Aided Mol Des"},{"key":"430_CR32","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Model 28:31\u201336. https:\/\/doi.org\/10.1021\/ci00057a005","journal-title":"J Chem Inf Model"},{"key":"430_CR33","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1021\/ci00062a008","volume":"29","author":"D Weininger","year":"1989","unstructured":"Weininger D, Weininger A, Weininger JL (1989) SMILES. 2. Algorithm for generation of unique SMILES notation. J Chem Inf Model 29:97\u2013101. https:\/\/doi.org\/10.1021\/ci00062a008","journal-title":"J Chem Inf Model"},{"key":"430_CR34","doi-asserted-by":"publisher","first-page":"2554","DOI":"10.1073\/pnas.79.8.2554","volume":"79","author":"JJ Hopfield","year":"1982","unstructured":"Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79:2554\u20132558. https:\/\/doi.org\/10.1073\/pnas.79.8.2554","journal-title":"Proc Natl Acad Sci"},{"key":"430_CR35","unstructured":"Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. http:\/\/arxiv.org\/abs\/1506.00019"},{"key":"430_CR36","unstructured":"Kim Y Convolutional neural networks for sentence classification. http:\/\/arxiv.org\/abs\/1408.5882"},{"key":"430_CR37","unstructured":"Vaswani A, Shazeer N, Parmar N, et al (2017) Attention Is All You Need. http:\/\/arxiv.org\/abs\/1706.03762"},{"key":"430_CR38","doi-asserted-by":"crossref","unstructured":"Deng J, Dong W, Socher R, et al (2009) ImageNet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248\u2013255","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"430_CR39","unstructured":"Canziani A, Paszke A, Culurciello E (2016) An analysis of deep neural network models for practical applications. http:\/\/arxiv.org\/abs\/1605.07678"},{"key":"430_CR40","unstructured":"Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. http:\/\/arxiv.org\/abs\/1301.3781"},{"key":"430_CR41","doi-asserted-by":"crossref","unstructured":"Pennington J, Socher R, Manning CD (2014) GloVe: Global vectors for word representation. In: Empirical methods in natural language processing (EMNLP). pp 1532\u20131543","DOI":"10.3115\/v1\/D14-1162"},{"key":"430_CR42","unstructured":"Joulin A, Grave E, Bojanowski P, et al (2016) FastText.zip: Compressing text classification models. http:\/\/arxiv.org\/abs\/1612.03651"},{"key":"430_CR43","doi-asserted-by":"crossref","unstructured":"Peters ME, Neumann M, Iyyer M, et al (2018) Deep contextualized word representations. http:\/\/allennlp.org\/elmo","DOI":"10.18653\/v1\/N18-1202"},{"key":"430_CR44","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: Pre-training of deep bidirectional transformers for language understanding. http:\/\/arxiv.org\/abs\/1810.04805"},{"key":"430_CR45","doi-asserted-by":"crossref","unstructured":"Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. http:\/\/arxiv.org\/abs\/1801.06146","DOI":"10.18653\/v1\/P18-1031"},{"key":"430_CR46","unstructured":"Yang Z, Dai Z, Yang Y, et al (2019) XLNet: Generalized autoregressive pretraining for language understanding. http:\/\/arxiv.org\/abs\/1906.08237"},{"key":"430_CR47","unstructured":"Liu Y, Ott M, Goyal N, et al (2019) RoBERTa: A robustly optimized BERT pretraining approach. http:\/\/arxiv.org\/abs\/1907.11692"},{"key":"430_CR48","doi-asserted-by":"publisher","first-page":"1100","DOI":"10.1093\/nar\/gkr777","volume":"40","author":"A Gaulton","year":"2012","unstructured":"Gaulton A, Bellis LJ, Bento AP et al (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:1100\u20131107. https:\/\/doi.org\/10.1093\/nar\/gkr777","journal-title":"Nucleic Acids Res"},{"key":"430_CR49","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1021\/acs.jcim.7b00616","volume":"58","author":"S Jaeger","year":"2018","unstructured":"Jaeger S, Fulle S, Turk S (2018) Mol2vec: unsupervised Machine Learning Approach with Chemical Intuition. J Chem Inf Model 58:27\u201335. https:\/\/doi.org\/10.1021\/acs.jcim.7b00616","journal-title":"J Chem Inf Model"},{"key":"430_CR50","unstructured":"Hu W, Liu B, Gomes J, et al (2019) Pre-training Graph Neural Networks. https:\/\/arxiv.org\/pdf\/1905.12265.pdf"},{"key":"430_CR51","doi-asserted-by":"publisher","first-page":"2490","DOI":"10.1021\/acs.jcim.7b00087","volume":"57","author":"Y Xu","year":"2017","unstructured":"Xu Y, Ma J, Liaw A et al (2017) Demystifying multitask deep neural networks for quantitative structure-activity relationships. J Chem Inf Model 57:2490\u20132504. https:\/\/doi.org\/10.1021\/acs.jcim.7b00087","journal-title":"J Chem Inf Model"},{"key":"430_CR52","doi-asserted-by":"publisher","first-page":"1062","DOI":"10.1021\/acs.jcim.8b00685","volume":"59","author":"S Sosnin","year":"2019","unstructured":"Sosnin S, Karlov D, Tetko IV, Fedorov MV (2019) Comparative study of multitask toxicity modeling on a broad chemical space. J Chem Inf Model 59:1062\u20131072. https:\/\/doi.org\/10.1021\/acs.jcim.8b00685","journal-title":"J Chem Inf Model"},{"key":"430_CR53","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1186\/s13321-018-0281-z","volume":"10","author":"A Le\u00f3n","year":"2018","unstructured":"Le\u00f3n A, Chen B, Gillet VJ (2018) Effect of missing data on multitask prediction methods. J Cheminform 10:26. https:\/\/doi.org\/10.1186\/s13321-018-0281-z","journal-title":"J Cheminform"},{"key":"430_CR54","doi-asserted-by":"publisher","first-page":"520","DOI":"10.1021\/acs.jcim.7b00558","volume":"58","author":"K Wu","year":"2018","unstructured":"Wu K, Wei G-W (2018) Quantitative toxicity prediction using topology based multitask deep neural networks. J Chem Inf Model 58:520\u2013531. https:\/\/doi.org\/10.1021\/acs.jcim.7b00558","journal-title":"J Chem Inf Model"},{"key":"430_CR55","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1021\/ci8002914","volume":"49","author":"A Varnek","year":"2009","unstructured":"Varnek A, Gaudin C, Marcou G et al (2009) Inductive transfer of knowledge: application of multi-task learning and feature net approaches to model tissue-air partition coefficients. J Chem Inf Model 49:133\u2013144. https:\/\/doi.org\/10.1021\/ci8002914","journal-title":"J Chem Inf Model"},{"key":"430_CR56","doi-asserted-by":"publisher","first-page":"2068","DOI":"10.1021\/acs.jcim.7b00146","volume":"57","author":"B Ramsundar","year":"2017","unstructured":"Ramsundar B, Liu B, Wu Z et al (2017) Is multitask deep learning practical for pharma? J Chem Inf Model 57:2068\u20132076. https:\/\/doi.org\/10.1021\/acs.jcim.7b00146","journal-title":"J Chem Inf Model"},{"key":"430_CR57","unstructured":"Merity S, Xiong C, Bradbury J, Socher R (2016) Pointer sentinel mixture models. http:\/\/arxiv.org\/abs\/1609.07843"},{"key":"430_CR58","doi-asserted-by":"crossref","unstructured":"Linzen T, Dupoux E, Goldberg Y (2016) Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies. http:\/\/arxiv.org\/abs\/1611.01368","DOI":"10.1162\/tacl_a_00115"},{"key":"430_CR59","doi-asserted-by":"crossref","unstructured":"Gulordava K, Bojanowski P, Grave E, et al (2018) Colorless green recurrent networks dream hierarchically. http:\/\/arxiv.org\/abs\/1803.11138","DOI":"10.18653\/v1\/N18-1108"},{"key":"430_CR60","unstructured":"Radford A, Jozefowicz R, Sutskever I (2017) Learning to generate reviews and discovering sentiment. http:\/\/arxiv.org\/abs\/1704.01444"},{"key":"430_CR61","unstructured":"Merity S, Keskar NS, Socher R (2017) Regularizing and optimizing LSTM language models. http:\/\/arxiv.org\/abs\/1708.02182"},{"key":"430_CR62","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735\u20131780. https:\/\/doi.org\/10.1162\/neco.1997.9.8.1735","journal-title":"Neural Comput"},{"key":"430_CR63","unstructured":"Smith LN (2018) A disciplined approach to neural network hyper-parameters: Part 1: learning rate, batch size, momentum, and weight decay. http:\/\/arxiv.org\/abs\/1803.09820"},{"key":"430_CR64","unstructured":"Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Advances in neural information processing systems. pp 3320\u20133328"},{"key":"430_CR65","unstructured":"Adam P, Sam G, et al (2017) Automatic differentiation in PyTorch. In: 31st Conf Neural Inf Process Syst (NIPS 2017)"},{"key":"430_CR66","doi-asserted-by":"publisher","first-page":"108","DOI":"10.3390\/info11020108","volume":"11","author":"J Howard","year":"2020","unstructured":"Howard J, Gugger S (2020) Fastai: a layered API for deep learning. Information 11:108. https:\/\/doi.org\/10.3390\/info11020108","journal-title":"Information"},{"key":"430_CR67","unstructured":"Swain M MolVS: Molecule validation and standardization. https:\/\/github.com\/mcs07\/MolVS"},{"key":"430_CR68","unstructured":"Landrum G RDKit: Open-source cheminformatics. http:\/\/www.rdkit.org"},{"key":"430_CR69","doi-asserted-by":"crossref","unstructured":"Fadaee M, Bisazza A, Monz C (2017) Data augmentation for low-resource neural machine translation. http:\/\/arxiv.org\/abs\/1705.00440","DOI":"10.18653\/v1\/P17-2090"},{"key":"430_CR70","doi-asserted-by":"crossref","unstructured":"Kobayashi S (2018) Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Association for Computational Linguistics, Stroudsburg, PA, USA, pp 452\u2013457","DOI":"10.18653\/v1\/N18-2072"},{"key":"430_CR71","doi-asserted-by":"crossref","unstructured":"Kafle K, Yousefhussien M, Kanan C (2017) Data Augmentation for Visual Question Answering. In: Proceedings of the 10th international conference on natural language generation. association for computational linguistics, Stroudsburg, PA, USA, pp 198\u2013202","DOI":"10.18653\/v1\/W17-3529"},{"key":"430_CR72","doi-asserted-by":"crossref","unstructured":"Lei C, Hu B, Wang D, et al (2019) A preliminary study on data augmentation of deep learning for image classification. In: ACM International Conference Proceeding Series","DOI":"10.1145\/3361242.3361259"},{"key":"430_CR73","unstructured":"Bjerrum EJ (2017) SMILES enumeration as data augmentation for neural network modeling of molecules. http:\/\/arxiv.org\/abs\/1703.07076"},{"key":"430_CR74","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1186\/s13321-019-0341-z","volume":"11","author":"J Ar\u00fas-Pous","year":"2019","unstructured":"Ar\u00fas-Pous J, Blaschke T, Ulander S et al (2019) Exploring the GDB-13 chemical space using deep generative models. J Cheminform 11:20. https:\/\/doi.org\/10.1186\/s13321-019-0341-z","journal-title":"J Cheminform"},{"key":"430_CR75","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1186\/s13321-019-0393-0","volume":"11","author":"J Ar\u00fas-Pous","year":"2019","unstructured":"Ar\u00fas-Pous J, Johansson SV, Prykhodko O et al (2019) Randomized SMILES strings improve the quality of molecular generative models. J Cheminform 11:71. https:\/\/doi.org\/10.1186\/s13321-019-0393-0","journal-title":"J Cheminform"},{"key":"430_CR76","doi-asserted-by":"publisher","first-page":"2682","DOI":"10.1021\/acs.jcim.5b00570","volume":"55","author":"I Cortes-Ciriano","year":"2015","unstructured":"Cortes-Ciriano I, Bender A (2015) Improved chemical structure-activity modeling through data augmentation. J Chem Inf Model 55:2682\u20132692. https:\/\/doi.org\/10.1021\/acs.jcim.5b00570","journal-title":"J Chem Inf Model"},{"key":"430_CR77","doi-asserted-by":"publisher","first-page":"783","DOI":"10.1021\/ci400084k","volume":"53","author":"RP Sheridan","year":"2013","unstructured":"Sheridan RP (2013) Time-split cross-validation as a method for estimating the goodness of prospective prediction. J Chem Inf Model 53:783\u2013790. https:\/\/doi.org\/10.1021\/ci400084k","journal-title":"J Chem Inf Model"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-020-00430-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-020-00430-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-020-00430-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,4,21]],"date-time":"2021-04-21T20:04:18Z","timestamp":1619035458000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-020-00430-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,22]]},"references-count":77,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["430"],"URL":"https:\/\/doi.org\/10.1186\/s13321-020-00430-x","relation":{"has-preprint":[{"id-type":"doi","id":"10.26434\/chemrxiv.9978743.v2","asserted-by":"object"},{"id-type":"doi","id":"10.26434\/chemrxiv.9978743.v1","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,4,22]]},"assertion":[{"value":"29 January 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 April 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 April 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing financial interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"27"}}