{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T02:08:12Z","timestamp":1777428492128,"version":"3.51.4"},"reference-count":66,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,11,2]],"date-time":"2023-11-02T00:00:00Z","timestamp":1698883200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,11,2]],"date-time":"2023-11-02T00:00:00Z","timestamp":1698883200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Shanghai Science and Technology Development Funds","award":["20QA1406400"],"award-info":[{"award-number":["20QA1406400"]}]},{"name":"Shanghai Science and Technology Development Funds","award":["20QA1406400"],"award-info":[{"award-number":["20QA1406400"]}]},{"name":"Shanghai Science and Technology Development Funds","award":["20QA1406400"],"award-info":[{"award-number":["20QA1406400"]}]},{"name":"Shanghai Science and Technology Development Funds","award":["20QA1406400"],"award-info":[{"award-number":["20QA1406400"]}]},{"name":"Lingang Laboratory","award":["LG202102-01-03"],"award-info":[{"award-number":["LG202102-01-03"]}]},{"name":"Lingang Laboratory","award":["LG202102-01-03"],"award-info":[{"award-number":["LG202102-01-03"]}]},{"name":"Lingang Laboratory","award":["LG202102-01-03"],"award-info":[{"award-number":["LG202102-01-03"]}]},{"name":"Lingang Laboratory","award":["LG202102-01-03"],"award-info":[{"award-number":["LG202102-01-03"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["82003654"],"award-info":[{"award-number":["82003654"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["82003654"],"award-info":[{"award-number":["82003654"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["82003654"],"award-info":[{"award-number":["82003654"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["82003654"],"award-info":[{"award-number":["82003654"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2022YFC3400501"],"award-info":[{"award-number":["2022YFC3400501"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2022YFC3400501"],"award-info":[{"award-number":["2022YFC3400501"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2022YFC3400501"],"award-info":[{"award-number":["2022YFC3400501"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2022YFC3400501"],"award-info":[{"award-number":["2022YFC3400501"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>With the continuous development of artificial intelligence technology, more and more computational models for generating new molecules are being developed. However, we are often confronted with the question of whether these compounds are easy or difficult to synthesize, which refers to synthetic accessibility of compounds. In this study, a deep learning based computational model called DeepSA, was proposed to predict the synthesis accessibility of compounds, which provides a useful tool to choose molecules. DeepSA is a chemical language model that was developed by training on a dataset of 3,593,053 molecules using various natural language processing (NLP) algorithms, offering advantages over state-of-the-art methods and having a much higher area under the receiver operating characteristic curve (AUROC), i.e., 89.6%, in discriminating those molecules that are difficult to synthesize. This helps users select less expensive molecules for synthesis, reducing the time and cost required for drug discovery and development. Interestingly, a comparison of DeepSA with a Graph Attention-based method shows that using SMILES alone can also\u00a0efficiently visualize and extract compound\u2019s informative features. DeepSA is available online on the below web server (<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/bailab.siais.shanghaitech.edu.cn\/services\/deepsa\/\">https:\/\/bailab.siais.shanghaitech.edu.cn\/services\/deepsa\/<\/jats:ext-link>) of our group, and the code is available at<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/Shihang-Wang-58\/DeepSA\">https:\/\/github.com\/Shihang-Wang-58\/DeepSA<\/jats:ext-link>.<\/jats:p>","DOI":"10.1186\/s13321-023-00771-3","type":"journal-article","created":{"date-parts":[[2023,11,2]],"date-time":"2023-11-02T06:01:55Z","timestamp":1698904915000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":30,"title":["DeepSA: a deep-learning driven predictor of compound synthesis accessibility"],"prefix":"10.1186","volume":"15","author":[{"given":"Shihang","family":"Wang","sequence":"first","affiliation":[]},{"given":"Lin","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Fenglei","family":"Li","sequence":"additional","affiliation":[]},{"given":"Fang","family":"Bai","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,11,2]]},"reference":[{"key":"771_CR1","doi-asserted-by":"publisher","first-page":"1315","DOI":"10.1007\/s11030-021-10217-3","volume":"25","author":"R Gupta","year":"2021","unstructured":"Gupta R et al (2021) Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Div 25:1315\u20131360. https:\/\/doi.org\/10.1007\/s11030-021-10217-3","journal-title":"Mol Div"},{"key":"771_CR2","doi-asserted-by":"publisher","first-page":"949","DOI":"10.1080\/17460441.2021.1909567","volume":"16","author":"J Jimenez-Luna","year":"2021","unstructured":"Jimenez-Luna J, Grisoni F, Weskamp N, Schneider G (2021) Artificial intelligence in drug discovery: recent advances and future perspectives. Exp Opin Drug Disc 16:949\u2013959. https:\/\/doi.org\/10.1080\/17460441.2021.1909567","journal-title":"Exp Opin Drug Disc"},{"key":"771_CR3","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejmech.2021.113705","author":"VT Sabe","year":"2021","unstructured":"Sabe VT et al (2021) Current trends in computer aided drug design and a highlight of drugs discovered via computational techniques: a review. Eur J Med Chem. https:\/\/doi.org\/10.1016\/j.ejmech.2021.113705","journal-title":"Eur J Med Chem"},{"key":"771_CR4","doi-asserted-by":"publisher","DOI":"10.3390\/ijms22094688","author":"MM Salman","year":"2021","unstructured":"Salman MM et al (2021) Advances in Applying Computer-Aided Drug Design for Neurodegenerative Diseases. Int J Mol Sci. https:\/\/doi.org\/10.3390\/ijms22094688","journal-title":"Int J Mol Sci"},{"key":"771_CR5","doi-asserted-by":"publisher","first-page":"1040","DOI":"10.1038\/s42256-021-00410-2","volume":"3","author":"ZQ Chen","year":"2021","unstructured":"Chen ZQ, Min MR, Parthasarathy S, Ning X (2021) A deep generative model for molecule optimization via one fragment modification. Nat Mach Intell 3:1040\u20131049. https:\/\/doi.org\/10.1038\/s42256-021-00410-2","journal-title":"Nat Mach Intell"},{"key":"771_CR6","doi-asserted-by":"publisher","DOI":"10.1038\/s41401-022-00999-z","author":"QL Han","year":"2022","unstructured":"Han QL et al (2022) Discovery, evaluation and mechanism study of WDR5-targeted small molecular inhibitors for neuroblastoma. Acta Pharmacologica Sinica. https:\/\/doi.org\/10.1038\/s41401-022-00999-z","journal-title":"Acta Pharmacologica Sinica"},{"key":"771_CR7","doi-asserted-by":"publisher","first-page":"788","DOI":"10.1038\/s41401-021-00735-z","volume":"43","author":"L Wang","year":"2022","unstructured":"Wang L et al (2022) Discovery of potential small molecular SARS-CoV-2 entry blockers targeting the spike protein. Acta Pharmacologica Sinica 43:788\u2013796. https:\/\/doi.org\/10.1038\/s41401-021-00735-z","journal-title":"Acta Pharmacologica Sinica"},{"key":"771_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2022.108581","author":"JC Yu","year":"2022","unstructured":"Yu JC, Xu TY, Rong Y, Huang JZ, He R (2022) Structure-aware conditional variational auto-encoder for constrained molecule optimization. Pattern Recogn. https:\/\/doi.org\/10.1016\/j.patcog.2022.108581","journal-title":"Pattern Recogn"},{"key":"771_CR9","doi-asserted-by":"publisher","DOI":"10.1002\/minf.202100045","author":"YJ Lee","year":"2021","unstructured":"Lee YJ, Kahng H, Kim SB (2021) Generative adversarial networks for de novo molecular design. Mol Inform. https:\/\/doi.org\/10.1002\/minf.202100045","journal-title":"Mol Inform"},{"key":"771_CR10","doi-asserted-by":"publisher","first-page":"4863","DOI":"10.1021\/acs.jcim.2c00838","volume":"62","author":"SR Atance","year":"2022","unstructured":"Atance SR, Diez JV, Engkvist O, Olsson S, Mercado R. De (2022) Novo drug design using reinforcement learning with graph- based deep generative models. J Chem Inform Model 62:4863\u20134872. https:\/\/doi.org\/10.1021\/acs.jcim.2c00838","journal-title":"J Chem Inform Model"},{"key":"771_CR11","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbab333","author":"FQ Lu","year":"2021","unstructured":"Lu FQ, Li MF, Min XP, Li CY, De Zeng XX (2021) novo generation of dual-target ligands using adversarial training and reinforcement learning. Brief Bioinform. https:\/\/doi.org\/10.1093\/bib\/bbab333","journal-title":"Brief Bioinform"},{"key":"771_CR12","doi-asserted-by":"publisher","first-page":"914","DOI":"10.1038\/s42256-021-00403-1","volume":"3","author":"JK Wang","year":"2021","unstructured":"Wang JK et al (2021) Multi-constraint molecular generation based on conditional transformer, knowledge distillation and reinforcement learning. Nat Mach Intell 3:914\u2013922. https:\/\/doi.org\/10.1038\/s42256-021-00403-1","journal-title":"Nat Mach Intell"},{"key":"771_CR13","unstructured":"Yang K, et al. (2021) In IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6684\u20136694"},{"key":"771_CR14","unstructured":"Zang CX, Wang F, Assoc Comp, M (2020) In: 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 617\u2013626"},{"key":"771_CR15","doi-asserted-by":"publisher","first-page":"14011","DOI":"10.1021\/acs.jmedchem.1c00927","volume":"64","author":"XC Tong","year":"2021","unstructured":"Tong XC et al (2021) Generative models for de novo drug design. J Med Chem 64:14011\u201314027. https:\/\/doi.org\/10.1021\/acs.jmedchem.1c00927","journal-title":"J Med Chem"},{"key":"771_CR16","doi-asserted-by":"publisher","first-page":"5343","DOI":"10.1021\/acs.jcim.0c01496","volume":"61","author":"T Sousa","year":"2021","unstructured":"Sousa T, Correia J, Pereira V, Rocha M (2021) Generative deep learning for targeted compound design. J Chem Inform Model 61:5343\u20135361. https:\/\/doi.org\/10.1021\/acs.jcim.0c01496","journal-title":"J Chem Inform Model"},{"key":"771_CR17","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1016\/j.ejmech.2012.06.024","volume":"54","author":"P Bonnet","year":"2012","unstructured":"Bonnet P (2012) Is chemical synthetic accessibility computationally predictable for drug and lead-like molecules? A comparative assessment between medicinal and computational chemists. Eur J Med Chem 54:679\u2013689. https:\/\/doi.org\/10.1016\/j.ejmech.2012.06.024","journal-title":"Eur J Med Chem"},{"key":"771_CR18","doi-asserted-by":"publisher","DOI":"10.1186\/1758-2946-1-8","author":"P Ertl","year":"2009","unstructured":"Ertl P, Schuffenhauer A (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminform. https:\/\/doi.org\/10.1186\/1758-2946-1-8","journal-title":"J Cheminform"},{"key":"771_CR19","doi-asserted-by":"publisher","first-page":"252","DOI":"10.1021\/acs.jcim.7b00622","volume":"58","author":"CW Coley","year":"2018","unstructured":"Coley CW, Rogers L, Green WH, Jensen KF (2018) SCScore: synthetic complexity learned from a reaction corpus. J Chem Inform Model 58:252\u2013261. https:\/\/doi.org\/10.1021\/acs.jcim.7b00622","journal-title":"J Chem Inform Model"},{"key":"771_CR20","doi-asserted-by":"publisher","first-page":"3339","DOI":"10.1039\/d0sc05401a","volume":"12","author":"A Thakkar","year":"2021","unstructured":"Thakkar A, Chadimova V, Bjerrum EJ, Engkvist O, Reymond JL (2021) Retrosynthetic accessibility score (RAscore) - rapid machine learned synthesizability classification from AI driven retrosynthetic planning. Chem Sci 12:3339\u20133349. https:\/\/doi.org\/10.1039\/d0sc05401a","journal-title":"Chem Sci"},{"key":"771_CR21","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-020-00439-2","author":"M Vorsilak","year":"2020","unstructured":"Vorsilak M, Kolar M, Cmelo I, Svozil D (2020) SYBA: Bayesian estimation of synthetic accessibility of organic compounds. J Cheminform. https:\/\/doi.org\/10.1186\/s13321-020-00439-2","journal-title":"J Cheminform"},{"key":"771_CR22","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-023-00678-z","author":"G Skoraczynski","year":"2023","unstructured":"Skoraczynski G, Kitlas M, Miasojedow B, Gambin A (2023) Critical assessment of synthetic accessibility scores in computer-assisted synthesis planning. J Cheminform. https:\/\/doi.org\/10.1186\/s13321-023-00678-z","journal-title":"J Cheminform"},{"key":"771_CR23","doi-asserted-by":"publisher","first-page":"2293","DOI":"10.1021\/acs.jcim.1c01476","volume":"62","author":"CH Liu","year":"2022","unstructured":"Liu CH et al (2022) RetroGNN: fast estimation of synthesizability for virtual screening and de novo design by learning from slow retrosynthesis software. J Chem Inform Model 62:2293\u20132300. https:\/\/doi.org\/10.1021\/acs.jcim.1c01476","journal-title":"J Chem Inform Model"},{"key":"771_CR24","doi-asserted-by":"publisher","first-page":"2973","DOI":"10.1021\/acs.jcim.2c00038","volume":"62","author":"JH Yu","year":"2022","unstructured":"Yu JH et al (2022) Organic compound synthetic accessibility prediction based on the graph attention mechanism. J Chem Inform Model 62:2973\u20132986. https:\/\/doi.org\/10.1021\/acs.jcim.2c00038","journal-title":"J Chem Inform Model"},{"key":"771_CR25","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-020-00472-1","author":"S Genheden","year":"2020","unstructured":"Genheden S et al (2020) AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning. J Cheminform. https:\/\/doi.org\/10.1186\/s13321-020-00472-1","journal-title":"J Cheminform"},{"key":"771_CR26","unstructured":"Chen BH, Li CT, Dai HJ, Song L (2020) in International Conference on Machine Learning (ICML)"},{"key":"771_CR27","doi-asserted-by":"publisher","first-page":"D930","DOI":"10.1093\/nar\/gky1075","volume":"47","author":"D Mendez","year":"2019","unstructured":"Mendez D et al (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res 47:D930\u2013D940. https:\/\/doi.org\/10.1093\/nar\/gky1075","journal-title":"Nucleic Acids Res"},{"key":"771_CR28","doi-asserted-by":"publisher","DOI":"10.3389\/fchem.2020.00046","author":"S Buhlmann","year":"2020","unstructured":"Buhlmann S, Reymond JL (2020) ChEMBL-Likeness Score and Database GDBChEMBL. Front Chem. https:\/\/doi.org\/10.3389\/fchem.2020.00046"},{"key":"771_CR29","doi-asserted-by":"publisher","first-page":"2324","DOI":"10.1021\/acs.jcim.5b00559","volume":"55","author":"T Sterling","year":"2015","unstructured":"Sterling T, Irwin JJ (2015) ZINC 15-Ligand Discovery for Everyone. J Chem Inform Model 55:2324\u20132337. https:\/\/doi.org\/10.1021\/acs.jcim.5b00559","journal-title":"J Chem Inform Model"},{"key":"771_CR30","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-017-0206-2","author":"M Vorsilak","year":"2017","unstructured":"Vorsilak M, Svozil D (2017) Nonpher: computational method for design of hard-to-synthesize structures. J Cheminform. https:\/\/doi.org\/10.1186\/s13321-017-0206-2","journal-title":"J Cheminform"},{"key":"771_CR31","doi-asserted-by":"publisher","first-page":"236","DOI":"10.1021\/acs.orglett.0c04000","volume":"23","author":"Z Huang","year":"2021","unstructured":"Huang Z, Ji X, Lumb JP (2021) Total Synthesis of (S)-Cularine via Nucleophilic Substitution on a Catechol. Org Lett 23:236\u2013241. https:\/\/doi.org\/10.1021\/acs.orglett.0c04000","journal-title":"Org Lett"},{"key":"771_CR32","doi-asserted-by":"publisher","first-page":"3416","DOI":"10.1021\/ol501341b","volume":"16","author":"SQ Zhou","year":"2014","unstructured":"Zhou SQ, Jia YX (2014) Total Synthesis of (-)-Goniomitine. Org Lett 16:3416\u20133418. https:\/\/doi.org\/10.1021\/ol501341b","journal-title":"Org Lett"},{"key":"771_CR33","doi-asserted-by":"publisher","DOI":"10.1002\/chem.202103558","author":"AC Schmidt","year":"2022","unstructured":"Schmidt AC, Hiersemann M (2022) Total synthesis and structural assignment of (-)-fusaequisin A. Chemistry. https:\/\/doi.org\/10.1002\/chem.202103558","journal-title":"Chemistry"},{"key":"771_CR34","doi-asserted-by":"publisher","first-page":"9666","DOI":"10.1002\/anie.202016343","volume":"60","author":"Y Jin","year":"2021","unstructured":"Jin Y et al (2021) Total synthesis of haliclonin A. Angewandte Chemie-Int Ed 60:9666\u20139671. https:\/\/doi.org\/10.1002\/anie.202016343","journal-title":"Angewandte Chemie-Int Ed"},{"key":"771_CR35","doi-asserted-by":"publisher","first-page":"1416","DOI":"10.1021\/acs.orglett.1c00090","volume":"23","author":"K Parmar","year":"2021","unstructured":"Parmar K, Haghshenas P, Gravel M (2021) Total synthesis of (+)-hyacinthacine a(1) using a chemoselective cross-benzoin reaction and a furan photooxygenation-amine cyclization strategy. Org Lett 23:1416\u20131421. https:\/\/doi.org\/10.1021\/acs.orglett.1c00090","journal-title":"Org Lett"},{"key":"771_CR36","doi-asserted-by":"publisher","first-page":"6424","DOI":"10.1021\/ol503246k","volume":"16","author":"SJ Gharpure","year":"2014","unstructured":"Gharpure SJ, Nanda LN, Shukla MK (2014) Donor-acceptor substituted cyclopropane to butanolide and butenolide natural products: enantiospecific first total synthesis of (+)-hydroxyancepsenolide. Org Lett 16:6424\u20136427. https:\/\/doi.org\/10.1021\/ol503246k","journal-title":"Org Lett"},{"key":"771_CR37","doi-asserted-by":"publisher","first-page":"7968","DOI":"10.1002\/anie.201502696","volume":"54","author":"S Sieber","year":"2015","unstructured":"Sieber S et al (2015) Isolation and total synthesis of kirkamide, an aminocyclitol from an obligate leaf nodule symbiont. Angewandte Chemie-Int Ed 54:7968\u20137970. https:\/\/doi.org\/10.1002\/anie.201502696","journal-title":"Angewandte Chemie-Int Ed"},{"key":"771_CR38","doi-asserted-by":"publisher","first-page":"3725","DOI":"10.1016\/j.tet.2012.03.021","volume":"68","author":"RS Perali","year":"2012","unstructured":"Perali RS, Kalapati S (2012) First enantioselective total synthesis of (S)-(-)-longianone. Tetrahedron 68:3725\u20133728. https:\/\/doi.org\/10.1016\/j.tet.2012.03.021","journal-title":"Tetrahedron"},{"key":"771_CR39","doi-asserted-by":"publisher","first-page":"5596","DOI":"10.1021\/acs.orglett.9b01945","volume":"21","author":"M Ohtawa","year":"2019","unstructured":"Ohtawa M et al (2019) Total synthesis and absolute configuration of simpotentin, a potentiator of amphotericin B activity. Org Lett 21:5596\u20135599. https:\/\/doi.org\/10.1021\/acs.orglett.9b01945","journal-title":"Org Lett"},{"key":"771_CR40","doi-asserted-by":"publisher","first-page":"12784","DOI":"10.1021\/jacs.5b08398","volume":"137","author":"C Bucher","year":"2015","unstructured":"Bucher C, Deans RM, Burns NZ (2015) Highly Selective Synthesis of Halomon, Plocamenone, and Isoplocamenone. J Am Chem Soc 137:12784\u201312787. https:\/\/doi.org\/10.1021\/jacs.5b08398","journal-title":"J Am Chem Soc"},{"key":"771_CR41","doi-asserted-by":"publisher","first-page":"6426","DOI":"10.1039\/c6ob00806b","volume":"14","author":"NN Yadav","year":"2016","unstructured":"Yadav NN, Choi J, Ha HJ (2016) One-pot multiple reactions: asymmetric synthesis of 2,6-cis-disubstituted piperidine alkaloids from chiral aziridine. Org Biomol Chemi 14:6426\u20136434. https:\/\/doi.org\/10.1039\/c6ob00806b","journal-title":"Org Biomol Chemi"},{"key":"771_CR42","doi-asserted-by":"publisher","DOI":"10.1002\/anie.202112427","author":"GL Wu","year":"2022","unstructured":"Wu GL et al (2022) Enantioselective allenation of terminal alkynes catalyzed by copper halides of mixed oxidation states and its application to the total synthesis of scorodonin. Angewandte Chemie-Int Ed. https:\/\/doi.org\/10.1002\/anie.202112427","journal-title":"Angewandte Chemie-Int Ed"},{"key":"771_CR43","doi-asserted-by":"publisher","first-page":"4035","DOI":"10.1021\/ol301932d","volume":"14","author":"LF Tietze","year":"2012","unstructured":"Tietze LF, Wolfram T, Holstein JJ, Dittrich B (2012) First enantioselective total synthesis of (+)-(r)-pinnatolide using an asymmetric domino allylation reaction. Org Lett 14:4035\u20134037. https:\/\/doi.org\/10.1021\/ol301932d","journal-title":"Org Lett"},{"key":"771_CR44","doi-asserted-by":"publisher","first-page":"8733","DOI":"10.1002\/anie.201004328","volume":"49","author":"B Gourdet","year":"2010","unstructured":"Gourdet B, Lam HW (2010) Catalytic Asymmetric Dihydroxylation of Enamides and Application to the Total Synthesis of (+)-Tanikolide. Angewandte Chemie-Int Ed 49:8733\u20138737. https:\/\/doi.org\/10.1002\/anie.201004328","journal-title":"Angewandte Chemie-Int Ed"},{"key":"771_CR45","doi-asserted-by":"publisher","first-page":"242","DOI":"10.1021\/ol302769r","volume":"15","author":"H Ren","year":"2013","unstructured":"Ren H, Wulff WD (2013) Total synthesis of sedum alkaloids via catalyst controlled aza-cope rearrangement and hydroformylation with formaldehyde. Org Lett 15:242\u2013245. https:\/\/doi.org\/10.1021\/ol302769r","journal-title":"Org Lett"},{"key":"771_CR46","doi-asserted-by":"publisher","first-page":"5904","DOI":"10.1021\/ol3028016","volume":"14","author":"MA Purino","year":"2012","unstructured":"Purino MA, Ramirez MA, Daranas AH, Martin VS, Padron JI (2012) Iron(III) catalyzed direct synthesis of cis-2,7-Disubstituted Oxepanes. The shortest total synthesis of (+)-Isolaurepan. Organic Letters 14:5904\u20135907. https:\/\/doi.org\/10.1021\/ol3028016","journal-title":"Organic Letters"},{"key":"771_CR47","doi-asserted-by":"publisher","first-page":"4441","DOI":"10.1021\/acs.joc.0c00167","volume":"85","author":"F Saito","year":"2020","unstructured":"Saito F, Becker J, Schreiner PR (2020) Synthesis and conformational analysis of parent perhydroazulenes reveal an energetically preferred cis ring fusion. J Org Chem 85:4441\u20134447. https:\/\/doi.org\/10.1021\/acs.joc.0c00167","journal-title":"J Org Chem"},{"key":"771_CR48","doi-asserted-by":"publisher","first-page":"1537","DOI":"10.1002\/anie.201410186","volume":"54","author":"M Nagatomo","year":"2015","unstructured":"Nagatomo M, Nishiyama H, Fujino H, Inoue M (2015) Decarbonylative radical coupling of alpha-aminoacyl tellurides: single-step preparation of gamma-amino and alpha, beta-diamino acids and rapid synthesis of gabapentin and manzacidin A. Angewandte Chemie-International Edition 54:1537\u20131541. https:\/\/doi.org\/10.1002\/anie.201410186","journal-title":"Angewandte Chemie-International Edition"},{"key":"771_CR49","doi-asserted-by":"publisher","first-page":"451","DOI":"10.1093\/bib\/bbz152","volume":"22","author":"Y Chu","year":"2021","unstructured":"Chu Y et al (2021) DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform 22:451\u2013462. https:\/\/doi.org\/10.1093\/bib\/bbz152","journal-title":"Brief Bioinform"},{"key":"771_CR50","doi-asserted-by":"publisher","first-page":"4577","DOI":"10.1021\/acs.jcim.9b00749","volume":"59","author":"X Shan","year":"2019","unstructured":"Shan X et al (2019) Prediction of CYP450 enzyme-substrate selectivity based on the network-based label space division method. J Chem Inform Model 59:4577\u20134586. https:\/\/doi.org\/10.1021\/acs.jcim.9b00749","journal-title":"J Chem Inform Model"},{"key":"771_CR51","doi-asserted-by":"publisher","DOI":"10.3390\/molecules26247414","author":"X Cheng","year":"2021","unstructured":"Cheng X, Wang J, Li QY, Liu TG (2021) BiLSTM-5mC: a bidirectional long short-term memory-based approach for predicting 5-methylcytosine sites in genome-wide DNA promoters. Molecules. https:\/\/doi.org\/10.3390\/molecules26247414","journal-title":"Molecules"},{"key":"771_CR52","doi-asserted-by":"publisher","DOI":"10.3390\/molecules26092487","author":"HT Han","year":"2021","unstructured":"Han HT, Ding CC, Cheng X, Sang XZ, Liu TG (2021) iT4SE-EP: accurate identification of bacterial type IV secreted effectors by exploring evolutionary features from two PSI-BLAST Profiles. Molecules. https:\/\/doi.org\/10.3390\/molecules26092487","journal-title":"Molecules"},{"key":"771_CR53","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"771_CR54","unstructured":"Landrum G (2022) \"RDKit: Open-source cheminformatics. https:\/\/www.rdkit.org\""},{"key":"771_CR55","doi-asserted-by":"publisher","first-page":"1234","DOI":"10.1093\/bioinformatics\/btz682","volume":"36","author":"J Lee","year":"2020","unstructured":"Lee J et al (2020) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36:1234\u20131240. https:\/\/doi.org\/10.1093\/bioinformatics\/btz682","journal-title":"Bioinformatics"},{"key":"771_CR56","unstructured":"Chithrananda S, Grand G, Ramsundar B (2019) ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction. ArXiv abs\/2010.09885"},{"key":"771_CR57","doi-asserted-by":"crossref","unstructured":"Bhargava P, Drozd A, Rogers A (2021) Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics. arXiv:2110.01518 (2021). https:\/\/ui.adsabs.harvard.edu\/abs\/2021arXiv211001518B.","DOI":"10.18653\/v1\/2021.insights-1.18"},{"key":"771_CR58","unstructured":"Liu Y, et al. (2019) RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692. https:\/\/ui.adsabs.harvard.edu\/abs\/2019arXiv190711692L"},{"key":"771_CR59","unstructured":"He P, Liu X, Gao J, Chen W (2020) DeBERTa: Decoding-enhanced BERT with Disentangled Attention. arXiv:2006.03654. https:\/\/ui.adsabs.harvard.edu\/abs\/2020arXiv200603654H."},{"key":"771_CR60","unstructured":"Guo D, et al. (2020) GraphCodeBERT: Pre-training Code Representations with Data Flow. arXiv:2009.08366. https:\/\/ui.adsabs.harvard.edu\/abs\/2020arXiv200908366G"},{"key":"771_CR61","unstructured":"Clark K, Luong M-T, Le QV, Manning CD (2020) ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. arXiv:2003.10555. <https:\/\/ui.adsabs.harvard.edu\/abs\/2020arXiv200310555C."},{"key":"771_CR62","unstructured":"Ahmad W, Simon E, Chithrananda S, Grand G, Ramsundar B (2022) ChemBERTa-2: Towards Chemical Foundation Models. arXiv:2209.01712. https:\/\/ui.adsabs.harvard.edu\/abs\/2022arXiv220901712A"},{"key":"771_CR63","unstructured":"Erickson N, et al. (2020) AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. ArXiv abs\/2003.06505"},{"key":"771_CR64","unstructured":"Shi X, Mueller J, Erickson N, Li M, Smola AJ (2021) Benchmarking Multimodal AutoML for Tabular Data with Text Fields. ArXiv abs\/2111.02705"},{"key":"771_CR65","unstructured":"Wortsman M, et al. (2022) Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. arXiv:2203.05482. https:\/\/ui.adsabs.harvard.edu\/abs\/2022arXiv220305482W"},{"key":"771_CR66","doi-asserted-by":"crossref","unstructured":"Wang S, Guo Y, Wang Y, Sun H, Huang J (2019) In Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics 429\u2013436 (Association for Computing Machinery, Niagara Falls, NY, USA, 2019)","DOI":"10.1145\/3307339.3342186"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00771-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-023-00771-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00771-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,1]],"date-time":"2024-11-01T06:49:39Z","timestamp":1730443779000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-023-00771-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,2]]},"references-count":66,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["771"],"URL":"https:\/\/doi.org\/10.1186\/s13321-023-00771-3","relation":{"references":[{"id-type":"uri","id":"","asserted-by":"subject"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,2]]},"assertion":[{"value":"5 July 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 October 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 November 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"103"}}