{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T17:36:33Z","timestamp":1755797793367},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,6,8]],"date-time":"2023-06-08T00:00:00Z","timestamp":1686182400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,8]],"date-time":"2023-06-08T00:00:00Z","timestamp":1686182400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Retrosynthesis is an important task in organic chemistry. Recently, numerous data-driven approaches have achieved promising results in this task. However, in practice, these data-driven methods might lead to sub-optimal outcomes by making predictions based on the training data distribution, a phenomenon we refer as frequency bias. For example, in template-based approaches, low-ranked predictions are typically generated by less common templates with low confidence scores which might be too low to be comparable, and it is observed that recorded reactants can be among these low-ranked predictions. In this work, we introduce RetroRanker, a ranking model built upon graph neural networks, designed to mitigate the frequency bias in predictions of existing retrosynthesis models through re-ranking. RetroRanker incorporates potential reaction changes of each set of predicted reactants in obtaining the given product to lower the rank of chemically unreasonable predictions. The predicted re-ranked results on publicly available retrosynthesis benchmarks demonstrate that we can achieve improvement on most state-of-the-art models with RetroRanker. Our preliminary studies also indicate that RetroRanker can enhance the performance of multi-step retrosynthesis.<\/jats:p>","DOI":"10.1186\/s13321-023-00727-7","type":"journal-article","created":{"date-parts":[[2023,6,8]],"date-time":"2023-06-08T08:01:43Z","timestamp":1686211303000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["RetroRanker: leveraging reaction changes to improve retrosynthesis prediction through re-ranking"],"prefix":"10.1186","volume":"15","author":[{"given":"Junren","family":"Li","sequence":"first","affiliation":[]},{"given":"Lei","family":"Fang","sequence":"additional","affiliation":[]},{"given":"Jian-Guang","family":"Lou","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,6,8]]},"reference":[{"issue":"1","key":"727_CR1","doi-asserted-by":"publisher","first-page":"3","DOI":"10.31635\/ccschem.019.20190006","volume":"1","author":"K Nicolaou","year":"2019","unstructured":"Nicolaou K, Rigol S, Yu R (2019) Total synthesis endeavors and their contributions to science and society: a personal account. CCS Chem 1(1):3\u201337","journal-title":"CCS Chem"},{"issue":"3902","key":"727_CR2","doi-asserted-by":"publisher","first-page":"178","DOI":"10.1126\/science.166.3902.178","volume":"166","author":"EJ Corey","year":"1969","unstructured":"Corey EJ, Wipke WT (1969) Computer-assisted design of complex organic syntheses. Science 166(3902):178\u2013192","journal-title":"Science"},{"key":"727_CR3","doi-asserted-by":"crossref","unstructured":"Pensak DA, Corey EJ (1977) Lhasa-logic and heuristics applied to synthetic analysis, Chap. 1. pp 1\u201332 .","DOI":"10.1021\/bk-1977-0061.ch001"},{"issue":"20","key":"727_CR4","doi-asserted-by":"publisher","first-page":"5904","DOI":"10.1002\/anie.201506101","volume":"55","author":"S Szymku\u0107","year":"2016","unstructured":"Szymku\u0107 S, Gajewska EP, Klucznik T, Molga K, Dittwald P, Startek M, Bajczyk M, Grzybowski BA (2016) Computer-assisted synthetic planning: the end of the beginning. Angew Chem Int Ed 55(20):5904\u20135937","journal-title":"Angew Chem Int Ed"},{"key":"727_CR5","doi-asserted-by":"publisher","DOI":"10.1016\/j.eng.2022.04.021","author":"Y Jiang","year":"2022","unstructured":"Jiang Y, Yu Y, Kong M, Mei Y, Yuan L, Huang Z, Kuang K, Wang Z, Yao H, Zou J, Coley CW, Wei Y (2022) Artificial intelligence for retrosynthesis prediction. Engineering. https:\/\/doi.org\/10.1016\/j.eng.2022.04.021","journal-title":"Engineering"},{"issue":"10","key":"727_CR6","doi-asserted-by":"publisher","first-page":"1103","DOI":"10.1021\/acscentsci.7b00303","volume":"3","author":"B Liu","year":"2017","unstructured":"Liu B, Ramsundar B, Kawthekar P, Shi J, Gomes J, Luu Nguyen Q, Ho S, Sloane J, Wender P, Pande V (2017) Retrosynthetic reaction prediction using neural sequence-to-sequence models. ACS Cent Sci 3(10):1103\u20131113","journal-title":"ACS Cent Sci"},{"issue":"3","key":"727_CR7","doi-asserted-by":"publisher","first-page":"522","DOI":"10.1016\/j.chempr.2018.02.002","volume":"4","author":"T Klucznik","year":"2018","unstructured":"Klucznik T, Mikulak-Klucznik B, McCormack MP, Lima H, Szymku\u0107 S, Bhowmick M, Molga K, Zhou Y, Rickershauser L, Gajewska EP, Toutchkine A, Dittwald P, Startek MP, Kirkovits GJ, Roszak R, Adamski A, Sieredzi\u0144ska B, Mrksich M, Trice SLJ, Grzybowski BA (2018) Efficient syntheses of diverse, medicinally relevant targets planned by computer and executed in the laboratory. Chem 4(3):522\u2013532","journal-title":"Chem"},{"issue":"12","key":"727_CR8","doi-asserted-by":"publisher","first-page":"1237","DOI":"10.1021\/acscentsci.7b00355","volume":"3","author":"CW Coley","year":"2017","unstructured":"Coley CW, Rogers L, Green WH, Jensen KF (2017) Computer-assisted retrosynthesis based on molecular similarity. ACS Cent Sci 3(12):1237\u20131245","journal-title":"ACS Cent Sci"},{"issue":"9","key":"727_CR9","doi-asserted-by":"publisher","first-page":"772","DOI":"10.1038\/s42256-022-00526-z","volume":"4","author":"S Chen","year":"2022","unstructured":"Chen S, Jung Y (2022) A generalized-template-based graph neural network for accurate organic reactivity prediction. Nat Mach Intell 4(9):772\u2013780","journal-title":"Nat Mach Intell"},{"issue":"1","key":"727_CR10","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. J Chem Inf Comput Sci 28(1):31\u201336","journal-title":"J Chem Inf Comput Sci"},{"issue":"2","key":"727_CR11","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1021\/ci00062a008","volume":"29","author":"D Weininger","year":"1989","unstructured":"Weininger D, Weininger A, Weininger JL (1989) Smiles. 2. algorithm for generation of unique smiles notation. J Chem Inf Comput Sci 29(2):97\u2013101","journal-title":"J Chem Inf Comput Sci"},{"key":"727_CR12","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1706.03762","author":"A Vaswani","year":"2017","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst. https:\/\/doi.org\/10.48550\/arXiv.1706.03762","journal-title":"Adv Neural Inf Process Syst"},{"issue":"9","key":"727_CR13","doi-asserted-by":"publisher","first-page":"1572","DOI":"10.1021\/acscentsci.9b00576","volume":"5","author":"P Schwaller","year":"2019","unstructured":"Schwaller P, Laino T, Gaudin T, Bolgar P, Hunter CA, Bekas C, Lee AA (2019) Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent Sci 5(9):1572\u20131583","journal-title":"ACS Cent Sci"},{"issue":"1","key":"727_CR14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-020-19266-y","volume":"11","author":"IV Tetko","year":"2020","unstructured":"Tetko IV, Karpov P, Van Deursen R, Godin G (2020) State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis. Nat Commun 11(1):1\u201311","journal-title":"Nat Commun"},{"key":"727_CR15","doi-asserted-by":"publisher","first-page":"9023","DOI":"10.1039\/D2SC02763A","volume":"13","author":"Z Zhong","year":"2022","unstructured":"Zhong Z, Song J, Feng Z, Liu T, Jia L, Yao S, Wu M, Hou T, Song M (2022) Root-aligned smiles: a tight representation for chemical reaction prediction. Chem Sci 13:9023\u20139034","journal-title":"Chem Sci"},{"key":"727_CR16","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1007\/978-3-030-30493-5_78","volume-title":"Artificial neural networks and machine learning - ICANN 2019: workshop and special sessions","author":"P Karpov","year":"2019","unstructured":"Karpov P, Godin G, Tetko IV (2019) A transformer model for retrosynthesis. In: Tetko IV, K\u016frkov\u00e1 V, Karpov P, Theis F (eds) Artificial neural networks and machine learning - ICANN 2019: workshop and special sessions. Springer, Cham, pp 817\u2013830"},{"issue":"7","key":"727_CR17","doi-asserted-by":"publisher","first-page":"3273","DOI":"10.1021\/acs.jcim.1c00537","volume":"61","author":"M Sacha","year":"2021","unstructured":"Sacha M, B\u0142az M, Byrski P, Dabrowski-Tumanski P, Chrominski M, Loska R, W\u0142odarczyk-Pruszynski P, Jastrzebski S (2021) Molecule edit graph attention network: modeling chemical reactions as sequences of graph edits. J Chem Inf Model 61(7):3273\u20133284","journal-title":"J Chem Inf Model"},{"key":"727_CR18","unstructured":"Shi C, Xu M, Guo H, Zhang M, Tang J (2020) A graph to graphs framework for retrosynthesis prediction. arXiv"},{"issue":"1","key":"727_CR19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-021-21895-w","volume":"12","author":"DP Kov\u00e1cs","year":"2021","unstructured":"Kov\u00e1cs DP, McCorkindale W, Lee AA (2021) Quantitative interpretation explains machine learning models for chemical reaction prediction and uncovers bias. Nat Commun 12(1):1\u20139","journal-title":"Nat Commun"},{"issue":"1","key":"727_CR20","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-020-00472-1","volume":"12","author":"S Genheden","year":"2020","unstructured":"Genheden S, Thakkar A, Chadimov\u00e1 V, Reymond J-L, Engkvist O, Bjerrum E (2020) Aizynthfinder: a fast, robust and flexible open-source software for retrosynthetic planning. J Cheminformatics 12(1):1\u20139","journal-title":"J Cheminformatics"},{"issue":"7698","key":"727_CR21","doi-asserted-by":"publisher","first-page":"604","DOI":"10.1038\/nature25978","volume":"555","author":"MH Segler","year":"2018","unstructured":"Segler MH, Preuss M, Waller MP (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555(7698):604\u2013610","journal-title":"Nature"},{"key":"727_CR22","first-page":"10186","volume":"34","author":"R Sun","year":"2021","unstructured":"Sun R, Dai H, Li L, Kearnes S, Dai B (2021) Towards understanding retrosynthesis by energy-based models. Adv Neural Inf Process Syst 34:10186\u201310194","journal-title":"Adv Neural Inf Process Syst"},{"issue":"1","key":"727_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-022-00594-8","volume":"14","author":"MH Lin","year":"2022","unstructured":"Lin MH, Tu Z, Coley CW (2022) Improving the performance of models for one-step retrosynthesis through re-ranking. J Cheminformatics 14(1):1\u201313","journal-title":"J Cheminformatics"},{"issue":"12","key":"727_CR24","doi-asserted-by":"publisher","first-page":"2336","DOI":"10.1021\/acs.jcim.6b00564","volume":"56","author":"N Schneider","year":"2016","unstructured":"Schneider N, Stiefl N, Landrum GA (2016) What\u2019s what: the (nearly) definitive guide to reaction role assignment. J Chem Inf Model 56(12):2336\u20132346","journal-title":"J Chem Inf Model"},{"key":"727_CR25","unstructured":"Lowe DM (2012) Extraction of chemical structures and reactions from the literature. PhD thesis, University of Cambridge"},{"key":"727_CR26","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2001.01408","author":"H Dai","year":"2019","unstructured":"Dai H, Li C, Coley C, Dai B, Song L (2019) Retrosynthesis prediction with conditional graph logic network. Adv Neural Inf Process Syst. https:\/\/doi.org\/10.48550\/arXiv.2001.01408","journal-title":"Adv Neural Inf Process Syst"},{"issue":"3","key":"727_CR27","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1561\/1500000016","volume":"3","author":"T-Y Liu","year":"2009","unstructured":"Liu T-Y (2009) Learning to rank for information retrieval. Found Trends Inf Retr 3(3):225\u2013331","journal-title":"Found Trends Inf Retr"},{"issue":"2","key":"727_CR28","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1039\/D1DD00006C","volume":"1","author":"D Probst","year":"2022","unstructured":"Probst D, Schwaller P, Reymond J-L (2022) Reaction classification and yield prediction using the differential reaction fingerprint DRFP. Dig Discov 1(2):91\u201397","journal-title":"Dig Discov"},{"key":"727_CR29","unstructured":"Tavakoli M, Shmakov A, Ceccarelli F, Baldi P (2022) Rxn hypergraph: a hypergraph attention model for chemical reaction representation. arXiv preprint http:\/\/arxiv.org\/abs\/2201.01196"},{"key":"727_CR30","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2106.14232","author":"M Li","year":"2021","unstructured":"Li M, Zhou J, Hu J, Fan W, Zhang Y, Gu Y, Karypis G (2021) Dgl-lifesci: an open-source toolkit for deep learning on graphs in life science. ACS Omega. https:\/\/doi.org\/10.48550\/arXiv.2106.14232","journal-title":"ACS Omega"},{"issue":"15","key":"727_CR31","doi-asserted-by":"publisher","first-page":"4166","DOI":"10.1126\/sciadv.abe4166","volume":"7","author":"P Schwaller","year":"2021","unstructured":"Schwaller P, Hoover B, Reymond J-L, Strobelt H, Laino T (2021) Extraction of organic chemistry grammar from unsupervised learning of chemical reactions. Sci Adv 7(15):4166","journal-title":"Sci Adv"},{"issue":"8","key":"727_CR32","doi-asserted-by":"publisher","first-page":"3370","DOI":"10.1021\/acs.jcim.9b00237","volume":"59","author":"K Yang","year":"2019","unstructured":"Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59(8):3370\u20133388","journal-title":"J Chem Inf Model"},{"issue":"16","key":"727_CR33","doi-asserted-by":"publisher","first-page":"8749","DOI":"10.1021\/acs.jmedchem.9b00959","volume":"63","author":"Z Xiong","year":"2019","unstructured":"Xiong Z, Wang D, Liu X, Zhong F, Wan X, Li X, Li Z, Luo X, Chen K, Jiang H (2019) Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J Med Chem 63(16):8749\u20138760","journal-title":"J Med Chem"},{"key":"727_CR34","unstructured":"Ying C, Cai T, Luo S, Zheng S, Ke G, He D, Shen Y, Liu T-Y (2021) Do transformers really perform badly for graph representation? In: Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Vaughan JW. (eds.) Advances in Neural Information Processing Systems, vol 34, pp 28877\u201328888"},{"key":"727_CR35","unstructured":"Veli\u010dkovi\u0107 P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint http:\/\/arxiv.org\/abs\/1710.10903"},{"issue":"1","key":"727_CR36","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-020-00479-8","volume":"13","author":"D Jiang","year":"2021","unstructured":"Jiang D, Wu Z, Hsieh C-Y, Chen G, Liao B, Wang Z, Shen C, Cao D, Wu J, Hou T (2021) Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J Cheminformatics 13(1):1\u201323","journal-title":"J Cheminformatics"},{"issue":"15","key":"727_CR37","doi-asserted-by":"publisher","first-page":"3503","DOI":"10.1021\/acs.jcim.2c00321","volume":"62","author":"Z Tu","year":"2022","unstructured":"Tu Z, Coley CW (2022) Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction. J Chem Inf Model 62(15):3503\u20133513","journal-title":"J Chem Inf Model"},{"issue":"9","key":"727_CR38","doi-asserted-by":"publisher","first-page":"2064","DOI":"10.1021\/acs.jcim.1c00600","volume":"62","author":"V Bagal","year":"2021","unstructured":"Bagal V, Aggarwal R, Vinod P, Priyakumar UD (2021) Molgpt: molecular generation using a transformer-decoder model. J Chem Inf Model 62(9):2064\u20132076","journal-title":"J Chem Inf Model"},{"key":"727_CR39","unstructured":"Shi Y, Zheng S, Ke G, Shen Y, You J, He J, Luo S, Liu C, He D, Liu TY (2022) Benchmarking graphormer on large-scale molecular modeling datasets. arXiv preprint http:\/\/arxiv.org\/abs\/2203.04810"},{"key":"727_CR40","first-page":"11248","volume":"33","author":"C Yan","year":"2020","unstructured":"Yan C, Ding Q, Zhao P, Zheng S, Yang J, Yu Y, Huang J (2020) Retroxpert: decompose retrosynthesis prediction like a chemist. Adv Neural Inf Process Syst 33:11248\u201311258","journal-title":"Adv Neural Inf Process Syst"},{"issue":"9","key":"727_CR41","doi-asserted-by":"publisher","first-page":"4385","DOI":"10.1021\/acs.jmedchem.6b00153","volume":"59","author":"N Schneider","year":"2016","unstructured":"Schneider N, Lowe DM, Sayle RA, Tarselli MA, Landrum GA (2016) Big data from pharmaceutical patents: a computational analysis of medicinal chemists\u2019 bread and butter. J Med Chem 59(9):4385\u20134402","journal-title":"J Med Chem"},{"key":"727_CR42","unstructured":"Chen B, Li C, Dai H, Song L (2020) Retro*: learning retrosynthetic planning with neural guided a* search. In: International Conference on Machine Learning, PMLR, pp 1608\u20131616 ."},{"key":"727_CR43","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1709.04555","author":"W Jin","year":"2017","unstructured":"Jin W, Coley C, Barzilay R, Jaakkola T (2017) Predicting organic reaction outcomes with Weisfeiler-Lehman network. Adv Neural Inf Process Syst. https:\/\/doi.org\/10.48550\/arXiv.1709.04555","journal-title":"Adv Neural Inf Process Syst"},{"key":"727_CR44","doi-asserted-by":"publisher","first-page":"595","DOI":"10.1007\/s10822-016-9938-8","volume":"30","author":"S Kearnes","year":"2016","unstructured":"Kearnes S, McCloskey K, Berndl M, Pande V, Riley P (2016) Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 30:595\u2013608","journal-title":"J Comput Aided Mol Des"},{"key":"727_CR45","doi-asserted-by":"publisher","first-page":"3355","DOI":"10.1039\/C9SC03666K","volume":"11","author":"K Lin","year":"2020","unstructured":"Lin K, Xu Y, Pei J, Lai L (2020) Automatic retrosynthetic route planning using template-free models. Chem Sci 11:3355\u20133364","journal-title":"Chem Sci"},{"issue":"25","key":"727_CR46","doi-asserted-by":"publisher","first-page":"5966","DOI":"10.1002\/chem.201605499","volume":"23","author":"MH Segler","year":"2017","unstructured":"Segler MH, Waller MP (2017) Neural-symbolic machine learning for retrosynthesis and reaction prediction. Chem A Eur J 23(25):5966\u20135971","journal-title":"Chem A Eur J"},{"key":"727_CR47","unstructured":"Hassen AK, Torren-Peraire P, Genheden S, Verhoeven J, Preuss M, Tetko IV (2022) Mind the retrosynthesis gap: Bridging the divide between single-step and multi-step retrosynthesis prediction. In: NeurIPS 2022 AI for Science: Progress and Promises."}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00727-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-023-00727-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-023-00727-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,8]],"date-time":"2023-06-08T08:05:11Z","timestamp":1686211511000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-023-00727-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,8]]},"references-count":47,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["727"],"URL":"https:\/\/doi.org\/10.1186\/s13321-023-00727-7","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,8]]},"assertion":[{"value":"31 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 May 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 June 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"58"}}