{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T08:49:26Z","timestamp":1773391766060,"version":"3.50.1"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T00:00:00Z","timestamp":1739750400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T00:00:00Z","timestamp":1739750400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>In recent years, the integration of machine learning techniques into chemical reaction product prediction has opened new avenues for understanding and predicting the behaviour of chemical substances. The necessity for such predictive methods stems from the growing regulatory and social awareness of the environmental consequences associated with the persistence and accumulation of chemical residues. Traditional biodegradation prediction methods rely on expert knowledge to perform predictions. However, creating this expert knowledge is becoming increasingly prohibitive due to the complexity and diversity of newer datasets, leaving existing methods unable to perform predictions on these datasets. We formulate the product prediction problem as a sequence-to-sequence generation task and take inspiration from natural language processing and other reaction prediction tasks. In doing so, we reduce the need for the expensive manual creation of expert-based rules.<\/jats:p>","DOI":"10.1186\/s13321-025-00969-7","type":"journal-article","created":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T21:04:22Z","timestamp":1739826262000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Predictive modeling of biodegradation pathways using transformer architectures"],"prefix":"10.1186","volume":"17","author":[{"given":"Liam","family":"Brydon","sequence":"first","affiliation":[]},{"given":"Kunyang","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Gillian","family":"Dobbie","sequence":"additional","affiliation":[]},{"given":"Katerina","family":"Ta\u0161kova","sequence":"additional","affiliation":[]},{"given":"J\u00f6rg Simon","family":"Wicker","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,2,17]]},"reference":[{"key":"969_CR1","unstructured":"E Union (2020) Regulation (EC) No 1907\/2006 of the European Parliament and of the Council of 18 December 2006 concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH), establishing a European Chemicals Agency, amending Directive 1999\/45\/EC and repealing Council Regulation (EEC) No 793\/93 and Commission Regulation (EC) No 1488\/94 as well as Council Directive 76\/769\/EEC and Commission Directives 91\/155\/EEC, 93\/67\/EEC, 93\/105\/EC and 2000\/21\/EC (Text with EEA relevance) Text with EEA relevance. Legislative Body: OP_DATPRO. http:\/\/data.europa.eu\/eli\/reg\/2006\/1907\/2020-08-24\/eng. Accessed on 12 Mar 2024"},{"key":"969_CR2","unstructured":"E Union (2012) Regulation (EU) No 528\/2012 of the European Parliament and of the Council of 22 May 2012 concerning the making available on the market and use of biocidal products Text with EEA relevance. Legislative Body: CONSIL, EP. http:\/\/data.europa.eu\/eli\/reg\/2012\/528\/oj\/eng. Accessed 2 Dec 2024"},{"issue":"Database issue","key":"969_CR3","doi-asserted-by":"publisher","first-page":"517","DOI":"10.1093\/nar\/gkj076","volume":"34","author":"LBM Ellis","year":"2006","unstructured":"Ellis LBM, Roe D, Wackett LP (2006) The University of Minnesota Biocatalysis\/Biodegradation Database: the first decade. Nucleic Acids Res 34(Database issue):517\u2013521. https:\/\/doi.org\/10.1093\/nar\/gkj076","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"969_CR4","doi-asserted-by":"publisher","first-page":"814","DOI":"10.1093\/bioinformatics\/btq024","volume":"26","author":"J Wicker","year":"2010","unstructured":"Wicker J, Fenner K, Ellis L, Wackett L, Kramer S (2010) Predicting biodegradation products and pathways: a hybrid knowledge- and machine learning-based approach. Bioinformatics 26(6):814\u2013821. https:\/\/doi.org\/10.1093\/bioinformatics\/btq024","journal-title":"Bioinformatics"},{"issue":"18","key":"969_CR5","doi-asserted-by":"publisher","first-page":"2079","DOI":"10.1093\/bioinformatics\/btn378","volume":"24","author":"K Fenner","year":"2008","unstructured":"Fenner K, Gao J, Kramer S, Ellis L, Wackett L (2008) Data-driven extraction of relative reasoning rules to limit combinatorial explosion in biodegradation pathway prediction. Bioinformatics 24(18):2079\u20132085. https:\/\/doi.org\/10.1093\/bioinformatics\/btn378","journal-title":"Bioinformatics"},{"issue":"7","key":"969_CR6","doi-asserted-by":"publisher","first-page":"407","DOI":"10.1093\/bioinformatics\/btad407","volume":"39","author":"K Zhang","year":"2023","unstructured":"Zhang K, Fenner K (2023) enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways. Bioinformatics 39(7):407. https:\/\/doi.org\/10.1093\/bioinformatics\/btad407","journal-title":"Bioinformatics"},{"issue":"1","key":"969_CR7","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1186\/s13321-018-0295-6","volume":"10","author":"N Kochev","year":"2018","unstructured":"Kochev N, Avramova S, Jeliazkova N (2018) Ambit-SMIRKS: a software module for reaction representation, reaction search and structure transformation. J Cheminform 10(1):42. https:\/\/doi.org\/10.1186\/s13321-018-0295-6","journal-title":"J Cheminform"},{"issue":"D1","key":"969_CR8","doi-asserted-by":"publisher","first-page":"502","DOI":"10.1093\/nar\/gkv1229","volume":"44","author":"J Wicker","year":"2016","unstructured":"Wicker J, Lorsbach T, G\u00fctlein M, Schmid E, Latino D, Kramer S, Fenner K (2016) enviPath\u2014the environmental contaminant biotransformation pathway resource. Nucleic Acids Res 44(D1):502\u2013508. https:\/\/doi.org\/10.1093\/nar\/gkv1229","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"969_CR9","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1186\/s13321-024-00881-6","volume":"16","author":"J Hafner","year":"2024","unstructured":"Hafner J, Lorsbach T, Schmidt S, Brydon L, Dost K, Zhang K, Fenner K, Wicker J (2024) Advancements in biotransformation pathway prediction: enhancements, datasets, and novel functionalities in enviPath. J Cheminform 16(1):93. https:\/\/doi.org\/10.1186\/s13321-024-00881-6","journal-title":"J Cheminform"},{"issue":"Database issue","key":"969_CR10","doi-asserted-by":"publisher","first-page":"488","DOI":"10.1093\/nar\/gkp771","volume":"38","author":"J Gao","year":"2010","unstructured":"Gao J, Ellis LBM, Wackett LP (2010) The University of Minnesota Biocatalysis\/Biodegradation Database: improving public access. Nucleic Acids Res 38(Database issue):488\u2013491. https:\/\/doi.org\/10.1093\/nar\/gkp771","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"969_CR11","doi-asserted-by":"publisher","first-page":"449","DOI":"10.1039\/c6em00697c","volume":"19","author":"DARS Latino","year":"2017","unstructured":"Latino DARS, Wicker J, G\u00fctlein M, Schmid E, Kramer S, Fenner K (2017) Eawag-Soil in enviPath: a new resource for exploring regulatory pesticide soil biodegradation pathways and half-life data. Environ Sci Process Impacts 19(3):449\u2013464. https:\/\/doi.org\/10.1039\/c6em00697c","journal-title":"Environ Sci Process Impacts"},{"issue":"8","key":"969_CR12","doi-asserted-by":"publisher","first-page":"1322","DOI":"10.1039\/D3EM00161J","volume":"25","author":"L Trostel","year":"2023","unstructured":"Trostel L, Coll C, Fenner K, Hafner J (2023) Combining predictive and analytical methods to elucidate pharmaceutical biotransformation in activated sludge. Environ Sci Process Impacts 25(8):1322\u20131336. https:\/\/doi.org\/10.1039\/D3EM00161J. (Publisher: Royal Society of Chemistry. Accessed 2024-01-29)","journal-title":"Environ Sci Process Impacts"},{"key":"969_CR13","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol. 30. Curran Associates, Inc., Newry. https:\/\/papers.nips.cc\/paper_files\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html. Accessed on 28 April 2023"},{"issue":"9","key":"969_CR14","doi-asserted-by":"publisher","first-page":"1572","DOI":"10.1021\/acscentsci.9b00576","volume":"5","author":"P Schwaller","year":"2019","unstructured":"Schwaller P, Laino T, Gaudin T, Bolgar P, Hunter CA, Bekas C, Lee AA (2019) Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Central Sci 5(9):1572\u20131583. https:\/\/doi.org\/10.1021\/acscentsci.9b00576","journal-title":"ACS Central Sci"},{"issue":"1","key":"969_CR15","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ac3ffb","volume":"3","author":"R Irwin","year":"2022","unstructured":"Irwin R, Dimitriadis S, He J, Bjerrum EJ (2022) Chemformer: a pre-trained transformer for computational chemistry. Mach Learn Sci Technol 3(1):015022. https:\/\/doi.org\/10.1088\/2632-2153\/ac3ffb","journal-title":"Mach Learn Sci Technol"},{"issue":"1","key":"969_CR16","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1109\/JPROC.2020.3004555","volume":"109","author":"F Zhuang","year":"2021","unstructured":"Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2021) A comprehensive survey on transfer learning. Proc IEEE 109(1):43\u201376. https:\/\/doi.org\/10.1109\/JPROC.2020.3004555. (Conference Name: Proceedings of the IEEE)","journal-title":"Proc IEEE"},{"issue":"65","key":"969_CR17","doi-asserted-by":"publisher","first-page":"9368","DOI":"10.1039\/D0CC02657C","volume":"56","author":"L Wang","year":"2020","unstructured":"Wang L, Zhang C, Bai R, Li J, Duan H (2020) Heck reaction prediction using a transformer model based on a transfer learning strategy. Chem Commun 56(65):9368\u20139371. https:\/\/doi.org\/10.1039\/D0CC02657C","journal-title":"Chem Commun"},{"issue":"19","key":"969_CR18","doi-asserted-by":"publisher","first-page":"4579","DOI":"10.1021\/acs.jcim.2c00588","volume":"62","author":"Z Wu","year":"2022","unstructured":"Wu Z, Cai X, Zhang C, Qiao H, Wu Y, Zhang Y, Wang X, Xie H, Luo F, Duan H (2022) Self-supervised molecular pretraining strategy for low-resource reaction prediction scenarios. J Chem Inf Model 62(19):4579\u20134590. https:\/\/doi.org\/10.1021\/acs.jcim.2c00588","journal-title":"J Chem Inf Model"},{"issue":"1","key":"969_CR19","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28(1):31\u201336. https:\/\/doi.org\/10.1021\/ci00057a005","journal-title":"J Chem Inf Comput Sci"},{"issue":"1","key":"969_CR20","doi-asserted-by":"publisher","first-page":"63","DOI":"10.1186\/s13321-021-00543-x","volume":"13","author":"JYC Tam","year":"2021","unstructured":"Tam JYC, Lorsbach T, Schmidt S, Wicker JS (2021) Holistic evaluation of biodegradation pathway prediction: assessing multi-step reactions and intermediate products. J Cheminform 13(1):63. https:\/\/doi.org\/10.1186\/s13321-021-00543-x. (Accessed 2023-06-18)","journal-title":"J Cheminform"},{"issue":"W1","key":"969_CR21","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1093\/nar\/gkac313","volume":"50","author":"DS Wishart","year":"2022","unstructured":"Wishart DS, Tian S, Allen D, Oler E, Peters H, Lui V, Gautam V, Djoumbou-Feunang Y, Greiner R, Metz T (2022) BioTransformer 3.0\u2014a web server for accurately predicting metabolic transformation products. Nucleic Acids Res 50(W1):115\u2013123. https:\/\/doi.org\/10.1093\/nar\/gkac313","journal-title":"Nucleic Acids Res"},{"key":"969_CR22","unstructured":"Lowe DM (2012) Extraction of chemical structures and reactions from the literature. Doctor of Philosophy (Ph.D.), University of Cambridge. http:\/\/www.dspace.cam.ac.uk\/handle\/1810\/244727. Accessed on 15 Oct 2023"},{"issue":"7","key":"969_CR23","doi-asserted-by":"publisher","first-page":"1415","DOI":"10.1039\/D0QO01636E","volume":"8","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Wang L, Wang X, Zhang C, Ge J, Tang J, Su A, Duan H (2021) Data augmentation and transfer learning strategies for reaction prediction in low chemical data regimes. Org Chem Front 8(7):1415\u20131423. https:\/\/doi.org\/10.1039\/D0QO01636E","journal-title":"Org Chem Front"},{"issue":"28","key":"969_CR24","doi-asserted-by":"publisher","first-page":"6091","DOI":"10.1039\/C8SC02339E","volume":"9","author":"P Schwaller","year":"2018","unstructured":"Schwaller P, Gaudin T, L\u00e1nyi D, Bekas C, Laino T (2018) \u201cFound in Translation\u2019\u2019: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem Sci 9(28):6091\u20136098. https:\/\/doi.org\/10.1039\/C8SC02339E","journal-title":"Chem Sci"},{"key":"969_CR25","doi-asserted-by":"publisher","unstructured":"Wicker J, Fenner K, Kramer S (2016) A hybrid machine learning and knowledge based approach to limit combinatorial explosion in biodegradation prediction. In: L\u00e4ssig J, Kersting K, Morik K (eds) Computational sustainability, pp 75\u201397. Springer, Cham. https:\/\/doi.org\/10.1007\/978-3-319-31858-5_5. Accessed on 29 Sept 2024","DOI":"10.1007\/978-3-319-31858-5_5"},{"issue":"1","key":"969_CR26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10462-009-9124-7","volume":"33","author":"L Rokach","year":"2010","unstructured":"Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1):1\u201339. https:\/\/doi.org\/10.1007\/s10462-009-9124-7","journal-title":"Artif Intell Rev"},{"key":"969_CR27","doi-asserted-by":"publisher","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, K\u00f6pf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. arXiv. arXiv:1912.01703 [cs, stat]. https:\/\/doi.org\/10.48550\/arXiv.1912.01703 . http:\/\/arxiv.org\/abs\/1912.01703. Accessed on 28 Feb 2024","DOI":"10.48550\/arXiv.1912.01703"},{"key":"969_CR28","doi-asserted-by":"publisher","unstructured":"Brydon L, Zhang K, Dobbie G, Ta\u0161kova K, Simon Wicker J (2024) v1.0.1 EnviFormer. Zenodo. https:\/\/doi.org\/10.5281\/ZENODO.13858575","DOI":"10.5281\/ZENODO.13858575"},{"key":"969_CR29","doi-asserted-by":"publisher","unstructured":"Brydon L (2024) USPTO dataset used by enviFormer. Zenodo. https:\/\/doi.org\/10.5281\/zenodo.13858535","DOI":"10.5281\/zenodo.13858535"},{"issue":"6","key":"969_CR30","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1007\/s10295-004-0144-7","volume":"31","author":"BK Hou","year":"2004","unstructured":"Hou BK, Ellis LBM, Wackett LP (2004) Encoding microbial metabolic logic: predicting biodegradation. J Ind Microbiol Biotechnol 31(6):261\u2013272. https:\/\/doi.org\/10.1007\/s10295-004-0144-7","journal-title":"J Ind Microbiol Biotechnol"},{"issue":"D1","key":"969_CR31","doi-asserted-by":"publisher","first-page":"498","DOI":"10.1093\/nar\/gkaa1025","volume":"49","author":"A Chang","year":"2021","unstructured":"Chang A, Jeske L, Ulbrich S, Hofmann J, Koblitz J, Schomburg I, Neumann-Schaal M, Jahn D, Schomburg D (2021) BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic Acids Res 49(D1):498\u2013508. https:\/\/doi.org\/10.1093\/nar\/gkaa1025","journal-title":"Nucleic Acids Res"},{"issue":"34","key":"969_CR32","doi-asserted-by":"publisher","first-page":"4114","DOI":"10.1039\/D1CC00586C","volume":"57","author":"Y Wu","year":"2021","unstructured":"Wu Y, Zhang C, Wang L, Duan H (2021) A graph-convolutional neural network for addressing small-scale reaction prediction. Chem Commun 57(34):4114\u20134117. https:\/\/doi.org\/10.1039\/D1CC00586C","journal-title":"Chem Commun"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-00969-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-025-00969-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-00969-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T21:04:28Z","timestamp":1739826268000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-025-00969-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,17]]},"references-count":32,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["969"],"URL":"https:\/\/doi.org\/10.1186\/s13321-025-00969-7","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-5200860\/v2","asserted-by":"object"},{"id-type":"doi","id":"10.21203\/rs.3.rs-5200860\/v1","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,17]]},"assertion":[{"value":"4 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 February 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 February 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"JSW is one of the founders of enviPath UG & Co. KG, a scientific software development company that develops and maintains enviPath. The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"21"}}