{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T15:15:30Z","timestamp":1777043730202,"version":"3.51.4"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T00:00:00Z","timestamp":1774310400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"University of Z\u00fcrich"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,4,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Microbial biotransformation plays a central role in the environmental degradation of chemical contaminants, driven by the catalytic activities of diverse enzymes. However, linking specific enzymes to contaminant removal and predicting associated transformation products (TPs) under real-world conditions remain a major challenge. In this study, we present a self-supervised, contrastive fine-tuning strategy for reaction fingerprint learning, designed to improve the chemical relevance of BERT-based reaction embeddings for environmental biotransformation reactions. Specifically, we fine-tuned a BERT encoder such that the cosine similarity between its reaction fingerprints aligns with the Tanimoto similarity of traditional structure-based fingerprints.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>The resulting compact, 256-dimensional fingerprints, termed crxnfp, showed an improved ability to cluster reactions according to transformation type and focus attention on chemically meaningful reaction centers. Our crxnfp fingerprints were further validated in reaction classification tasks across multiple datasets, achieving superior or comparable performance relative to existing methods. Importantly, they enabled a similarity-based association of biotransformation rules and reactions from enviPath with enzyme annotations from the Rhea and UniProt databases, offering a scalable approach to enrich environmental biotransformation datasets with enzymatic information. Additionally, crxnfp was employed to identify specific enzyme classes involved in contaminant biotransformation, which were subsequently validated through experiments conducted in this study, achieving 91.3% accuracy at the third-level enzyme classification. The crxnfp fingerprints offer a promising solution to advance the understanding of contaminant biotransformation and guide the development of enzyme-informed strategies for contaminant management across diverse environmental contexts.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Code is available at https:\/\/github.com\/zhangky12\/crxnfp and https:\/\/github.com\/zhangky12\/crxnfp_knn.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btag142","type":"journal-article","created":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T12:37:22Z","timestamp":1774096642000},"source":"Crossref","is-referenced-by-count":0,"title":["Enzyme association for environmental biotransformation reactions through contrastive learning of reaction center-specific fingerprints"],"prefix":"10.1093","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-4588-9487","authenticated-orcid":false,"given":"Kunyang","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Environmental Chemistry, Eawag , D\u00fcbendorf 8600,","place":["Switzerland"]},{"name":"Department of Chemistry, University of Z\u00fcrich , Z\u00fcrich 8057,","place":["Switzerland"]}]},{"given":"Thierry D","family":"Marti","sequence":"additional","affiliation":[{"name":"Department of Environmental Microbiology, Eawag , D\u00fcbendorf 8600,","place":["Switzerland"]},{"name":"Department of Environmental Systems Science, ETH Z\u00fcrich , Z\u00fcrich 8057,","place":["Switzerland"]}]},{"given":"Silke I","family":"Probst","sequence":"additional","affiliation":[{"name":"Department of Environmental Microbiology, Eawag , D\u00fcbendorf 8600,","place":["Switzerland"]}]},{"given":"Serina L","family":"Robinson","sequence":"additional","affiliation":[{"name":"Department of Environmental Microbiology, Eawag , D\u00fcbendorf 8600,","place":["Switzerland"]},{"name":"Department of Environmental Systems Science, ETH Z\u00fcrich , Z\u00fcrich 8057,","place":["Switzerland"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6068-8220","authenticated-orcid":false,"given":"Kathrin","family":"Fenner","sequence":"additional","affiliation":[{"name":"Department of Environmental Chemistry, Eawag , D\u00fcbendorf 8600,","place":["Switzerland"]},{"name":"Department of Chemistry, University of Z\u00fcrich , Z\u00fcrich 8057,","place":["Switzerland"]}]}],"member":"286","published-online":{"date-parts":[[2026,3,24]]},"reference":[{"key":"2026042409463598700_btag142-B1","doi-asserted-by":"crossref","first-page":"D693","DOI":"10.1093\/nar\/gkab1016","article-title":"Rhea, the reaction knowledgebase in 2022","volume":"50","author":"Bansal","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2026042409463598700_btag142-B2","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1021\/acs.jcim.8b00801","article-title":"Enhancing retrosynthetic reaction prediction with deep learning using multiscale reaction classification","volume":"59","author":"Baylon","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2026042409463598700_btag142-B3","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1186\/s12859-018-2368-y","article-title":"ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature","volume":"19","author":"Dalkiran","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2026042409463598700_btag142-B4","author":"Devlin","year":"2019"},{"key":"2026042409463598700_btag142-B5","doi-asserted-by":"crossref","first-page":"752","DOI":"10.1126\/science.1236281","article-title":"Evaluating pesticide degradation in the environment: blind spots and emerging opportunities","volume":"341","author":"Fenner","year":"2013","journal-title":"Science"},{"key":"2026042409463598700_btag142-B6","doi-asserted-by":"crossref","first-page":"1541","DOI":"10.1021\/acsestwater.1c00025","article-title":"Methodological advances to study contaminant biotransformation: new prospects for understanding and reducing environmental persistence?","volume":"1","author":"Fenner","year":"2021","journal-title":"ACS ES T Water"},{"key":"2026042409463598700_btag142-B7","doi-asserted-by":"crossref","first-page":"2342","DOI":"10.3390\/jcm9082342","article-title":"In vitro assessment of fluoropyrimidine-metabolizing enzymes: dihydropyrimidine dehydrogenase, dihydropyrimidinase, and \u03b2-ureidopropionase","volume":"9","author":"Hishinuma","year":"2020","journal-title":"J Clin Med"},{"key":"2026042409463598700_btag142-B8","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1109\/TBDATA.2019.2921572","article-title":"Billion-scale similarity search with GPUs","volume":"7","author":"Johnson","year":"2021","journal-title":"IEEE Trans Big Data"},{"key":"2026042409463598700_btag142-B9","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1039\/C6EM00697C","article-title":"Eawag-Soil in enviPath: a new resource for exploring regulatory pesticide soil biodegradation pathways and half-life data","volume":"19","author":"Latino","year":"2017","journal-title":"Environ Sci Process Impacts"},{"key":"2026042409463598700_btag142-B10","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1186\/s13321-023-00732-w","article-title":"A deep learning framework for accurate reaction prediction and its application on high-throughput experimentation data","volume":"15","author":"Li","year":"2023","journal-title":"J Cheminform"},{"key":"2026042409463598700_btag142-B11","doi-asserted-by":"crossref","first-page":"760","DOI":"10.1093\/bioinformatics\/btx680","article-title":"DEEPre: sequence-based enzyme EC number prediction by deep learning","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2026042409463598700_btag142-B12","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1021\/c160017a018","article-title":"The generation of a unique machine description for chemical structures\u2014a technique developed at chemical abstracts service","volume":"5","author":"Morgan","year":"1965","journal-title":"J Chem Doc"},{"key":"2026042409463598700_btag142-B13","doi-asserted-by":"crossref","first-page":"2153","DOI":"10.1021\/acs.biochem.4c00336","article-title":"Enzyme catalyzed formation of CoA adducts of fluorinated hexanoic acid analogues using a long-chain acyl-CoA synthetase from Gordonia sp. strain NB4-1Y","volume":"63","author":"Mothersole","year":"2024","journal-title":"Biochemistry"},{"key":"2026042409463598700_btag142-B14","doi-asserted-by":"crossref","first-page":"1245","DOI":"10.1021\/ci900043r","article-title":"APIF: a new interaction fingerprint based on atom pairs and its application to virtual screening","volume":"49","author":"P\u00e9rez-Nueno","year":"2009","journal-title":"J Chem Inf Model"},{"key":"2026042409463598700_btag142-B15","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1186\/s13321-020-0416-x","article-title":"Visualization of very large high-dimensional data sets as minimum spanning trees","volume":"12","author":"Probst","year":"2020","journal-title":"J Cheminform"},{"key":"2026042409463598700_btag142-B16","doi-asserted-by":"crossref","first-page":"964","DOI":"10.1038\/s41467-022-28536-w","article-title":"Biocatalysed synthesis planning using data-driven learning","volume":"13","author":"Probst","year":"2022","journal-title":"Nat Commun"},{"key":"2026042409463598700_btag142-B17","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1039\/D1DD00006C","article-title":"Reaction classification and yield prediction using the differential reaction fingerprint DRFP","volume":"1","author":"Probst","year":"2022","journal-title":"Digit Discov"},{"key":"2026042409463598700_btag142-B18","author":"Probst","year":"2023"},{"key":"2026042409463598700_btag142-B19","doi-asserted-by":"crossref","first-page":"e2504122122","DOI":"10.1073\/pnas.2504122122","article-title":"Enzymatic carbon\u2013fluorine bond cleavage by human gut microbes","volume":"122","author":"Probst","year":"2025","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2026042409463598700_btag142-B20","author":"Reimers","year":"2019"},{"key":"2026042409463598700_btag142-B21","doi-asserted-by":"crossref","first-page":"100152","DOI":"10.1016\/j.wroa.2022.100152","article-title":"Microbial paracetamol degradation involves a high diversity of novel amidase enzyme candidates","volume":"16","author":"Rios-Miguel","year":"2022","journal-title":"Water Research X"},{"key":"2026042409463598700_btag142-B22","doi-asserted-by":"crossref","first-page":"1044","DOI":"10.1111\/j.1742-4658.2012.08495.x","article-title":"Kinetics of hydrolysis and mutational analysis of N,N-diethyl-m-toluamide hydrolase from Pseudomonas putida DTB","volume":"279","author":"Rivera-Cancel","year":"2012","journal-title":"FEBS J"},{"key":"2026042409463598700_btag142-B23","author":"Schmid","year":"2021"},{"key":"2026042409463598700_btag142-B24","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1021\/ci5006614","article-title":"Development of a novel fingerprint for chemical reactions and its application to large-scale reaction classification and similarity","volume":"55","author":"Schneider","year":"2015","journal-title":"J Chem Inf Model"},{"key":"2026042409463598700_btag142-B25","doi-asserted-by":"crossref","first-page":"1499","DOI":"10.1021\/ci2002318","article-title":"TFD: torsion fingerprints as a new measure to compare small molecule conformations","volume":"52","author":"Schulz-Gasch","year":"2012","journal-title":"J Chem Inf Model"},{"key":"2026042409463598700_btag142-B26","doi-asserted-by":"crossref","first-page":"eabe4166","DOI":"10.1126\/sciadv.abe4166","article-title":"Extraction of organic chemistry grammar from unsupervised learning of chemical reactions","volume":"7","author":"Schwaller","year":"2021","journal-title":"Sci Adv"},{"key":"2026042409463598700_btag142-B27","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1038\/s42256-020-00284-w","article-title":"Mapping the space of chemical reactions using attention-based neural networks","volume":"3","author":"Schwaller","year":"2021","journal-title":"Nat Mach Intell"},{"key":"2026042409463598700_btag142-B28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s43586-021-00022-5","article-title":"Automation and computer-assisted planning for chemical synthesis","volume":"1","author":"Shen","year":"2021","journal-title":"Nat Rev Methods Primers"},{"key":"2026042409463598700_btag142-B29","doi-asserted-by":"crossref","first-page":"1605","DOI":"10.1038\/s41564-022-01226-5","article-title":"Host and gut bacteria share metabolic pathways for anti-cancer drug metabolism","volume":"7","author":"Spanogiannopoulos","year":"2022","journal-title":"Nat Microbiol"},{"key":"2026042409463598700_btag142-B30","doi-asserted-by":"crossref","first-page":"805","DOI":"10.1038\/nrg1709","article-title":"Metagenomics: DNA sequencing of environmental samples","volume":"6","author":"Tringe","year":"2005","journal-title":"Nat Rev Genet"},{"key":"2026042409463598700_btag142-B31","doi-asserted-by":"crossref","first-page":"D506","DOI":"10.1093\/nar\/gky1049","article-title":"UniProt: a worldwide hub of protein knowledge","volume":"47","author":"UniProt Consortium","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2026042409463598700_btag142-B32","doi-asserted-by":"crossref","first-page":"1446","DOI":"10.1039\/D1SC06515G","article-title":"Improving machine learning performance on small chemical reaction data with unsupervised contrastive pretraining","volume":"13","author":"Wen","year":"2022","journal-title":"Chem Sci"},{"key":"2026042409463598700_btag142-B33","doi-asserted-by":"crossref","first-page":"D502","DOI":"10.1093\/nar\/gkv1229","article-title":"enviPath\u2014the environmental contaminant biotransformation pathway resource","volume":"44","author":"Wicker","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2026042409463598700_btag142-B34","doi-asserted-by":"crossref","first-page":"i179","DOI":"10.1093\/bioinformatics\/btp223","article-title":"E-zyme: predicting potential EC numbers from the chemical transformation pattern of substrate\u2013product pairs","volume":"25","author":"Yamanishi","year":"2009","journal-title":"Bioinformatics"},{"key":"2026042409463598700_btag142-B35","first-page":"3094","article-title":"Care: a benchmark suite for the classification and retrieval of enzymes","volume":"37","author":"Yang","year":"2024","journal-title":"Adv Neural Inf Process Syst"},{"key":"2026042409463598700_btag142-B36","doi-asserted-by":"crossref","first-page":"1358","DOI":"10.1126\/science.adf2465","article-title":"Enzyme function prediction using contrastive learning","volume":"379","author":"Yu","year":"2023","journal-title":"Science"},{"key":"2026042409463598700_btag142-B37","doi-asserted-by":"crossref","first-page":"121593","DOI":"10.1016\/j.watres.2024.121593","article-title":"Substrate promiscuity of xenobiotic-transforming hydrolases from stream biofilms impacted by treated wastewater","volume":"256","author":"Yu","year":"2024","journal-title":"Water Res"},{"key":"2026042409463598700_btag142-B38","doi-asserted-by":"crossref","first-page":"eado2957","DOI":"10.1126\/sciadv.ado2957","article-title":"Electron bifurcation and fluoride efflux systems implicated in defluorination of perfluorinated unsaturated carboxylic acids by Acetobacterium spp","volume":"10","author":"Yu","year":"2024","journal-title":"Sci Adv"},{"key":"2026042409463598700_btag142-B39","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1186\/s13321-024-00944-8","article-title":"CLAIRE: a contrastive learning-based predictor for EC number of chemical reactions","volume":"17","author":"Zeng","year":"2025","journal-title":"J Cheminform"},{"key":"2026042409463598700_btag142-B40","doi-asserted-by":"crossref","first-page":"btad407","DOI":"10.1093\/bioinformatics\/btad407","article-title":"enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways","volume":"39","author":"Zhang","year":"2023","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btag142\/67485334\/btag142.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/4\/btag142\/67485334\/btag142.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/4\/btag142\/67485334\/btag142.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T13:46:48Z","timestamp":1777038408000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btag142\/8537913"}},"subtitle":[],"editor":[{"given":"Daisuke","family":"Kihara","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2026,3,24]]},"references-count":40,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2026,4,7]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btag142","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,4]]},"published":{"date-parts":[[2026,3,24]]},"article-number":"btag142"}}