{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T00:58:42Z","timestamp":1773881922012,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2023,6,24]],"date-time":"2023-06-24T00:00:00Z","timestamp":1687564800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union\u2019s H2020 research and innovation program"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Transformation products (TPs) of man-made chemicals, formed through microbially mediated transformation in the environment, can have serious adverse environmental effects, yet the analytical identification of TPs is challenging. Rule-based prediction tools are successful in predicting TPs, especially in environmental chemistry applications that typically have to rely on small datasets, by imparting the existing knowledge on enzyme-mediated biotransformation reactions. However, the rules extracted from biotransformation reaction databases usually face the issue of being over\/under-generalized and are not flexible to be updated with new reactions.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We developed an automatic rule extraction tool called enviRule. It clusters biotransformation reactions into different groups based on the similarities of reaction fingerprints, and then automatically extracts and generalizes rules for each reaction group in SMARTS format. It optimizes the genericity of automatic rules against the downstream TP prediction task. Models trained with automatic rules outperformed the models trained with manually curated rules by 30% in the area under curve (AUC) scores. Moreover, automatic rules can be easily updated with new reactions, highlighting enviRule\u2019s strengths for both automatic extraction of optimized reactions rules and automated updating thereof.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>enviRule code is freely available at https:\/\/github.com\/zhangky12\/enviRule.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad407","type":"journal-article","created":{"date-parts":[[2023,6,24]],"date-time":"2023-06-24T17:32:04Z","timestamp":1687627924000},"source":"Crossref","is-referenced-by-count":12,"title":["enviRule: an end-to-end system for automatic extraction of reaction patterns from environmental contaminant biotransformation pathways"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-4588-9487","authenticated-orcid":false,"given":"Kunyang","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Environmental Chemistry, Eawag , D\u00fcbendorf 8600, Switzerland"},{"name":"Department of Chemistry, University of Z\u00fcrich , Z\u00fcrich 8057, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6068-8220","authenticated-orcid":false,"given":"Kathrin","family":"Fenner","sequence":"additional","affiliation":[{"name":"Department of Environmental Chemistry, Eawag , D\u00fcbendorf 8600, Switzerland"},{"name":"Department of Chemistry, University of Z\u00fcrich , Z\u00fcrich 8057, Switzerland"}]}],"member":"286","published-online":{"date-parts":[[2023,6,24]]},"reference":[{"key":"2023070604461972600_btad407-B1","doi-asserted-by":"crossref","first-page":"D633","DOI":"10.1093\/nar\/gkx935","article-title":"The MetaCyc database of metabolic pathways and enzymes","volume":"46","author":"Caspi","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023070604461972600_btad407-B2","doi-asserted-by":"crossref","first-page":"434","DOI":"10.1021\/acscentsci.7b00064","article-title":"Prediction of organic reaction outcomes using machine learning","volume":"3","author":"Coley","year":"2017","journal-title":"ACS Cent Sci"},{"key":"2023070604461972600_btad407-B3","doi-asserted-by":"crossref","first-page":"11737","DOI":"10.1021\/es503425w","article-title":"Environmental designer drugs: when transformation may not eliminate risk","volume":"48","author":"Cwiertny","year":"2014","journal-title":"Environ Sci Technol"},{"key":"2023070604461972600_btad407-B4","doi-asserted-by":"crossref","first-page":"579","DOI":"10.1016\/j.copbio.2008.10.004","article-title":"Systems biology approaches to bioremediation","volume":"19","author":"de Lorenzo","year":"2008","journal-title":"Curr Opin Biotechnol"},{"key":"2023070604461972600_btad407-B5","doi-asserted-by":"crossref","first-page":"W477","DOI":"10.1093\/nar\/gkaa230","article-title":"novoPathFinder: a webserver of designing novel-pathway with integrating GEM-model","volume":"48","author":"Ding","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2023070604461972600_btad407-B6","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1186\/s13321-018-0324-5","article-title":"BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification","volume":"11","author":"Djoumbou-Feunang","year":"2019","journal-title":"J Cheminform"},{"key":"2023070604461972600_btad407-B7","doi-asserted-by":"crossref","first-page":"D1229","DOI":"10.1093\/nar\/gky940","article-title":"RetroRules: a database of reaction rules for engineering biology","volume":"47","author":"Duigou","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023070604461972600_btad407-B8","doi-asserted-by":"crossref","first-page":"1273","DOI":"10.1021\/ci010132r","article-title":"Reoptimization of MDL keys for use in drug discovery","volume":"42","author":"Durant","year":"2002","journal-title":"J Chem Inf Comput Sci"},{"key":"2023070604461972600_btad407-B9","doi-asserted-by":"crossref","first-page":"2572","DOI":"10.1021\/acs.jcim.9b00249","article-title":"Comparing molecular patterns using the example of SMARTS: applications and filter collection analysis","volume":"59","author":"Ehmki","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2023070604461972600_btad407-B10","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1093\/nar\/27.1.373","article-title":"The University of Minnesota Biocatalysis\/Biodegradation Database: specialized metabolism for functional genomics","volume":"27","author":"Ellis","year":"1999","journal-title":"Nucleic Acids Res"},{"key":"2023070604461972600_btad407-B11","doi-asserted-by":"crossref","first-page":"D517","DOI":"10.1093\/nar\/gkj076","article-title":"The University of Minnesota Biocatalysis\/Biodegradation Database: the first decade","volume":"34","author":"Ellis","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023070604461972600_btad407-B12","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.watres.2016.03.045","article-title":"Identification of transformation products of antiviral drugs formed during biological wastewater treatment and their occurrence in the urban water cycle","volume":"98","author":"Funke","year":"2016","journal-title":"Water Res"},{"key":"2023070604461972600_btad407-B13","doi-asserted-by":"crossref","first-page":"D488","DOI":"10.1093\/nar\/gkp771","article-title":"The University of Minnesota Biocatalysis\/Biodegradation Database: improving public access","volume":"38","author":"Gao","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023070604461972600_btad407-B14","doi-asserted-by":"crossref","first-page":"6621","DOI":"10.1021\/es100970m","article-title":"High-throughput identification of microbial transformation products of organic micropollutants","volume":"44","author":"Helbling","year":"2010","journal-title":"Environ Sci Technol"},{"key":"2023070604461972600_btad407-B15","doi-asserted-by":"crossref","first-page":"155","DOI":"10.2174\/1386207024607338","article-title":"Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings","volume":"5","author":"Holliday","year":"2002","journal-title":"Comb Chem High Throughput Screen"},{"key":"2023070604461972600_btad407-B16","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1007\/s11101-015-9448-7","article-title":"Dereplication strategies in natural product research: how many tools and methodologies behind the same concept?","volume":"16","author":"Hubert","year":"2017","journal-title":"Phytochem Rev"},{"key":"2023070604461972600_btad407-B17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13321-015-0087-1","article-title":"MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics","volume":"7","author":"Jeffryes","year":"2015","journal-title":"J Cheminform"},{"key":"2023070604461972600_btad407-B18","doi-asserted-by":"crossref","first-page":"2100","DOI":"10.1039\/c0em00238k","article-title":"A tiered procedure for assessing the formation of biotransformation products of pharmaceuticals and biocides during activated sludge treatment","volume":"12","author":"Kern","year":"2010","journal-title":"J Environ Monit"},{"key":"2023070604461972600_btad407-B19","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1039\/C6EM00697C","article-title":"Eawag-Soil in enviPath: a new resource for exploring regulatory pesticide soil biodegradation pathways and half-life data","volume":"19","author":"Latino","year":"2017","journal-title":"Environ Sci Process Impacts"},{"key":"2023070604461972600_btad407-B20","doi-asserted-by":"crossref","first-page":"5051","DOI":"10.1016\/j.ces.2004.09.021","article-title":"Computational discovery of biochemical routes to specialty chemicals","volume":"59","author":"Li","year":"2004","journal-title":"Chem Eng Sci"},{"key":"2023070604461972600_btad407-B21","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1021\/acs.estlett.2c00446","article-title":"GREENER pharmaceuticals for more sustainable healthcare","volume":"9","author":"Moermond","year":"2022","journal-title":"Environ Sci Technol Lett"},{"key":"2023070604461972600_btad407-B22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-022-29238-z","article-title":"Expanding biochemical knowledge and illuminating metabolic dark matter with ATLASx","volume":"13","author":"MohammadiPeyhani","year":"2022","journal-title":"Nat Commun"},{"key":"2023070604461972600_btad407-B24","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1016\/j.ymben.2021.02.006","article-title":"Curating a comprehensive set of enzymatic reaction rules for efficient novel biosynthetic pathway design","volume":"65","author":"Ni","year":"2021","journal-title":"Metab Eng"},{"key":"2023070604461972600_btad407-B25","doi-asserted-by":"crossref","first-page":"22691","DOI":"10.1007\/s11356-016-7398-2","article-title":"Microbial biotransformation of furosemide for environmental risk assessment: identification of metabolites and toxicological evaluation","volume":"23","author":"Olvera-Vargas","year":"2016","journal-title":"Environ Sci Pollut Res Int"},{"key":"2023070604461972600_btad407-B26","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1038\/nmeth.2803","article-title":"EC-BLAST: a tool to automatically search and compare enzyme reactions","volume":"11","author":"Rahman","year":"2014","journal-title":"Nat Methods"},{"key":"2023070604461972600_btad407-B27","doi-asserted-by":"crossref","first-page":"2065","DOI":"10.1093\/bioinformatics\/btw096","article-title":"Reaction decoder tool (RDT): extracting features from chemical reactions","volume":"32","author":"Rahman","year":"2016","journal-title":"Bioinformatics"},{"key":"2023070604461972600_btad407-B28","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1007\/s10994-011-5256-5","article-title":"Classifier chains for multi-label classification","volume":"85","author":"Read","year":"2011","journal-title":"Mach Learn"},{"key":"2023070604461972600_btad407-B29","first-page":"1","article-title":"Meka: a multi-label\/multi-target extension to Weka","volume":"17","author":"Read","year":"2016","journal-title":"J Mach Learn Res"},{"key":"2023070604461972600_btad407-B30","doi-asserted-by":"crossref","first-page":"48","DOI":"10.2533\/chimia.2023.48","article-title":"Can AI help improve water quality? Towards the prediction of degradation of micropollutants in wastewater","volume":"77","author":"Satoh","year":"2023","journal-title":"Chimia"},{"key":"2023070604461972600_btad407-B31","doi-asserted-by":"crossref","first-page":"2560","DOI":"10.1021\/acs.jcim.9b00250","article-title":"Comparing molecular patterns using the example of SMARTS: theory and algorithms","volume":"59","author":"Schmidt","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2023070604461972600_btad407-B32","doi-asserted-by":"crossref","first-page":"1572","DOI":"10.1021\/acscentsci.9b00576","article-title":"Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction","volume":"5","author":"Schwaller","year":"2019","journal-title":"ACS Cent Sci"},{"key":"2023070604461972600_btad407-B33","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1038\/s42256-020-00284-w","article-title":"Mapping the space of chemical reactions using attention-based neural networks","volume":"3","author":"Schwaller","year":"2021","journal-title":"Nat Mach Intell"},{"key":"2023070604461972600_btad407-B34","doi-asserted-by":"crossref","first-page":"015016","DOI":"10.1088\/2632-2153\/abc81d","article-title":"Prediction of chemical reaction yields using deep learning","volume":"2","author":"Schwaller","year":"2021","journal-title":"Mach Learn Sci Technol"},{"key":"2023070604461972600_btad407-B35","doi-asserted-by":"crossref","first-page":"5966","DOI":"10.1002\/chem.201605499","article-title":"Neural-symbolic machine learning for retrosynthesis and reaction prediction","volume":"23","author":"Segler","year":"2017","journal-title":"Chemistry"},{"key":"2023070604461972600_btad407-B36","doi-asserted-by":"crossref","first-page":"102722","DOI":"10.1016\/j.copbio.2022.102722","article-title":"Computational tools and resources for designing new pathways to small molecules","volume":"76","author":"Sveshnikova","year":"2022","journal-title":"Curr Opin Biotechnol"},{"key":"2023070604461972600_btad407-B37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13321-021-00543-x","article-title":"Holistic evaluation of biodegradation pathway prediction: assessing multi-step reactions and intermediate products","volume":"13","author":"Tam","year":"2021","journal-title":"J Cheminform"},{"key":"2023070604461972600_btad407-B38","doi-asserted-by":"crossref","first-page":"e01536","DOI":"10.1128\/AEM.01536-18","article-title":"Blame it on the metabolite: 3,5-dichloroaniline rather than the parent compound is responsible for the decreasing diversity and function of soil microorganisms","volume":"84","author":"Vasileiadis","year":"2018","journal-title":"Appl Environ Microbiol"},{"key":"2023070604461972600_btad407-B39","doi-asserted-by":"crossref","first-page":"814","DOI":"10.1093\/bioinformatics\/btq024","article-title":"Predicting biodegradation products and pathways: a hybrid knowledge- and machine learning-based approach","volume":"26","author":"Wicker","year":"2010","journal-title":"Bioinformatics"},{"key":"2023070604461972600_btad407-B40","doi-asserted-by":"crossref","first-page":"D502","DOI":"10.1093\/nar\/gkv1229","article-title":"enviPath\u2014the environmental contaminant biotransformation pathway resource","volume":"44","author":"Wicker","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023070604461972600_btad407-B41","doi-asserted-by":"crossref","first-page":"W115","DOI":"10.1093\/nar\/gkac313","article-title":"BioTransformer 3.0\u2014a web server for accurately predicting metabolic transformation products","volume":"50","author":"Wishart","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2023070604461972600_btad407-B42","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1145\/507338.507355","article-title":"Data mining: practical machine learning tools and techniques with java implementations","volume":"31","author":"Witten","year":"2002","journal-title":"SIGMOD Rec"},{"key":"2023070604461972600_btad407-B43","doi-asserted-by":"crossref","first-page":"462","DOI":"10.1038\/s41586-019-1291-3","article-title":"Mapping human microbiome drug metabolism by gut bacteria and their genes","volume":"570","author":"Zimmermann","year":"2019","journal-title":"Nature"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad407\/50696368\/btad407.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/7\/btad407\/50827500\/btad407.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/7\/btad407\/50827500\/btad407.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,6]],"date-time":"2023-07-06T04:47:11Z","timestamp":1688618831000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad407\/7206883"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,6,24]]},"references-count":42,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2023,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad407","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,7,1]]},"published":{"date-parts":[[2023,6,24]]},"article-number":"btad407"}}