{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T03:44:58Z","timestamp":1774064698372,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2023,2,15]],"date-time":"2023-02-15T00:00:00Z","timestamp":1676419200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["CCF-1909536"],"award-info":[{"award-number":["CCF-1909536"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>While traditionally utilized for identifying site-specific metabolic activity within a compound to alter its interaction with a metabolizing enzyme, predicting the site-of-metabolism (SOM) is essential in analyzing the promiscuity of enzymes on substrates. The successful prediction of SOMs and the relevant promiscuous products has a wide range of applications that include creating extended metabolic models (EMMs) that account for enzyme promiscuity and the construction of novel heterologous synthesis pathways. There is therefore a need to develop generalized methods that can predict molecular SOMs for a wide range of metabolizing enzymes.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>This article develops a Graph Neural Network (GNN) model for the classification of an atom (or a bond) being an SOM. Our model, GNN-SOM, is trained on enzymatic interactions, available in the KEGG database, that span all enzyme commission numbers. We demonstrate that GNN-SOM consistently outperforms baseline machine learning models, when trained on all enzymes, on Cytochrome P450 (CYP) enzymes, or on non-CYP enzymes. We showcase the utility of GNN-SOM in prioritizing predicted enzymatic products due to enzyme promiscuity for two biological applications: the construction of EMMs and the construction of synthesis pathways.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>A python implementation of the trained SOM predictor model can be found at https:\/\/github.com\/HassounLab\/GNN-SOM.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad089","type":"journal-article","created":{"date-parts":[[2023,2,15]],"date-time":"2023-02-15T05:42:03Z","timestamp":1676439723000},"source":"Crossref","is-referenced-by-count":23,"title":["Using graph neural networks for site-of-metabolism prediction and its applications to ranking promiscuous enzymatic products"],"prefix":"10.1093","volume":"39","author":[{"given":"Vladimir","family":"Porokhin","sequence":"first","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford, MA 02155, USA"}]},{"given":"Li-Ping","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford, MA 02155, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9477-2199","authenticated-orcid":false,"given":"Soha","family":"Hassoun","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford, MA 02155, USA"},{"name":"Department of Chemical and Biological Engineering, Tufts University , Medford, MA 02155, USA"}]}],"member":"286","published-online":{"date-parts":[[2023,2,15]]},"reference":[{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1186\/s12934-019-1156-3","article-title":"Towards creating an extended metabolic model (EMM) for E. coli using enzyme promiscuity prediction and metabolomics data","volume":"18","author":"Amin","year":"2019","journal-title":"Microb. Cell Fact"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"D603","DOI":"10.1093\/nar\/gkab1106","article-title":"eQuilibrator 3.0: a database solution for thermodynamic constant estimation","volume":"50","author":"Beber","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"897","DOI":"10.1016\/j.biortech.2015.10.107","article-title":"Enhanced production of 3-hydroxypropionic acid from glucose via malonyl-CoA pathway by engineered Escherichia coli","volume":"200","author":"Cheng","year":"2016","journal-title":"Bioresour. Technol"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"1146","DOI":"10.1021\/acs.jcim.9b00836","article-title":"The metabolic rainbow: deep learning phase I metabolism in five colors","volume":"60","author":"Dang","year":"2020","journal-title":"J. Chem. Inf. Model"},{"key":"2023030719292906100_","doi-asserted-by":"publisher","author":"Defferrard","year":"2016","DOI":"10.48550\/ARXIV.1606.09375"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"e1323","DOI":"10.1002\/wcms.1323","article-title":"Recent advances in the prediction of non-CYP450-mediated drug metabolism","volume":"7","author":"Dixit","year":"2017","journal-title":"WIREs Comput. Mol. Sci"},{"key":"2023030719292906100_","author":"Donti","year":"2017"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"D1229","DOI":"10.1093\/nar\/gky940","article-title":"RetroRules: a database of reaction rules for engineering biology","volume":"47","author":"Duigou","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023030719292906100_","author":"Duvenaud","year":"2015"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Patt. Recognit. Lett"},{"key":"2023030719292906100_","author":"Fey","year":"2019"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"2281","DOI":"10.1002\/cmdc.201800309","article-title":"MetScore: site of metabolism prediction beyond cytochrome P450 enzymes","volume":"13","author":"Finkelmann","year":"2018","journal-title":"ChemMedChem"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"D625","DOI":"10.1093\/nar\/gks992","article-title":"ECMDB: the E. coli metabolome database","volume":"41","author":"Guo","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1093\/bioinformatics\/btw617","article-title":"Site of metabolism prediction for oxidation reactions mediated by oxidoreductases based on chemical bond","volume":"33","author":"He","year":"2017","journal-title":"Bioinformatics"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1093\/bioinformatics\/btaa881","article-title":"Learning graph representations of biochemical networks and its application to enzymatic link prediction","volume":"37","author":"Jiang","year":"2021","journal-title":"Bioinformatics"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"e2000605","DOI":"10.1002\/biot.202000605","article-title":"A deep learning approach to evaluate the feasibility of enzymatic reactions generated by retrobiosynthesis","volume":"16","author":"Kim","year":"2021","journal-title":"Biotechnol. J"},{"key":"2023030719292906100_","volume-title":"Adam: A Method for Stochastic Optimization","author":"Kingma","year":"2014"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"2896","DOI":"10.1021\/ci400503s","article-title":"FAst MEtabolizer (FAME): a rapid and accurate predictor of sites of metabolism in multiple species by endogenous enzymes","volume":"53","author":"Kirchmair","year":"2013","journal-title":"J. Chem. Inf. Model"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"16487","DOI":"10.1021\/ja0466457","article-title":"Computational assignment of the EC numbers for genomic-scale analysis of enzymatic reactions","volume":"126","author":"Kotera","year":"2004","journal-title":"J. Am. Chem. Soc"},{"key":"2023030719292906100_","first-page":"263","article-title":"Basic review of the cytochrome p450 system","volume":"4","author":"McDonnell","year":"2013","journal-title":"J. Adv. Pract. Oncol"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"904","DOI":"10.1038\/nbt.3956","article-title":"iML1515, a knowledgebase that computes Escherichia coli traits","volume":"35","author":"Monk","year":"2017","journal-title":"Nat. Biotechnol"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1038\/nbt1519","article-title":"Protein promiscuity and its implications for biotechnology","volume":"27","author":"Nobeli","year":"2009","journal-title":"Nat. Biotechnol"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"e1003098","DOI":"10.1371\/journal.pcbi.1003098","article-title":"Consistent estimation of Gibbs energy using component contributions","volume":"9","author":"Noor","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.ymben.2020.11.012","article-title":"Automated engineering of synthetic metabolic pathways for efficient biomanufacturing","volume":"63","author":"Otero-Muras","year":"2021","journal-title":"Metab. Eng"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"e00170","DOI":"10.1016\/j.mec.2021.e00170","article-title":"Analysis of metabolic network disruption in engineered microbial hosts due to enzyme promiscuity","volume":"12","author":"Porokhin","year":"2021","journal-title":"Metab. Eng. Commun"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"633","DOI":"10.1016\/j.jbiotec.2011.06.008","article-title":"Production of 3-hydroxypropionic acid via malonyl-CoA pathway using recombinant Escherichia coli strains","volume":"157","author":"Rathnasingh","year":"2012","journal-title":"J. Biotechnol"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1021\/ml100016x","article-title":"SMARTCyp: a 2D method for prediction of cytochrome P450-Mediated drug metabolism","volume":"1","author":"Rydberg","year":"2010","journal-title":"ACS Med. Chem. Lett"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"2507","DOI":"10.1093\/bioinformatics\/btm344","article-title":"A review of feature selection techniques in bioinformatics","volume":"23","author":"Saeys","year":"2007","journal-title":"Bioinformatics"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"D495","DOI":"10.1093\/nar\/gkv1060","article-title":"ECMDB 2.0: a richer resource for understanding the biochemistry of E. coli","volume":"44","author":"Sajed","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"3522","DOI":"10.1093\/bioinformatics\/btw491","article-title":"ReactPRED: a tool to predict and analyze biochemical reactions","volume":"32","author":"Sivakumar","year":"2016","journal-title":"Bioinformatics"},{"issue":"13","key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"3484","DOI":"10.1093\/bioinformatics\/btac331","article-title":"MINE 2.0: enhanced biochemical coverage for peak identification in untargeted metabolomics","volume":"38","author":"Strutz","year":"2022","journal-title":"Bioinformatics"},{"key":"2023030719292906100_","first-page":"1260","article-title":"Enzyme promiscuity and evolution in light of cellular metabolism","volume":"287","author":"Tawfik","year":"2020","journal-title":"Wiley Online Library"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"e75370","DOI":"10.1371\/journal.pone.0075370","article-title":"Steady-state metabolite concentrations reflect a balance between maximizing enzyme efficiency and minimizing total metabolite load","volume":"8","author":"Tepper","year":"2013","journal-title":"PLoS One"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"549","DOI":"10.1016\/j.drudis.2012.01.017","article-title":"Reactions and enzymes in the metabolism of drugs and other xenobiotics","volume":"17","author":"Testa","year":"2012","journal-title":"Drug Discov. Today"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1111\/cbdd.13445","article-title":"Computational methods and tools to predict cytochrome P450 metabolism for drug discovery","volume":"93","author":"Tyzack","year":"2019","journal-title":"Chem. Biol. Drug Des"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"2257","DOI":"10.1007\/s10529-014-1600-8","article-title":"Metabolic engineering of Escherichia coli for poly(3-hydroxypropionate) production from glycerol and glucose","volume":"36","author":"Wang","year":"2014","journal-title":"Biotechnol. Lett"},{"key":"2023030719292906100_","author":"Xu","year":"2018"},{"key":"2023030719292906100_","author":"Yamada","year":"2005"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1186\/s12918-015-0241-4","article-title":"PROXIMAL: a method for prediction of xenobiotic metabolism","volume":"9","author":"Yousofshahi","year":"2015","journal-title":"BMC Syst. Biol"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1016\/j.pharmthera.2012.12.007","article-title":"Cytochrome P450 enzymes in drug metabolism: regulation of gene expression, enzyme activities, and impact of genetic variation","volume":"138","author":"Zanger","year":"2013","journal-title":"Pharmacol. Ther"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"3373","DOI":"10.1021\/ci400518g","article-title":"XenoSite: accurately predicting CYP-mediated sites of metabolism with neural networks","volume":"53","author":"Zaretzki","year":"2013","journal-title":"J. Chem. Inf. Model"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"1637","DOI":"10.1021\/ci300009z","article-title":"RS-Predictor models augmented with SMARTCyp reactivities: robust metabolic regioselectivity predictions for nine CYP isozymes","volume":"52","author":"Zaretzki","year":"2012","journal-title":"J. Chem. Inf. Model"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/s40649-019-0069-y","article-title":"Graph convolutional networks: a comprehensive review","volume":"6","author":"Zhang","year":"2019","journal-title":"Comput. Soc. Netw"},{"key":"2023030719292906100_","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.aiopen.2021.01.001","article-title":"Graph neural networks: a review of methods and applications","volume":"1","author":"Zhou","year":"2020","journal-title":"AI Open"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad089\/49196853\/btad089.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/3\/btad089\/49436629\/btad089.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/3\/btad089\/49436629\/btad089.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,25]],"date-time":"2023-03-25T08:50:12Z","timestamp":1679734212000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad089\/7039680"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,2,15]]},"references-count":44,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad089","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,3,1]]},"published":{"date-parts":[[2023,2,15]]},"article-number":"btad089"}}