{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T13:17:09Z","timestamp":1770470229065,"version":"3.49.0"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2020,10,14]],"date-time":"2020-10-14T00:00:00Z","timestamp":1602633600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1909536"],"award-info":[{"award-number":["1909536"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1850358"],"award-info":[{"award-number":["1850358"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1908617"],"award-info":[{"award-number":["1908617"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01GM132391"],"award-info":[{"award-number":["R01GM132391"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,5,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The complete characterization of enzymatic activities between molecules remains incomplete, hindering biological engineering and limiting biological discovery. We develop in this work a technique, enzymatic link prediction (ELP), for predicting the likelihood of an enzymatic transformation between two molecules. ELP models enzymatic reactions cataloged in the KEGG database as a graph. ELP is innovative over prior works in using graph embedding to learn molecular representations that capture not only molecular and enzymatic attributes but also graph connectivity.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We explore transductive (test nodes included in the training graph) and inductive (test nodes not part of the training graph) learning models. We show that ELP achieves high AUC when learning node embeddings using both graph connectivity and node attributes. Further, we show that graph embedding improves link prediction by 30% in area under curve over fingerprint-based similarity approaches and by 8% over support vector machines. We compare ELP against rule-based methods. We also evaluate ELP for predicting links in pathway maps and for reconstruction of edges in reaction networks of four common gut microbiota phyla: actinobacteria, bacteroidetes, firmicutes and proteobacteria. To emphasize the importance of graph embedding in the context of biochemical networks, we illustrate how graph embedding can guide visualization.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The code and datasets are available through https:\/\/github.com\/HassounLab\/ELP.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa881","type":"journal-article","created":{"date-parts":[[2020,9,29]],"date-time":"2020-09-29T21:35:55Z","timestamp":1601415355000},"page":"793-799","source":"Crossref","is-referenced-by-count":10,"title":["Learning graph representations of biochemical networks and its application to enzymatic link prediction"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4260-282X","authenticated-orcid":false,"given":"Julie","family":"Jiang","sequence":"first","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford 02155, USA"}]},{"given":"Li-Ping","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford 02155, USA"}]},{"given":"Soha","family":"Hassoun","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford 02155, USA"},{"name":"Department of Chemical and Biological Engineering, Tufts University , Medford 02155, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,10,14]]},"reference":[{"key":"2023051705194417500_btaa881-B1","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1016\/j.cbpa.2011.03.008","article-title":"Toward mechanistic classification of enzyme functions","volume":"15","author":"Almonacid","year":"2011","journal-title":"Curr. Opin. Chem. Biol"},{"key":"2023051705194417500_btaa881-B2","doi-asserted-by":"crossref","first-page":"1616","DOI":"10.1109\/TKDE.2018.2807452","article-title":"A comprehensive survey of graph embedding: problems, techniques, and applications","volume":"30","author":"Cai","year":"2018","journal-title":"IEEE Trans. Knowl. Data Eng"},{"key":"2023051705194417500_btaa881-B3","first-page":"5119","article-title":"Learning graph representations with embedding propagation","author":"Garc\u00eda-Dur\u00e1n","year":"2017"},{"key":"2023051705194417500_btaa881-B4","doi-asserted-by":"crossref","first-page":"1273","DOI":"10.1021\/ci010132r","article-title":"Reoptimization of mdl keys for use in drug discovery","volume":"42","author":"Durant","year":"2002","journal-title":"J. Chem. Inform. Comput. Sci"},{"key":"2023051705194417500_btaa881-B5","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1016\/j.knosys.2018.03.022","article-title":"Graph embedding techniques, applications, and performance: a survey","volume":"151","author":"Goyal","year":"2018","journal-title":"Knowl. Based Syst"},{"key":"2023051705194417500_btaa881-B6","first-page":"855","author":"Grover","year":"2016"},{"key":"2023051705194417500_btaa881-B7","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1016\/j.tibtech.2007.03.002","article-title":"Enzyme promiscuity: mechanism and applications","volume":"25","author":"Hult","year":"2007","journal-title":"Trends Biotechnol"},{"key":"2023051705194417500_btaa881-B8","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"Kegg: Kyoto encyclopedia of genes and genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023051705194417500_btaa881-B9","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1146\/annurev-biochem-030409-143718","article-title":"Enzyme promiscuity: a mechanistic and evolutionary perspective","volume":"79","author":"Khersonsky","year":"2010","journal-title":"Annu. Rev. Biochem"},{"key":"2023051705194417500_btaa881-B10","doi-asserted-by":"crossref","first-page":"D1202","DOI":"10.1093\/nar\/gkv951","article-title":"PubChem substance and compound databases","volume":"44","author":"Kim","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023051705194417500_btaa881-B11","doi-asserted-by":"crossref","first-page":"16487","DOI":"10.1021\/ja0466457","article-title":"Computational assignment of the EC numbers for genomic-scale analysis of enzymatic reactions","volume":"126","author":"Kotera","year":"2004","journal-title":"J. Am. Chem. Soc"},{"key":"2023051705194417500_btaa881-B12","doi-asserted-by":"crossref","first-page":"2335","DOI":"10.1021\/ci800213g","article-title":"Eliciting possible reaction equations and metabolic pathways involving orphan metabolites","volume":"48","author":"Kotera","year":"2008","journal-title":"J. Chem. Inform. Model"},{"key":"2023051705194417500_btaa881-B13","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/1752-0509-7-S6-S2","article-title":"KCF-S: KEGG Chemical Function and Substructure for improved interpretability and prediction in chemical bioinformatics","volume":"7","author":"Kotera","year":"2013","journal-title":"BMC Syst. Biol"},{"key":"2023051705194417500_btaa881-B14","doi-asserted-by":"crossref","first-page":"i135","DOI":"10.1093\/bioinformatics\/btt244","article-title":"Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets","volume":"29","author":"Kotera","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051705194417500_btaa881-B15","doi-asserted-by":"crossref","first-page":"i165","DOI":"10.1093\/bioinformatics\/btu265","article-title":"Metabolome-scale prediction of intermediate compounds in multistep metabolic pathways with a recursive supervised approach","volume":"30","author":"Kotera","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051705194417500_btaa881-B16","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.pisc.2014.02.003","article-title":"Predictive genomic and metabolomic analysis for the standardization of enzyme data","volume":"1","author":"Kotera","year":"2014","journal-title":"Perspect. Sci"},{"key":"2023051705194417500_btaa881-B17","doi-asserted-by":"crossref","DOI":"10.2174\/0929867325666181101115314","article-title":"Survey of similarity-based prediction of drug-protein interactions","author":"Kurgan","year":"2018","journal-title":"Curr Med Chem"},{"key":"2023051705194417500_btaa881-B18","doi-asserted-by":"crossref","first-page":"5051","DOI":"10.1016\/j.ces.2004.09.021","article-title":"Computational discovery of biochemical routes to specialty chemicals","volume":"59","author":"Li","year":"2004","journal-title":"Chem. Eng. Sci"},{"key":"2023051705194417500_btaa881-B19","doi-asserted-by":"crossref","first-page":"1019","DOI":"10.1002\/asi.20591","article-title":"The link-prediction problem for social networks","volume":"58","author":"Liben-Nowell","year":"2007","journal-title":"J. Am. Soc. Inform. Sci. Technol"},{"key":"2023051705194417500_btaa881-B20","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Maaten","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023051705194417500_btaa881-B21","doi-asserted-by":"crossref","first-page":"929","DOI":"10.1105\/tpc.113.122242","article-title":"Systematic structural characterization of metabolites in Arabidopsis via candidate substrate-product pair networks","volume":"26","author":"Morreel","year":"2014","journal-title":"Plant Cell"},{"key":"2023051705194417500_btaa881-B22","doi-asserted-by":"crossref","first-page":"970","DOI":"10.1016\/j.jmb.2019.01.013","article-title":"Discovery and characterization of fmn-binding \u03b2-glucuronidases in the human gut microbiome","volume":"431","author":"Pellock","year":"2019","journal-title":"J. Mol. Biol"},{"key":"2023051705194417500_btaa881-B23","first-page":"701","author":"Perozzi","year":"2014"},{"key":"2023051705194417500_btaa881-B24","doi-asserted-by":"crossref","first-page":"1016","DOI":"10.1093\/bioinformatics\/btu760","article-title":"Efficient searching and annotation of metabolic networks using chemical similarity","volume":"31","author":"Pertusi","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051705194417500_btaa881-B25","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1038\/nmeth.2803","article-title":"Ec-blast: a tool to automatically search and compare enzyme reactions","volume":"11","author":"Rahman","year":"2014","journal-title":"Nat. Methods"},{"key":"2023051705194417500_btaa881-B26","doi-asserted-by":"crossref","first-page":"1551","DOI":"10.1126\/science.1073374","article-title":"Hierarchical organization of modularity in metabolic networks","volume":"297","author":"Ravasz","year":"2002","journal-title":"Science"},{"key":"2023051705194417500_btaa881-B27","doi-asserted-by":"crossref","first-page":"14","DOI":"10.3390\/microorganisms7010014","article-title":"What is the healthy gut microbiota composition? A changing ecosystem across age, environment, diet, and diseases","volume":"7","author":"Rinninella","year":"2019","journal-title":"Microorganisms"},{"key":"2023051705194417500_btaa881-B28","doi-asserted-by":"crossref","first-page":"6118","DOI":"10.1002\/chem.201604556","article-title":"Modelling chemical reasoning to predict and invent reactions","volume":"23","author":"Segler","year":"2017","journal-title":"Chemistry"},{"key":"2023051705194417500_btaa881-B29","doi-asserted-by":"crossref","first-page":"3522","DOI":"10.1093\/bioinformatics\/btw491","article-title":"ReactPRED: a tool to predict and analyze biochemical reactions","volume":"32","author":"Sivakumar","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051705194417500_btaa881-B30","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1186\/s12859-018-2248-5","article-title":"Simcal: a flexible tool to compute biochemical reaction similarity","volume":"19","author":"Sivakumar","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023051705194417500_btaa881-B31","doi-asserted-by":"crossref","first-page":"i278","DOI":"10.1093\/bioinformatics\/btw260","article-title":"Simultaneous prediction of enzyme orthologs from chemical transformation patterns for de novo metabolic pathway reconstruction","volume":"32","author":"Tabei","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051705194417500_btaa881-B32","author":"Tang","year":"2015"},{"key":"2023051705194417500_btaa881-B33","doi-asserted-by":"crossref","first-page":"i161","DOI":"10.1093\/bioinformatics\/btv224","article-title":"Metabolome-scale de novo pathway reconstruction using regioisomer-sensitive graph alignments","volume":"31","author":"Yamanishi","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051705194417500_btaa881-B34","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1016\/j.ymben.2011.01.006","article-title":"Probabilistic pathway construction","volume":"13","author":"Yousofshahi","year":"2011","journal-title":"Metabol. Eng"},{"key":"2023051705194417500_btaa881-B35","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1186\/s12918-015-0241-4","article-title":"PROXIMAL: a method for prediction of xenobiotic metabolism","volume":"9","author":"Yousofshahi","year":"2015","journal-title":"BMC Syst. Biol"},{"key":"2023051705194417500_btaa881-B36","year":"2019"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa881\/34841253\/btaa881.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/6\/793\/50356282\/btaa881.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/6\/793\/50356282\/btaa881.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T05:21:03Z","timestamp":1684300863000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/6\/793\/5922818"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,10,14]]},"references-count":36,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2021,5,5]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa881","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,3,15]]},"published":{"date-parts":[[2020,10,14]]}}}