{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T23:29:09Z","timestamp":1773271749320,"version":"3.50.1"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2020,10,20]],"date-time":"2020-10-20T00:00:00Z","timestamp":1603152000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100008762","name":"Genome Canada","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100008762","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000233","name":"Genome British Columbia","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000233","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Compute\/Calcul Canada"},{"name":"UBC four-year doctoral fellowship","award":["4YF"],"award-info":[{"award-number":["4YF"]}]},{"name":"UBC Graduate Program in Bioinformatics"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,5,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Metabolic pathway reconstruction from genomic sequence information is a key step in predicting regulatory and functional potential of cells at the individual, population and community levels of organization. Although the most common methods for metabolic pathway reconstruction are gene-centric e.g. mapping annotated proteins onto known pathways using a reference database, pathway-centric methods based on heuristics or machine learning to infer pathway presence provide a powerful engine for hypothesis generation in biological systems. Such methods rely on rule sets or rich feature information that may not be known or readily accessible.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Here, we present pathway2vec, a software package consisting of six representational learning modules used to automatically generate features for pathway inference. Specifically, we build a three-layered network composed of compounds, enzymes and pathways, where nodes within a layer manifest inter-interactions and nodes between layers manifest betweenness interactions. This layered architecture captures relevant relationships used to learn a neural embedding-based low-dimensional space of metabolic features. We benchmark pathway2vec performance based on node-clustering, embedding visualization and pathway prediction using MetaCyc as a trusted source. In the pathway prediction task, results indicate that it is possible to leverage embeddings to improve prediction outcomes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The software package and installation instructions are published on http:\/\/github.com\/pathway2vec.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa906","type":"journal-article","created":{"date-parts":[[2020,10,8]],"date-time":"2020-10-08T15:23:26Z","timestamp":1602170606000},"page":"822-829","source":"Crossref","is-referenced-by-count":12,"title":["Leveraging heterogeneous network embedding for metabolic pathway prediction"],"prefix":"10.1093","volume":"37","author":[{"given":"Abdur Rahman","family":"M A Basher","sequence":"first","affiliation":[{"name":"Graduate Program in Bioinformatics, University of British Columbia , Vancouver, BC V6T 1Z3, Canada"}]},{"given":"Steven J","family":"Hallam","sequence":"additional","affiliation":[{"name":"Graduate Program in Bioinformatics, University of British Columbia , Vancouver, BC V6T 1Z3, Canada"},{"name":"Department of Microbiology & Immunology, University of British Columbia , Vancouver, BC V6T 1Z3, Canada"},{"name":"Genome Science and Technology Program, University of British Columbia , Vancouver, BC V6T 1Z3, Canada"},{"name":"Life Sciences Institute, University of British Columbia , Vancouver, BC V6T 1Z3, Canada"},{"name":"ECOSCOPE Training Program, University of British Columbia , Vancouver, BC V6T 1Z3, Canada"}]}],"member":"286","published-online":{"date-parts":[[2020,10,20]]},"reference":[{"key":"2023051705194444100_btaa906-B1","first-page":"265","volume-title":"12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16)","author":"Abadi","year":"2016"},{"key":"2023051705194444100_btaa906-B2","doi-asserted-by":"crossref","first-page":"e1002358","DOI":"10.1371\/journal.pcbi.1002358","article-title":"Metabolic reconstruction for metagenomic data and its application to the human microbiome","volume":"8","author":"Abubucker","year":"2012","journal-title":"PLoS Comput. Biol"},{"key":"2023051705194444100_btaa906-B3","first-page":"9180","volume-title":"Advances in Neural Information Processing Systems","author":"Abu-El-Haija","year":"2018"},{"key":"2023051705194444100_btaa906-B4","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/j.nbt.2008.12.009","article-title":"Next-generation DNA sequencing techniques","volume":"25","author":"Ansorge","year":"2009","journal-title":"N. Biotechnol"},{"key":"2023051705194444100_btaa906-B5","first-page":"1027","volume-title":"Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms","author":"Arthur","year":"2007"},{"key":"2023051705194444100_btaa906-B7","doi-asserted-by":"crossref","first-page":"2153","DOI":"10.1093\/bioinformatics\/bty065","article-title":"Selenzyme: enzyme selection tool for pathway design","volume":"34","author":"Carbonell","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051705194444100_btaa906-B8","doi-asserted-by":"crossref","first-page":"lb192","DOI":"10.1096\/fasebj.30.1_supplement.lb192","article-title":"BioCyc: online resource for genome and metabolic pathway analysis","volume":"30","author":"Caspi","year":"2016","journal-title":"FASEB J"},{"key":"2023051705194444100_btaa906-B9","doi-asserted-by":"crossref","first-page":"D471","DOI":"10.1093\/nar\/gkv1164","article-title":"The metaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway\/genome databases","volume":"44","author":"Caspi","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023051705194444100_btaa906-B10","first-page":"1321","volume-title":"International Conference on Machine Learning","author":"Cohen","year":"2019"},{"key":"2023051705194444100_btaa906-B11","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/1471-2105-11-15","article-title":"Machine learning methods for metabolic pathway prediction","volume":"11","author":"Dale","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023051705194444100_btaa906-B12","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1145\/3097983.3098036","volume-title":"Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Dong","year":"2017"},{"key":"2023051705194444100_btaa906-B13","doi-asserted-by":"crossref","first-page":"3013","DOI":"10.1021\/cr950057h","article-title":"Structure- function relationships of alternative nitrogenases","volume":"96","author":"Eady","year":"1996","journal-title":"Chem. Rev"},{"key":"2023051705194444100_btaa906-B14","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.physrep.2009.11.002","article-title":"Community detection in graphs","volume":"486","author":"Fortunato","year":"2010","journal-title":"Phys. Rep"},{"key":"2023051705194444100_btaa906-B15","doi-asserted-by":"crossref","first-page":"1797","DOI":"10.1145\/3132847.3132953","volume-title":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","author":"Fu","year":"2017"},{"key":"2023051705194444100_btaa906-B16","doi-asserted-by":"crossref","first-page":"855","DOI":"10.1145\/2939672.2939754","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Grover","year":"2016"},{"key":"2023051705194444100_btaa906-B17","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1145\/2339530.2339723","volume-title":"Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Henderson","year":"2012"},{"key":"2023051705194444100_btaa906-B18","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1145\/3269206.3271777","volume-title":"Proceedings of the 27th ACM International Conference on Information and Knowledge Management","author":"Hussein","year":"2018"},{"key":"2023051705194444100_btaa906-B19","doi-asserted-by":"crossref","first-page":"e1002981","DOI":"10.1371\/journal.pcbi.1002981","article-title":"Probabilistic inference of biochemical reactions in microbial communities from metagenomic sequences","volume":"9","author":"Jiao","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023051705194444100_btaa906-B20","doi-asserted-by":"crossref","first-page":"D353","DOI":"10.1093\/nar\/gkw1092","article-title":"KEGG: new perspectives on genomes, pathways, diseases and drugs","volume":"45","author":"Kanehisa","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023051705194444100_btaa906-B21","doi-asserted-by":"crossref","first-page":"877","DOI":"10.1093\/bib\/bbv079","article-title":"Pathway tools version 19.0 update: software for pathway\/genome informatics and systems biology","volume":"17","author":"Karp","year":"2016","journal-title":"Brief. Bioinform"},{"key":"2023051705194444100_btaa906-B22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1128\/ecosalplus.ESP-0006-2018","article-title":"The EcoCyc Database","volume":"8","author":"Karp","year":"2018","journal-title":"EcoSal Plus"},{"key":"2023051705194444100_btaa906-B23","doi-asserted-by":"crossref","first-page":"725","DOI":"10.1038\/s41579-019-0255-9","article-title":"Common principles and best practices for engineering microbiomes","volume":"17","author":"Lawson","year":"2019","journal-title":"Nat. Rev. Microbiol"},{"key":"2023051705194444100_btaa906-B6","doi-asserted-by":"crossref","first-page":"e1008174","DOI":"10.1371\/journal.pcbi.1008174","article-title":"Metabolic pathway inference using multi-label classification with rich pathway features","volume":"16","author":"M.A.Basher","year":"2020","journal-title":"PLoS Comput. Biol"},{"key":"2023051705194444100_btaa906-B24","doi-asserted-by":"crossref","first-page":"861","DOI":"10.21105\/joss.00861","article-title":"UMAP: uniform manifold approximation and projection","volume":"3","author":"McInnes","year":"2018","journal-title":"J. Open Source Softw"},{"key":"2023051705194444100_btaa906-B25","first-page":"3111","volume-title":"Advances in Neural Information Processing Systems","author":"Mikolov","year":"2013"},{"key":"2023051705194444100_btaa906-B26","doi-asserted-by":"crossref","first-page":"8577","DOI":"10.1073\/pnas.0601602103","article-title":"Modularity and community structure in networks","volume":"103","author":"Newman","year":"2006","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051705194444100_btaa906-B27","article-title":"Geom-GCN: geometric graph convolutional networks","author":"Pei","year":"2020","journal-title":"In International Conference on Learning Representations, Addis Ababa, Ethiopia."},{"key":"2023051705194444100_btaa906-B28","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1145\/2623330.2623732","volume-title":"Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Perozzi","year":"2014"},{"key":"2023051705194444100_btaa906-B29","doi-asserted-by":"crossref","first-page":"e1003918","DOI":"10.1371\/journal.pcbi.1003918","article-title":"BiomeNet: a Bayesian model for inference of metabolic divergence among microbial communities","volume":"10","author":"Shafiei","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023051705194444100_btaa906-B30","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1109\/TKDE.2016.2598561","article-title":"A survey of heterogeneous information network analysis","volume":"29","author":"Shi","year":"2017","journal-title":"IEEE Trans. Knowl. Data Eng"},{"key":"2023051705194444100_btaa906-B31","doi-asserted-by":"crossref","first-page":"992","DOI":"10.14778\/3402707.3402736","article-title":"PathSim: meta path-based top-K similarity search in heterogeneous information networks","volume":"4","author":"Sun","year":"2011","journal-title":"Proc. VLDB Endow"},{"key":"2023051705194444100_btaa906-B32","doi-asserted-by":"crossref","first-page":"i278","DOI":"10.1093\/bioinformatics\/btw260","article-title":"Simultaneous prediction of enzyme orthologs from chemical transformation patterns for de novo metabolic pathway reconstruction","volume":"32","author":"Tabei","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051705194444100_btaa906-B33","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1038\/s42003-019-0440-4","article-title":"Combined network analysis and machine learning allows the prediction of metabolic pathways from tomato metabolomics data","volume":"2","author":"Toubiana","year":"2019","journal-title":"Commun. Biol"},{"key":"2023051705194444100_btaa906-B34","doi-asserted-by":"crossref","first-page":"1225","DOI":"10.1145\/2939672.2939753","volume-title":"Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining","author":"Wang","year":"2016"},{"key":"2023051705194444100_btaa906-B35","doi-asserted-by":"crossref","first-page":"e1000465","DOI":"10.1371\/journal.pcbi.1000465","article-title":"A parsimony approach to biological pathway reconstruction\/inference for genomes and metagenomes","volume":"5","author":"Ye","year":"2009","journal-title":"PLoS Comput. Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa906\/34841286\/btaa906.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/6\/822\/50356644\/btaa906.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/6\/822\/50356644\/btaa906.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,15]],"date-time":"2024-08-15T14:37:17Z","timestamp":1723732637000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/6\/822\/5932372"}},"subtitle":[],"editor":[{"given":"Cowen","family":"Lenore","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,10,20]]},"references-count":35,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2021,5,5]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa906","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.02.20.940205","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,3,15]]},"published":{"date-parts":[[2020,10,20]]}}}