{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T13:59:36Z","timestamp":1770818376923,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T00:00:00Z","timestamp":1649721600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["1909536"],"award-info":[{"award-number":["1909536"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["1908617"],"award-info":[{"award-number":["1908617"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"NIGMS of the National Institutes of Health","award":["R01GM132391"],"award-info":[{"award-number":["R01GM132391"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,5,13]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Despite experimental and curation efforts, the extent of enzyme promiscuity on substrates continues to be largely unexplored and under documented. Providing computational tools for the exploration of the enzyme\u2013substrate interaction space can expedite experimentation and benefit applications such as constructing synthesis pathways for novel biomolecules, identifying products of metabolism on ingested compounds, and elucidating xenobiotic metabolism. Recommender systems (RS), which are currently unexplored for the enzyme\u2013substrate interaction prediction problem, can be utilized to provide enzyme recommendations for substrates, and vice versa. The performance of Collaborative-Filtering (CF) RSs; however, hinges on the quality of embedding vectors of users and items (enzymes and substrates in our case). Importantly, enhancing CF embeddings with heterogeneous auxiliary data, specially relational data (e.g. hierarchical, pairwise or groupings), remains a challenge.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose an innovative general RS framework, termed Boost-RS that enhances RS performance by \u2018boosting\u2019 embedding vectors through auxiliary data. Specifically, Boost-RS is trained and dynamically tuned on multiple relevant auxiliary learning tasks Boost-RS utilizes contrastive learning tasks to exploit relational data. To show the efficacy of Boost-RS for the enzyme\u2013substrate prediction interaction problem, we apply the Boost-RS framework to several baseline CF models. We show that each of our auxiliary tasks boosts learning of the embedding vectors, and that contrastive learning using Boost-RS outperforms attribute concatenation and multi-label learning. We also show that Boost-RS outperforms similarity-based models. Ablation studies and visualization of learned representations highlight the importance of using contrastive learning on some of the auxiliary data in boosting the embedding vectors.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>A Python implementation for Boost-RS is provided at https:\/\/github.com\/HassounLab\/Boost-RS. The enzyme-substrate interaction data is available from the KEGG database (https:\/\/www.genome.jp\/kegg\/).<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btac201","type":"journal-article","created":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T13:24:41Z","timestamp":1649769881000},"page":"2832-2838","source":"Crossref","is-referenced-by-count":1,"title":["Boost-RS: boosted embeddings for recommender systems and its application to enzyme\u2013substrate interaction prediction"],"prefix":"10.1093","volume":"38","author":[{"given":"Xinmeng","family":"Li","sequence":"first","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford, MA 02155, USA"}]},{"given":"Li-Ping","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford, MA 02155, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9477-2199","authenticated-orcid":false,"given":"Soha","family":"Hassoun","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Tufts University , Medford, MA 02155, USA"},{"name":"Department of Chemical and Biological Engineering, Tufts University , Medford, MA 02155, USA"}]}],"member":"286","published-online":{"date-parts":[[2022,4,12]]},"reference":[{"key":"2023020109082412300_btac201-B1","first-page":"802","author":"Acun","year":"2021"},{"key":"2023020109082412300_btac201-B2","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1093\/bib\/bbz157","article-title":"Machine learning approaches and databases for prediction of drug\u2013target interaction: a survey paper","volume":"22","author":"Bagherian","year":"2021","journal-title":"Brief. Bioinform"},{"key":"2023020109082412300_btac201-B3","author":"Belharbi","year":"2016"},{"key":"2023020109082412300_btac201-B4","doi-asserted-by":"crossref","first-page":"766","DOI":"10.1016\/j.tibtech.2019.12.024","article-title":"Synthetic biochemistry: the bio-inspired cell-free approach to commodity chemical production","volume":"38","author":"Bowie","year":"2020","journal-title":"Trends Biotechnol"},{"key":"2023020109082412300_btac201-B5","doi-asserted-by":"crossref","first-page":"1273","DOI":"10.1021\/ci010132r","article-title":"Reoptimization of MDL keys for use in drug discovery","volume":"42","author":"Durant","year":"2002","journal-title":"J. Chem. Inf. Comput. Sci"},{"key":"2023020109082412300_btac201-B6","author":"Gao","year":"2018"},{"key":"2023020109082412300_btac201-B7","first-page":"173","author":"He","year":"2017"},{"key":"2023020109082412300_btac201-B8","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1093\/bioinformatics\/btaa881","article-title":"Learning graph representations of biochemical networks and its application to enzymatic link prediction","volume":"37","author":"Jiang","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020109082412300_btac201-B9","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"KEGG: Kyoto Encyclopedia of Genes and Genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023020109082412300_btac201-B10","doi-asserted-by":"crossref","first-page":"D457","DOI":"10.1093\/nar\/gkv1070","article-title":"KEGG as a reference resource for gene and protein annotation","volume":"44","author":"Kanehisa","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023020109082412300_btac201-B11","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1146\/annurev-biochem-030409-143718","article-title":"Enzyme promiscuity: a mechanistic and evolutionary perspective","volume":"79","author":"Khersonsky","year":"2010","journal-title":"Annu. Rev. Biochem"},{"key":"2023020109082412300_btac201-B12","volume-title":"ICLR, San Diego, CA, United States","author":"Kingma","year":"2015"},{"key":"2023020109082412300_btac201-B13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1752-0509-7-S6-S2","article-title":"KCF-S: KEGG chemical function and substructure for improved interpretability and prediction in chemical bioinformatics","volume":"7","author":"Kotera","year":"2013","journal-title":"BMC Syst. Biol"},{"key":"2023020109082412300_btac201-B14","doi-asserted-by":"crossref","first-page":"e1005135","DOI":"10.1371\/journal.pcbi.1005135","article-title":"Large-scale off-target identification using fast and accurate dual regularized one-class collaborative filtering and its application to drug repurposing","volume":"12","author":"Lim","year":"2016","journal-title":"PLoS Comput. Biol"},{"key":"2023020109082412300_btac201-B15","doi-asserted-by":"crossref","first-page":"e1004760","DOI":"10.1371\/journal.pcbi.1004760","article-title":"Neighborhood regularized logistic matrix factorization for drug\u2013target interaction prediction","volume":"12","author":"Liu","year":"2016","journal-title":"PLoS Comput. Biol"},{"key":"2023020109082412300_btac201-B16","first-page":"9977","article-title":"Loss-balanced task weighting to reduce negative transfer in multi-task learning","volume":"33","author":"Liu","year":"2019","journal-title":"Proc. AAAI Conf. Artif. Intell"},{"key":"2023020109082412300_btac201-B18","doi-asserted-by":"crossref","first-page":"518","DOI":"10.1021\/acssynbio.5b00294","article-title":"Semisupervised gaussian process for automated enzyme search","volume":"5","author":"Mellor","year":"2016","journal-title":"ACS Synth. Biol"},{"key":"2023020109082412300_btac201-B19","article-title":"Probabilistic matrix factorization","author":"Mnih","year":"2008"},{"key":"2023020109082412300_btac201-B20","doi-asserted-by":"crossref","first-page":"e00170","DOI":"10.1016\/j.mec.2021.e00170","article-title":"Analysis of metabolic network disruption in engineered microbial hosts due to enzyme promiscuity","volume":"12","author":"Porokhin","year":"2021","journal-title":"Metab. Eng. Commun"},{"key":"2023020109082412300_btac201-B21","first-page":"821","article-title":"SYGMA: combining expert knowledge and empirical scoring in the prediction of metabolites","volume":"3","author":"Ridder","year":"2008","journal-title":"ChemMedChem Chem. Enabling Drug Discov"},{"key":"2023020109082412300_btac201-B22","doi-asserted-by":"crossref","first-page":"866","DOI":"10.1038\/nrm2805","article-title":"Exploring protein fitness landscapes by directed evolution","volume":"10","author":"Romero","year":"2009","journal-title":"Nat. Rev. Mol. Cell Biol"},{"key":"2023020109082412300_btac201-B23","doi-asserted-by":"crossref","first-page":"100879","DOI":"10.1016\/j.elerap.2019.100879","article-title":"Research commentary on recommendations with side information: a survey and research directions","volume":"37","author":"Sun","year":"2019","journal-title":"Electron. Commer. Res. Appl"},{"key":"2023020109082412300_btac201-B24","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1111\/cbdd.13445","article-title":"Computational methods and tools to predict cytochrome p450 metabolism for drug discovery","volume":"93","author":"Tyzack","year":"2019","journal-title":"Chem. Biol. Drug Des"},{"key":"2023020109082412300_btac201-B25","first-page":"2579","article-title":"Visualizing data using T-SNE","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023020109082412300_btac201-B27","doi-asserted-by":"crossref","first-page":"2017","DOI":"10.1093\/bioinformatics\/btab054","article-title":"Enzyme promiscuity prediction using hierarchy-informed multi-label classification","volume":"37","author":"Visani","year":"2021","journal-title":"Bioinformatics"},{"key":"2023020109082412300_btac201-B28","first-page":"165","author":"Wang","year":"2019"},{"key":"2023020109082412300_btac201-B29","doi-asserted-by":"crossref","first-page":"e0251162","DOI":"10.1371\/journal.pone.0251162","article-title":"Multitask feature learning approach for knowledge graph enhanced recommendations with ripplenet","volume":"16","author":"Wang","year":"2021","journal-title":"PLoS One"},{"key":"2023020109082412300_btac201-B30","volume-title":"Enzyme Nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes","year":"1992"},{"key":"2023020109082412300_btac201-B31","first-page":"207","article-title":"Distance metric learning for large margin nearest neighbor classification","volume":"10","author":"Weinberger","year":"2009","journal-title":"J. Mach. Learn. Res"},{"key":"2023020109082412300_btac201-B32","first-page":"3203","volume-title":"IJCAI International Joint Conference on Artificial Intelligence, Melbourne, Australia","author":"Xue","year":"2017"},{"key":"2023020109082412300_btac201-B33","doi-asserted-by":"crossref","first-page":"3474","DOI":"10.1093\/bioinformatics\/btaa157","article-title":"A graph regularized generalized matrix factorization model for predicting links in biomedical bipartite networks","volume":"36","author":"Zhang","year":"2020","journal-title":"Bioinformatics"},{"key":"2023020109082412300_btac201-B34","first-page":"1025","author":"Zheng","year":"2013"},{"key":"2023020109082412300_btac201-B35","first-page":"1409","author":"Zhu","year":"2017"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btac201\/43404396\/btac201.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/10\/2832\/49010108\/btac201.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/10\/2832\/49010108\/btac201.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T21:01:57Z","timestamp":1675285317000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/10\/2832\/6567355"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,4,12]]},"references-count":33,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2022,5,13]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btac201","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,5,15]]},"published":{"date-parts":[[2022,4,12]]}}}