{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,8]],"date-time":"2026-05-08T07:57:17Z","timestamp":1778227037540,"version":"3.51.4"},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,6,12]],"date-time":"2021-06-12T00:00:00Z","timestamp":1623456000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,6,12]],"date-time":"2021-06-12T00:00:00Z","timestamp":1623456000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004837","name":"Ministerio de Ciencia e Innovaci\u00f3n","doi-asserted-by":"publisher","award":["BIO2015-72091-EXP"],"award-info":[{"award-number":["BIO2015-72091-EXP"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004837","name":"Ministerio de Ciencia e Innovaci\u00f3n","doi-asserted-by":"publisher","award":["BIO2010-22109"],"award-info":[{"award-number":["BIO2010-22109"]}],"id":[{"id":"10.13039\/501100004837","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004895","name":"European Social Fund","doi-asserted-by":"publisher","award":["BIO2010-22109"],"award-info":[{"award-number":["BIO2010-22109"]}],"id":[{"id":"10.13039\/501100004895","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Assignment of chemical compounds to biological pathways is a crucial step to understand the relationship between the chemical repertory of an organism and its biology. Protein sequence profiles are very successful in capturing the main structural and functional features of a protein family, and can be used to assign new members to it based on matching of their sequences against these profiles. In this work, we extend this idea to chemical compounds, constructing a profile-inspired model for a set of related metabolites (those in the same biological pathway), based on a fragment-based vectorial representation of their chemical structures.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We use this representation to predict the biological pathway of a chemical compound with good overall accuracy (AUC 0.74\u20130.90 depending on the database tested), and analyzed some factors that affect performance. The approach, which is compared with equivalent methods, can in addition detect those molecular fragments characteristic of a pathway.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>The method is available as a graphical interactive web server <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"http:\/\/csbg.cnb.csic.es\/iFragMent\">http:\/\/csbg.cnb.csic.es\/iFragMent<\/jats:ext-link>.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-021-04252-y","type":"journal-article","created":{"date-parts":[[2021,6,12]],"date-time":"2021-06-12T18:02:27Z","timestamp":1623520947000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Predicting biological pathways of chemical compounds with a profile-inspired approach"],"prefix":"10.1186","volume":"22","author":[{"given":"Javier","family":"Lopez-Iba\u00f1ez","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Florencio","family":"Pazos","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6911-1591","authenticated-orcid":false,"given":"Monica","family":"Chagoyen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,6,12]]},"reference":[{"issue":"7019","key":"4252_CR1","doi-asserted-by":"publisher","first-page":"824","DOI":"10.1038\/nature03192","volume":"432","author":"CM Dobson","year":"2004","unstructured":"Dobson CM. Chemical space and biology. Nature. 2004;432(7019):824\u20138. https:\/\/doi.org\/10.1038\/nature03192.","journal-title":"Nature"},{"issue":"2","key":"4252_CR2","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/s11030-008-9085-9","volume":"12","author":"YD Cai","year":"2008","unstructured":"Cai YD, Qian Z, Lu L, Feng KY, Meng X, Niu B, et al. Prediction of compounds\u2019 biological function (metabolic pathways) based on functional group composition. Mol Divers. 2008;12(2):131\u20137. https:\/\/doi.org\/10.1007\/s11030-008-9085-9.","journal-title":"Mol Divers"},{"issue":"8","key":"4252_CR3","doi-asserted-by":"publisher","first-page":"969","DOI":"10.2174\/092986609788923374","volume":"16","author":"J Lu","year":"2009","unstructured":"Lu J, Niu B, Liu L, Lu WC, Cai YD. Prediction of small molecules\u2019 metabolic pathways based on functional group composition. Protein Pept Lett. 2009;16(8):969\u201376.","journal-title":"Protein Pept Lett"},{"issue":"12","key":"4252_CR4","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0029491","volume":"6","author":"LL Hu","year":"2011","unstructured":"Hu LL, Chen C, Huang T, Cai YD, Chou KC. Predicting biological functions of compounds based on chemical-chemical interactions. PLoS ONE. 2011;6(12): e29491. https:\/\/doi.org\/10.1371\/journal.pone.0029491.","journal-title":"PLoS ONE"},{"issue":"9","key":"4252_CR5","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0045944","volume":"7","author":"YF Gao","year":"2012","unstructured":"Gao YF, Chen L, Cai YD, Feng KY, Huang T, Jiang Y. Predicting metabolic pathways of small molecules and enzymes based on interaction information of chemicals and proteins. PLoS ONE. 2012;7(9): e45944. https:\/\/doi.org\/10.1371\/journal.pone.0045944.","journal-title":"PLoS ONE"},{"issue":"2","key":"4252_CR6","doi-asserted-by":"publisher","first-page":"136","DOI":"10.2174\/1386207319666151110122453","volume":"19","author":"L Chen","year":"2016","unstructured":"Chen L, Chu C, Feng K. Predicting the types of metabolic pathway of compounds using molecular fragments and sequential minimal optimization. Comb Chem High Throughput Screen. 2016;19(2):136\u201343.","journal-title":"Comb Chem High Throughput Screen"},{"issue":"8","key":"4252_CR7","doi-asserted-by":"publisher","first-page":"2547","DOI":"10.1093\/bioinformatics\/btz954","volume":"36","author":"M Baranwal","year":"2020","unstructured":"Baranwal M, Magner A, Elvati P, Saldinger J, Violi A, Hero AO. A deep learning architecture for metabolic pathway prediction. Bioinformatics. 2020;36(8):2547\u201353.","journal-title":"Bioinformatics"},{"issue":"10","key":"4252_CR8","doi-asserted-by":"publisher","first-page":"2272","DOI":"10.1021\/ci900196u","volume":"49","author":"A Macchiarulo","year":"2009","unstructured":"Macchiarulo A, Thornton JM, Nobeli I. Mapping human metabolic pathways in the small molecule chemical space. J Chem Inf Model. 2009;49(10):2272\u201389. https:\/\/doi.org\/10.1021\/ci900196u.","journal-title":"J Chem Inf Model"},{"issue":"D1","key":"4252_CR9","doi-asserted-by":"publisher","first-page":"D353","DOI":"10.1093\/nar\/gkw1092","volume":"45","author":"M Kanehisa","year":"2017","unstructured":"Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017;45(D1):D353\u201361. https:\/\/doi.org\/10.1093\/nar\/gkw1092.","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"4252_CR10","doi-asserted-by":"publisher","first-page":"709","DOI":"10.1021\/ci500517v","volume":"55","author":"MA Hamdalla","year":"2015","unstructured":"Hamdalla MA, Rajasekaran S, Grant DF, Mandoiu II. Metabolic pathway predictions for metabolomics: a molecular structure matching approach. J Chem Inf Model. 2015;55(3):709\u201318. https:\/\/doi.org\/10.1021\/ci500517v.","journal-title":"J Chem Inf Model"},{"key":"4252_CR11","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological sequence analysis","author":"R Durbin","year":"1998","unstructured":"Durbin R, Eddy SR, Krogh A, Mitchison G. Biological sequence analysis. Cambridge: Cambridge University Press; 1998."},{"key":"4252_CR12","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gkx1132","author":"A Fabregat","year":"2017","unstructured":"Fabregat A, Jupe S, Matthews L, Sidiropoulos K, Gillespie M, Garapati P, et al. The reactome pathway knowledgebase. Nucleic Acids Res. 2017. https:\/\/doi.org\/10.1093\/nar\/gkx1132.","journal-title":"Nucleic Acids Res"},{"issue":"Database issue","key":"4252_CR13","doi-asserted-by":"publisher","first-page":"D478","DOI":"10.1093\/nar\/gkt1067","volume":"42","author":"T Jewison","year":"2014","unstructured":"Jewison T, Su Y, Disfany FM, Liang Y, Knox C, Maciejewski A, et al. SMPDB 2.0: big improvements to the small molecule pathway database. Nucleic Acids Res. 2014;42(Database issue):D478\u201384. https:\/\/doi.org\/10.1093\/nar\/gkt1067.","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"4252_CR14","doi-asserted-by":"publisher","first-page":"D502","DOI":"10.1093\/nar\/gkv1229","volume":"44","author":"J Wicker","year":"2016","unstructured":"Wicker J, Lorsbach T, Gutlein M, Schmid E, Latino D, Kramer S, et al. enviPath\u2013The environmental contaminant biotransformation pathway resource. Nucleic Acids Res. 2016;44(D1):D502\u20138. https:\/\/doi.org\/10.1093\/nar\/gkv1229.","journal-title":"Nucleic Acids Res"},{"issue":"9","key":"4252_CR15","doi-asserted-by":"publisher","first-page":"2997","DOI":"10.1093\/nar\/10.9.2997","volume":"10","author":"GD Stormo","year":"1982","unstructured":"Stormo GD, Schneider TD, Gold L, Ehrenfeucht A. Use of the \u201cPerceptron\u201d algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 1982;10(9):2997\u20133011.","journal-title":"Nucleic Acids Res"},{"issue":"13","key":"4252_CR16","doi-asserted-by":"publisher","first-page":"4355","DOI":"10.1073\/pnas.84.13.4355","volume":"84","author":"M Gribskov","year":"1987","unstructured":"Gribskov M, McLachlan AD, Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci USA. 1987;84(13):4355\u20138.","journal-title":"Proc Natl Acad Sci USA"},{"key":"4252_CR17","first-page":"47","volume":"1","author":"M Brown","year":"1993","unstructured":"Brown M, Hughey R, Krogh A, Mian IS, Sjolander K, Haussler D. Using Dirichlet mixture priors to derive hidden Markov models for protein families. Proc Int Conf Intell Syst Mol Biol. 1993;1:47\u201355.","journal-title":"Proc Int Conf Intell Syst Mol Biol"},{"issue":"5","key":"4252_CR18","doi-asserted-by":"publisher","first-page":"1501","DOI":"10.1006\/jmbi.1994.1104","volume":"235","author":"A Krogh","year":"1994","unstructured":"Krogh A, Brown M, Mian IS, Sjolander K, Haussler D. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol. 1994;235(5):1501\u201331. https:\/\/doi.org\/10.1006\/jmbi.1994.1104.","journal-title":"J Mol Biol"},{"issue":"4","key":"4252_CR19","doi-asserted-by":"publisher","first-page":"1201","DOI":"10.1006\/jmbi.1998.2221","volume":"284","author":"J Park","year":"1998","unstructured":"Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, et al. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol. 1998;284(4):1201\u201310. https:\/\/doi.org\/10.1006\/jmbi.1998.2221.","journal-title":"J Mol Biol"},{"issue":"7270","key":"4252_CR20","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1038\/nature08506","volume":"462","author":"MJ Keiser","year":"2009","unstructured":"Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ, et al. Predicting new molecular targets for known drugs. Nature. 2009;462(7270):175\u201381. https:\/\/doi.org\/10.1038\/nature08506.","journal-title":"Nature"},{"issue":"7403","key":"4252_CR21","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1038\/nature11159","volume":"486","author":"E Lounkine","year":"2012","unstructured":"Lounkine E, Keiser MJ, Whitebread S, Mikhailov D, Hamon J, Jenkins JL, et al. Large-scale prediction and testing of drug activity on side-effect targets. Nature. 2012;486(7403):361\u20137. https:\/\/doi.org\/10.1038\/nature11159.","journal-title":"Nature"},{"issue":"39","key":"4252_CR22","doi-asserted-by":"publisher","first-page":"11853","DOI":"10.1021\/ja036030u","volume":"125","author":"M Hattori","year":"2003","unstructured":"Hattori M, Okuno Y, Goto S, Kanehisa M. Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. J Am Chem Soc. 2003;125(39):11853\u201365. https:\/\/doi.org\/10.1021\/ja036030u.","journal-title":"J Am Chem Soc"},{"issue":"3","key":"4252_CR23","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1002\/bies.201300153","volume":"36","author":"V de Lorenzo","year":"2014","unstructured":"de Lorenzo V. From the selfish gene to selfish metabolism: revisiting the central dogma. BioEssays. 2014;36(3):226\u201335. https:\/\/doi.org\/10.1002\/bies.201300153.","journal-title":"BioEssays"},{"issue":"D1","key":"4252_CR24","doi-asserted-by":"publisher","first-page":"D1214","DOI":"10.1093\/nar\/gkv1031","volume":"44","author":"J Hastings","year":"2016","unstructured":"Hastings J, Owen G, Dekker A, Ennis M, Kale N, Muthukrishnan V, et al. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res. 2016;44(D1):D1214\u20139. https:\/\/doi.org\/10.1093\/nar\/gkv1031.","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"4252_CR25","doi-asserted-by":"publisher","first-page":"855","DOI":"10.1002\/minf.201000099","volume":"29","author":"F Ruggiu","year":"2010","unstructured":"Ruggiu F, Marcou G, Varnek A, Horvath D. ISIDA Property-labelled fragment descriptors. Mol Inform. 2010;29(12):855\u201368. https:\/\/doi.org\/10.1002\/minf.201000099.","journal-title":"Mol Inform"},{"issue":"2","key":"4252_CR26","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1038\/nbt1284","volume":"25","author":"MJ Keiser","year":"2007","unstructured":"Keiser MJ, Roth BL, Armburuster BN, Ernsberger P, Irwin JJ, Shoichet BK. Relating protein pharmacology by ligand chemistry. Nat Biotechnol. 2007;25(2):197\u2013206.","journal-title":"Nat Biotechnol"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-04252-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-021-04252-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-04252-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,6,12]],"date-time":"2021-06-12T18:02:43Z","timestamp":1623520963000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-021-04252-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,12]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["4252"],"URL":"https:\/\/doi.org\/10.1186\/s12859-021-04252-y","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,12]]},"assertion":[{"value":"15 April 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 June 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 June 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"Florencio Pazos is member of the editorial board (Associate Editor).","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"320"}}