{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T13:33:05Z","timestamp":1762522385698},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"S17","license":[{"start":{"date-parts":[[2012,12,1]],"date-time":"2012-12-01T00:00:00Z","timestamp":1354320000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/2.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2012,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Prediction of biochemical (metabolic) pathways has a wide range of applications, including the optimization of drug candidates, and the elucidation of toxicity mechanisms. Recently, several methods have been developed for pathway prediction to derive a goal compound from a start compound. However, these methods require high computational costs, and cannot perform comprehensive prediction of novel metabolic pathways. Our aim of this study is to develop a <jats:italic>de novo<\/jats:italic> prediction method for reconstructions of metabolic pathways and predictions of unknown biosynthetic pathways in the sense that it does not require any initial network such as KEGG metabolic network to be explored.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We formulated pathway prediction between a start compound and a goal compound as the shortest path search problem in terms of the number of enzyme reactions applied. We propose an efficient search method based on A* algorithm and heuristic techniques utilizing Linear Programming (LP) solution for estimation of the distance to the goal. First, a chemical compound is represented by a feature vector which counts frequencies of substructure occurrences in the structural formula. Second, an enzyme reaction is represented as an operator vector by detecting the structural changes to compounds before and after the reaction. By defining compound vectors as nodes and operator vectors as edges, prediction of the reaction pathway is reduced to the shortest path search problem in the vector space. In experiments on the DDT degradation pathway, we verify that the shortest paths predicted by our method are biologically correct pathways registered in the KEGG database. The results also demonstrate that the LP heuristics can achieve significant reduction in computation time. Furthermore, we apply our method to a secondary metabolite pathway of plant origin, and successfully find a novel biochemical pathway which cannot be predicted by the existing method. For the reconstruction of a known biochemical pathway, our method is over 40 times as fast as the existing method.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Our method enables fast and accurate <jats:italic>de novo<\/jats:italic> pathway predictions and novel pathway detection.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-13-s17-s8","type":"journal-article","created":{"date-parts":[[2019,12,11]],"date-time":"2019-12-11T02:00:38Z","timestamp":1576029638000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["An efficient algorithm for de novo predictions of biochemical pathways between chemical compounds"],"prefix":"10.1186","volume":"13","author":[{"given":"Masaomi","family":"Nakamura","sequence":"first","affiliation":[]},{"given":"Tsuyoshi","family":"Hachiya","sequence":"additional","affiliation":[]},{"given":"Yutaka","family":"Saito","sequence":"additional","affiliation":[]},{"given":"Kengo","family":"Sato","sequence":"additional","affiliation":[]},{"given":"Yasubumi","family":"Sakakibara","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2012,12,13]]},"reference":[{"key":"5472_CR1","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1186\/1752-0509-4-35","volume":"4","author":"A Cho","year":"2010","unstructured":"Cho A, Yun H, Park J, Lee S, Park S: Prediction of novel synthetic pathways for the production of desired chemicals. BMC Systems Biology. 2010, 4: 35-10.1186\/1752-0509-4-35.","journal-title":"BMC Systems Biology"},{"issue":"2","key":"5472_CR2","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1038\/nrd728","volume":"1","author":"J Nicholson","year":"2002","unstructured":"Nicholson J, Connelly J, Lindon J, Holmes E: Metabonomics: a platform for studying drug toxicity and gene function. Nature Reviews Drug Discovery. 2002, 1 (2): 153-162. 10.1038\/nrd728.","journal-title":"Nature Reviews Drug Discovery"},{"issue":"3","key":"5472_CR3","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1038\/nrmicro2717","volume":"10","author":"M Medema","year":"2012","unstructured":"Medema M, van Raaphorst R, Takano E, Breitling R: Computational tools for the synthetic design of biochemical pathways. Nature Reviews Microbiology. 2012, 10 (3): 191-202. 10.1038\/nrmicro2717.","journal-title":"Nature Reviews Microbiology"},{"issue":"0","key":"5472_CR4","doi-asserted-by":"publisher","first-page":"736","DOI":"10.2197\/ipsjdc.3.736","volume":"3","author":"Y Tohsato","year":"2007","unstructured":"Tohsato Y, Nishimura Y: Metabolic pathway alignment based on similarity between chemical structures. IPSJ Digital Courier. 2007, 3 (0): 736-745.","journal-title":"IPSJ Digital Courier"},{"issue":"12","key":"5472_CR5","doi-asserted-by":"publisher","first-page":"2335","DOI":"10.1021\/ci800213g","volume":"48","author":"M Kotera","year":"2008","unstructured":"Kotera M, McDonald A, Boyce S, Tipton K: Eliciting possible reaction equations and metabolic pathways involving orphan metabolites. Journal of Chemical Information and Modeling. 2008, 48 (12): 2335-2349. 10.1021\/ci800213g.","journal-title":"Journal of Chemical Information and Modeling"},{"issue":"23","key":"5472_CR6","doi-asserted-by":"publisher","first-page":"3135","DOI":"10.1093\/bioinformatics\/btp549","volume":"25","author":"M Leber","year":"2009","unstructured":"Leber M, Egelhofer V, Schomburg I, Schomburg D: Automatic assignment of reaction operators to enzymatic reactions. Bioinformatics. 2009, 25 (23): 3135-3142. 10.1093\/bioinformatics\/btp549.","journal-title":"Bioinformatics"},{"issue":"2","key":"5472_CR7","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1021\/ci025584y","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E: The Chemistry development kit (CDK): An open-source Java library for chemo-and bioinformatics. Journal of chemical information and computer sciences. 2003, 43 (2): 493-500. 10.1021\/ci025584y.","journal-title":"Journal of chemical information and computer sciences"},{"key":"5472_CR8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-1-12","volume":"1","author":"S Rahman","year":"2009","unstructured":"Rahman S, Bashton M, Holliday G, Schrader R, Thornton J: Small molecule subgraph detector (SMSD) toolkit. Journal of cheminformatics. 2009, 1: 1-13. 10.1186\/1758-2946-1-1.","journal-title":"Journal of cheminformatics"},{"issue":"3","key":"5472_CR9","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1021\/ci00031a005","volume":"21","author":"J McGregor","year":"1981","unstructured":"McGregor J, Willett P: Use of a maximum common subgraph algorithm in the automatic identification of ostensible bond changes occurring in chemical reactions. Journal of Chemical Information and Computer Sciences. 1981, 21 (3): 137-140. 10.1021\/ci00031a005.","journal-title":"Journal of Chemical Information and Computer Sciences"},{"issue":"3","key":"5472_CR10","doi-asserted-by":"publisher","first-page":"542","DOI":"10.1021\/ci050011h","volume":"45","author":"M Stahl","year":"2005","unstructured":"Stahl M, Mauser H: Database clustering with a combination of fingerprint and maximum common substructure methods. Journal of chemical information and modeling. 2005, 45 (3): 542-548. 10.1021\/ci050011h.","journal-title":"Journal of chemical information and modeling"},{"issue":"6","key":"5472_CR11","doi-asserted-by":"publisher","first-page":"639","DOI":"10.1021\/ci00010a009","volume":"32","author":"Y Takahashi","year":"1992","unstructured":"Takahashi Y, Sukekawa M, Sasaki S: Automatic identification of molecular similarity using reduced-graph representation of chemical structure. Journal of chemical information and computer sciences. 1992, 32 (6): 639-643. 10.1021\/ci00010a009.","journal-title":"Journal of chemical information and computer sciences"},{"key":"5472_CR12","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1021\/c160016a007","volume":"5","author":"E Sussenguth Jr","year":"1965","unstructured":"Sussenguth E: A graph-theoretic algorithm for matching chemical structures. Journal of Chemical Documentation. 1965, 5: 36-43. 10.1021\/c160016a007.","journal-title":"Journal of Chemical Documentation"},{"key":"5472_CR13","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1023\/A:1016387816342","volume":"16","author":"J Raymond","year":"2002","unstructured":"Raymond J, Willett P: Effectiveness of graph-based and fingerprint-based similarity measures for virtual screening of 2D chemical structure databases. Journal of computer-aided molecular design. 2002, 16: 59-71. 10.1023\/A:1016387816342.","journal-title":"Journal of computer-aided molecular design"},{"issue":"7","key":"5472_CR14","doi-asserted-by":"publisher","first-page":"521","DOI":"10.1023\/A:1021271615909","volume":"16","author":"J Raymond","year":"2002","unstructured":"Raymond J, Willett P: Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of computer-aided molecular design. 2002, 16 (7): 521-533. 10.1023\/A:1021271615909.","journal-title":"Journal of computer-aided molecular design"},{"issue":"2","key":"5472_CR15","doi-asserted-by":"publisher","first-page":"305","DOI":"10.1021\/ci010381f","volume":"42","author":"J Raymond","year":"2002","unstructured":"Raymond J, Gardiner E, Willett P: Heuristics for similarity searching of chemical graphs using a maximum common edge subgraph algorithm. Journal of chemical information and computer sciences. 2002, 42 (2): 305-316. 10.1021\/ci010381f.","journal-title":"Journal of chemical information and computer sciences"},{"issue":"13","key":"5472_CR16","doi-asserted-by":"publisher","first-page":"i366","DOI":"10.1093\/bioinformatics\/btn186","volume":"24","author":"Y Cao","year":"2008","unstructured":"Cao Y, Jiang T, Girke T: A maximum common substructure-based algorithm for searching and predicting drug-like compounds. Bioinformatics. 2008, 24 (13): i366-10.1093\/bioinformatics\/btn186.","journal-title":"Bioinformatics"},{"issue":"8","key":"5472_CR17","doi-asserted-by":"publisher","first-page":"1603","DOI":"10.1093\/bioinformatics\/bti213","volume":"21","author":"V Hatzimanikatis","year":"2005","unstructured":"Hatzimanikatis V, Li C, Ionita J, Henry C, Jankowski M, Broadbelt L: Exploring the diversity of complex metabolic networks. Bioinformatics. 2005, 21 (8): 1603-1609. 10.1093\/bioinformatics\/bti213.","journal-title":"Bioinformatics"},{"issue":"22-23","key":"5472_CR18","doi-asserted-by":"publisher","first-page":"5051","DOI":"10.1016\/j.ces.2004.09.021","volume":"59","author":"C Li","year":"2004","unstructured":"Li C, Henry C, Jankowski M, Ionita J, Hatzimanikatis V, Broadbelt L: Computational discovery of biochemical routes to specialty chemicals. Chemical engineering science. 2004, 59 (22-23): 5051-5060. 10.1016\/j.ces.2004.09.021.","journal-title":"Chemical engineering science"},{"issue":"6","key":"5472_CR19","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1007\/s10295-004-0144-7","volume":"31","author":"B Hou","year":"2004","unstructured":"Hou B, Ellis L, Wackett L: Encoding microbial metabolic logic: predicting biodegradation. Journal of industrial microbiology & biotechnology. 2004, 31 (6): 261-272.","journal-title":"Journal of industrial microbiology & biotechnology"},{"issue":"3","key":"5472_CR20","doi-asserted-by":"publisher","first-page":"407","DOI":"10.1016\/S0169-409X(02)00011-X","volume":"54","author":"J Langowski","year":"2002","unstructured":"Langowski J, Long A: Computer systems for the prediction of xenobiotic metabolism. Advanced drug delivery reviews. 2002, 54 (3): 407-415. 10.1016\/S0169-409X(02)00011-X.","journal-title":"Advanced drug delivery reviews"},{"issue":"4","key":"5472_CR21","doi-asserted-by":"publisher","first-page":"1702","DOI":"10.1021\/ci700006f","volume":"47","author":"M Oh","year":"2007","unstructured":"Oh M, Yamada T, Hattori M, Goto S, Kanehisa M: Systematic analysis of enzyme-catalyzed reaction patterns and prediction of microbial biodegradation pathways. Journal of chemical information and modeling. 2007, 47 (4): 1702-1712. 10.1021\/ci700006f.","journal-title":"Journal of chemical information and modeling"},{"issue":"6","key":"5472_CR22","doi-asserted-by":"publisher","first-page":"1326","DOI":"10.1021\/ci00022a015","volume":"34","author":"J Talafous","year":"1994","unstructured":"Talafous J, Sayre L, Mieyal J, Klopman G: META. 2. A dictionary model of mammalian xenobiotic metabolism. Journal of chemical information and computer sciences. 1994, 34 (6): 1326-1333. 10.1021\/ci00022a015.","journal-title":"Journal of chemical information and computer sciences"},{"key":"5472_CR23","doi-asserted-by":"crossref","unstructured":"Moriya Y, Shigemizu D, Hattori M, Tokimatsu T, Kotera M, Goto S, Kanehisa M: PathPred: an enzyme-catalyzed metabolic pathway prediction server. Nucleic acids research. 2010, W138-W143. 38 Web Server","DOI":"10.1093\/nar\/gkq318"},{"key":"5472_CR24","doi-asserted-by":"crossref","unstructured":"Gao J, Ellis L, Wackett L: The university of Minnesota pathway prediction system: multi-level prediction and visualization. Nucleic acids research. 2011, W406-W411. 39 Web Server","DOI":"10.1093\/nar\/gkr200"},{"issue":"27","key":"5472_CR25","doi-asserted-by":"publisher","first-page":"9930","DOI":"10.1021\/ja051586y","volume":"127","author":"J Gonzalez-Lergier","year":"2005","unstructured":"Gonzalez-Lergier J, Broadbelt L, Hatzimanikatis V: Theoretical considerations and computational analysis of the complexity in polyketide synthesis pathways. Journal of the American Chemical Society. 2005, 127 (27): 9930-9938. 10.1021\/ja051586y.","journal-title":"Journal of the American Chemical Society"},{"issue":"suppl 1","key":"5472_CR26","doi-asserted-by":"publisher","first-page":"i468","DOI":"10.1093\/bioinformatics\/bti1012","volume":"21","author":"Y Yamanishi","year":"2005","unstructured":"Yamanishi Y, Vert J, Kanehisa M: Supervised enzyme network inference from the integration of genomic data and chemical information. Bioinformatics. 2005, 21 (suppl 1): i468-i477. 10.1093\/bioinformatics\/bti1012.","journal-title":"Bioinformatics"},{"key":"5472_CR27","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1038\/msb4100155","volume":"3","author":"A Feist","year":"2007","unstructured":"Feist A, Henry C, Reed J, Krummenacker M, Joyce A, Karp P, Broadbelt L, Hatzimanikatis V, Palsson B: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Molecular Systems Biology. 2007, 3: 121-","journal-title":"Molecular Systems Biology"},{"key":"5472_CR28","doi-asserted-by":"publisher","first-page":"D109","DOI":"10.1093\/nar\/gkr988","volume":"40","author":"M Kanehisa","year":"2012","unstructured":"Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M: KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Research. 2012, 40: D109-D114. 10.1093\/nar\/gkr988.","journal-title":"Nucleic Acids Research"},{"key":"5472_CR29","unstructured":"KEGG PATHWAY Database. [http:\/\/www.kegg.jp\/kegg\/pathway.html]"},{"issue":"Supple 1","key":"5472_CR30","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1093\/bioinformatics\/bti1055","volume":"21","author":"SJ Swamidass","year":"2005","unstructured":"Swamidass SJ, Chen J, Bruand J, Phung P, Ralaivola L, Baldi P: Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics. 2005, 21 (Supple 1): 359-368.","journal-title":"Bioinformatics"},{"issue":"15","key":"5472_CR31","doi-asserted-by":"publisher","first-page":"2004","DOI":"10.1093\/bioinformatics\/btm266","volume":"23","author":"N Nagamine","year":"2007","unstructured":"Nagamine N, Sakakibara Y: Statistical prediction of protein chemical interactions based on chemical structure and mass spectrometry data. Bioinformatics. 2007, 23 (15): 2004-2012. 10.1093\/bioinformatics\/btm266.","journal-title":"Bioinformatics"},{"key":"5472_CR32","volume-title":"Bioinformatics","author":"Y Sakakibara","year":"2012","unstructured":"Sakakibara Y, Hachiya T, Uchida M, Nagamine N, Sugawara Y, Yokota M, Nakamura M, Popendorf K, Komori T, Sato K: COPICAT: A software system for predicting interactions between proteins and chemical compounds. Bioinformatics. 2012, doi:10.1093\/bioinformatics\/bts031"},{"key":"5472_CR33","unstructured":"IBM ILOG CPLEX. [http:\/\/www-06.ibm.com\/software\/jp\/websphere\/ilog\/optimization\/core-products-technologies\/cplex\/]"},{"key":"5472_CR34","unstructured":"DDT degradation - Reference pathway. [http:\/\/www.kegg.jp\/kegg-bin\/show_pathway?map00351]"},{"key":"5472_CR35","unstructured":"Higginson J: DDT: Epidemiological evidence. IARC scientific publications. 1985, 107-117. 65"},{"issue":"3","key":"5472_CR36","doi-asserted-by":"publisher","first-page":"756","DOI":"10.1007\/s11356-011-0601-6","volume":"19","author":"M Manaca","year":"2011","unstructured":"Manaca M, Grimalt J, Gari M, Sacarlal J, Sunyer J, Gonzalez R, Doba\u00f1o C, Menendez C, Alonso P: Assessment of exposure to DDT and metabolites after indoor residual spraying through the analysis of thatch material from rural African dwellings. Environmental Science and Pollution Research. 2011, 19 (3): 756-762.","journal-title":"Environmental Science and Pollution Research"},{"key":"5472_CR37","unstructured":"PathPred: Pathway Prediction server. [http:\/\/www.genome.jp\/tools\/pathpred\/]"},{"issue":"39","key":"5472_CR38","doi-asserted-by":"publisher","first-page":"11853","DOI":"10.1021\/ja036030u","volume":"125","author":"M Hattori","year":"2003","unstructured":"Hattori M, Okuno Y, Goto S, Kanehisa M: Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. Journal of the American Chemical Society. 2003, 125 (39): 11853-11865. 10.1021\/ja036030u.","journal-title":"Journal of the American Chemical Society"},{"issue":"suppl 1","key":"5472_CR39","doi-asserted-by":"publisher","first-page":"S268","DOI":"10.1093\/bioinformatics\/18.suppl_1.S268","volume":"18","author":"K Tsuda","year":"2002","unstructured":"Tsuda K, Kin T, Asai K: Marginalized kernels for biological sequences. Bioinformatics. 2002, 18 (suppl 1): S268-10.1093\/bioinformatics\/18.suppl_1.S268.","journal-title":"Bioinformatics"},{"issue":"6","key":"5472_CR40","doi-asserted-by":"publisher","first-page":"e1000397","DOI":"10.1371\/journal.pcbi.1000397","volume":"5","author":"N Nagamine","year":"2009","unstructured":"Nagamine N, Shirakawa T, Minato Y, Torii K, Kobayashi H, Imoto M, Sakakibara Y: Integrating statistical predictions and experimental verifications for enhancing protein-chemical interaction predictions in virtual screening. PLoS Computational Biology. 2009, 5 (6): e1000397-10.1371\/journal.pcbi.1000397.","journal-title":"PLoS Computational Biology"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-13-S17-S8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/1471-2105-13-S17-S8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-13-S17-S8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T21:09:10Z","timestamp":1630530550000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-13-S17-S8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,12]]},"references-count":40,"journal-issue":{"issue":"S17","published-print":{"date-parts":[[2012,12]]}},"alternative-id":["5472"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-13-s17-s8","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,12]]},"assertion":[{"value":"13 December 2012","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S8"}}