{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T12:25:41Z","timestamp":1775219141570,"version":"3.50.1"},"reference-count":33,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Current efforts in Metabolomics, such as the Human Metabolome Project, collect structures of biological metabolites as well as data for their characterisation, such as spectra for identification of substances and measurements of their concentration. Still, only a fraction of existing metabolites and their spectral fingerprints are known. Computer-Assisted Structure Elucidation (CASE) of biological metabolites will be an important tool to leverage this lack of knowledge. Indispensable for CASE are modules to predict spectra for hypothetical structures. This paper evaluates different statistical and machine learning methods to perform predictions of proton NMR spectra based on data from our open database NMRShiftDB.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>A mean absolute error of 0.18 ppm was achieved for the prediction of proton NMR shifts ranging from 0 to 11 ppm. Random forest, J48 decision tree and support vector machines achieved similar overall errors. HOSE codes being a notably simple method achieved a comparatively good result of 0.17 ppm mean absolute error.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>NMR prediction methods applied in the course of this work delivered precise predictions which can serve as a building block for Computer-Assisted Structure Elucidation for biological metabolites.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-9-400","type":"journal-article","created":{"date-parts":[[2008,9,25]],"date-time":"2008-09-25T18:13:56Z","timestamp":1222366436000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":111,"title":["Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction"],"prefix":"10.1186","volume":"9","author":[{"given":"Stefan","family":"Kuhn","sequence":"first","affiliation":[]},{"given":"Bj\u00f6rn","family":"Egert","sequence":"additional","affiliation":[]},{"given":"Steffen","family":"Neumann","sequence":"additional","affiliation":[]},{"given":"Christoph","family":"Steinbeck","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2008,9,25]]},"reference":[{"issue":"2","key":"2385_CR1","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1016\/j.ab.2007.04.036","volume":"367","author":"J Boerner","year":"2007","unstructured":"Boerner J, Buchinger S, Schomburg D: A high-throughput method for microbial metabolome analysis using gas chromatography\/mass spectrometry. Anal Biochem 2007, 367(2):143\u2013151. 10.1016\/j.ab.2007.04.036","journal-title":"Anal Biochem"},{"key":"2385_CR2","volume-title":"Nucleic Acids Res","author":"DS Wishart","year":"2007","unstructured":"Wishart DS, Tzur D, Knox C, Eisner R, Guo AC, Young N, Cheng D, Jewell K, Arndt D, Sawhney S, Fung C, Nikolai L, Lewis M, Coutouly MA, Forsythe I, Tang P, Shrivastava S, Jeroncic K, Stothard P, Amegbey G, Block D, Hau DD, Wagner J, Miniaci J, Clements M, Gebremedhin M, Guo N, Zhang Y, Duggan GE, Macinnis GD, Weljie AM, Dowlatabadi R, Bamforth F, Clive D, Greiner R, Li L, Marrie T, Sykes BD, Vogel HJ, Querengesser L: HMDB: the Human Metabolome Database. Nucleic Acids Res 2007., (35 Database):"},{"key":"2385_CR3","first-page":"338","volume":"4","author":"C Steinbeck","year":"2001","unstructured":"Steinbeck C: The Automation of Natural Product Structure Elucidation. Current Opinion in Drug Discovery and Development 2001, 4: 338\u2013342.","journal-title":"Current Opinion in Drug Discovery and Development"},{"key":"2385_CR4","doi-asserted-by":"publisher","first-page":"512","DOI":"10.1039\/b400678j","volume":"21","author":"C Steinbeck","year":"2004","unstructured":"Steinbeck C: Recent Developments in Automated Structure Elucidation of Natural Products. Natural Product Reports 2004, 21: 512\u2013518. 10.1039\/b400678j","journal-title":"Natural Product Reports"},{"key":"2385_CR5","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1016\/S0003-2670(01)83100-7","volume":"103","author":"W Bremser","year":"1978","unstructured":"Bremser W: HOSE \u2013 A Novel Substructre Code. Analytica Chimica Acta 1978, 103: 355\u2013365. 10.1016\/S0003-2670(01)83100-7","journal-title":"Analytica Chimica Acta"},{"key":"2385_CR6","unstructured":"2006. [http:\/\/www.modgraph.co.uk\/product_nmr.htm]"},{"key":"2385_CR7","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1007\/s002160050531","volume":"359","author":"V Schutz","year":"1997","unstructured":"Schutz V, Purtuc V, Felsinger S, Robien W: Csearch-Stereo \u2013 A New Generation of NMR Database Systems Allowing Three-Dimensional Spectrum Prediction. Fresenius Journal of Analytical Chemistry 1997, 359: 33\u201341. 10.1007\/s002160050531","journal-title":"Fresenius Journal of Analytical Chemistry"},{"key":"2385_CR8","doi-asserted-by":"publisher","first-page":"1898","DOI":"10.1002\/hlca.19820650624","volume":"65","author":"H Egli","year":"1982","unstructured":"Egli H, Smith DH, Djerassi C: Computer-Assisted Structural Interpretation of 1H-NMR Spectral Data. Helvetica Chimica Acta 1982, 65: 1898\u20131920. 10.1002\/hlca.19820650624","journal-title":"Helvetica Chimica Acta"},{"key":"2385_CR9","doi-asserted-by":"publisher","first-page":"80","DOI":"10.1021\/ac010737m","volume":"74","author":"J Aires-De-Sousa","year":"2002","unstructured":"Aires-De-Sousa J, Hemmer M, Gasteiger J: Prediction of 1H-NMR Chemical Shifts Using Neural Networks. Anal Chem 2002, 74: 80\u201390. 10.1021\/ac010737m","journal-title":"Anal Chem"},{"key":"2385_CR10","doi-asserted-by":"publisher","first-page":"128","DOI":"10.1021\/ci700256n","volume":"48","author":"Y Smurnyy","year":"2008","unstructured":"Smurnyy Y, Blinov K, Churanova T, Elyashberg M, Williams A: Toward More Reliable 13C and 1H Chemical Shift Prediction: A Systematic Comparison of Neural-Network and Least-Squares Regression Based Approaches. Journal of Chemical Information and Modeling 2008, 48: 128\u2013134. 10.1021\/ci700256n","journal-title":"Journal of Chemical Information and Modeling"},{"key":"2385_CR11","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1016\/0003-2670(94)80116-9","volume":"290","author":"RB Schaller","year":"1994","unstructured":"Schaller RB, Pretsch EA: A computer program for the automatic estimation of 1H NMR chemical shifts. Analytica Chimica Acta 1994, 290: 295\u2013302. 10.1016\/0003-2670(94)80116-9","journal-title":"Analytica Chimica Acta"},{"key":"2385_CR12","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1023\/A:1023060720156","volume":"26","author":"J Meiler","year":"2003","unstructured":"Meiler J: PROSHIFT: Protein chemical shift prediction using artificial neural networks. J of Biomolecular NMR 2003, 26: 25\u201337. 10.1023\/A:1023060720156","journal-title":"J of Biomolecular NMR"},{"key":"2385_CR13","doi-asserted-by":"publisher","first-page":"946","DOI":"10.1021\/ci034229k","volume":"44","author":"Y Binev","year":"2004","unstructured":"Binev Y, Corvo M, Aires-de Sousa J: The Impact of Available Experimental Data on the Prediction of 1H NMR Chemical Shifts by Neural Networks. J Chem Inf Comput Sci 2004, 44: 946\u2013949.","journal-title":"J Chem Inf Comput Sci"},{"key":"2385_CR14","doi-asserted-by":"publisher","first-page":"940","DOI":"10.1021\/ci034228s","volume":"44","author":"Y Binev\u2020","year":"2004","unstructured":"Binev\u2020 Y, Aires-de Sousa J: Structure-Based Predictions of 1H NMR Chemical Shifts Using Feed-Forward Neural Networks. J Chem Inf Comput Sci 2004, 44: 940\u2013945.","journal-title":"J Chem Inf Comput Sci"},{"issue":"2","key":"2385_CR15","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1021\/ci025584y","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck C, Han YQ, Kuhn S, Horlacher O, Luttmann E, Willighagen E: The Chemistry Development Kit (CDK): An open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci 2003, 43(2):493\u2013500.","journal-title":"J Chem Inf Comput Sci"},{"key":"2385_CR16","doi-asserted-by":"publisher","first-page":"2111","DOI":"10.2174\/138161206777585274","volume":"12","author":"C Steinbeck","year":"2006","unstructured":"Steinbeck C, Hoppe C, Kuhn S, Floris M, Guha R, Willighagen E: Recent Developments of the Chemistry Development Kit (CDK) \u2013 An Open-Source Java Library for Chemo- and Bioinformatics. Curr Pharm Des 2006, 12: 2111\u20132120. 10.2174\/138161206777585274","journal-title":"Curr Pharm Des"},{"key":"2385_CR17","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1016\/S0924-2031(99)00014-4","volume":"19","author":"M Hemmer","year":"1999","unstructured":"Hemmer M, Steinhauer V, Gasteiger J: Deriving the 3D structure of organic molecules from their infrared spectra. Vib Spectrosc 1999, 19: 151\u2013164. 10.1016\/S0924-2031(99)00014-4","journal-title":"Vib Spectrosc"},{"issue":"6","key":"2385_CR18","doi-asserted-by":"publisher","first-page":"1733","DOI":"10.1021\/ci0341363","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck C, Kuhn S, Krause S: NMRShiftDB \u2013 Constructing a Chemical Information System with Open Source Components. J Chem Inf Comput Sci 2003, 43(6):1733\u20131739.","journal-title":"J Chem Inf Comput Sci"},{"issue":"19","key":"2385_CR19","doi-asserted-by":"publisher","first-page":"2711","DOI":"10.1016\/j.phytochem.2004.08.027","volume":"65","author":"C Steinbeck","year":"2004","unstructured":"Steinbeck C, Kuhn S: NMRShiftDB \u2013 Compound identification and structure elucidation support through a free community-build web database. Phytochemistry 2004, 65(19):2711\u20132717. 10.1016\/j.phytochem.2004.08.027","journal-title":"Phytochemistry"},{"key":"2385_CR20","volume-title":"Journal of Chemical Information and Modeling","author":"K Blinov","year":"2008","unstructured":"Blinov K, Smurnyy Y, Elyashberg M, Churanova T, Kvasha M, Steinbeck C, Lefebvre B, Williams A: Performance Validation of Neural Network Based 13C NMR Prediction Using a Publicly Available Data Source. Journal of Chemical Information and Modeling 2008."},{"key":"2385_CR21","unstructured":"2008. [http:\/\/pubchem.ncbi.nlm.nih.gov\/]"},{"key":"2385_CR22","doi-asserted-by":"publisher","first-page":"1000","DOI":"10.1021\/ci00020a039","volume":"34","author":"J Sadowski","year":"1994","unstructured":"Sadowski J, Gasteiger J, Klebe G: Comparison of Automatic Three-Dimensional Model Builders Using 639 X-Ray Structures. J Chem Inf Comput Sci 1994, 34: 1000\u20131008. 10.1021\/ci00020a039","journal-title":"J Chem Inf Comput Sci"},{"key":"2385_CR23","unstructured":"2008. [http:\/\/www.molecular-networks.com]"},{"key":"2385_CR24","volume-title":"R: A Language and Environment for Statistical Computing","author":"R Development Core Team","year":"2006","unstructured":"R Development Core Team:R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria; 2006. [http:\/\/www.R-project.org]"},{"key":"2385_CR25","volume-title":"Data Mining: Practical Machine Learning Tools and Techniques","author":"IH Witten","year":"2005","unstructured":"Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques. 2nd edition. San Francisco: Morgan Kaufmann; 2005.","edition":"2"},{"key":"2385_CR26","first-page":"37","volume":"6","author":"D Aha","year":"1991","unstructured":"Aha D, Kibler D: Instance-based learning algorithms. Machine Learning 1991, 6: 37\u201366.","journal-title":"Machine Learning"},{"key":"2385_CR27","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Nature Theory","author":"V Vapnik","year":"1995","unstructured":"Vapnik V: The Nature of Statistical Nature Theory. Heidelberg: Springer; 1995."},{"key":"2385_CR28","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-663-05681-2","volume-title":"Methoden wissensbasierter Systeme. Grundlagen-Algorithmen-Anwendungen","author":"C Beierle","year":"2003","unstructured":"Beierle C, Kern-Isberner G: Methoden wissensbasierter Systeme. Grundlagen-Algorithmen-Anwendungen. Wiesbaden: Vieweg; 2003."},{"issue":"1","key":"2385_CR29","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L: Random forests. Machine Learning 2001, 45(1):5\u201332. 10.1023\/A:1010933404324","journal-title":"Machine Learning"},{"issue":"2","key":"2385_CR30","first-page":"123","volume":"24","author":"L Breiman","year":"1996","unstructured":"Breiman L: Bagging predictors. Machine Learning 1996, 24(2):123\u2013140.","journal-title":"Machine Learning"},{"issue":"8","key":"2385_CR31","doi-asserted-by":"publisher","first-page":"832","DOI":"10.1109\/34.709601","volume":"20","author":"Ho","year":"1998","unstructured":"Ho , Tin , Kam : The Random Subspace Method for Constructing Decision Forests. IEEE Trans on Pattern Analysis and Machine Intelligence 1998, 20(8):832\u2013844. (Preceding Work) 10.1109\/34.709601","journal-title":"IEEE Trans on Pattern Analysis and Machine Intelligence"},{"issue":"5","key":"2385_CR32","doi-asserted-by":"publisher","first-page":"1651","DOI":"10.1214\/aos\/1024691352","volume":"26","author":"R Shapire","year":"1998","unstructured":"Shapire R, Freund Y, Bartlett P, Lee W: Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of Statistics 1998, 26(5):1651\u20131686. 10.1214\/aos\/1024691352","journal-title":"Annals of Statistics"},{"key":"2385_CR33","volume-title":"Putting ACD\/NMR Predictors to the Test","author":"A Williams","year":"2006","unstructured":"Williams A, Lefebvre B, Sasaki R: Putting ACD\/NMR Predictors to the Test.2006. [http:\/\/www.acdlabs.com\/products\/spec_lab\/predict_nmr\/chemnmr\/]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-400.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T10:59:58Z","timestamp":1630493998000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-400"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,9,25]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["2385"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-400","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,9,25]]},"assertion":[{"value":"18 April 2008","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 September 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 September 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"400"}}