{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T00:47:37Z","timestamp":1775263657846,"version":"3.50.1"},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":772,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Data analysis for metabolomics suffers from uncertainty because of the noisy measurement technology and the small sample size of experiments. Noise and the small sample size lead to a high probability of false findings. Further, individual compounds have natural variation between samples, which in many cases renders them unreliable as biomarkers. However, the levels of similar compounds are typically highly correlated, which is a phenomenon that we model in this work.<\/jats:p><jats:p>Results: We propose a hierarchical Bayesian model for inferring differences between groups of samples more accurately in metabolomic studies, where the observed compounds are collinear. We discover that the method decreases the error of weak and non-existent covariate effects, and thereby reduces false-positive findings. To achieve this, the method makes use of the mass spectral peak data by clustering similar peaks into latent compounds, and by further clustering latent compounds into groups that respond in a coherent way to the experimental covariates. We demonstrate the method with three simulated studies and validate it with a metabolomic benchmark dataset.<\/jats:p><jats:p>Availability and implementation: An implementation in R is available at http:\/\/research.ics.aalto.fi\/mi\/software\/peakANOVA\/.<\/jats:p><jats:p>Contact: \u00a0samuel.kaski@aalto.fi.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu455","type":"journal-article","created":{"date-parts":[[2014,8,26]],"date-time":"2014-08-26T11:23:57Z","timestamp":1409052237000},"page":"i461-i467","source":"Crossref","is-referenced-by-count":9,"title":["Stronger findings for metabolomics through Bayesian modeling of multiple peaks and compound correlations"],"prefix":"10.1093","volume":"30","author":[{"given":"Tommi","family":"Suvitaival","sequence":"first","affiliation":[{"name":"1 Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, FI-00076 Espoo, Finland, 2School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK and 3Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland"}]},{"given":"Simon","family":"Rogers","sequence":"additional","affiliation":[{"name":"1 Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, FI-00076 Espoo, Finland, 2School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK and 3Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland"}]},{"given":"Samuel","family":"Kaski","sequence":"additional","affiliation":[{"name":"1 Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, FI-00076 Espoo, Finland, 2School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK and 3Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland"},{"name":"1 Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, FI-00076 Espoo, Finland, 2School of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK and 3Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland"}]}],"member":"286","published-online":{"date-parts":[[2014,8,22]]},"reference":[{"key":"2023012711541641100_btu455-B1","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. B Methodol."},{"key":"2023012711541641100_btu455-B2","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1007\/s11306-006-0037-z","article-title":"Statistical strategies for avoiding false discoveries in metabolomics and related experiments","volume":"2","author":"Broadhurst","year":"2006","journal-title":"Metabolomics"},{"key":"2023012711541641100_btu455-B3","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1017\/CBO9780511584589.011","article-title":"Model-based clustering for expression data via a Dirichlet process mixture model","volume-title":"Bayesian Inference for Gene Expression and Proteomics","author":"Dahl","year":"2006"},{"key":"2023012711541641100_btu455-B4","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1198\/016214504000000205","article-title":"An ANOVA model for dependent random measures","volume":"99","author":"De Iorio","year":"2004","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012711541641100_btu455-B5","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1080\/01621459.1994.10476468","article-title":"Estimating normal means with a Dirichlet process prior","volume":"89","author":"Escobar","year":"1994","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012711541641100_btu455-B6","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1002\/cem.1420","article-title":"A benchmark spike-in data set for biomarker identification in metabolomics","volume":"26","author":"Franceschi","year":"2012","journal-title":"J. Chemom."},{"key":"2023012711541641100_btu455-B7","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1007\/s10618-009-0142-5","article-title":"Two-way analysis of high-dimensional collinear data","volume":"19","author":"Huopaniemi","year":"2009","journal-title":"Data Min. Knowl. Discov."},{"key":"2023012711541641100_btu455-B8","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1007\/s11306-012-0414-8","article-title":"Individual differences in metabolomics: individualised responses and between-metabolite relationships","volume":"8","author":"Jansen","year":"2012","journal-title":"Metabolomics"},{"key":"2023012711541641100_btu455-B9","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1016\/j.chroma.2007.04.021","article-title":"Data processing for mass spectrometry-based metabolomics","volume":"1158","author":"Katajamaa","year":"2007","journal-title":"J. Chromatogr. A"},{"key":"2023012711541641100_btu455-B10","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1038\/nature10213","article-title":"Human nutrition, the gut microbiome and the immune system","volume":"474","author":"Kau","year":"2011","journal-title":"Nature"},{"key":"2023012711541641100_btu455-B11","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1214\/10-BA505","article-title":"Bayesian functional ANOVA modeling using Gaussian process prior distributions","volume":"5","author":"Kaufman","year":"2010","journal-title":"Bayesian Anal."},{"key":"2023012711541641100_btu455-B12","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1021\/ac202450g","article-title":"CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography\/mass spectrometry data sets","volume":"84","author":"Kuhl","year":"2012","journal-title":"Anal. Chem."},{"key":"2023012711541641100_btu455-B13","doi-asserted-by":"crossref","first-page":"1023","DOI":"10.1080\/01621459.1988.10478694","article-title":"Bayesian variable selection in linear regression","volume":"83","author":"Mitchell","year":"1988","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012711541641100_btu455-B14","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1016\/j.numecd.2009.04.018","article-title":"Metabolomics, a novel tool for studies of nutrition, metabolism and lipid dysfunction","volume":"19","author":"Ore\u0161i\u010d","year":"2009","journal-title":"Nutr. Metab. Cardiovasc. Dis."},{"key":"2023012711541641100_btu455-B15","doi-asserted-by":"crossref","first-page":"2331","DOI":"10.1002\/rcm.1627","article-title":"Ultra-performance liquid chromatography coupled to quadrupole-orthogonal time-of-flight mass spectrometry","volume":"18","author":"Plumb","year":"2004","journal-title":"Rapid Commun. Mass Spectrom."},{"key":"2023012711541641100_btu455-B16","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1186\/1471-2105-11-395","article-title":"MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data","volume":"11","author":"Pluskal","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012711541641100_btu455-B17","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1093\/bioinformatics\/btn642","article-title":"Probabilistic assignment of formulas to mass peaks in metabolomics experiments","volume":"25","author":"Rogers","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012711541641100_btu455-B18","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1007\/s11306-013-0598-6","article-title":"Reflections on univariate and multivariate analysis of metabolomics data","volume":"10","author":"Saccenti","year":"2014","journal-title":"Metabolomics"},{"key":"2023012711541641100_btu455-B19","doi-asserted-by":"crossref","first-page":"3043","DOI":"10.1093\/bioinformatics\/bti476","article-title":"ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data","volume":"21","author":"Smilde","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012711541641100_btu455-B20","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1093\/bib\/bbl009","article-title":"Review: on the analysis and interpretation of correlations in metabolomic data","volume":"7","author":"Steuer","year":"2006","journal-title":"Brief. Bioinform."},{"key":"2023012711541641100_btu455-B21","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1186\/1471-2105-15-208","article-title":"Stronger findings from mass spectral data through multi-peak modeling","volume":"15","author":"Suvitaival","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023012711541641100_btu455-B22","doi-asserted-by":"crossref","first-page":"714","DOI":"10.1007\/s11306-011-0368-2","article-title":"MSClust: a tool for unsupervised mass spectra extraction of chromatography-mass spectrometry ion-wise aligned data","volume":"8","author":"Tikunov","year":"2012","journal-title":"Metabolomics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/17\/i461\/48927159\/bioinformatics_30_17_i461.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/17\/i461\/48927159\/bioinformatics_30_17_i461.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,1]],"date-time":"2024-06-01T22:43:41Z","timestamp":1717281821000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/17\/i461\/200468"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,8,22]]},"references-count":22,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2014,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu455","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,9,1]]},"published":{"date-parts":[[2014,8,22]]}}}