{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,9]],"date-time":"2026-01-09T18:24:51Z","timestamp":1767983091264,"version":"3.49.0"},"reference-count":27,"publisher":"Oxford University Press (OUP)","issue":"20","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Peak detection is a key step in the preprocessing of untargeted metabolomics data generated from high-resolution liquid chromatography-mass spectrometry (LC\/MS). The common practice is to use filters with predetermined parameters to select peaks in the LC\/MS profile. This rigid approach can cause suboptimal performance when the choice of peak model and parameters do not suit the data characteristics.<\/jats:p>\n               <jats:p>Results: Here we present a method that learns directly from various data features of the extracted ion chromatograms (EICs) to differentiate between true peak regions from noise regions in the LC\/MS profile. It utilizes the knowledge of known metabolites, as well as robust machine learning approaches. Unlike currently available methods, this new approach does not assume a parametric peak shape model and allows maximum flexibility. We demonstrate the superiority of the new approach using real data. Because matching to known metabolites entails uncertainties and cannot be considered a gold standard, we also developed a probabilistic receiver-operating characteristic (pROC) approach that can incorporate uncertainties.<\/jats:p>\n               <jats:p>Availability and implementation: The new peak detection approach is implemented as part of the apLCMS package available at http:\/\/web1.sph.emory.edu\/apLCMS\/<\/jats:p>\n               <jats:p>Contact: \u00a0tyu8@emory.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu430","type":"journal-article","created":{"date-parts":[[2014,7,9]],"date-time":"2014-07-09T04:13:37Z","timestamp":1404879217000},"page":"2941-2948","source":"Crossref","is-referenced-by-count":56,"title":["Improving peak detection in high-resolution LC\/MS metabolomics data using preexisting knowledge and machine learning approach"],"prefix":"10.1093","volume":"30","author":[{"given":"Tianwei","family":"Yu","sequence":"first","affiliation":[{"name":"1 Department of Biostatistics and Bioinformatics, Rollins School of Public Health and 2 Department of Medicine, School of Medicine, Emory University, Atlanta, GA 30322, USA"}]},{"given":"Dean P.","family":"Jones","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics and Bioinformatics, Rollins School of Public Health and 2 Department of Medicine, School of Medicine, Emory University, Atlanta, GA 30322, USA"}]}],"member":"286","published-online":{"date-parts":[[2014,7,7]]},"reference":[{"key":"2023012711555719700_btu430-B1","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.chroma.2008.03.033","article-title":"Feature detection and alignment of hyphenated chromatographic-mass spectrometric data. Extraction of pure ion chromatograms using Kalman tracking","volume":"1192","author":"Aberg","year":"2008","journal-title":"J. Chromatogr. A"},{"key":"2023012711555719700_btu430-B2","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1038\/nbt0208-162","article-title":"Metabolite identification via the Madison Metabolomics Consortium Database","volume":"26","author":"Cui","year":"2008","journal-title":"Nat. Biotechnol."},{"key":"2023012711555719700_btu430-B3","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recognit. Lett."},{"key":"2023012711555719700_btu430-B4","volume-title":"The Elements of Statistical Learning: Data Mining, Inference: Prediction","author":"Hastie","year":"2009"},{"key":"2023012711555719700_btu430-B5","doi-asserted-by":"crossref","first-page":"2183","DOI":"10.1002\/jssc.200900152","article-title":"Analytical and statistical approaches to metabolomics research","volume":"32","author":"Issaq","year":"2009","journal-title":"J. Sep. Sci."},{"key":"2023012711555719700_btu430-B6","doi-asserted-by":"crossref","first-page":"2864","DOI":"10.1039\/c0an00333f","article-title":"A practical approach to detect unique metabolic patterns for personalized medicine","volume":"135","author":"Johnson","year":"2010","journal-title":"Analyst"},{"key":"2023012711555719700_btu430-B7","doi-asserted-by":"crossref","first-page":"634","DOI":"10.1093\/bioinformatics\/btk039","article-title":"MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data","volume":"22","author":"Katajamaa","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012711555719700_btu430-B8","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1016\/j.chroma.2007.04.021","article-title":"Data processing for mass spectrometry-based metabolomics","volume":"1158","author":"Katajamaa","year":"2007","journal-title":"J. Chromatogr. A"},{"key":"2023012711555719700_btu430-B9","first-page":"179","article-title":"Addressing the curse of imbalanced data sets: one-sided sampling","volume-title":"Proceedings of the 14th International conference on Machine Learning","author":"Kubat","year":"1997"},{"key":"2023012711555719700_btu430-B10","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1021\/ac202450g","article-title":"CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography\/mass spectrometry data sets","volume":"84","author":"Kuhl","year":"2012","journal-title":"Anal. Chem."},{"key":"2023012711555719700_btu430-B11","doi-asserted-by":"crossref","first-page":"468","DOI":"10.1016\/j.csl.2005.06.002","article-title":"A study in machine learning from imbalanced data for sentence boundary detection in speech","volume":"20","author":"Liu","year":"2006","journal-title":"Comput. Speech Lang."},{"key":"2023012711555719700_btu430-B12","doi-asserted-by":"crossref","first-page":"3637","DOI":"10.1021\/pr8005099","article-title":"The metabolome-wide association study: a new look at human disease risk factors","volume":"7","author":"Nicholson","year":"2008","journal-title":"J. Proteome Res."},{"key":"2023012711555719700_btu430-B13","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1038\/nrm3314","article-title":"Innovation: Metabolomics: the apogee of the omics trilogy","volume":"13","author":"Patti","year":"2012","journal-title":"Nat. Rev. Mol. Cell Biol."},{"key":"2023012711555719700_btu430-B14","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1097\/01.ftd.0000179845.53213.39","article-title":"METLIN: a metabolite mass spectral database","volume":"27","author":"Smith","year":"2005","journal-title":"Ther. Drug Monit."},{"key":"2023012711555719700_btu430-B15","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1021\/ac051437y","article-title":"XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification","volume":"78","author":"Smith","year":"2006","journal-title":"Anal. Chem."},{"key":"2023012711555719700_btu430-B16","doi-asserted-by":"crossref","first-page":"975","DOI":"10.1021\/ac050980b","article-title":"Second-order peak detection for multicomponent high-resolution LC\/MS data","volume":"78","author":"Stolt","year":"2006","journal-title":"Anal. Chem."},{"key":"2023012711555719700_btu430-B17","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1186\/1471-2105-12-259","article-title":"AMDORAP: non-targeted metabolic profiling based on high-resolution LC-MS","volume":"12","author":"Takahashi","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023012711555719700_btu430-B18","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1186\/1471-2105-9-504","article-title":"Highly sensitive feature detection for high resolution LC\/MS","volume":"9","author":"Tautenhahn","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012711555719700_btu430-B19","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1007\/978-1-61737-985-7_17","article-title":"Processing and analysis of GC\/LC-MS-based metabolomics data","volume":"708","author":"Want","year":"2011","journal-title":"Methods Mol. Biol."},{"key":"2023012711555719700_btu430-B20","doi-asserted-by":"crossref","first-page":"7963","DOI":"10.1021\/ac3016856","article-title":"Data preprocessing method for liquid chromatography-mass spectrometry based metabolomics","volume":"84","author":"Wei","year":"2012","journal-title":"Anal. Chem."},{"key":"2023012711555719700_btu430-B21","doi-asserted-by":"crossref","first-page":"D603","DOI":"10.1093\/nar\/gkn810","article-title":"HMDB: a knowledgebase for the human metabolome","volume":"37","author":"Wishart","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012711555719700_btu430-B22","doi-asserted-by":"crossref","first-page":"e40598","DOI":"10.1371\/journal.pone.0040598","article-title":"ROCS: receiver operating characteristic surface for class-skewed high-throughput data","volume":"7","author":"Yu","year":"2012","journal-title":"PloS One"},{"key":"2023012711555719700_btu430-B23","doi-asserted-by":"crossref","first-page":"1930","DOI":"10.1093\/bioinformatics\/btp291","article-title":"apLCMS\u2014adaptive processing of high-resolution LC\/MS data","volume":"25","author":"Yu","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012711555719700_btu430-B24","doi-asserted-by":"crossref","first-page":"1419","DOI":"10.1021\/pr301053d","article-title":"Hybrid feature detection and information accumulation using high-resolution LC-MS metabolomics data","volume":"12","author":"Yu","year":"2013","journal-title":"J. Proteome Res."},{"key":"2023012711555719700_btu430-B25","first-page":"83","article-title":"Analyzing LC\/MS metabolic profiling data in the context of existing metabolic networks","volume":"1","author":"Yu","year":"2013","journal-title":"Curr. Metabolomics"},{"key":"2023012711555719700_btu430-B26","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1186\/1471-2105-11-559","article-title":"Quantification and deconvolution of asymmetric LC-MS peaks using the bi-Gaussian mixture model and statistical model selection","volume":"11","author":"Yu","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012711555719700_btu430-B27","doi-asserted-by":"crossref","first-page":"470","DOI":"10.1039\/C1MB05350G","article-title":"LC-MS-based metabolomics","volume":"8","author":"Zhou","year":"2012","journal-title":"Mol. Biosyst."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/20\/2941\/48929820\/bioinformatics_30_20_2941.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/20\/2941\/48929820\/bioinformatics_30_20_2941.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T12:42:44Z","timestamp":1674823364000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/20\/2941\/2422280"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,7,7]]},"references-count":27,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2014,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu430","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,10,15]]},"published":{"date-parts":[[2014,7,7]]}}}