{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T21:14:48Z","timestamp":1770844488394,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Matching both the retention index (RI) and the mass spectrum of an unknown compound against a mass spectral reference library provides strong evidence for a correct identification of that compound. Data on retention indices are, however, available for only a small fraction of the compounds in such libraries. We propose a quantitative structure-RI model that enables the ranking and filtering of putative identifications of compounds for which the predicted RI falls outside a predefined window.<\/jats:p>\n               <jats:p>Results: We constructed multiple linear regression and support vector regression (SVR) models using a set of descriptors obtained with a genetic algorithm as variable selection method. The SVR model is a significant improvement over previous models built for structurally diverse compounds as it covers a large range (360\u20134100) of RI values and gives better prediction of isomer compounds. The hit list reduction varied from 41% to 60% and depended on the size of the original hit list. Large hit lists were reduced to a greater extend compared with small hit lists.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/appliedbioinformatics.wur.nl\/GC-MS<\/jats:p>\n               <jats:p>Contact: \u00a0roeland.vanham@wur.nl<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp056","type":"journal-article","created":{"date-parts":[[2009,1,29]],"date-time":"2009-01-29T01:31:13Z","timestamp":1233192673000},"page":"787-794","source":"Crossref","is-referenced-by-count":36,"title":["Automated procedure for candidate compound selection in GC-MS metabolomics based on prediction of Kovats retention index"],"prefix":"10.1093","volume":"25","author":[{"given":"V. V.","family":"Mihaleva","sequence":"first","affiliation":[{"name":"1 Applied Bioinformatics, Plant Research International, 2Centre for BioSystems Genomics (CBSG), Droevendaalsesteeg 1 and 3Laboratory of Bioinformatics, Wageningen University, Dreijenlaan 3, Wageningen, The Netherlands"}]},{"given":"H. A.","family":"Verhoeven","sequence":"additional","affiliation":[{"name":"1 Applied Bioinformatics, Plant Research International, 2Centre for BioSystems Genomics (CBSG), Droevendaalsesteeg 1 and 3Laboratory of Bioinformatics, Wageningen University, Dreijenlaan 3, Wageningen, The Netherlands"}]},{"given":"R. C. H.","family":"de Vos","sequence":"additional","affiliation":[{"name":"1 Applied Bioinformatics, Plant Research International, 2Centre for BioSystems Genomics (CBSG), Droevendaalsesteeg 1 and 3Laboratory of Bioinformatics, Wageningen University, Dreijenlaan 3, Wageningen, The Netherlands"}]},{"given":"R. D.","family":"Hall","sequence":"additional","affiliation":[{"name":"1 Applied Bioinformatics, Plant Research International, 2Centre for BioSystems Genomics (CBSG), Droevendaalsesteeg 1 and 3Laboratory of Bioinformatics, Wageningen University, Dreijenlaan 3, Wageningen, The Netherlands"}]},{"given":"R. C. H. J.","family":"van Ham","sequence":"additional","affiliation":[{"name":"1 Applied Bioinformatics, Plant Research International, 2Centre for BioSystems Genomics (CBSG), Droevendaalsesteeg 1 and 3Laboratory of Bioinformatics, Wageningen University, Dreijenlaan 3, Wageningen, The Netherlands"},{"name":"1 Applied Bioinformatics, Plant Research International, 2Centre for BioSystems Genomics (CBSG), Droevendaalsesteeg 1 and 3Laboratory of Bioinformatics, Wageningen University, Dreijenlaan 3, Wageningen, The Netherlands"}]}],"member":"286","published-online":{"date-parts":[[2009,1,28]]},"reference":[{"key":"2023051209122036800_B1","volume-title":"Identification of Essential Oil Components by Gas Chromatography\/Quadrupole Mass Spectrometry.","author":"Adams","year":"2001"},{"key":"2023051209122036800_B2","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1016\/S1044-0305(98)00159-7","article-title":"The critical evaluation of a comprehensive mass spectral library","volume":"10","author":"Ausloos","year":"1999","journal-title":"J. Am. Soc. Mass Spectrom."},{"key":"2023051209122036800_B3","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1016\/S0003-2670(97)00065-2","article-title":"Genetic algorithms as a method for variable selection in multiple linear regression and partial least squares regression, with applications to pyrolysis mass spectrometry","volume":"348","author":"Broadhurst","year":"1997","journal-title":"Anal. Chim. Acta"},{"key":"2023051209122036800_B4","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1002\/ijc.23689","article-title":"Vitamin E and cancer: an insight into the anticancer activities of vitamin E isomers and analogs","volume":"123","author":"Constantinou","year":"2008","journal-title":"Int. J. Cancer"},{"key":"2023051209122036800_B5","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511801389","volume-title":"An Introcuction to Support Vector Machines and other Kernel-based Learning Methods.","author":"Cristianini","year":"2000"},{"key":"2023051209122036800_B6","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/j.aca.2003.08.003","article-title":"Use of boiling point-Lee retention index correlation for rapid review of gas chromatography-mass spectrometry data","volume":"494","author":"Eckel","year":"2003","journal-title":"Anal. Chim. Acta"},{"key":"2023051209122036800_B7","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/j.chemolab.2004.01.012","article-title":"Quantitative structure-retention relationships XIV - Prediction of gas chromatographic retention indices for saturated O-, N-, and S-heterocyclic compounds","volume":"72","author":"Farkas","year":"2004","journal-title":"Chemom. Intell. Lab. Syst."},{"key":"2023051209122036800_B8","doi-asserted-by":"crossref","first-page":"1769","DOI":"10.1021\/jf048575t","article-title":"Structure-function analysis of the vanillin molecule and its antifungal properties","volume":"53","author":"Fitzgerald","year":"2005","journal-title":"J. Agric. Food Chem."},{"key":"2023051209122036800_B9","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1016\/S1093-3263(01)00122-X","article-title":"Enhancement of binary QSAR analysis by a GA-based variable selection method","volume":"20","author":"Gao","year":"2002","journal-title":"J. Mol. Graphics Modell."},{"key":"2023051209122036800_B10","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1016\/j.chroma.2003.12.003","article-title":"Prediction of gas chromatographic retention indices of a diverse set of toxicologically relevant compounds","volume":"1028","author":"Garkani-Nejad","year":"2004","journal-title":"J. Chromatogr. A"},{"key":"2023051209122036800_B11","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/j.chroma.2007.03.108","article-title":"Quantitative structure-(chromatographic) retention relationships","volume":"1158","author":"Heberger","year":"2007","journal-title":"J. Chromatogr. A"},{"key":"2023051209122036800_B12","doi-asserted-by":"crossref","first-page":"306","DOI":"10.1021\/ci960047x","article-title":"GA strategy for variable selection in QSAR studies: GA-based PLS analysis of calcium channel antagonists","volume":"37","author":"Hasegawa","year":"1997","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"2023051209122036800_B13","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1016\/j.aca.2007.04.009","article-title":"Quantitative structure-retention relationship for the Kovats retention indices of a large set of terpenes: a combined data splitting-feature selection strategy","volume":"592","author":"Hemmateenejad","year":"2007","journal-title":"Anal. Chim. Acta"},{"key":"2023051209122036800_B14","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1016\/j.talanta.2005.04.034","article-title":"QSPR prediction of GC retention indices for nitrogen-containing polycyclic aromatic compounds from heuristically computed molecular descriptors","volume":"68","author":"Hu","year":"2005","journal-title":"Talanta"},{"key":"2023051209122036800_B15","doi-asserted-by":"crossref","first-page":"1328","DOI":"10.1021\/ci0342270","article-title":"Use of computer-assisted methods for the modeling of the retention time of a variety of volatile organic compounds: a PCA-MLR-ANN approach","volume":"44","author":"Jalali-Heravi","year":"2004","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"2023051209122036800_B16","doi-asserted-by":"crossref","first-page":"978","DOI":"10.1124\/jpet.104.075994","article-title":"Positional isomerism markedly affects the growth inhibition of colon cancer cells by nitric oxide-donating aspirin in vitro and in vivo","volume":"312","author":"Kashfi","year":"2005","journal-title":"J. Pharmacol. Exp. Ther."},{"key":"2023051209122036800_B17","doi-asserted-by":"crossref","first-page":"1915","DOI":"10.1002\/hlca.19580410703","article-title":"Gas-Chromatographische Charakterisierung Organischer Verbindungen. 1. Retentionsindices Aliphatischer Halogenide, Alkohole, Aldehyde Und Ketone","volume":"41","author":"Kovats","year":"1958","journal-title":"Helv. Chim. Acta"},{"key":"2023051209122036800_B18","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1016\/j.aca.2004.12.085","article-title":"Prediction of retention time of a variety of volatile organic compounds based on the heuristic method and support vector machine","volume":"537","author":"Luan","year":"2005","journal-title":"Anal. Chim. Acta"},{"key":"2023051209122036800_B19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/0169-7439(93)80079-W","article-title":"Understanding and using genetic algorithms. 1. Concepts, properties and context","volume":"19","author":"Lucasius","year":"1993","journal-title":"Chemom. Intell. Lab. Syst."},{"key":"2023051209122036800_B20","doi-asserted-by":"crossref","first-page":"5147","DOI":"10.1021\/es060709r","article-title":"Nonylphenol isomers differ in estrogenic activity","volume":"40","author":"Preuss","year":"2006","journal-title":"Environ. Sci. Technol."},{"key":"2023051209122036800_B21","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1021\/ci0001031","article-title":"Novel shape descriptors for molecular graphs","volume":"41","author":"Randic","year":"2001","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"2023051209122036800_B22","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/j.chroma.2003.07.002","article-title":"Predicting gas chromatographic retention times for the 209 polybrominated diphenyl ether congeners","volume":"1016","author":"Rayne","year":"2003","journal-title":"J. Chromatogr. A"},{"key":"2023051209122036800_B23","doi-asserted-by":"crossref","first-page":"854","DOI":"10.1021\/ci00020a020","article-title":"Application of genetic function approximation to quantitative structure-activity-relationships and quantitative structure-property relationships","volume":"34","author":"Rogers","year":"1994","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"2023051209122036800_B24","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1002\/qsar.200530008","article-title":"Use of topological indices of organic sulfur compounds in quantitative structure-retention relationship study","volume":"24","author":"Safa","year":"2005","journal-title":"QSAR Comb. Sci."},{"key":"2023051209122036800_B25","doi-asserted-by":"crossref","first-page":"770","DOI":"10.1016\/S1044-0305(99)00047-1","article-title":"An integrated method for spectrum extraction and compound identification from gas chromatography\/mass spectrometry data","volume":"10","author":"Stein","year":"1999","journal-title":"J. Am. Soc. Mass Spectrom."},{"key":"2023051209122036800_B26","doi-asserted-by":"crossref","first-page":"581","DOI":"10.1021\/ci00019a016","article-title":"Estimation of normal boiling points from group contributions","volume":"34","author":"Stein","year":"1994","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"2023051209122036800_B27","first-page":"U304","article-title":"Open standards for chemical information - the IUPAC chemical identifier and data dictionary projects","volume":"226","author":"Stein","year":"2003","journal-title":"Abstr. Pap. Am. Chem. Soc."},{"key":"2023051209122036800_B28","doi-asserted-by":"crossref","first-page":"975","DOI":"10.1021\/ci600548y","article-title":"Estimation of Kovats retention indices using group contributions","volume":"47","author":"Stein","year":"2007","journal-title":"J. Chem. Inf. Model."},{"key":"2023051209122036800_B29","doi-asserted-by":"crossref","first-page":"1125","DOI":"10.1104\/pp.105.068130","article-title":"A novel approach for nontargeted data analysis for metabolomics. Large-scale profiling of tomato fruit volatiles","volume":"139","author":"Tikunov","year":"2005","journal-title":"Plant Physiol."},{"key":"2023051209122036800_B30","author":"Todeschini","year":"2003","journal-title":"DragonX 1.2."},{"key":"2023051209122036800_B31","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1021\/ci960149n","article-title":"The detour matrix in chemistry","volume":"37","author":"Trinajstic","year":"1997","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"2023051209122036800_B32","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1006\/taap.1996.0080","article-title":"Isomer-specific acute toxicity and cell proliferation in livers of B6G3F1 mice exposed to dichlorobenzene","volume":"137","author":"Umemura","year":"1996","journal-title":"Toxicol. Appl. Pharmacol."},{"key":"2023051209122036800_B33","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Learninr Theory.","author":"Vapnik","year":"1995"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/6\/787\/50286424\/bioinformatics_25_6_787.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/6\/787\/50286424\/bioinformatics_25_6_787.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,12]],"date-time":"2023-05-12T09:13:08Z","timestamp":1683882788000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/6\/787\/251924"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,1,28]]},"references-count":33,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2009,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp056","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,3,15]]},"published":{"date-parts":[[2009,1,28]]}}}