{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:01Z","timestamp":1772138041989,"version":"3.50.1"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2020,11,27]],"date-time":"2020-11-27T00:00:00Z","timestamp":1606435200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"publisher","award":["310107"],"award-info":[{"award-number":["310107"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Aalto Science-IT infrastructure"},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/R018634\/1"],"award-info":[{"award-number":["EP\/R018634\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Scottish Informatics and Computing Science Alliance"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,7,19]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Identification of small molecules in a biological sample remains a major bottleneck in molecular biology, despite a decade of rapid development of computational approaches for predicting molecular structures using mass spectrometry (MS) data. Recently, there has been increasing interest in utilizing other information sources, such as liquid chromatography (LC) retention time (RT), to improve identifications solely based on MS information, such as precursor mass-per-charge and tandem mass spectrometry (MS2).<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We put forward a probabilistic modelling framework to integrate MS and RT data of multiple features in an LC-MS experiment. We model the MS measurements and all pairwise retention order information as a Markov random field and use efficient approximate inference for scoring and ranking potential molecular structures. Our experiments show improved identification accuracy by combining MS2 data and retention orders using our approach, thereby outperforming state-of-the-art methods. Furthermore, we demonstrate the benefit of our model when only a subset of LC-MS features has MS2 measurements available besides MS1.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Software and data are freely available at https:\/\/github.com\/aalto-ics-kepaco\/msms_rt_score_integration.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa998","type":"journal-article","created":{"date-parts":[[2020,11,17]],"date-time":"2020-11-17T15:12:39Z","timestamp":1605625959000},"page":"1724-1731","source":"Crossref","is-referenced-by-count":10,"title":["Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification"],"prefix":"10.1093","volume":"37","author":[{"given":"Eric","family":"Bach","sequence":"first","affiliation":[{"name":"Department of Computer Science, School of Science, Aalto University , Espoo, Finland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3578-4477","authenticated-orcid":false,"given":"Simon","family":"Rogers","sequence":"additional","affiliation":[{"name":"School of Computing Science, University of Glasgow , Glasgow, UK"}]},{"given":"John","family":"Williamson","sequence":"additional","affiliation":[{"name":"School of Computing Science, University of Glasgow , Glasgow, UK"}]},{"given":"Juho","family":"Rousu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, School of Science, Aalto University , Espoo, Finland"}]}],"member":"286","published-online":{"date-parts":[[2020,11,27]]},"reference":[{"key":"2023051709551403400_btaa998-B1","doi-asserted-by":"crossref","first-page":"0054","DOI":"10.1038\/s41570-017-0054","article-title":"Global chemical analysis of biology by mass spectrometry","volume":"1","author":"Aksenov","year":"2017","journal-title":"Nat. Rev. Chem"},{"key":"2023051709551403400_btaa998-B2","doi-asserted-by":"crossref","first-page":"W94","DOI":"10.1093\/nar\/gku436","article-title":"CFM-ID: a web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra","volume":"42","author":"Allen","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023051709551403400_btaa998-B3","doi-asserted-by":"crossref","first-page":"i875","DOI":"10.1093\/bioinformatics\/bty590","article-title":"Liquid-chromatography retention order prediction for metabolite identification","volume":"34","author":"Bach","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051709551403400_btaa998-B4","doi-asserted-by":"crossref","first-page":"31","DOI":"10.3390\/metabo8020031","article-title":"Software tools and approaches for compound identification of LC-MS\/MS data in metabolomics","volume":"8","author":"Bla\u017eenovi\u0107","year":"2018","journal-title":"Metabolites"},{"key":"2023051709551403400_btaa998-B5","doi-asserted-by":"crossref","first-page":"i28","DOI":"10.1093\/bioinformatics\/btw246","article-title":"Fast metabolite identification with Input Output Kernel Regression","volume":"32","author":"Brouard","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051709551403400_btaa998-B6","doi-asserted-by":"crossref","first-page":"160","DOI":"10.3390\/metabo9080160","article-title":"Improved small molecule identification through learning combinations of kernel regression models","volume":"9","author":"Brouard","year":"2019","journal-title":"Metabolites"},{"key":"2023051709551403400_btaa998-B7","doi-asserted-by":"crossref","first-page":"12549","DOI":"10.1073\/pnas.1516878112","article-title":"Illuminating the dark matter in metabolomics","volume":"112","author":"da Silva","year":"2015","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051709551403400_btaa998-B8","doi-asserted-by":"crossref","first-page":"12799","DOI":"10.1021\/acs.analchem.9b02354","article-title":"Integrated probabilistic annotation (IPA): a Bayesian-based annotation method for metabolomic profiles integrating biochemical connections, isotope patterns and adduct relationships","volume":"91","author":"Del Carratore","year":"2019","journal-title":"Anal. Chem"},{"key":"2023051709551403400_btaa998-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-019-13680-7","article-title":"The METLIN small molecule dataset for machine learning-based retention time prediction","volume":"10","author":"Domingo-Almenara","year":"2019","journal-title":"Nat. Commun"},{"key":"2023051709551403400_btaa998-B10","doi-asserted-by":"crossref","first-page":"12580","DOI":"10.1073\/pnas.1509788112","article-title":"Searching molecular structure databases with tandem mass spectra using CSI: FingerID","volume":"112","author":"D\u00fchrkop","year":"2015","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051709551403400_btaa998-B11","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1038\/s41592-019-0344-8","article-title":"SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information","volume":"16","author":"D\u00fchrkop","year":"2019","journal-title":"Nat. Methods"},{"key":"2023051709551403400_btaa998-B12","doi-asserted-by":"crossref","first-page":"703","DOI":"10.1002\/jms.1777","article-title":"MassBank: a public repository for sharing mass spectral data for life sciences","volume":"45","author":"Horai","year":"2010","journal-title":"J. Mass Spectrom"},{"key":"2023051709551403400_btaa998-B13","doi-asserted-by":"crossref","first-page":"1931","DOI":"10.1007\/s00216-018-0857-5","article-title":"Performance of combined fragmentation and retention prediction for the identification of organic micropollutants by LC-HRMS","volume":"410","author":"Hu","year":"2018","journal-title":"Anal. Bioanal. Chem"},{"key":"2023051709551403400_btaa998-B14","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1007\/s10994-007-5018-6","article-title":"A note on Platt\u2019s probabilistic outputs for support vector machines","volume":"68","author":"Lin","year":"2007","journal-title":"Mach. Learn"},{"key":"2023051709551403400_btaa998-B15","doi-asserted-by":"crossref","first-page":"3443","DOI":"10.3390\/ijms20143443","article-title":"Quantitative structure\u2013retention relationships with non-linear programming for prediction of chromatographic elution order","volume":"20","author":"Liu","year":"2019","journal-title":"Int. J. Mol. Sci"},{"key":"2023051709551403400_btaa998-B16","volume-title":"Information Theory, Inference and Learning Algorithms","author":"MacKay","year":"2005"},{"key":"2023051709551403400_btaa998-B17","first-page":"873","volume-title":"NIPS","author":"Marchand","year":"2014"},{"key":"2023051709551403400_btaa998-B18","doi-asserted-by":"crossref","first-page":"2028","DOI":"10.1093\/bib\/bby066","article-title":"Recent advances and prospects of computational methods for metabolite identification: a review with emphasis on machine learning approaches","volume":"20","author":"Nguyen","year":"2018","journal-title":"Brief. Bioinform"},{"key":"2023051709551403400_btaa998-B19","doi-asserted-by":"crossref","first-page":"i323","DOI":"10.1093\/bioinformatics\/bty252","article-title":"Simple: sparse interaction model over peaks of molecules for fast, interpretable metabolite identification from tandem mass spectra","volume":"34","author":"Nguyen","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051709551403400_btaa998-B20","doi-asserted-by":"crossref","first-page":"i164","DOI":"10.1093\/bioinformatics\/btz319","article-title":"ADAPTIVE: leArning DAta-dePendenT, concIse molecular VEctors for fast, accurate metabolite identification from tandem mass spectra","volume":"35","author":"Nguyen","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051709551403400_btaa998-B21","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1021\/ed100697w","article-title":"ChemSpider: an online chemical information resource","volume":"87","author":"Pence","year":"2010","journal-title":"J. Chem. Educ"},{"key":"2023051709551403400_btaa998-B22","doi-asserted-by":"crossref","first-page":"5191","DOI":"10.1021\/acs.analchem.8b05821","article-title":"Predicting ion mobility collision cross-sections using a deep neural network: DeepCCS","volume":"91","author":"Plante","year":"2019","journal-title":"Anal. Chem"},{"key":"2023051709551403400_btaa998-B23","volume-title":"Advances in Large Margin Classifiers","author":"Platt","year":"2000"},{"key":"2023051709551403400_btaa998-B24","first-page":"408","article-title":"Spanning tree approximations for conditional random fields","volume":"5","author":"Pletscher","year":"2009","journal-title":"PMLR"},{"key":"2023051709551403400_btaa998-B25","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1016\/j.neunet.2005.07.009","article-title":"Graph kernels for chemical informatics","volume":"18","author":"Ralaivola","year":"2005","journal-title":"Neural Netw"},{"key":"2023051709551403400_btaa998-B26","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1186\/s13321-016-0115-9","article-title":"MetFrag relaunched: incorporating strategies beyond in silico fragmentation","volume":"8","author":"Ruttkies","year":"2016","journal-title":"J. Cheminform"},{"key":"2023051709551403400_btaa998-B27","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1186\/s12859-019-2954-7","article-title":"Improving MetFrag with statistical learning of fragment annotations","volume":"20","author":"Ruttkies","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023051709551403400_btaa998-B28","doi-asserted-by":"crossref","first-page":"1329","DOI":"10.3389\/fpls.2019.01329","article-title":"Taxonomically informed scoring enhances confidence in natural products annotation","volume":"10","author":"Rutz","year":"2019","journal-title":"Front. Plant Sci"},{"key":"2023051709551403400_btaa998-B29","doi-asserted-by":"crossref","first-page":"12752","DOI":"10.1021\/acs.analchem.8b03118","article-title":"Evaluation of an artificial neural network retention index model for chemical structure identification in nontargeted metabolomics","volume":"90","author":"Samaraweera","year":"2018","journal-title":"Anal. Chem"},{"key":"2023051709551403400_btaa998-B30","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1186\/s13321-017-0207-1","article-title":"Critical assessment of small molecule identification 2016: automated methods","volume":"9","author":"Schymanski","year":"2017","journal-title":"J. Cheminform"},{"key":"2023051709551403400_btaa998-B31","doi-asserted-by":"crossref","first-page":"9421","DOI":"10.1021\/acs.analchem.5b02287","article-title":"PredRet: prediction of retention time by direct mapping between multiple chromatographic systems","volume":"87","author":"Stanstrup","year":"2015","journal-title":"Anal. Chem"},{"key":"2023051709551403400_btaa998-B32","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1007\/s10994-014-5465-9","article-title":"Multilabel classification through random graph ensembles","volume":"99","author":"Su","year":"2015","journal-title":"Mach. Learn"},{"key":"2023051709551403400_btaa998-B33","doi-asserted-by":"crossref","first-page":"3697","DOI":"10.1109\/TIT.2005.856938","article-title":"Map estimation via agreement on trees: message-passing and linear programming","volume":"51","author":"Wainwright","year":"2005","journal-title":"IEEE Trans. Inf. Theory"},{"key":"2023051709551403400_btaa998-B34","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1038\/nbt.3597","article-title":"Sharing and community curation of mass spectrometry data with global natural products social molecular networking","volume":"34","author":"Wang","year":"2016","journal-title":"Nat. Biotechnol"},{"key":"2023051709551403400_btaa998-B35","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1186\/s13321-017-0220-4","article-title":"The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching","volume":"9","author":"Willighagen","year":"2017","journal-title":"J. Cheminform"},{"key":"2023051709551403400_btaa998-B36","doi-asserted-by":"crossref","first-page":"1746","DOI":"10.1002\/jssc.202000060","article-title":"Current status of retention time prediction in metabolite identification","volume":"43","author":"Witting","year":"2020","journal-title":"J. Sep. Sci"},{"key":"2023051709551403400_btaa998-B37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.trac.2011.08.009","article-title":"Metabolite identification and quantitation in LC-MS\/MS-based metabolomics","volume":"32","author":"Xiao","year":"2012","journal-title":"Trends Analyt. Chem"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa998\/34899743\/btaa998.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/12\/1724\/50361247\/btaa998.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/12\/1724\/50361247\/btaa998.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T06:36:59Z","timestamp":1684305419000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/12\/1724\/6007259"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,11,27]]},"references-count":37,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2021,7,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa998","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.08.19.255653","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,6,15]]},"published":{"date-parts":[[2020,11,27]]}}}