{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,1]],"date-time":"2025-11-01T13:49:38Z","timestamp":1762004978894},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2018,6,27]],"date-time":"2018-06-27T00:00:00Z","timestamp":1530057600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"MEXT KAKENHI","award":["16H02868","JPMJAC1503"],"award-info":[{"award-number":["16H02868","JPMJAC1503"]}]},{"name":"ACCEL JST"},{"name":"FiDiPro Tekes"},{"name":"AIPSE Academy of Finland"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Recent success in metabolite identification from tandem mass spectra has been led by machine learning, which has two stages: mapping mass spectra to molecular fingerprint vectors and then retrieving candidate molecules from the database. In the first stage, i.e. fingerprint prediction, spectrum peaks are features and considering their interactions would be reasonable for more accurate identification of unknown metabolites. Existing approaches of fingerprint prediction are based on only individual peaks in the spectra, without explicitly considering the peak interactions. Also the current cutting-edge method is based on kernels, which are computationally heavy and difficult to interpret.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We propose two learning models that allow to incorporate peak interactions for fingerprint prediction. First, we extend the state-of-the-art kernel learning method by developing kernels for peak interactions to combine with kernels for peaks through multiple kernel learning (MKL). Second, we formulate a sparse interaction model for metabolite peaks, which we call SIMPLE, which is computationally light and interpretable for fingerprint prediction. The formulation of SIMPLE is convex and guarantees global optimization, for which we develop an alternating direction method of multipliers (ADMM) algorithm. Experiments using the MassBank dataset show that both models achieved comparative prediction accuracy with the current top-performance kernel method. Furthermore SIMPLE clearly revealed individual peaks and peak interactions which contribute to enhancing the performance of fingerprint prediction.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The code will be accessed through http:\/\/mamitsukalab.org\/tools\/SIMPLE\/.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty252","type":"journal-article","created":{"date-parts":[[2018,4,16]],"date-time":"2018-04-16T06:34:05Z","timestamp":1523860445000},"page":"i323-i332","source":"Crossref","is-referenced-by-count":29,"title":["SIMPLE: Sparse Interaction Model over Peaks of moLEcules for fast, interpretable metabolite identification from tandem mass spectra"],"prefix":"10.1093","volume":"34","author":[{"given":"Dai Hai","family":"Nguyen","sequence":"first","affiliation":[{"name":"Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Japan"}]},{"given":"Canh Hao","family":"Nguyen","sequence":"additional","affiliation":[{"name":"Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Japan"}]},{"given":"Hiroshi","family":"Mamitsuka","sequence":"additional","affiliation":[{"name":"Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Japan"},{"name":"Department of Computer Science, Alato University, Espoo, Finland"}]}],"member":"286","published-online":{"date-parts":[[2018,6,27]]},"reference":[{"key":"2023051604251561400_bty252-B1","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1007\/978-3-319-23525-7_2","volume-title":"Machine Learning and Knowledge Discovery in Databases","author":"Blondel","year":"2015"},{"key":"2023051604251561400_bty252-B2","doi-asserted-by":"crossref","first-page":"i49","DOI":"10.1093\/bioinformatics\/btn270","article-title":"Towards de novo identification of metabolites by analyzing tandem mass spectra","volume":"24","author":"B\u00f6cker","year":"2008","journal-title":"Bioinformatics"},{"key":"2023051604251561400_bty252-B3","doi-asserted-by":"crossref","first-page":"1529","DOI":"10.1021\/acs.jcim.5b00261","article-title":"Relevance vector machines: sparse classification methods for qsar","volume":"55","author":"Burden","year":"2015","journal-title":"J. Chem. Inf. Model"},{"key":"2023051604251561400_bty252-B4","doi-asserted-by":"crossref","first-page":"1956","DOI":"10.1137\/080738970","article-title":"A singular value thresholding algorithm for matrix completion","volume":"20","author":"Cai","year":"2010","journal-title":"SIAM J. Optim"},{"key":"2023051604251561400_bty252-B5","first-page":"795","article-title":"Algorithms for learning kernels based on centered alignment","volume":"13","author":"Cortes","year":"2012","journal-title":"J. Mach. Learn. Res"},{"key":"2023051604251561400_bty252-B6","volume-title":"Mass Spectrometry, Principles and Applications","author":"de Hoffmann","year":"2007","edition":"3rd edn."},{"key":"2023051604251561400_bty252-B7","doi-asserted-by":"crossref","first-page":"12580","DOI":"10.1073\/pnas.1509788112","article-title":"Searching molecular structure databases with tandem mass spectra using csi: fingerid","volume":"112","author":"D\u00fchrkop","year":"2015","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051604251561400_bty252-B8","first-page":"2211","article-title":"Multiple kernel learning algorithms","volume":"12","author":"G\u00f6nen","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023051604251561400_bty252-B9","doi-asserted-by":"crossref","first-page":"2333","DOI":"10.1093\/bioinformatics\/bts437","article-title":"Metabolite identification and molecular fingerprint prediction through machine learning","volume":"28","author":"Heinonen","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051604251561400_bty252-B10","doi-asserted-by":"crossref","first-page":"703","DOI":"10.1002\/jms.1777","article-title":"Massbank: a public repository for sharing mass spectral data for life sciences","volume":"45","author":"Horai","year":"2010","journal-title":"J. Mass Spectrom"},{"key":"2023051604251561400_bty252-B11","doi-asserted-by":"crossref","first-page":"322","DOI":"10.1007\/s11306-010-0198-7","article-title":"Decision tree supported substructure prediction of metabolites from gc-ms profiles","volume":"6","author":"Hummel","year":"2010","journal-title":"Metabolomics"},{"key":"2023051604251561400_bty252-B12","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1016\/j.jprot.2008.04.005","article-title":"Mass spectrometric and linear discriminant analysis of n-glycans of human serum alpha-1-acid glycoprotein in cancer patients and healthy individuals","volume":"71","author":"Imre","year":"2008","journal-title":"J. Proteomics"},{"key":"2023051604251561400_bty252-B13","first-page":"819","article-title":"Probability product kernels","volume":"5","author":"Jebara","year":"2004","journal-title":"J. Mach. Learn. Res"},{"key":"2023051604251561400_bty252-B14","first-page":"953","article-title":"lp-norm multiple kernel learning","volume":"12","author":"Kloft","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023051604251561400_bty252-B15","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1007\/s10107-009-0306-5","article-title":"Fixed point and bregman iterative methods for matrix rank minimization","volume":"128","author":"Ma","year":"2011","journal-title":"Math. Program"},{"key":"2023051604251561400_bty252-B16","doi-asserted-by":"crossref","first-page":"33.","DOI":"10.1186\/1758-2946-3-33","article-title":"Open babel: an open chemical toolbox","volume":"3","author":"O'Boyle","year":"2011","journal-title":"J. Cheminf"},{"key":"2023051604251561400_bty252-B17","doi-asserted-by":"crossref","first-page":"1243","DOI":"10.1021\/ac101825k","article-title":"Computing fragmentation trees from tandem mass spectrometry data","volume":"83","author":"Rasche","year":"2011","journal-title":"Anal. Chem"},{"key":"2023051604251561400_bty252-B18","doi-asserted-by":"crossref","first-page":"12.","DOI":"10.1186\/1758-2946-5-12","article-title":"Computational mass spectrometry for small molecules","volume":"5","author":"Scheubert","year":"2013","journal-title":"J. Cheminf"},{"key":"2023051604251561400_bty252-B19","doi-asserted-by":"crossref","first-page":"i157","DOI":"10.1093\/bioinformatics\/btu275","article-title":"Metabolite identification through multiple kernel learning on fragmentation trees","volume":"30","author":"Shen","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051604251561400_bty252-B20","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1994","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023051604251561400_bty252-B21","first-page":"5989","author":"Watanabe","year":"2014"},{"key":"2023051604251561400_bty252-B22","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1093\/bib\/bbm030","article-title":"Current progress in computational metabolomics","volume":"8","author":"Wishart","year":"2007","journal-title":"Brief. Bioinf"},{"key":"2023051604251561400_bty252-B23","doi-asserted-by":"crossref","first-page":"D801","DOI":"10.1093\/nar\/gks1065","article-title":"Hmdb 3.0 \u2013 the human metabolome database in 2013","volume":"41","author":"Wishart","year":"2012","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i323\/50316274\/bioinformatics_34_13_i323.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/13\/i323\/50316274\/bioinformatics_34_13_i323.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,6]],"date-time":"2024-07-06T01:19:37Z","timestamp":1720228777000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/13\/i323\/5045791"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,6,27]]},"references-count":23,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2018,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty252","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,7,1]]},"published":{"date-parts":[[2018,6,27]]}}}