{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,5]],"date-time":"2024-08-05T08:00:14Z","timestamp":1722844814164},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2020,3,24]],"date-time":"2020-03-24T00:00:00Z","timestamp":1585008000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The discrimination ability of score functions to separate correct from incorrect peptide-spectrum-matches in database-searching-based spectrum identification is hindered by many superfluous peaks belonging to unexpected fragmentation ions or by the lacking peaks of anticipated fragmentation ions.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here, we present a new method, called BoltzMatch, to learn score functions using a particular stochastic neural networks, called restricted Boltzmann machines, in order to enhance their discrimination ability. BoltzMatch learns chemically explainable patterns among peak pairs in the spectrum data, and it can augment peaks depending on their semantic context or even reconstruct lacking peaks of expected ions during its internal scoring mechanism. As a result, BoltzMatch achieved 50% and 33% more annotations on high- and low-resolution MS2 data than XCorr at a 0.1% false discovery rate in our benchmark; conversely, XCorr yielded the same number of spectrum annotations as BoltzMatch, albeit with 4\u20136 times more errors. In addition, BoltzMatch alone does yield 14% more annotations than Prosit (which runs with Percolator), and BoltzMatch with Percolator yields 32% more annotations than Prosit at 0.1% FDR level in our benchmark.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>BoltzMatch is freely available at: https:\/\/github.com\/kfattila\/BoltzMatch.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Contact<\/jats:title>\n                  <jats:p>akerteszfarkas@hse.ru<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supporting information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa206","type":"journal-article","created":{"date-parts":[[2020,3,20]],"date-time":"2020-03-20T12:28:31Z","timestamp":1584707311000},"page":"3781-3787","source":"Crossref","is-referenced-by-count":4,"title":["Annotation of tandem mass spectrometry data using stochastic neural networks in shotgun proteomics"],"prefix":"10.1093","volume":"36","author":[{"given":"Pavel","family":"Sulimov","sequence":"first","affiliation":[{"name":"Faculty of Computer Science , School of Data Analysis and Artificial Intelligence, Moscow 101000, Russia"}]},{"given":"Anastasia","family":"Voronkova","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science , School of Data Analysis and Artificial Intelligence, Moscow 101000, Russia"}]},{"given":"Attila","family":"Kert\u00e9sz-Farkas","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science , School of Data Analysis and Artificial Intelligence, Moscow 101000, Russia"}]}],"member":"286","published-online":{"date-parts":[[2020,3,24]]},"reference":[{"key":"2023063011072133000_btaa206-B1","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1038\/nature01511","article-title":"Mass spectrometry-based proteomics","volume":"422","author":"Aebersold","year":"2003","journal-title":"Nature"},{"key":"2023063011072133000_btaa206-B2","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1016\/j.cels.2017.05.009","article-title":"An optimized shotgun strategy for the rapid generation of comprehensive human proteomes","volume":"4","author":"Bekker-Jensen","year":"2017","journal-title":"Cell Syst"},{"key":"2023063011072133000_btaa206-B3","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1074\/mcp.M113.032813","article-title":"Proteome informatics research group (iPRG)_2012: a study on detecting modified peptides in a complex mixture","volume":"13","author":"Chalkley","year":"2014","journal-title":"Mol. Cell. Proteomics"},{"key":"2023063011072133000_btaa206-B4","doi-asserted-by":"crossref","first-page":"1794","DOI":"10.1021\/pr101065j","article-title":"Andromeda: a peptide search engine integrated into the maxquant environment","volume":"10","author":"Cox","year":"2011","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B5","doi-asserted-by":"crossref","first-page":"2354","DOI":"10.1021\/acs.jproteome.8b00991","article-title":"Bias in false discovery rate estimation in mass-spectrometry-based peptide identification","volume":"18","author":"Danilova","year":"2019","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B6","doi-asserted-by":"crossref","first-page":"3679","DOI":"10.1021\/pr500202e","article-title":"MS Amanda, a universal identification algorithm optimized for high accuracy tandem mass spectra","volume":"13","author":"Dorfer","year":"2014","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B7","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1038\/nmeth1019","article-title":"Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry","volume":"4","author":"Elias","year":"2007","journal-title":"Nat. Methods"},{"key":"2023063011072133000_btaa206-B8","doi-asserted-by":"crossref","first-page":"976","DOI":"10.1016\/1044-0305(94)80016-2","article-title":"An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database","volume":"5","author":"Eng","year":"1994","journal-title":"J. Am. Soc. Mass Spectrom"},{"key":"2023063011072133000_btaa206-B9","doi-asserted-by":"crossref","first-page":"4598","DOI":"10.1021\/pr800420s","article-title":"A fast sequest cross correlation algorithm","volume":"7","author":"Eng","year":"2008","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B10","doi-asserted-by":"crossref","first-page":"1865","DOI":"10.1007\/s13361-015-1179-x","article-title":"A deeper look into comet-implementation and features","volume":"26","author":"Eng","year":"2015","journal-title":"J. Am. Soc. Mass Spectrom"},{"key":"2023063011072133000_btaa206-B11","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1021\/ac0258709","article-title":"A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes","volume":"75","author":"Feny\u00f6","year":"2003","journal-title":"Anal. Chem"},{"key":"2023063011072133000_btaa206-B12","author":"Fischer","year":"2012"},{"key":"2023063011072133000_btaa206-B13","doi-asserted-by":"crossref","first-page":"958","DOI":"10.1021\/pr0499491","article-title":"Open mass spectrometry search algorithm","volume":"3","author":"Geer","year":"2004","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B14","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1038\/s41592-019-0426-7","article-title":"Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning","volume":"16","author":"Gessulat","year":"2019","journal-title":"Nat. Methods"},{"key":"2023063011072133000_btaa206-B15","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1038\/msb.2008.75","article-title":"An integrated workflow for charting the human interaction proteome: insights into the pp2a system","volume":"5","author":"Glatter","year":"2009","journal-title":"Mol. Syst. Biol"},{"key":"2023063011072133000_btaa206-B16","first-page":"320","article-title":"Learning peptide-spectrum alignment models for tandem mass spectrometry","volume":"30","author":"Halloran","year":"2014","journal-title":"Uncertain. Artif. Intell"},{"key":"2023063011072133000_btaa206-B17","author":"Hinton","year":"2012"},{"key":"2023063011072133000_btaa206-B18","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1126\/science.1127647","article-title":"Reducing the dimensionality of data with neural networks","volume":"313","author":"Hinton","year":"2006","journal-title":"Science"},{"key":"2023063011072133000_btaa206-B19","doi-asserted-by":"crossref","first-page":"2467","DOI":"10.1074\/mcp.O113.036327","article-title":"Computing exact p-values for a cross-correlation shotgun proteomics score function","volume":"13","author":"Howbert","year":"2014","journal-title":"Mol. Cell. Proteomics"},{"key":"2023063011072133000_btaa206-B20","doi-asserted-by":"crossref","first-page":"923","DOI":"10.1038\/nmeth1113","article-title":"Semi-supervised learning for peptide identification from shotgun proteomics datasets","volume":"4","author":"K\u00e4ll","year":"2007","journal-title":"Nat. Methods"},{"key":"2023063011072133000_btaa206-B21","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1021\/pr5010983","article-title":"On the importance of well-calibrated scores for identifying shotgun proteomics spectra","volume":"14","author":"Keich","year":"2015","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B22","doi-asserted-by":"crossref","first-page":"3148","DOI":"10.1021\/acs.jproteome.5b00081","article-title":"Improved false discovery rate estimation procedure for shotgun proteomics","volume":"14","author":"Keich","year":"2015","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B23","doi-asserted-by":"crossref","first-page":"221","DOI":"10.2174\/157489312800604354","article-title":"Database searching in mass spectrometry based proteomics","volume":"7","author":"Kert\u00e9sz-Farkas","year":"2012","journal-title":"Curr. Bioinform"},{"key":"2023063011072133000_btaa206-B24","doi-asserted-by":"crossref","first-page":"3027","DOI":"10.1021\/pr501173s","article-title":"Tandem mass spectrum identification via cascaded serch","volume":"14","author":"Kertesz-Farkas","year":"2015","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B25","doi-asserted-by":"crossref","first-page":"5277","DOI":"10.1038\/ncomms6277","article-title":"Ms-gf+ makes progress towards a universal database search tool for proteomics","volume":"5","author":"Kim","year":"2014","journal-title":"Nat. Commun"},{"key":"2023063011072133000_btaa206-B26","doi-asserted-by":"crossref","first-page":"3354","DOI":"10.1021\/pr8001244","article-title":"Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases","volume":"7","author":"Kim","year":"2008","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B27","doi-asserted-by":"crossref","first-page":"2840","DOI":"10.1074\/mcp.M110.003731","article-title":"The generating function of CID, ETD, and CID\/ETD pairs of tandem mass spectra: applications to database search","volume":"9","author":"Kim","year":"2010","journal-title":"Mol. Cell. Proteomics"},{"key":"2023063011072133000_btaa206-B28","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2023063011072133000_btaa206-B29","doi-asserted-by":"crossref","first-page":"3644","DOI":"10.1021\/acs.jproteome.8b00206","article-title":"Combining high-resolution and exact calibration to boost statistical power: a well-calibrated score function for high-resolution ms2 data","volume":"17","author":"Lin","year":"2018","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B30","doi-asserted-by":"crossref","first-page":"4488","DOI":"10.1021\/pr500741y","article-title":"Crux: rapid open source protein tandem mass spectrometry analysis","volume":"13","author":"McIlwain","year":"2014","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B31","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1109\/TASL.2011.2109382","article-title":"Acoustic modeling using deep belief networks","volume":"20","author":"Mohamed","year":"2012","journal-title":"IEEE Trans. Audio Speech Lang. Process"},{"key":"2023063011072133000_btaa206-B32","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/S1359-6446(03)02978-7","article-title":"Analysis, statistical validation and dissemination of large-scale proteomics datasets generated by tandem MS","volume":"9","author":"Nesvizhskii","year":"2004","journal-title":"Drug Discov. Today"},{"key":"2023063011072133000_btaa206-B33","doi-asserted-by":"crossref","first-page":"e1002296","DOI":"10.1371\/journal.pcbi.1002296","article-title":"Computational and statistical analysis of protein mass spectrometry data","volume":"8","author":"Noble","year":"2012","journal-title":"PLoS Comput. Biol"},{"key":"2023063011072133000_btaa206-B34","doi-asserted-by":"crossref","first-page":"4028","DOI":"10.1021\/pr400394g","article-title":"Global analysis of protein expression and phosphorylation of three stages of plasmodium falciparum intraerythrocytic development","volume":"12","author":"Pease","year":"2013","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B35","doi-asserted-by":"crossref","first-page":"3551","DOI":"10.1002\/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2","article-title":"Probability-based protein identification by searching sequence databases using mass spectrometry data","volume":"20","author":"Perkins","year":"1999","journal-title":"Electrophoresis"},{"key":"2023063011072133000_btaa206-B36","author":"Salakhutdinov","year":"2007"},{"key":"2023063011072133000_btaa206-B37","first-page":"1481","author":"Sulimov","year":"2020"},{"key":"2023063011072133000_btaa206-B38","author":"Sulimov","year":"2020"},{"key":"2023063011072133000_btaa206-B39","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1038\/s41592-019-0427-6","article-title":"High-quality MS\/MS spectrum prediction for data-dependent and data-independent acquisition data analysis","volume":"16","author":"Tiwary","year":"2019","journal-title":"Nat. Methods"},{"key":"2023063011072133000_btaa206-B40","doi-asserted-by":"crossref","first-page":"8247","DOI":"10.1073\/pnas.1705691114","article-title":"De novo peptide sequencing by deep learning","volume":"114","author":"Tran","year":"2017","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023063011072133000_btaa206-B41","doi-asserted-by":"crossref","first-page":"1377","DOI":"10.1021\/pr301024c","article-title":"A proteomics search algorithm specifically designed for high-resolution tandem mass spectra","volume":"12","author":"Wenger","year":"2013","journal-title":"J. Proteome Res"},{"key":"2023063011072133000_btaa206-B42","doi-asserted-by":"crossref","first-page":"1426","DOI":"10.1021\/ac00104a020","article-title":"Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database","volume":"67","author":"Yates","year":"1995","journal-title":"Anal. Chem"},{"key":"2023063011072133000_btaa206-B43","doi-asserted-by":"crossref","first-page":"12690","DOI":"10.1021\/acs.analchem.7b02566","article-title":"pDeep: predicting MS\/MS spectra of peptides with deep learning","volume":"89","author":"Zhou","year":"2017","journal-title":"Anal. Chem"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa206\/33241671\/btaa206.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/12\/3781\/50749226\/bioinformatics_36_12_3781.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/12\/3781\/50749226\/bioinformatics_36_12_3781.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,30]],"date-time":"2023-06-30T11:08:20Z","timestamp":1688123300000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/12\/3781\/5811231"}},"subtitle":[],"editor":[{"given":"Pier","family":"Luigi Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,3,24]]},"references-count":43,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2020,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa206","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,6,15]]},"published":{"date-parts":[[2020,3,24]]}}}