{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T05:10:02Z","timestamp":1774933802347,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2021,7,12]],"date-time":"2021-07-12T00:00:00Z","timestamp":1626048000000},"content-version":"vor","delay-in-days":11,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union s Horizon 2020 Research and Innovation Program","award":["668858"],"award-info":[{"award-number":["668858"]}]},{"name":"European Union s Horizon 2020 Research and Innovation Program","award":["826121"],"award-info":[{"award-number":["826121"]}]},{"DOI":"10.13039\/501100000947","name":"Australian Cancer Research Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000947","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001171","name":"Cancer Institute NSW","doi-asserted-by":"publisher","award":["2017\/TPG001"],"award-info":[{"award-number":["2017\/TPG001"]}],"id":[{"id":"10.13039\/501100001171","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,8,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Summary<\/jats:title>\n                  <jats:p>In recent years, SWATH-MS has become the proteomic method of choice for data-independent\u2013acquisition, as it enables high proteome coverage, accuracy and reproducibility. However, data analysis is convoluted and requires prior information and expert curation. Furthermore, as quantification is limited to a small set of peptides, potentially important biological information may be discarded. Here we demonstrate that deep learning can be used to learn discriminative features directly from raw MS data, eliminating hence the need of elaborate data processing pipelines. Using transfer learning to overcome sample sparsity, we exploit a collection of publicly available deep learning models already trained for the task of natural image classification. These models are used to produce feature vectors from each mass spectrometry (MS) raw image, which are later used as input for a classifier trained to distinguish tumor from normal prostate biopsies. Although the deep learning models were originally trained for a completely different classification task and no additional fine-tuning is performed on them, we achieve a highly remarkable classification performance of 0.876 AUC. We investigate different types of image preprocessing and encoding. We also investigate whether the inclusion of the secondary MS2 spectra improves the classification performance. Throughout all tested models, we use standard protein expression vectors as gold standards. Even with our na\u00efve implementation, our results suggest that the application of deep learning and transfer learning techniques might pave the way to the broader usage of raw mass spectrometry data in real-time diagnosis.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The open source code used to generate the results from MS images is available on GitHub: https:\/\/ibm.biz\/mstransc. The data, including the MS images, their encodings, classification labels and results, can be accessed at the following link: https:\/\/ibm.ent.box.com\/v\/mstc-supplementary<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab311","type":"journal-article","created":{"date-parts":[[2021,4,26]],"date-time":"2021-04-26T20:46:37Z","timestamp":1619469997000},"page":"i245-i253","source":"Crossref","is-referenced-by-count":22,"title":["On the feasibility of deep learning applications using raw mass spectrometry data"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4410-2805","authenticated-orcid":false,"given":"Joris","family":"Cadow","sequence":"first","affiliation":[{"name":"Cognitive Computing & Industry Solutions, IBM Research Europe - Zurich, Rueschlikon 8803, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8872-0269","authenticated-orcid":false,"given":"Matteo","family":"Manica","sequence":"additional","affiliation":[{"name":"Cognitive Computing & Industry Solutions, IBM Research Europe - Zurich, Rueschlikon 8803, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1115-6508","authenticated-orcid":false,"given":"Roland","family":"Mathis","sequence":"additional","affiliation":[{"name":"Cognitive Computing & Industry Solutions, IBM Research Europe - Zurich, Rueschlikon 8803, Switzerland"}]},{"given":"Roger R","family":"Reddel","sequence":"additional","affiliation":[{"name":"ProCan\u00ae, Children\u2019s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney , Westmead, NSW, Australia"}]},{"given":"Phillip J","family":"Robinson","sequence":"additional","affiliation":[{"name":"ProCan\u00ae, Children\u2019s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney , Westmead, NSW, Australia"}]},{"given":"Peter J","family":"Wild","sequence":"additional","affiliation":[{"name":"University Hospital Frankfurt Dr. Senckenberg Institute of Pathology, , Frankfurt am Main, Germany"}]},{"given":"Peter G","family":"Hains","sequence":"additional","affiliation":[{"name":"ProCan\u00ae, Children\u2019s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney , Westmead, NSW, Australia"}]},{"given":"Natasha","family":"Lucas","sequence":"additional","affiliation":[{"name":"ProCan\u00ae, Children\u2019s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney , Westmead, NSW, Australia"}]},{"given":"Qing","family":"Zhong","sequence":"additional","affiliation":[{"name":"ProCan\u00ae, Children\u2019s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney , Westmead, NSW, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3869-7651","authenticated-orcid":false,"given":"Tiannan","family":"Guo","sequence":"additional","affiliation":[{"name":"Institute of Basic Medical Sciences, School of Life Science, Westlake University, Hangzhou 310024, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9576-3267","authenticated-orcid":false,"given":"Ruedi","family":"Aebersold","sequence":"additional","affiliation":[{"name":"Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich 8093, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3766-4233","authenticated-orcid":false,"given":"Mar\u00eda","family":"Rodr\u00edguez Mart\u00ednez","sequence":"additional","affiliation":[{"name":"Cognitive Computing & Industry Solutions, IBM Research Europe - Zurich, Rueschlikon 8803, Switzerland"}]}],"member":"286","published-online":{"date-parts":[[2021,7,12]]},"reference":[{"key":"2023062410304119800_btab311-B1","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1038\/nature01511","article-title":"Mass spectrometry-based proteomics","volume":"422","author":"Aebersold","year":"2003","journal-title":"Nature"},{"key":"2023062410304119800_btab311-B2","author":"Alain","year":"2016"},{"key":"2023062410304119800_btab311-B3","author":"Alom","year":"2018"},{"key":"2023062410304119800_btab311-B4","doi-asserted-by":"crossref","first-page":"918","DOI":"10.1038\/nbt.2377","article-title":"A cross-platform toolkit for mass spectrometry and proteomics","volume":"30","author":"Chambers","year":"2012","journal-title":"Nat. Biotechnol"},{"key":"2023062410304119800_btab311-B5","author":"Charmpi","year":"2020"},{"key":"2023062410304119800_btab311-B6","first-page":"785","author":"Chen","year":"2016"},{"key":"2023062410304119800_btab311-B7","doi-asserted-by":"crossref","first-page":"20170387","DOI":"10.1098\/rsif.2017.0387","article-title":"Opportunities and obstacles for deep learning in biology and medicine","volume":"15","author":"Ching","year":"2018","journal-title":"J. R. Soc. Interface"},{"key":"2023062410304119800_btab311-B8","first-page":"4109","author":"Cui","year":"2018"},{"key":"2023062410304119800_btab311-B9","doi-asserted-by":"crossref","first-page":"D655","DOI":"10.1093\/nar\/gkj040","article-title":"The PeptideAtlas project","volume":"34","author":"Desiere","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023062410304119800_btab311-B10","first-page":"592","volume-title":"Bengio,S. and Wallach,H. and Larochelle,H.and Grauman,K. and Cesa-Bianchi,N. and Garnett,R.","author":"Dhurandhar","year":"2018"},{"key":"2023062410304119800_btab311-B11","doi-asserted-by":"crossref","first-page":"O111","DOI":"10.1074\/mcp.O111.016717","article-title":"Targeted data extraction of the MS\/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis","volume":"11","author":"Gillet","year":"2012","journal-title":"Mol. Cell. Proteomics"},{"key":"2023062410304119800_btab311-B12","doi-asserted-by":"crossref","first-page":"O111.016717","DOI":"10.1074\/mcp.O111.016717","article-title":"Targeted data extraction of the MS\/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis","volume":"11","author":"Gillet","year":"2012","journal-title":"Mol. Cell. Proteomics MCP"},{"key":"2023062410304119800_btab311-B13","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1038\/nm.3807","article-title":"Rapid mass spectrometric conversion of tissue biopsy samples into permanent quantitative digital proteome maps","volume":"21","author":"Guo","year":"2015","journal-title":"Nat. Med"},{"key":"2023062410304119800_btab311-B14","first-page":"630","volume-title":"European Conference on Computer Vision","author":"He","year":"2016"},{"key":"2023062410304119800_btab311-B16","author":"Howard","year":"2017"},{"key":"2023062410304119800_btab311-B17","author":"Ioffe","year":"2015"},{"key":"2023062410304119800_btab311-B18","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1002\/aic.690370209","article-title":"Nonlinear principal component analysis using autoassociative neural networks","volume":"37","author":"Kramer","year":"1991","journal-title":"AIChE J"},{"key":"2023062410304119800_btab311-B19","first-page":"1900358","author":"Liang","year":"2020"},{"key":"2023062410304119800_btab311-B20","first-page":"19","author":"Liu","year":"2018"},{"key":"2023062410304119800_btab311-B21","doi-asserted-by":"crossref","first-page":"e8126","DOI":"10.15252\/msb.20178126","article-title":"Data-independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial","volume":"14","author":"Ludwig","year":"2018","journal-title":"Mol. Syst. Biol"},{"key":"2023062410304119800_btab311-B22","doi-asserted-by":"crossref","first-page":"1130","DOI":"10.1038\/nbt.3685","article-title":"A multicenter study benchmarks software tools for label-free proteome quantification","volume":"34","author":"Navarro","year":"2016","journal-title":"Nat. Biotechnol"},{"key":"2023062410304119800_btab311-B23","doi-asserted-by":"crossref","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","article-title":"A survey on transfer learning","volume":"22","author":"Pan","year":"2010","journal-title":"IEEE Trans. Knowledge Data Eng"},{"key":"2023062410304119800_btab311-B24","first-page":"677","article-title":"Transfer learning via dimensionality reduction","volume":"8","author":"Pan","year":"2008","journal-title":"AAAI"},{"key":"2023062410304119800_btab311-B25","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023062410304119800_btab311-B26","doi-asserted-by":"crossref","first-page":"1459","DOI":"10.1038\/nbt1031","article-title":"A common open representation of mass spectrometry data and its application to proteomics research","volume":"22","author":"Pedrioli","year":"2004","journal-title":"Nat. Biotechnol"},{"key":"2023062410304119800_btab311-B27","doi-asserted-by":"crossref","first-page":"D442","DOI":"10.1093\/nar\/gky1106","article-title":"The PRIDE database and related tools and resources in 2019: improving support for quantification data","volume":"47","author":"Perez-Riverol","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023062410304119800_btab311-B28","author":"Real","year":"2019"},{"key":"2023062410304119800_btab311-B29","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1038\/nbt.2841","article-title":"OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data","volume":"32","author":"R\u00f6st","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023062410304119800_btab311-B30","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis"},{"key":"2023062410304119800_btab311-B15","first-page":"4510","author":"Sandler","year":"2018"},{"key":"2023062410304119800_btab311-B31","first-page":"806","author":"Sharif Razavian","year":"2014"},{"key":"2023062410304119800_btab311-B32","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1186\/s12859-015-0478-3","article-title":"Removing batch effects from purified plasma cell gene expression microarrays with modified combat","volume":"16","author":"Stein","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2023062410304119800_btab311-B33","first-page":"1","author":"Szegedy","year":"2015"},{"key":"2023062410304119800_btab311-B34","first-page":"2818","author":"Szegedy","year":"2016"},{"key":"2023062410304119800_btab311-B35","volume-title":"Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, California, USA","author":"Szegedy","year":"2017"},{"key":"2023062410304119800_btab311-B36","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1038\/nmeth.4390","article-title":"Pecan: library-free peptide detection for data-independent acquisition tandem mass spectrometry data","volume":"14","author":"Ting","year":"2017","journal-title":"Nat. Methods"},{"key":"2023062410304119800_btab311-B37","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1038\/s41592-018-0260-3","article-title":"Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry","volume":"16","author":"Tran","year":"2019","journal-title":"Nat. Methods"},{"key":"2023062410304119800_btab311-B38","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1186\/1471-2490-8-9","article-title":"ProCOC: the prostate cancer outcomes cohort study","volume":"8","author":"Umbehr","year":"2008","journal-title":"BMC Urology"},{"key":"2023062410304119800_btab311-B39","first-page":"8769","author":"Van Horn","year":"2018"},{"key":"2023062410304119800_btab311-B40","first-page":"3371","article-title":"Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion","volume":"11","author":"Vincent","year":"2010","journal-title":"J. Mach. Learn. Res"},{"key":"2023062410304119800_btab311-B41","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1093\/bioinformatics\/btaa1088","article-title":"ProteomeExpert: a docker image based web-server for exploring, modeling, visualizing, and mining quantitative proteomic data sets","volume":"37","author":"Zhu","year":"2021","journal-title":"Bioinformatics"},{"key":"2023062410304119800_btab311-B42","first-page":"8697","author":"Zoph","year":"2018"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/Supplement_1\/i245\/50694345\/btab311.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/Supplement_1\/i245\/50694345\/btab311.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,25]],"date-time":"2023-06-25T00:23:16Z","timestamp":1687652596000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/Supplement_1\/i245\/6319670"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,1]]},"references-count":42,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2021,8,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab311","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,7,1]]},"published":{"date-parts":[[2021,7,1]]}}}