{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T02:19:09Z","timestamp":1772504349425,"version":"3.50.1"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,11,8]],"date-time":"2023-11-08T00:00:00Z","timestamp":1699401600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,11,8]],"date-time":"2023-11-08T00:00:00Z","timestamp":1699401600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-18-CE45-004"],"award-info":[{"award-number":["ANR-18-CE45-004"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>In proteomics, the interpretation of mass spectra representing peptides carrying multiple complex modifications remains challenging, as it is difficult to strike a balance between reasonable execution time, a limited number of false positives, and a huge search space allowing any number of modifications without a priori. The scientific community needs new developments in this area to aid in the discovery of novel post-translational modifications that may play important roles in disease.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>To make progress on this issue, we implemented SpecGlobX (SpecGlob eXTended to eXperimental spectra), a standalone Java application that quickly determines the best spectral alignments of a (possibly very large) list of Peptide-to-Spectrum Matches (PSMs) provided by any open modification search method, or generated by the user. As input, SpecGlobX reads a file containing spectra in MGF or mzML format and a semicolon-delimited spreadsheet describing the PSMs. SpecGlobX returns the best alignment for each PSM as output, splitting the mass difference between the spectrum and the peptide into one or more shifts while considering the possibility of non-aligned masses (a phenomenon resulting from many situations including neutral losses).\u00a0SpecGlobX is fast, able to align one million PSMs in about 1.5 min on a standard desktop. Firstly, we remind the foundations of the algorithm and detail how we adapted SpecGlob (the method we previously developed following the same aim, but limited to the interpretation of perfect simulated spectra) to the interpretation of imperfect experimental spectra. Then, we highlight the interest of SpecGlobX as a complementary tool downstream to three open modification search methods on a large simulated spectra dataset. Finally, we ran SpecGlobX on a proteome-wide dataset downloaded from PRIDE to demonstrate that SpecGlobX functions just as well on simulated and experimental spectra. We then carefully analyzed a limited set of interpretations.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusions<\/jats:title>\n                    <jats:p>SpecGlobX is helpful as a decision support tool, providing keys to interpret peptides carrying complex modifications still poorly considered by current open modification search software. Better alignment of PSMs enhances confidence in the identification of spectra provided by open modification search methods and should improve the interpretation rate of spectra.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-023-05555-y","type":"journal-article","created":{"date-parts":[[2023,11,8]],"date-time":"2023-11-08T02:02:48Z","timestamp":1699408968000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Fast alignment of mass spectra in large proteomics datasets, capturing dissimilarities arising from multiple complex modifications of peptides"],"prefix":"10.1186","volume":"24","author":[{"given":"Gr\u00e9goire","family":"Prunier","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mehdi","family":"Cherkaoui","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Albane","family":"Lysiak","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Olivier","family":"Langella","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"M\u00e9lisande","family":"Blein-Nicolas","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Virginie","family":"Lollier","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Emile","family":"Benoist","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"G\u00e9raldine","family":"Jean","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guillaume","family":"Fertin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"H\u00e9l\u00e8ne","family":"Rogniaux","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dominique","family":"Tessier","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,11,8]]},"reference":[{"issue":"8","key":"5555_CR1","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1038\/nmeth.3902","volume":"13","author":"J Griss","year":"2016","unstructured":"Griss J, Perez-Riverol Y, Lewis S, Tabb DL, Dianes JA, Del-Toro N, et al. Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets. Nat Methods. 2016;13(8):651\u20136.","journal-title":"Nat Methods"},{"issue":"8","key":"5555_CR2","doi-asserted-by":"publisher","first-page":"2791","DOI":"10.1074\/mcp.M115.055103","volume":"15","author":"B Bogdanow","year":"2016","unstructured":"Bogdanow B, Zauber H, Selbach M. Systematic errors in peptide and protein identification and quantification by modified peptides. Mol Cell Proteomics. 2016;15(8):2791\u2013801.","journal-title":"Mol Cell Proteomics"},{"issue":"6","key":"5555_CR3","doi-asserted-by":"publisher","first-page":"1534","DOI":"10.1002\/pmic.200300744","volume":"4","author":"DM Creasy","year":"2004","unstructured":"Creasy DM, Cottrell JS. Unimod: protein modifications for mass spectrometry. Proteomics. 2004;4(6):1534\u20136.","journal-title":"Proteomics"},{"issue":"1","key":"5555_CR4","doi-asserted-by":"publisher","first-page":"foz088","DOI":"10.1093\/femsyr\/foz088","volume":"20","author":"M den Ridder","year":"2020","unstructured":"den Ridder M, Daran-Lapujade P, Pabst M. Shot-gun proteomics: why thousands of unidentified signals matter. FEMS Yeast Res. 2020;20(1):foz088.","journal-title":"FEMS Yeast Res"},{"issue":"12","key":"5555_CR5","doi-asserted-by":"publisher","first-page":"5555","DOI":"10.1021\/pr200913a","volume":"10","author":"N Colaert","year":"2011","unstructured":"Colaert N, Degroeve S, Helsens K, Martens L. Analysis of the resolution limitations of peptide identification algorithms. J Proteome Res. 2011;10(12):5555\u201361.","journal-title":"J Proteome Res"},{"issue":"11","key":"5555_CR6","doi-asserted-by":"publisher","first-page":"7469","DOI":"10.1021\/acsomega.0c05997","volume":"6","author":"F Bugyi","year":"2021","unstructured":"Bugyi F, Szab\u00f3 D, Szab\u00f3 G, R\u00e9v\u00e9sz \u00c1, Pape VFS, Solt\u00e9sz-Katona E, et al. Influence of post-translational modifications on protein identification in database searches. ACS Omega. 2021;6(11):7469\u201377.","journal-title":"ACS Omega"},{"issue":"5","key":"5555_CR7","doi-asserted-by":"publisher","first-page":"935","DOI":"10.1074\/mcp.T500034-MCP200","volume":"5","author":"MM Savitski","year":"2006","unstructured":"Savitski MM, Nielsen ML, Zubarev RA. ModifiComb, a new proteomic tool for mapping substoichiometric post-translational modifications, finding novel types of modifications, and fingerprinting complex protein mixtures. Mol Cell Proteomics. 2006;5(5):935\u201348.","journal-title":"Mol Cell Proteomics"},{"issue":"8","key":"5555_CR8","doi-asserted-by":"publisher","first-page":"3501","DOI":"10.1021\/acs.analchem.1c04101","volume":"94","author":"M Riffle","year":"2022","unstructured":"Riffle M, Hoopmann MR, Jaschob D, Zhong G, Moritz RL, MacCoss MJ, et al. Discovery and visualization of uncharacterized drug-protein adducts using mass spectrometry. Anal Chem. 2022;94(8):3501\u20139.","journal-title":"Anal Chem"},{"issue":"5","key":"5555_CR9","doi-asserted-by":"publisher","first-page":"513","DOI":"10.1038\/nmeth.4256","volume":"14","author":"AT Kong","year":"2017","unstructured":"Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI. MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat Methods. 2017;14(5):513\u201320.","journal-title":"Nat Methods"},{"issue":"3","key":"5555_CR10","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1073\/pnas.0811739106","volume":"106","author":"Y Chen","year":"2009","unstructured":"Chen Y, Chen W, Cobb MH, Zhao Y. PTMap\u2013a sequence alignment software for unrestricted, accurate, and full-spectrum identification of post-translational modification sites. Proc Natl Acad Sci U S A. 2009;106(3):761\u20136.","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"3","key":"5555_CR11","doi-asserted-by":"publisher","first-page":"721","DOI":"10.1021\/acs.jproteome.5b00877","volume":"15","author":"O Horlacher","year":"2016","unstructured":"Horlacher O, Lisacek F, M\u00fcller M. Mining large scale tandem mass spectrometry data for protein modifications using spectral libraries. J Proteome Res. 2016;15(3):721\u201331.","journal-title":"J Proteome Res"},{"issue":"4","key":"5555_CR12","doi-asserted-by":"publisher","first-page":"1835","DOI":"10.1021\/acs.jproteome.0c00638","volume":"20","author":"P Cifani","year":"2021","unstructured":"Cifani P, Li Z, Luo D, Grivainis M, Intlekofer AM, Feny\u00f6 D, et al. Discovery of protein modifications using differential tandem mass spectrometry proteomics. J Proteome Res. 2021;20(4):1835\u201348.","journal-title":"J Proteome Res"},{"issue":"5","key":"5555_CR13","doi-asserted-by":"publisher","first-page":"1844","DOI":"10.1021\/acs.jproteome.7b00873","volume":"17","author":"SK Solntsev","year":"2018","unstructured":"Solntsev SK, Shortreed MR, Frey BL, Smith LM. Enhanced global post-translational modification discovery with MetaMorpheus. J Proteome Res. 2018;17(5):1844\u201351.","journal-title":"J Proteome Res"},{"issue":"17","key":"5555_CR14","doi-asserted-by":"publisher","first-page":"11324","DOI":"10.1021\/acs.analchem.9b02445","volume":"91","author":"S Na","year":"2019","unstructured":"Na S, Kim J, Paek E. MODplus: robust and unrestrictive identification of post-translational modifications using mass spectrometry. Anal Chem. 2019;91(17):11324\u201333.","journal-title":"Anal Chem"},{"issue":"11","key":"5555_CR15","doi-asserted-by":"publisher","first-page":"1059","DOI":"10.1038\/nbt.4236","volume":"36","author":"H Chi","year":"2018","unstructured":"Chi H, Liu C, Yang H, Zeng WF, Wu L, Zhou WJ, et al. Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine. Nat Biotechnol. 2018;36(11):1059\u201361.","journal-title":"Nat Biotechnol."},{"issue":"4","key":"5555_CR16","doi-asserted-by":"publisher","first-page":"469","DOI":"10.1038\/s41587-019-0067-5","volume":"37","author":"A Devabhaktuni","year":"2019","unstructured":"Devabhaktuni A, Lin S, Zhang L, Swaminathan K, Gonzalez CG, Olsson N, et al. TagGraph reveals vast protein modification landscapes from large tandem mass spectrometry datasets. Nat Biotechnol. 2019;37(4):469\u201379.","journal-title":"Nat Biotechnol"},{"issue":"5","key":"5555_CR17","doi-asserted-by":"publisher","first-page":"1924","DOI":"10.1021\/acs.jproteome.6b00988","volume":"16","author":"MC Burke","year":"2017","unstructured":"Burke MC, Mirokhin YA, Tchekhovskoi DV, Markey SP, Heidbrink Thompson J, Larkin C, et al. The hybrid search: a mass spectral library search method for discovery of modifications in proteomics. J Proteome Res. 2017;16(5):1924\u201335.","journal-title":"J Proteome Res"},{"issue":"10","key":"5555_CR18","doi-asserted-by":"publisher","first-page":"3463","DOI":"10.1021\/acs.jproteome.8b00359","volume":"17","author":"W Bittremieux","year":"2018","unstructured":"Bittremieux W, Meysman P, Noble WS, Laukens K. Fast open modification spectral library searching through approximate nearest neighbor indexing. J Proteome Res. 2018;17(10):3463\u201374.","journal-title":"J Proteome Res"},{"key":"5555_CR19","doi-asserted-by":"publisher","unstructured":"Lysiak A, Fertin G, Jean G, Tessier D. SpecGlob: rapid and accurate alignment of mass spectra differing from their peptide models by several unknown modifications. bioRxiv. 2022; doi: https:\/\/doi.org\/10.1101\/2022.05.31.494131.","DOI":"10.1101\/2022.05.31.494131"},{"issue":"6","key":"5555_CR20","doi-asserted-by":"publisher","first-page":"777","DOI":"10.1089\/10665270050514927","volume":"7","author":"P Pevzner","year":"2000","unstructured":"Pevzner P, Dancik V, Tang C. Mutation-tolerant protein identification by mass spectrometry. J Comput Biol. 2000;7(6):777\u201387.","journal-title":"J Comput Biol"},{"issue":"2","key":"5555_CR21","doi-asserted-by":"publisher","first-page":"290","DOI":"10.1101\/gr.154101","volume":"11","author":"PA Pevzner","year":"2001","unstructured":"Pevzner PA, Mulyukov Z, Dancik V, Tang CL. Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. Genome Res. 2001;11(2):290\u20139.","journal-title":"Genome Res"},{"key":"5555_CR22","doi-asserted-by":"crossref","unstructured":"Bandeira N, Tsur D, Frank A, Pevzner PA. Protein identification by spectral networks analysis. 2007.","DOI":"10.1073\/pnas.0701130104"},{"issue":"8","key":"5555_CR23","doi-asserted-by":"publisher","first-page":"3030","DOI":"10.1021\/acs.jproteome.7b00308","volume":"16","author":"M David","year":"2017","unstructured":"David M, Fertin G, Rogniaux H, Tessier D. SpecOMS: a full open modification search method performing all-to-all spectra comparisons within minutes. J Proteome Res. 2017;16(8):3030\u20138.","journal-title":"J Proteome Res"},{"issue":"7","key":"5555_CR24","doi-asserted-by":"publisher","first-page":"743","DOI":"10.1038\/nbt.3267","volume":"33","author":"JM Chick","year":"2015","unstructured":"Chick JM, Kolippakkam D, Nusinow DP, Zhai B, Rad R, Huttlin EL, et al. A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides. Nat Biotechnol. 2015;33(7):743\u20139.","journal-title":"Nat Biotechnol"},{"key":"5555_CR25","doi-asserted-by":"crossref","unstructured":"Cliquet F, Fertin G, Rusu I, Tessier D, editors. Comparison of spectra in unsequenced species. 4th Brazilian Symposium on Bioinformatics (BSB 2009); 2009; Porto Alegre, Brazil.","DOI":"10.1007\/978-3-642-03223-3_3"},{"issue":"6","key":"5555_CR26","doi-asserted-by":"publisher","first-page":"795","DOI":"10.1002\/pmic.201100578","volume":"12","author":"J Griss","year":"2012","unstructured":"Griss J, Reisinger F, Hermjakob H, Vizca\u00edno JA. jmzReader: a Java parser library to process and visualize multiple text and XML-based mass spectrometry data formats. Proteomics. 2012;12(6):795\u20138.","journal-title":"Proteomics"},{"issue":"D1","key":"5555_CR27","first-page":"D682","volume":"48","author":"AD Yates","year":"2020","unstructured":"Yates AD, Achuthan P, Akanni W, Allen J, Alvarez-Jarreta J, Amode MR, et al. Ensembl. Nucleic Acids Res. 2020;48(D1):D682\u20138.","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"5555_CR28","doi-asserted-by":"publisher","first-page":"918","DOI":"10.1038\/nbt.2377","volume":"30","author":"MC Chambers","year":"2012","unstructured":"Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S, et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol. 2012;30(10):918\u201320.","journal-title":"Nat Biotechnol"},{"issue":"26","key":"5555_CR29","doi-asserted-by":"publisher","first-page":"E1743","DOI":"10.1073\/pnas.1203689109","volume":"109","author":"J Watrous","year":"2012","unstructured":"Watrous J, Roach P, Alexandrov T, Heath BS, Yang JY, Kersten RD, et al. Mass spectral molecular networking of living microbial colonies. Proc Natl Acad Sci U S A. 2012;109(26):E1743\u201352.","journal-title":"Proc Natl Acad Sci U S A"},{"key":"5555_CR30","doi-asserted-by":"crossref","unstructured":"Bastian M, Heymann S, Jacomy M. Gephi: An Open Source Software for Exploring and Manipulating Networks. 03, 2009.","DOI":"10.1609\/icwsm.v3i1.13937"},{"issue":"6","key":"5555_CR31","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0098679","volume":"9","author":"M Jacomy","year":"2014","unstructured":"Jacomy M, Venturini T, Heymann S, Bastian M. ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS ONE. 2014;9(6): e98679.","journal-title":"PLoS ONE"},{"key":"5555_CR32","doi-asserted-by":"crossref","unstructured":"Giese SH, Belsom A, Sinn L, Fischer L, Rappsilber J. Noncovalently associated peptides observed during liquid chromatography-mass spectrometry and their effect on cross-link analyses. 2019.","DOI":"10.1101\/502351"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05555-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-023-05555-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05555-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,8]],"date-time":"2023-11-08T02:02:53Z","timestamp":1699408973000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-023-05555-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,8]]},"references-count":32,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["5555"],"URL":"https:\/\/doi.org\/10.1186\/s12859-023-05555-y","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.03.09.531667","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,8]]},"assertion":[{"value":"13 March 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 October 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 November 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable, human data used was publicly available.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"421"}}