{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T08:10:23Z","timestamp":1774944623535,"version":"3.50.1"},"reference-count":19,"publisher":"Springer Science and Business Media LLC","issue":"S16","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2012,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>PeptideProphet is a post-processing algorithm designed to evaluate the confidence in identifications of MS\/MS spectra returned by a database search. In this manuscript we describe the \"what and how\" of PeptideProphet in a manner aimed at statisticians and life scientists who would like to gain a more in-depth understanding of the underlying statistical modeling. The theory and rationale behind the mixture-modeling approach taken by PeptideProphet is discussed from a statistical model-building perspective followed by a description of how a model can be used to express confidence in the identification of individual peptides or sets of peptides. We also demonstrate how to evaluate the quality of model fit and select an appropriate model from several available alternatives. We illustrate the use of PeptideProphet in association with the Trans-Proteomic Pipeline, a free suite of software used for protein identification.<\/jats:p>","DOI":"10.1186\/1471-2105-13-s16-s1","type":"journal-article","created":{"date-parts":[[2012,12,5]],"date-time":"2012-12-05T22:19:42Z","timestamp":1354745982000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":119,"title":["A statistical model-building perspective to identification of MS\/MS spectra with PeptideProphet"],"prefix":"10.1186","volume":"13","author":[{"given":"Kelvin","family":"Ma","sequence":"first","affiliation":[]},{"given":"Olga","family":"Vitek","sequence":"additional","affiliation":[]},{"given":"Alexey I","family":"Nesvizhskii","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2012,11,5]]},"reference":[{"key":"5421_CR1","doi-asserted-by":"publisher","first-page":"976","DOI":"10.1016\/1044-0305(94)80016-2","volume":"5","author":"J Eng","year":"1994","unstructured":"Eng J, McCormack A, Yates J: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. American Society for Mass Spectrometry. 1994, 5: 976-989. 10.1016\/1044-0305(94)80016-2.","journal-title":"American Society for Mass Spectrometry"},{"issue":"9","key":"5421_CR2","doi-asserted-by":"publisher","first-page":"1466","DOI":"10.1093\/bioinformatics\/bth092","volume":"20","author":"R Craig","year":"2004","unstructured":"Craig R, Beavis R: TANDEM: matching proteins with tandem mass spectra. Bioinformatics. 2004, 20 (9): 1466-1467. 10.1093\/bioinformatics\/bth092.","journal-title":"Bioinformatics"},{"issue":"22","key":"5421_CR3","doi-asserted-by":"publisher","first-page":"2830","DOI":"10.1093\/bioinformatics\/btl379","volume":"22","author":"B MacLean","year":"2006","unstructured":"MacLean B, Eng J, Beavis R, McIntosh M: General framework for developing and evaluating database scoring algorithms using the TANDEM search engine. Bioinformatics. 2006, 22 (22): 2830-2832. 10.1093\/bioinformatics\/btl379.","journal-title":"Bioinformatics"},{"key":"5421_CR4","doi-asserted-by":"publisher","first-page":"5383","DOI":"10.1021\/ac025747h","volume":"74","author":"A Keller","year":"2002","unstructured":"Keller A, Nesvizhskii A, Kolker E, Aebersold R: Empirical statistical model to estimate the accuracy of peptide identifications made by MS\/MS and database search. Analytical Chemistry. 2002, 74: 5383-5392. 10.1021\/ac025747h.","journal-title":"Analytical Chemistry"},{"key":"5421_CR5","doi-asserted-by":"publisher","first-page":"2092","DOI":"10.1016\/j.jprot.2010.08.009","volume":"73","author":"A Nesvizhskii","year":"2010","unstructured":"Nesvizhskii A: A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. Journal of Proteomics. 2010, 73: 2092-2123. 10.1016\/j.jprot.2010.08.009.","journal-title":"Journal of Proteomics"},{"key":"5421_CR6","doi-asserted-by":"publisher","first-page":"828","DOI":"10.1021\/pr0604920","volume":"6","author":"J Whiteaker","year":"2007","unstructured":"Whiteaker J, Zhang H, Eng J, Fang R, Piening B, Feng L, Lorentzen T, Schoenherr R, Keane J, Holzman T, Fitzgibbon M, Lin C, Zhang H, Cooke K, Liu T, II DC, Anderson L, Watts J, Smith R, McIntosh M, Paulovich A: Head-to-head comparison of serum fractionation techniques. Journal of Proteome Research. 2007, 6: 828-836. 10.1021\/pr0604920.","journal-title":"Journal of Proteome Research"},{"key":"5421_CR7","doi-asserted-by":"publisher","first-page":"254","DOI":"10.1021\/pr070542g","volume":"7","author":"H Choi","year":"2008","unstructured":"Choi H, Nesvizhskii A: Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics. Journal of Proteome Research. 2008, 7: 254-265. 10.1021\/pr070542g.","journal-title":"Journal of Proteome Research"},{"key":"5421_CR8","doi-asserted-by":"publisher","first-page":"96","DOI":"10.1021\/pr070244j","volume":"7","author":"J Klimek","year":"2007","unstructured":"Klimek J, Eddes J, Hohmann L, Jackson J, Peterson A, Letarte S, Gafken P, Katz J, Mallick P, Lee H, Schmidt A, Ossola R, Eng J, Aebersold R, Martin D: The standard protein mix database: a diverse data set to assist in the production of improved peptide and protein identification software tools. Journal of proteome research. 2007, 7: 96-103.","journal-title":"Journal of proteome research"},{"issue":"3","key":"5421_CR9","doi-asserted-by":"publisher","first-page":"479","DOI":"10.1111\/1467-9868.00346","volume":"64","author":"J Storey","year":"2002","unstructured":"Storey J: A direct approach to false discovery rates. Journal of the Royal Statistical Society. Series B. 2002, 64 (3): 479-498. 10.1111\/1467-9868.00346.","journal-title":"Journal of the Royal Statistical Society. Series B"},{"key":"5421_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1214\/07-STS236","volume":"23","author":"B Efron","year":"2008","unstructured":"Efron B: Microarrays, empirical Bayes and the two-groups model. Statistical Science. 2008, 23: 1-22. 10.1214\/07-STS236.","journal-title":"Statistical Science"},{"key":"5421_CR11","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1021\/pr700739d","volume":"7","author":"L Kall","year":"2008","unstructured":"Kall L, Storey J, MacCoss M: Posterior error probabilities and false discovery rates: two sides of the same coin. Journal of Proteome Research. 2008, 7: 40-44. 10.1021\/pr700739d.","journal-title":"Journal of Proteome Research"},{"key":"5421_CR12","doi-asserted-by":"publisher","first-page":"286","DOI":"10.1021\/pr7006818","volume":"7","author":"H Choi","year":"2008","unstructured":"Choi H, Ghosh D, Nesvizhskii A: Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling. Journal of Proteome Research. 2008, 7: 286-292. 10.1021\/pr7006818.","journal-title":"Journal of Proteome Research"},{"key":"5421_CR13","doi-asserted-by":"publisher","first-page":"4878","DOI":"10.1021\/pr800484x","volume":"7","author":"Y Ding","year":"2008","unstructured":"Ding Y, Choi H, Nesvizhskii A: Adaptive discriminant function analysis and reranking of MS\/MS database search results for improved peptide identification in shotgun proteomics. Journal of Proteome Research. 2008, 7: 4878-4889. 10.1021\/pr800484x.","journal-title":"Journal of Proteome Research"},{"key":"5421_CR14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","volume":"39","author":"A Dempster","year":"1977","unstructured":"Dempster A, Laird N, Rubin D: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B. 1977, 39: 1-38. [http:\/\/www.jstor.org\/discover\/10.2307\/2984875?uid=3738032&uid=2&uid=4&sid=21101269442551]","journal-title":"Journal of the Royal Statistical Society. Series B"},{"issue":"6","key":"5421_CR15","doi-asserted-by":"publisher","first-page":"2013","DOI":"10.1214\/aos\/1074290335","volume":"31","author":"J Storey","year":"2003","unstructured":"Storey J: The positive false discovery rate: a Bayesian interpretation and the q-value. Annals of Statistics. 2003, 31 (6): 2013-2035. 10.1214\/aos\/1074290335.","journal-title":"Annals of Statistics"},{"issue":"3","key":"5421_CR16","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1038\/nmeth1019","volume":"4","author":"J Elias","year":"2007","unstructured":"Elias J, Gygi S: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nature Methods. 2007, 4 (3): 207-214. 10.1038\/nmeth1019.","journal-title":"Nature Methods"},{"key":"5421_CR17","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1021\/pr700600n","volume":"7","author":"L K\u00e4ll","year":"2008","unstructured":"K\u00e4ll L, Storey J, MacCoss M, Noble W: Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. Journal of Proteome Research. 2008, 7: 29-34. 10.1021\/pr700600n.","journal-title":"Journal of Proteome Research"},{"key":"5421_CR18","doi-asserted-by":"publisher","first-page":"1150","DOI":"10.1002\/pmic.200900375","volume":"10","author":"E Deutsch","year":"2010","unstructured":"Deutsch E, Mendoza L, Shteynberg D, Farrah T, Lam H, Tasman N, Sun Z, Nilsson E, Pratt B, Prazen B, Eng JK, Martin DB, Nesvizhskii AI, Aebersold R: A guided tour of the Trans Proteomic Pipeline. Proteomics. 2010, 10: 1150-1159. 10.1002\/pmic.200900375.","journal-title":"Proteomics"},{"key":"5421_CR19","volume-title":"Analytical Chemistry","author":"A Nesvizhskii","year":"2003","unstructured":"Nesvizhskii A, Keller A, Kolker E, Aebersold R: A statistical model for identifying proteins by tandem mass spectrometry. Analytical Chemistry. 2003, 75: [http:\/\/pubs.acs.org\/doi\/abs\/10.1021\/ac0341261]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-13-S16-S1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,2]],"date-time":"2024-05-02T21:00:50Z","timestamp":1714683650000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-13-S16-S1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,11]]},"references-count":19,"journal-issue":{"issue":"S16","published-print":{"date-parts":[[2012,11]]}},"alternative-id":["5421"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-13-s16-s1","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,11]]},"assertion":[{"value":"5 November 2012","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S1"}}