{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,4]],"date-time":"2026-02-04T15:32:14Z","timestamp":1770219134358,"version":"3.49.0"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,1,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Statistical evaluation of the confidence of peptide and protein identifications made by tandem mass spectrometry is a critical component for appropriately interpreting the experimental data and conducting downstream analysis. Although many approaches have been developed to assign confidence measure from different perspectives, a unified statistical framework that integrates the uncertainty of peptides and proteins is still missing.<\/jats:p><jats:p>Results: We developed a hierarchical statistical model (HSM) that jointly models the uncertainty of the identified peptides and proteins and can be applied to any scoring system. With data sets of a standard mixture and the yeast proteome, we demonstrate that the HSM offers a reliable or at least conservative false discovery rate (FDR) estimate for peptide and protein identifications. The probability measure of HSM also offers a powerful discriminating score for peptide identification.<\/jats:p><jats:p>Availability: The algorithm is available upon request from the authors.<\/jats:p><jats:p>Contact: \u00a0chashen@iupui.edu<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm555","type":"journal-article","created":{"date-parts":[[2007,11,18]],"date-time":"2007-11-18T01:26:02Z","timestamp":1195349162000},"page":"202-208","source":"Crossref","is-referenced-by-count":31,"title":["A hierarchical statistical model to assess the confidence of peptides and proteins inferred from tandem mass spectrometry"],"prefix":"10.1093","volume":"24","author":[{"given":"Changyu","family":"Shen","sequence":"first","affiliation":[{"name":"1 Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202 and 2Department of Chemistry, University of Louisville, Louisville, KY 40292, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhiping","family":"Wang","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202 and 2Department of Chemistry, University of Louisville, Louisville, KY 40292, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ganesh","family":"Shankar","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202 and 2Department of Chemistry, University of Louisville, Louisville, KY 40292, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiang","family":"Zhang","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202 and 2Department of Chemistry, University of Louisville, Louisville, KY 40292, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lang","family":"Li","sequence":"additional","affiliation":[{"name":"1 Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202 and 2Department of Chemistry, University of Louisville, Louisville, KY 40292, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2007,11,17]]},"reference":[{"key":"2023020209480673200_B1","doi-asserted-by":"crossref","first-page":"S13","DOI":"10.1093\/bioinformatics\/17.suppl_1.S13","article-title":"SCOPE: a probabilistic model for scoring tandem mass spectra against a peptide database","volume":"17","author":"Bafna","year":"2001","journal-title":"Bioinformatics"},{"key":"2023020209480673200_B2","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023020209480673200_B3","doi-asserted-by":"crossref","first-page":"1454","DOI":"10.1002\/pmic.200300485","article-title":"OLAV: towards high-throughput tandem mass spectrometry data identification","volume":"3","author":"Colinge","year":"2003","journal-title":"Proteomics"},{"key":"2023020209480673200_B4","doi-asserted-by":"crossref","first-page":"1234","DOI":"10.1021\/pr049882h","article-title":"Open source system for analyzing, validating, and storing protein identification data","volume":"3","author":"Craig","year":"2004","journal-title":"J. Proteome Res."},{"key":"2023020209480673200_B5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm (with discussion)","volume":"39","author":"Dempster","year":"1977","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023020209480673200_B6","doi-asserted-by":"crossref","first-page":"1151","DOI":"10.1198\/016214501753382129","article-title":"Empirical Bayes analysis of a microarray experiment","volume":"96","author":"Efron","year":"2001","journal-title":"J. Am. Stat. Assoc."},{"key":"2023020209480673200_B7","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1038\/nmeth1019","article-title":"Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry","volume":"4","author":"Elias","year":"2007","journal-title":"Nat. Methods"},{"key":"2023020209480673200_B8","doi-asserted-by":"crossref","first-page":"667","DOI":"10.1038\/nmeth785","article-title":"Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations","volume":"2","author":"Elias","year":"2005","journal-title":"Nat. Methods"},{"key":"2023020209480673200_B9","doi-asserted-by":"crossref","first-page":"976","DOI":"10.1016\/1044-0305(94)80016-2","article-title":"An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database","volume":"5","author":"Eng","year":"1994","journal-title":"J. Am. Soc. Mass Spectrom."},{"key":"2023020209480673200_B10","doi-asserted-by":"crossref","first-page":"3901","DOI":"10.1021\/ac070202e","article-title":"Probability model for assessing proteins assembled from peptide sequences inferred from tandem mass spectrometry data","volume":"79","author":"Feng","year":"2007","journal-title":"Anal. Chem."},{"key":"2023020209480673200_B11","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1021\/ac0258709","article-title":"A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes","volume":"75","author":"Fenyo","year":"2003","journal-title":"Anal. Chem."},{"key":"2023020209480673200_B12","doi-asserted-by":"crossref","first-page":"958","DOI":"10.1021\/pr0499491","article-title":"Open mass spectrometry search algorithm","volume":"3","author":"Geer","year":"2004","journal-title":"J. Proteome Res."},{"key":"2023020209480673200_B13","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1021\/ac0258913","article-title":"Intensity-based statistical scorer for tandem mass spectrometry","volume":"75","author":"Havilio","year":"2003","journal-title":"Anal. Chem."},{"key":"2023020209480673200_B14","doi-asserted-by":"crossref","first-page":"1758","DOI":"10.1021\/pr0605320","article-title":"Estimating the statistical significance of peptide identifications from shotgun proteomics experiments","volume":"6","author":"Higgs","year":"2007","journal-title":"J. Proteome Res."},{"key":"2023020209480673200_B15","doi-asserted-by":"crossref","first-page":"5383","DOI":"10.1021\/ac025747h","article-title":"Empirical statistical model to estimate the accuracy of peptide identifications made by MS\/MS and database search","volume":"74","author":"Keller","year":"2002","journal-title":"Anal. Chem."},{"key":"2023020209480673200_B16","doi-asserted-by":"crossref","first-page":"2338","DOI":"10.1021\/pr050264q","article-title":"VEMS 3.0: algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins","volume":"4","author":"Matthiesen","year":"2005","journal-title":"J Proteome Res."},{"key":"2023020209480673200_B17","doi-asserted-by":"crossref","first-page":"767","DOI":"10.1021\/ac960799q","article-title":"Direct analysis and identification of proteins in mixtures by LC\/MS\/MS and database searching at the low-femtomole level","volume":"69","author":"McCormack","year":"1997","journal-title":"Anal. Chem."},{"key":"2023020209480673200_B18","doi-asserted-by":"crossref","first-page":"4646","DOI":"10.1021\/ac0341261","article-title":"A statistical model for identifying proteins by tandem mass spectrometry","volume":"75","author":"Nesvizhskii","year":"2003","journal-title":"Anal. Chem."},{"key":"2023020209480673200_B19","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1093\/biostatistics\/5.2.155","article-title":"Detecting differential gene expression with a semiparametric hierarchical mixture method","volume":"5","author":"Newton","year":"2004","journal-title":"Biostatistics"},{"key":"2023020209480673200_B20","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1021\/pr025556v","article-title":"Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC\/LC-MS\/MS) for large-scale protein analysis: the yeast proteome","volume":"2","author":"Peng","year":"2003","journal-title":"J. Proteome Res."},{"key":"2023020209480673200_B21","doi-asserted-by":"crossref","first-page":"3551","DOI":"10.1002\/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2","article-title":"Probability-based protein identification by searching sequence databases using mass spectrometry data","volume":"20","author":"Perkins","year":"1999","journal-title":"Electrophoresis"},{"key":"2023020209480673200_B22","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1089\/153623104773547507","article-title":"Standard mixtures for proteome studies","volume":"8","author":"Purvine","year":"2004","journal-title":"Omics"},{"key":"2023020209480673200_B23","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1021\/pr0498638","article-title":"Probability-based evaluation of peptide and protein identifications from tandem mass spectrometry and SEQUEST analysis: the human proteome","volume":"4","author":"Qian","year":"2005","journal-title":"J. Proteome Res."},{"key":"2023020209480673200_B24","doi-asserted-by":"crossref","first-page":"3792","DOI":"10.1021\/ac034157w","article-title":"A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases","volume":"75","author":"Sadygov","year":"2003","journal-title":"Anal. Chem."},{"key":"2023020209480673200_B25","doi-asserted-by":"crossref","first-page":"654","DOI":"10.1021\/pr0604054","article-title":"MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis","volume":"6","author":"Tabb","year":"2007","journal-title":"J. Proteome Res."},{"key":"2023020209480673200_B26","doi-asserted-by":"crossref","first-page":"e481","DOI":"10.1093\/bioinformatics\/btl237","article-title":"A computational approach toward label-free protein quantification using predicted peptide detectability","volume":"22","author":"Tang","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020209480673200_B27","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1038\/85686","article-title":"Large-scale analysis of the yeast proteome by multidimensional protein identification technology","volume":"19","author":"Washburn","year":"2001","journal-title":"Nat. Biotechnol."},{"key":"2023020209480673200_B28","doi-asserted-by":"crossref","first-page":"6134","DOI":"10.1002\/pmic.200600070","article-title":"Protein probabilities in shotgun proteomics: evaluating different estimation methods using a semi-random sampling model","volume":"6","author":"Xue","year":"2006","journal-title":"Proteomics"},{"key":"2023020209480673200_B29","doi-asserted-by":"crossref","first-page":"1406","DOI":"10.1002\/1615-9861(200210)2:10<1406::AID-PROT1406>3.0.CO;2-9","article-title":"ProbID: a probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data","volume":"2","author":"Zhang","year":"2002","journal-title":"Proteomics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/2\/202\/49044334\/bioinformatics_24_2_202.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/2\/202\/49044334\/bioinformatics_24_2_202.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,22]],"date-time":"2025-01-22T16:20:58Z","timestamp":1737562858000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/2\/202\/227172"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,11,17]]},"references-count":29,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2008,1,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm555","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,1,15]]},"published":{"date-parts":[[2007,11,17]]}}}