{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T14:00:21Z","timestamp":1761919221081},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2749,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Tandem mass spectrometry (MS\/MS) offers fast and reliable characterization of complex protein mixtures, but suffers from low sensitivity in protein identification. In a typical shotgun proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other information available, e.g. the probability of a protein's presence is likely to correlate with its mRNA concentration.<\/jats:p>\n               <jats:p>Results: We develop a Bayesian score that estimates the posterior probability of a protein's presence in the sample given its identification in an MS\/MS experiment and its mRNA concentration measured under similar experimental conditions. Our method, MSpresso, substantially increases the number of proteins identified in an MS\/MS experiment at the same error rate, e.g. in yeast, MSpresso increases the number of proteins identified by \u223c40%. We apply MSpresso to data from different MS\/MS instruments, experimental conditions and organisms (Escherichia coli, human), and predict 19\u201363% more proteins across the different datasets. MSpresso demonstrates that incorporating prior knowledge of protein presence into shotgun proteomics experiments can substantially improve protein identification scores.<\/jats:p>\n               <jats:p>Availability and Implementation: Software is available upon request from the authors. Mass spectrometry datasets and supplementary information are available from http:\/\/www.marcottelab.org\/MSpresso\/.<\/jats:p>\n               <jats:p>Contact: \u00a0marcotte@icmb.utexas.edu; miranker@cs.utexas.edu<\/jats:p>\n               <jats:p>Supplementary Information: Supplementary data website: http:\/\/www.marcottelab.org\/MSpresso\/.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp168","type":"journal-article","created":{"date-parts":[[2009,3,25]],"date-time":"2009-03-25T02:18:30Z","timestamp":1237947510000},"page":"1397-1403","source":"Crossref","is-referenced-by-count":60,"title":["Integrating shotgun proteomics and mRNA expression data to improve protein identification"],"prefix":"10.1093","volume":"25","author":[{"given":"Smriti R.","family":"Ramakrishnan","sequence":"first","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]},{"given":"Christine","family":"Vogel","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]},{"given":"John T.","family":"Prince","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]},{"given":"Rong","family":"Wang","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]},{"given":"Zhihua","family":"Li","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]},{"given":"Luiz O.","family":"Penalva","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]},{"given":"Margaret","family":"Myers","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]},{"given":"Edward M.","family":"Marcotte","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]},{"given":"Daniel P.","family":"Miranker","sequence":"additional","affiliation":[{"name":"1 Department of Computer Sciences, 1 University Station C0500, 2Department of Chemistry and Biochemistry, Institute for Cellular and Molecular Biology, Center for Systems and Synthetic Biology, 2500 Speedway, The University of Texas at Austin, Austin, TX 78712 ,3Pathogen Functional Genomics Resource Center, J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and 4Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,3,24]]},"reference":[{"key":"2023013111502115600_B1","doi-asserted-by":"crossref","first-page":"6392","DOI":"10.1128\/JB.185.21.6392-6399.2003","article-title":"Genome-scale analysis of the uses of the Escherichia coli genome: model-driven analysis of heterogeneous data sets","volume":"185","author":"Allen","year":"2003","journal-title":"J. Bacteriol."},{"key":"2023013111502115600_B2","doi-asserted-by":"crossref","first-page":"2502","DOI":"10.1093\/bioinformatics\/btg363","article-title":"Characterizing gene sets with FuncAssociate","volume":"19","author":"Berriz","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013111502115600_B3","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1038\/nbt1300","article-title":"A high-quality catalog of the Drosophila melanogaster proteome","volume":"25","author":"Brunner","year":"2007","journal-title":"Nat. Biotechnol."},{"key":"2023013111502115600_B4","doi-asserted-by":"crossref","first-page":"2193","DOI":"10.1073\/pnas.0607084104","article-title":"Analysis of phosphorylation sites on proteins from Saccharomyces cerevisiae by electron transfer dissociation (ETD) mass spectrometry","volume":"104","author":"Chi","year":"2007","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013111502115600_B5","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1021\/pr700747q","article-title":"False discovery rates and related statistical concepts in mass spectrometry-based proteomics","volume":"7","author":"Choi","year":"2008","journal-title":"J. Proteome Res."},{"key":"2023013111502115600_B6","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1021\/pr7006818","article-title":"Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling","volume":"7","author":"Choi","year":"2008","journal-title":"J. Proteome Res."},{"key":"2023013111502115600_B7","doi-asserted-by":"crossref","first-page":"9232","DOI":"10.1073\/pnas.1533294100","article-title":"Toward a protein profile of Escherichia coli: comparison to its transcription profile","volume":"100","author":"Corbin","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013111502115600_B8","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1038\/nature02456","article-title":"Integrating high-throughput and computational data elucidates bacterial networks","volume":"429","author":"Covert","year":"2004","journal-title":"Nature"},{"key":"2023013111502115600_B9","doi-asserted-by":"crossref","first-page":"R50","DOI":"10.1186\/gb-2006-7-6-r50","article-title":"Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system","volume":"7","author":"de Godoy","year":"2006","journal-title":"Genome Biol."},{"key":"2023013111502115600_B10","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1038\/nmeth1019","article-title":"Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry","volume":"4","author":"Elias","year":"2007","journal-title":"Nat. Methods"},{"key":"2023013111502115600_B11","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recognit. Lett."},{"key":"2023013111502115600_B12","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1021\/pr7007303","article-title":"Modes of inference for evaluating the confidence of peptide identifications","volume":"7","author":"Fitzgibbon","year":"2008","journal-title":"J. Proteome Res."},{"key":"2023013111502115600_B13","doi-asserted-by":"crossref","first-page":"7357","DOI":"10.1128\/MCB.19.11.7357","article-title":"A sampling of the yeast proteome","volume":"19","author":"Futcher","year":"1999","journal-title":"Mol. Cell. Biol."},{"key":"2023013111502115600_B14","article-title":"SILAC-labeling and proteome quantitation of mouse embryonic stem cells to a depth of 5111 proteins","author":"Graumann","year":"2007","journal-title":"Mol. Cell Proteomics."},{"key":"2023013111502115600_B15","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1186\/gb-2003-4-9-117","article-title":"Comparing protein abundance and mRNA expression levels on a genomic scale","volume":"4","author":"Greenbaum","year":"2003","journal-title":"Genome Biol."},{"key":"2023013111502115600_B16","doi-asserted-by":"crossref","first-page":"994","DOI":"10.1038\/13690","article-title":"Quantitative analysis of complex protein mixtures using isotope-coded affinity tags","volume":"17","author":"Gygi","year":"1999","journal-title":"Nat. Biotechnol."},{"key":"2023013111502115600_B17","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1016\/S0092-8674(00)81641-4","article-title":"Dissecting the regulatory circuitry of a eukaryotic genome","volume":"95","author":"Holstege","year":"1998","journal-title":"Cell"},{"key":"2023013111502115600_B18","doi-asserted-by":"crossref","first-page":"1027","DOI":"10.1016\/j.jmb.2004.03.016","article-title":"A combined transmembrane topology and signal peptide prediction method","volume":"338","author":"Kall","year":"2004","journal-title":"J. Mol. Biol."},{"key":"2023013111502115600_B19","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1021\/pr700739d","article-title":"Posterior error probabilities and false discovery rates: two sides of the same coin","volume":"7","author":"Kall","year":"2008","journal-title":"J. Proteome Res."},{"key":"2023013111502115600_B20","doi-asserted-by":"crossref","first-page":"5383","DOI":"10.1021\/ac025747h","article-title":"Empirical statistical model to estimate the accuracy of peptide identifications made by MS\/MS and database search","volume":"74","author":"Keller","year":"2002","journal-title":"Anal. Chem."},{"key":"2023013111502115600_B21","doi-asserted-by":"crossref","first-page":"3354","DOI":"10.1021\/pr8001244","article-title":"Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases","volume":"7","author":"Kim","year":"2008","journal-title":"J. Proteome Res."},{"key":"2023013111502115600_B22","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1016\/S0076-6879(07)29006-8","article-title":"Yeast phenotypic assays on translational control","volume":"429","author":"Lee","year":"2007","journal-title":"Methods Enzymol."},{"key":"2023013111502115600_B23","doi-asserted-by":"crossref","first-page":"1259","DOI":"10.1002\/elps.1150180807","article-title":"Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K-12","volume":"18","author":"Link","year":"1997","journal-title":"Electrophoresis"},{"key":"2023013111502115600_B24","doi-asserted-by":"crossref","first-page":"1205","DOI":"10.1074\/mcp.D500006-MCP200","article-title":"Localization, annotation, and comparison of the Escherichia coli K-12 proteome under two states of growth","volume":"4","author":"Lopez-Campistrous","year":"2005","journal-title":"Mol. Cell Proteomics"},{"key":"2023013111502115600_B25","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1038\/nbt1270","article-title":"Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation","volume":"25","author":"Lu","year":"2007","journal-title":"Nat. Biotechnol."},{"key":"2023013111502115600_B26","doi-asserted-by":"crossref","first-page":"D468","DOI":"10.1093\/nar\/gkl931","article-title":"Expanded protein information at SGD: new pages and proteome browser","volume":"35","author":"Nash","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023013111502115600_B27","doi-asserted-by":"crossref","first-page":"4646","DOI":"10.1021\/ac0341261","article-title":"A statistical model for identifying proteins by tandem mass spectrometry","volume":"75","author":"Nesvizhskii","year":"2003","journal-title":"Anal. Chem."},{"key":"2023013111502115600_B28","doi-asserted-by":"crossref","DOI":"10.1038\/nature04785","article-title":"Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise","author":"Newman","year":"2006","journal-title":"Nature."},{"key":"2023013111502115600_B29","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1021\/pr025556v","article-title":"Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC\/LC-MS\/MS) for large-scale protein analysis: the yeast proteome","volume":"2","author":"Peng","year":"2003","journal-title":"J. Proteome Res."},{"key":"2023013111502115600_B30","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1002\/(SICI)1097-0061(19980330)14:5<471::AID-YEA241>3.0.CO;2-U","article-title":"The list of cytoplasmic ribosomal proteins of Saccharomyces cerevisiae","volume":"14","author":"Planta","year":"1998","journal-title":"Yeast"},{"key":"2023013111502115600_B31","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1038\/nature04187","article-title":"Global analysis of protein phosphorylation in yeast","volume":"438","author":"Ptacek","year":"2005","journal-title":"Nature"},{"key":"2023013111502115600_B32","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1186\/1471-2105-3-35","article-title":"FunSpec: a web-based cluster interpreter for yeast","volume":"3","author":"Robinson","year":"2002","journal-title":"BMC Bioinformatics"},{"key":"2023013111502115600_B33","doi-asserted-by":"crossref","first-page":"D300","DOI":"10.1093\/nar\/gkh087","article-title":"GenProtEC: an updated and improved analysis of functions of Escherichia coli K-12 proteins","volume":"32","author":"Serres","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023013111502115600_B34","doi-asserted-by":"crossref","first-page":"9340","DOI":"10.1128\/MCB.25.21.9340-9349.2005","article-title":"Global gene expression profiling reveals widespread yet distinctive translational responses to different eukaryotic translation initiation factor 2B-targeting stress pathways","volume":"25","author":"Smirnova","year":"2005","journal-title":"Mol. Cell. Biol."},{"key":"2023013111502115600_B35","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1126\/science.270.5235.484","article-title":"Serial analysis of gene expression","volume":"270","author":"Velculescu","year":"1995","journal-title":"Science"},{"key":"2023013111502115600_B36","doi-asserted-by":"crossref","first-page":"5860","DOI":"10.1073\/pnas.092538799","article-title":"Precision and functional specificity in mRNA decay","volume":"99","author":"Wang","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013111502115600_B37","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1038\/85686","article-title":"Large-scale analysis of the yeast proteome by multidimensional protein identification technology","volume":"19","author":"Washburn","year":"2001","journal-title":"Nat. Biotechnol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/11\/1397\/48989825\/bioinformatics_25_11_1397.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/11\/1397\/48989825\/bioinformatics_25_11_1397.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T20:56:36Z","timestamp":1675198596000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/11\/1397\/330984"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,3,24]]},"references-count":37,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2009,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp168","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,6,1]]},"published":{"date-parts":[[2009,3,24]]}}}