{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T18:27:53Z","timestamp":1772735273607,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2025,8,11]],"date-time":"2025-08-11T00:00:00Z","timestamp":1754870400000},"content-version":"vor","delay-in-days":10,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["R01GM147653"],"award-info":[{"award-number":["R01GM147653"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,8,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Studying protein isoforms is an essential step in biomedical research; at present, the main approach for analyzing proteins is via bottom-up mass spectrometry proteomics, which return peptide identifications, that are indirectly used to infer the presence of protein isoforms. However, the detection and quantification processes are noisy; in particular, peptides may be erroneously detected, and most peptides, known as shared peptides, are associated to multiple protein isoforms. As a consequence, studying individual protein isoforms is challenging, and inferred protein results are often abstracted to the gene-level or to groups of protein isoforms.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here, we introduce IsoBayes, a novel statistical method to perform inference at the isoform level. Our method enhances the information available, by integrating mass spectrometry proteomics and transcriptomics data in a Bayesian probabilistic framework. To account for the uncertainty in the measurement process, we propose a two-layer latent variable approach: first, we sample if a peptide has been correctly detected (or, alternatively filter peptides); second, we allocate the abundance of such selected peptides across the protein(s) they are compatible with. This enables us, starting from peptide-level data, to recover protein-level data; in particular, we: (i) infer the presence\/absence of each protein isoform (via a posterior probability), (ii) estimate its abundance (and credible interval), and (iii) target isoforms where transcript and protein relative abundances significantly differ. We benchmarked our approach in simulations, and in two multi-protease real datasets: our method displays good sensitivity and specificity when detecting protein isoforms, its estimated abundances highly correlate with the ground truth, and can detect changes between protein and transcript relative abundances.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>IsoBayes is freely distributed as a Bioconductor R package, and is accompanied by an example usage vignette.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf450","type":"journal-article","created":{"date-parts":[[2025,8,12]],"date-time":"2025-08-12T23:52:15Z","timestamp":1755042735000},"source":"Crossref","is-referenced-by-count":1,"title":["IsoBayes: a Bayesian approach for single-isoform proteomics inference"],"prefix":"10.1093","volume":"41","author":[{"given":"Jordy","family":"Bollon","sequence":"first","affiliation":[{"name":"Computational and Chemical Biology, Italian Institute of Technology , Genova 16163,","place":["Italy"]},{"name":"Astronomical Observatory of the Autonomous Region of the Aosta Valley (OAVdA) , Nus 11020,","place":["Italy"]}]},{"given":"Michael R","family":"Shortreed","sequence":"additional","affiliation":[{"name":"Department of Chemistry, University of Wisconsin-Madison , Madison, WI 53706,","place":["United States"]}]},{"given":"Erin","family":"Jeffery","sequence":"additional","affiliation":[{"name":"Department of Molecular Physiology and Biological Physics, University of Virginia , Charlottesville, VA 22903,","place":["United States"]}]},{"given":"Ben T","family":"Jordan","sequence":"additional","affiliation":[{"name":"Frederick National Laboratory for Cancer Research , Frederick, MD 21701,","place":["United States"]}]},{"given":"Rachel","family":"Miller","sequence":"additional","affiliation":[{"name":"Department of Chemistry, University of Wisconsin-Madison , Madison, WI 53706,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4063-4502","authenticated-orcid":false,"given":"Andrea","family":"Cavalli","sequence":"additional","affiliation":[{"name":"Computational and Chemical Biology, Italian Institute of Technology , Genova 16163,","place":["Italy"]},{"name":"Centre Europ\u00e9en de Calcul Atomique et Mol\u00e9culaire, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne , Lausanne 1015,","place":["Switzerland"]}]},{"given":"Lloyd M","family":"Smith","sequence":"additional","affiliation":[{"name":"Department of Chemistry, University of Wisconsin-Madison , Madison, WI 53706,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1498-9254","authenticated-orcid":false,"given":"Colin N","family":"Dewey","sequence":"additional","affiliation":[{"name":"Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison , Madison, WI 53726,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4223-9947","authenticated-orcid":false,"given":"Gloria M","family":"Sheynkman","sequence":"additional","affiliation":[{"name":"Department of Molecular Physiology and Biological Physics, University of Virginia , Charlottesville, VA 22903,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3054-9964","authenticated-orcid":false,"given":"Simone","family":"Tiberi","sequence":"additional","affiliation":[{"name":"Department of Statistical Sciences, University of Bologna , Bologna 40126,","place":["Italy"]}]}],"member":"286","published-online":{"date-parts":[[2025,8,11]]},"reference":[{"key":"2025082519465327500_btaf450-B1","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1038\/nbt.3519","article-title":"Near-optimal probabilistic RNA-seq quantification","volume":"34","author":"Bray","year":"2016","journal-title":"Nat Biotechnol"},{"key":"2025082519465327500_btaf450-B2","doi-asserted-by":"crossref","first-page":"2072","DOI":"10.1021\/acs.jproteome.5b01008","article-title":"Hiquant: rapid postquantification analysis of large-scale MS-generated proteomics data","volume":"15","author":"Bryan","year":"2016","journal-title":"J Proteome Res"},{"key":"2025082519465327500_btaf450-B3","doi-asserted-by":"crossref","first-page":"3431","DOI":"10.1021\/acs.jproteome.8b00310","article-title":"Isoform-level interpretation of high-throughput proteomics data enabled by deep integration with RNA-seq","volume":"17","author":"Carlyle","year":"2018","journal-title":"J Proteome Res"},{"key":"2025082519465327500_btaf450-B4","doi-asserted-by":"crossref","first-page":"1367","DOI":"10.1038\/nbt.1511","article-title":"Maxquant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification","volume":"26","author":"Cox","year":"2008","journal-title":"Nat Biotechnol"},{"key":"2025082519465327500_btaf450-B5","first-page":"1512","article-title":"Global signatures of protein and mRNA expression levels","volume":"5","author":"de Sousa Abreu","year":"2009","journal-title":"Mol Biosyst"},{"key":"2025082519465327500_btaf450-B6","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1186\/s13059-023-02923-y","article-title":"Transformation of alignment files improves performance of variant callers for long-read RNA sequencing data","volume":"24","author":"de Souza","year":"2023","journal-title":"Genome Biol"},{"key":"2025082519465327500_btaf450-B7","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/j.cels.2017.12.005","article-title":"Universal alternative splicing of noncoding exons","volume":"6","author":"Deveson","year":"2018","journal-title":"Cell Syst"},{"key":"2025082519465327500_btaf450-B8","doi-asserted-by":"crossref","first-page":"883","DOI":"10.15252\/msb.20167144","article-title":"Gene-specific correlation of RNA and protein levels in human cells and tissues","volume":"12","author":"Edfors","year":"2016","journal-title":"Mol Syst Biol"},{"key":"2025082519465327500_btaf450-B9","doi-asserted-by":"crossref","first-page":"398","DOI":"10.1080\/01621459.1990.10476213","article-title":"Sampling-based approaches to calculating marginal densities","volume":"85","author":"Gelfand","year":"1990","journal-title":"J Am Stat Assoc"},{"key":"2025082519465327500_btaf450-B10","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1109\/TPAMI.1984.4767596","article-title":"Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images","volume":"6","author":"Geman","year":"1984","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2025082519465327500_btaf450-B11","doi-asserted-by":"crossref","first-page":"923","DOI":"10.1038\/nmeth1113","article-title":"Semi-supervised learning for peptide identification from shotgun proteomics datasets","volume":"4","author":"K\u00e4ll","year":"2007","journal-title":"Nat Methods"},{"key":"2025082519465327500_btaf450-B12","doi-asserted-by":"crossref","first-page":"535","DOI":"10.1016\/j.cell.2016.03.014","article-title":"On the dependency of cellular protein levels on mRNA abundance","volume":"165","author":"Liu","year":"2016","journal-title":"Cell"},{"key":"2025082519465327500_btaf450-B13","doi-asserted-by":"crossref","first-page":"1229","DOI":"10.1016\/j.celrep.2017.07.025","article-title":"Impact of alternative splicing on the human proteome","volume":"20","author":"Liu","year":"2017","journal-title":"Cell Rep"},{"key":"2025082519465327500_btaf450-B14","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1038\/nbt1270","article-title":"Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation","volume":"25","author":"Lu","year":"2007","journal-title":"Nat Biotechnol"},{"key":"2025082519465327500_btaf450-B15","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1186\/s12859-017-1491-5","article-title":"Improvement of peptide identification with considering the abundance of mRNA and peptide","volume":"18","author":"Ma","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"2025082519465327500_btaf450-B16","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/j.jprot.2019.02.010","article-title":"A protein identification algorithm for tandem mass spectrometry by incorporating the abundance of mRNA into a binomial probability scoring model","volume":"197","author":"Ma","year":"2019","journal-title":"J Proteomics"},{"key":"2025082519465327500_btaf450-B17","doi-asserted-by":"crossref","first-page":"3966","DOI":"10.1016\/j.febslet.2009.10.036","article-title":"Correlation of mRNA and protein in complex biological samples","volume":"583","author":"Maier","year":"2009","journal-title":"FEBS Lett"},{"key":"2025082519465327500_btaf450-B18","doi-asserted-by":"crossref","first-page":"3429","DOI":"10.1021\/acs.jproteome.9b00330","article-title":"Improved protein inference from multiple protease bottom-up mass spectrometry data","volume":"18","author":"Miller","year":"2019","journal-title":"J Proteome Res"},{"key":"2025082519465327500_btaf450-B19","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1186\/s13059-022-02624-y","article-title":"Enhanced protein isoform characterization through long-read proteogenomics","volume":"23","author":"Miller","year":"2022","journal-title":"Genome Biol"},{"key":"2025082519465327500_btaf450-B20","doi-asserted-by":"crossref","first-page":"4646","DOI":"10.1021\/ac0341261","article-title":"A statistical model for identifying proteins by tandem mass spectrometry","volume":"75","author":"Nesvizhskii","year":"2003","journal-title":"Anal Chem"},{"key":"2025082519465327500_btaf450-B21","doi-asserted-by":"crossref","first-page":"1060","DOI":"10.1021\/acs.jproteome.9b00566","article-title":"Epifany: a method for efficient high-confidence protein inference","volume":"19","author":"Pfeuffer","year":"2020","journal-title":"J Proteome Res"},{"key":"2025082519465327500_btaf450-B22","doi-asserted-by":"crossref","first-page":"1397","DOI":"10.1093\/bioinformatics\/btp168","article-title":"Integrating shotgun proteomics and mRNA expression data to improve protein identification","volume":"25","author":"Ramakrishnan","year":"2009","journal-title":"Bioinformatics"},{"key":"2025082519465327500_btaf450-B23","doi-asserted-by":"crossref","first-page":"741","DOI":"10.1038\/nmeth.3959","article-title":"Openms: a flexible open-source software platform for mass spectrometry data analysis","volume":"13","author":"R\u00f6st","year":"2016","journal-title":"Nat Methods"},{"key":"2025082519465327500_btaf450-B24","doi-asserted-by":"crossref","first-page":"e9170","DOI":"10.15252\/msb.20199170","article-title":"Isoform-resolved correlation analysis between mRNA abundance regulation and protein level degradation","volume":"16","author":"Salovska","year":"2020","journal-title":"Mol Syst Biol"},{"key":"2025082519465327500_btaf450-B25","doi-asserted-by":"crossref","first-page":"5346","DOI":"10.1021\/pr100594k","article-title":"Efficient marginalization to compute protein posterior probabilities from shotgun mass spectrometry data","volume":"9","author":"Serang","year":"2010","journal-title":"J Proteome Res"},{"key":"2025082519465327500_btaf450-B26","doi-asserted-by":"crossref","first-page":"2341","DOI":"10.1074\/mcp.O113.028142","article-title":"Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq","volume":"12","author":"Sheynkman","year":"2013","journal-title":"Mol Cell Proteomics"},{"key":"2025082519465327500_btaf450-B27","doi-asserted-by":"crossref","first-page":"1844","DOI":"10.1021\/acs.jproteome.7b00873","article-title":"Enhanced global post-translational modification discovery with MetaMorpheus","volume":"17","author":"Solntsev","year":"2018","journal-title":"J Proteome Res"},{"key":"2025082519465327500_btaf450-B28","doi-asserted-by":"crossref","first-page":"528","DOI":"10.1080\/01621459.1987.10478458","article-title":"The calculation of posterior distributions by data augmentation","volume":"82","author":"Tanner","year":"1987","journal-title":"J Am Stat Assoc"},{"key":"2025082519465327500_btaf450-B29","doi-asserted-by":"crossref","first-page":"1719","DOI":"10.1007\/s13361-016-1460-7","article-title":"Fast and accurate protein false discovery rates on large-scale proteomics data sets with percolator 3.0","volume":"27","author":"The","year":"2016","journal-title":"J Am Soc Mass Spectrom"},{"key":"2025082519465327500_btaf450-B30","doi-asserted-by":"crossref","first-page":"2988","DOI":"10.1021\/acs.jproteome.5b00121","article-title":"Pia: an intuitive protein inference engine with a web-based user interface","volume":"14","author":"Uszkoreit","year":"2015","journal-title":"J Proteome Res"},{"key":"2025082519465327500_btaf450-B31","doi-asserted-by":"crossref","first-page":"1419","DOI":"10.1074\/mcp.R500012-MCP200","article-title":"Interpretation of shotgun proteomic data: the protein interference problem","volume":"4","author":"Vesvizhskii","year":"2005","journal-title":"Mol Cell Proteomics"},{"key":"2025082519465327500_btaf450-B32","doi-asserted-by":"crossref","first-page":"e8503","DOI":"10.15252\/msb.20188503","article-title":"A deep proteome and transcriptome abundance atlas of 29 healthy human tissues","volume":"15","author":"Wang","year":"2019","journal-title":"Mol Syst Biol"},{"key":"2025082519465327500_btaf450-B33","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/d41586-018-05462-w","article-title":"Expanded human gene tally reignites debate","volume":"558","author":"Willyard","year":"2018","journal-title":"Nature"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf450\/64014876\/btaf450.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/8\/btaf450\/64014876\/btaf450.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/8\/btaf450\/64014876\/btaf450.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,8,25]],"date-time":"2025-08-25T23:47:03Z","timestamp":1756165623000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf450\/8231069"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,8]]},"references-count":33,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,8,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf450","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,8]]},"published":{"date-parts":[[2025,8]]},"article-number":"btaf450"}}