{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T06:42:06Z","timestamp":1776062526933,"version":"3.50.1"},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"15","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Background: In high-throughput experimental biology, it is widely acknowledged that while expression levels measured at the levels of transcriptome and the corresponding proteome do not, in general, correlate well, messenger RNA levels are used as convenient proxies for protein levels. Our interest is in developing data-driven computational models that can bridge the gap between these two levels of measurement at which different mechanisms of regulation may act on different molecular species causing any observed lack of correlations. To this end, we build data-driven predictors of protein levels using mRNA levels and known proxies of translation efficiencies as covariates. Previous work showed that in such a setting, outliers with respect to the model are reliable candidates for post-translational regulation.<\/jats:p><jats:p>Results: Here, we present and compare two novel formulations of deriving a protein concentration predictor from which outliers may be extracted in a systematic manner. The first approach, outlier rejecting regression, allows explicit specification of a certain fraction of the data as outliers. In a regression setting, this is a non-convex optimization problem which we solve by deriving a difference of convex functions algorithm (DCA). With post-translationally regulated proteins, one expects their concentrations to be affected primarily by disruption of protein stability. Our second algorithm exploits this observation by minimizing an asymmetric loss using quantile regression and extracts outlier proteins whose measured concentrations are lower than what a genome-wide regression would predict. We validate the two approaches on a dataset of yeast transcriptome and proteome. Functional annotation check on detected outliers demonstrate that the methods are able to identify post-translationally regulated genes with high statistical confidence.<\/jats:p><jats:p>Contact: \u00a0mn@ecs.soton.ac.uk<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv182","type":"journal-article","created":{"date-parts":[[2015,3,30]],"date-time":"2015-03-30T00:12:02Z","timestamp":1427674322000},"page":"2530-2536","source":"Crossref","is-referenced-by-count":15,"title":["Outlier detection at the transcriptome-proteome interface"],"prefix":"10.1093","volume":"31","author":[{"given":"Yawwani","family":"Gunawardana","sequence":"first","affiliation":[{"name":"1 School of Electronics and Computer Science, University of Southampton, Southampton, UK,"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuhei","family":"Fujiwara","sequence":"additional","affiliation":[{"name":"2 Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan and"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Akiko","family":"Takeda","sequence":"additional","affiliation":[{"name":"2 Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan and"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jeongmin","family":"Woo","sequence":"additional","affiliation":[{"name":"3 Faculty of Medicine, Southampton General Hospital, University of Southampton, Southampton, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christopher","family":"Woelk","sequence":"additional","affiliation":[{"name":"3 Faculty of Medicine, Southampton General Hospital, University of Southampton, Southampton, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mahesan","family":"Niranjan","sequence":"additional","affiliation":[{"name":"1 School of Electronics and Computer Science, University of Southampton, Southampton, UK,"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2015,3,29]]},"reference":[{"key":"2023051308495564400_btv182-B1","first-page":"33","article-title":"Scalable training of L1-regularized log-linear models","author":"Andrew","year":"2007"},{"key":"2023051308495564400_btv182-B2","doi-asserted-by":"crossref","first-page":"3889","DOI":"10.1073\/pnas.0635171100","article-title":"Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae","volume":"100","author":"Arava","year":"2003","journal-title":"Proc. Natl. Acad. Sci."},{"key":"2023051308495564400_btv182-B3","author":"Bache","year":"2013"},{"key":"2023051308495564400_btv182-B4","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1016\/S0021-9258(18)71184-8","article-title":"The enzymatic phosphorylation of proteins","volume":"211","author":"Burnett","year":"1954","journal-title":"J. Biol. Chem."},{"key":"2023051308495564400_btv182-B5","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1074\/mcp.M700052-MCP200","article-title":"Analysis of the arabidopsis cytosolic ribosome proteome provides detailed insights into its components and their post-translational modification","volume":"7","author":"Carroll","year":"2008","journal-title":"Mol. Cell. Proteomics"},{"key":"2023051308495564400_btv182-B6","doi-asserted-by":"crossref","first-page":"1305","DOI":"10.1002\/sim.4780111005","article-title":"Smoothing reference centile curves: the LMS method and penalized likelihood","volume":"11","author":"Cole","year":"1992","journal-title":"Stat. Med."},{"key":"2023051308495564400_btv182-B7","first-page":"129","article-title":"Trading convexity for scalability","author":"Collobert","year":"2006"},{"key":"2023051308495564400_btv182-B8","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1534\/genetics.109.101105","article-title":"Genomewide analysis reveals novel pathways affecting endoplasmic reticulum homeostasis, protein modification and quality control","volume":"182","author":"\u010copi\u010d","year":"2009","journal-title":"Genetics"},{"key":"2023051308495564400_btv182-B9","doi-asserted-by":"crossref","first-page":"7357","DOI":"10.1128\/MCB.19.11.7357","article-title":"A sampling of the yeast proteome","volume":"19","author":"Futcher","year":"1999","journal-title":"Mol. Cell. Biol."},{"key":"2023051308495564400_btv182-B10","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1186\/gb-2003-4-9-117","article-title":"Comparing protein abundance and mRNA expression levels on a genomic scale","volume":"4","author":"Greenbaum","year":"2003","journal-title":"Genome Biol."},{"key":"2023051308495564400_btv182-B11","doi-asserted-by":"crossref","first-page":"3060","DOI":"10.1093\/bioinformatics\/btt537","article-title":"Bridging the gap between transcriptome and proteome measurements identifies post-translationally regulated genes","volume":"29","author":"Gunawardana","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051308495564400_btv182-B12","doi-asserted-by":"crossref","first-page":"1720","DOI":"10.1128\/MCB.19.3.1720","article-title":"Correlation between protein and mRNA abundance in yeast","volume":"19","author":"Gygi","year":"1999","journal-title":"Mol. Cell. Biol."},{"key":"2023051308495564400_btv182-B13","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1080\/00401706.1984.10487956","article-title":"Location of several outliers in multiple-regression data using elemental sets","volume":"26","author":"Hawkins","year":"1984","journal-title":"Technometrics"},{"key":"2023051308495564400_btv182-B14","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1111\/1467-9876.00170","article-title":"Semiparametric estimation of regression quantiles with application to standardizing weight for height and age in US children","volume":"48","author":"Heagerty","year":"1999","journal-title":"J. R. Stat. Soc. Ser. C (Appl. Stat.)"},{"key":"2023051308495564400_btv182-B15","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1080\/01621459.1992.10475175","article-title":"Hierarchical spline models for conditional quantiles and the demand for electricity","volume":"87","author":"Hendricks","year":"1992","journal-title":"J. Am. Stat. Assoc."},{"key":"2023051308495564400_btv182-B16","doi-asserted-by":"crossref","first-page":"1269","DOI":"10.1093\/bioinformatics\/bti130","article-title":"NetAcet: prediction of N-terminal acetylation sites","volume":"21","author":"Kiemer","year":"2005","journal-title":"Bioinformatics"},{"key":"2023051308495564400_btv182-B17","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511754098","volume-title":"Quantile Regression","author":"Koenker","year":"2005"},{"key":"2023051308495564400_btv182-B18","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1198\/016214501753168172","article-title":"Reappraising medfly longevity: a quantile regression survival analysis","volume":"96","author":"Koenker","year":"2001","journal-title":"J. Am. Stat. Assoc."},{"key":"2023051308495564400_btv182-B19","doi-asserted-by":"crossref","first-page":"e34370","DOI":"10.1371\/journal.pone.0034370","article-title":"GPS-ARM: computational analysis of the APC\/C recognition motif by predicting D-boxes and KEN-boxes","volume":"7","author":"Liu","year":"2012","journal-title":"PloS One"},{"key":"2023051308495564400_btv182-B20","doi-asserted-by":"crossref","first-page":"3448","DOI":"10.1093\/bioinformatics\/bti551","article-title":"BiNGO: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks","volume":"21","author":"Maere","year":"2005","journal-title":"Bioinformatics"},{"key":"2023051308495564400_btv182-B21","doi-asserted-by":"crossref","DOI":"10.1093\/database\/bar009","article-title":"Uniprot knowledgebase: a hub of integrated protein data","author":"Magrane","year":"2011","journal-title":"Database"},{"key":"2023051308495564400_btv182-B22","doi-asserted-by":"crossref","first-page":"671","DOI":"10.1016\/j.cell.2012.09.019","article-title":"Quantitative analysis of fission yeast transcriptomes and proteomes in proliferating and quiescent cells","volume":"151","author":"Marguerat","year":"2012","journal-title":"Cell"},{"key":"2023051308495564400_btv182-B23","first-page":"289","article-title":"Convex analysis approach to D.C. programming: theory, algorithms and applications","volume":"22","author":"Pham Dinh","year":"1997","journal-title":"Acta Math. Vietnamica"},{"key":"2023051308495564400_btv182-B24","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1016\/S0168-9525(00)02024-2","article-title":"EMBOSS: the European molecular biology open software suite","volume":"16","author":"Rice","year":"2000","journal-title":"Trends Genet."},{"key":"2023051308495564400_btv182-B25","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1186\/1752-0509-7-83","article-title":"Post-translational regulation enables robust p53 regulation","volume":"7","author":"Shin","year":"2013","journal-title":"BMC Syst. Biol."},{"key":"2023051308495564400_btv182-B26","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1111\/j.1432-0436.2005.00028.x","article-title":"Functional analysis of p53 tumor suppressor in yeast","volume":"73","author":"\u0160mardov\u00e1","year":"2005","journal-title":"Differentiation"},{"key":"2023051308495564400_btv182-B27","doi-asserted-by":"crossref","first-page":"D535","DOI":"10.1093\/nar\/gkj109","article-title":"BioGRID: a general repository for interaction datasets","volume":"34","author":"Stark","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023051308495564400_btv182-B28","doi-asserted-by":"crossref","first-page":"2129","DOI":"10.1101\/gr.772403","article-title":"PANTHER: a library of protein families and subfamilies indexed by function","volume":"13","author":"Thomas","year":"2003","journal-title":"Genome Res."},{"key":"2023051308495564400_btv182-B29","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1994","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023051308495564400_btv182-B30","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B (Methodological)"},{"key":"2023051308495564400_btv182-B31","doi-asserted-by":"crossref","first-page":"e248","DOI":"10.1371\/journal.pcbi.0030248","article-title":"Determinants of protein abundance and translation efficiency in S.Cerevisiae","volume":"3","author":"Tuller","year":"2007","journal-title":"PLoS Comput. Biol."},{"key":"2023051308495564400_btv182-B32","doi-asserted-by":"crossref","first-page":"5483","DOI":"10.1073\/pnas.0501761102","article-title":"Functional genomic analysis of the rates of protein evolution","volume":"102","author":"Wall","year":"2005","journal-title":"Proc. Natl. Acad. Sci. U.S.A."},{"key":"2023051308495564400_btv182-B33","doi-asserted-by":"crossref","first-page":"W214","DOI":"10.1093\/nar\/gkq537","article-title":"The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function","volume":"38","author":"Warde-Farley","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023051308495564400_btv182-B34","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1016\/S0968-0004(99)01460-7","article-title":"The economics of ribosome biosynthesis in yeast","volume":"24","author":"Warner","year":"1999","journal-title":"Trends Biochem. Sci."},{"key":"2023051308495564400_btv182-B35","doi-asserted-by":"crossref","first-page":"974","DOI":"10.1198\/016214507000000617","article-title":"Robust truncated hinge loss support vector machines","volume":"102","author":"Wu","year":"2007","journal-title":"J. Am. Stat. Assoc."},{"key":"2023051308495564400_btv182-B36","first-page":"536","article-title":"Robust support vector machine training via convex outlier ablation","volume-title":"American Association for Artificial Intelligence (AAAI)","author":"Xu","year":"2006"},{"key":"2023051308495564400_btv182-B37","first-page":"2532","article-title":"Relaxed clipping: a global training method for robust regression and classification","volume-title":"Neural Information Processing Systems, Curran Associates, Inc","author":"Yang","year":"2010"},{"key":"2023051308495564400_btv182-B38","doi-asserted-by":"crossref","first-page":"W741","DOI":"10.1093\/nar\/gki475","article-title":"WebGestalt: an integrated system for exploring gene sets in various biological contexts","volume":"33","author":"Zhang","year":"2005","journal-title":"Nucleic Acids Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/15\/2530\/50307057\/bioinformatics_31_15_2530.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/15\/2530\/50307057\/bioinformatics_31_15_2530.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,8]],"date-time":"2024-06-08T03:00:47Z","timestamp":1717815647000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/15\/2530\/188674"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,3,29]]},"references-count":38,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2015,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv182","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,8,1]]},"published":{"date-parts":[[2015,3,29]]}}}