{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T22:35:04Z","timestamp":1761863704287,"version":"3.37.3"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"16","license":[{"start":{"date-parts":[[2017,4,12]],"date-time":"2017-04-12T00:00:00Z","timestamp":1491955200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/501100004359","name":"Swedish Research Council","doi-asserted-by":"publisher","award":["VR-NT 2012-5046"],"award-info":[{"award-number":["VR-NT 2012-5046"]}],"id":[{"id":"10.13039\/501100004359","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,8,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Knowledge of the correct protein subcellular localization is necessary for understanding the function of a protein. Unfortunately large-scale experimental studies are limited in their accuracy. Therefore, the development of prediction methods has been limited by the amount of accurate experimental data. However, recently large-scale experimental studies have provided new data that can be used to evaluate the accuracy of subcellular predictions in human cells. Using this data we examined the performance of state of the art methods and developed SubCons, an ensemble method that combines four predictors using a Random Forest classifier.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>SubCons outperforms earlier methods in a dataset of proteins where two independent methods confirm the subcellular localization. Given nine subcellular localizations, SubCons achieves an F1-Score of 0.79 compared to 0.70 of the second best method. Furthermore, at a FPR of 1% the true positive rate (TPR) is over 58% for SubCons compared to less than 50% for the best individual predictor.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and Implementation<\/jats:title>\n                  <jats:p>SubCons is freely available as a webserver (http:\/\/subcons.bioinfo.se) and source code from https:\/\/bitbucket.org\/salvatore_marco\/subcons-web-server. The golden dataset as well is available from http:\/\/subcons.bioinfo.se\/pred\/download.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx219","type":"journal-article","created":{"date-parts":[[2017,4,11]],"date-time":"2017-04-11T12:06:01Z","timestamp":1491912361000},"page":"2464-2470","source":"Crossref","is-referenced-by-count":27,"title":["SubCons: a new ensemble method for improved human subcellular localization predictions"],"prefix":"10.1093","volume":"33","author":[{"given":"M","family":"Salvatore","sequence":"first","affiliation":[{"name":"Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Solna, Sweden"}]},{"given":"P","family":"Warholm","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Solna, Sweden"}]},{"given":"N","family":"Shu","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Solna, Sweden"},{"name":"Sweden Bioinformatics Infrastructure for Life Sciences (BILS), Stockholm University, Solna, Stockholm, Sweden"}]},{"given":"W","family":"Basile","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Solna, Sweden"}]},{"given":"A","family":"Elofsson","sequence":"additional","affiliation":[{"name":"Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Solna, Sweden"}]}],"member":"286","published-online":{"date-parts":[[2017,4,12]]},"reference":[{"key":"2023020206250383400_btx219-B1","first-page":"113","article-title":"Reducing multiclass to binary: a unifying approach for margin classifiers","volume":"1","author":"Allwein","year":"2000","journal-title":"J. Mach. Learn. Res"},{"key":"2023020206250383400_btx219-B2","doi-asserted-by":"crossref","first-page":"W410","DOI":"10.1093\/nar\/gkw348","article-title":"The mpi bioinformatics toolkit as an integrative platform for advanced protein sequence and structure analysis","volume":"44","author":"Alva","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023020206250383400_btx219-B3","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1093\/bioinformatics\/16.5.412","article-title":"Assessing the accuracy of prediction algorithms for classification: an overview","volume":"16","author":"Baldi","year":"2000","journal-title":"Bioinformatics"},{"key":"2023020206250383400_btx219-B4","doi-asserted-by":"crossref","first-page":"1039","DOI":"10.1111\/tra.12310","article-title":"Mechanism regulating protein localization","volume":"16","author":"Bauer","year":"2015","journal-title":"Traffic"},{"key":"2023020206250383400_btx219-B5","doi-asserted-by":"crossref","first-page":"274","DOI":"10.1186\/1471-2105-10-274","article-title":"Multiloc2: integrating phylogeny and gene ontology terms improves subcellular protein localization prediction","volume":"10","author":"Blum","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023020206250383400_btx219-B6","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/j.jprot.2013.02.019","article-title":"The effect of organelle discovery upon sub-cellular protein localisation","volume":"88","author":"Breckels","year":"2013","journal-title":"J. Proteomics"},{"key":"2023020206250383400_btx219-B7","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023020206250383400_btx219-B8","doi-asserted-by":"crossref","first-page":"5363","DOI":"10.1021\/pr900665y","article-title":"Sherloc2: a high-accuracy hybrid method for predicting subcellular localization of proteins","volume":"8","author":"Briesemeister","year":"2009","journal-title":"J. Proteome Res"},{"key":"2023020206250383400_btx219-B9","doi-asserted-by":"crossref","first-page":"497","DOI":"10.1093\/nar\/gkq477","article-title":"Yloc-an interpretable web server for predicting subcellular localization","volume":"38","author":"Briesemeister","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023020206250383400_btx219-B10","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1007\/978-1-4939-0685-7_10","article-title":"Determining protein subcellular localization in mammalian cell culture with biochemical fractionation and itraq 8-plex quantification","volume":"1156","author":"Christoforou","year":"2014","journal-title":"Shotgun Proteomics Methods Protoc. Method Mol. Biol"},{"key":"2023020206250383400_btx219-B11","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1093\/embo-reports\/kvd092","article-title":"Finding nuclear localization signals","volume":"1","author":"Cokol","year":"2000","journal-title":"EMBO Rep"},{"key":"2023020206250383400_btx219-B12","doi-asserted-by":"crossref","first-page":"953","DOI":"10.1038\/nprot.2007.131","article-title":"Locating proteins in the cell using targetp, signalp and related tools","volume":"2","author":"Emanuelsson","year":"2007","journal-title":"Nat. Protoc"},{"key":"2023020206250383400_btx219-B13","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/S0022-2836(03)00553-9","article-title":"In silico prediction of the peroxisomal proteome in fungi, plants and animals","volume":"330","author":"Emanuelsson","year":"2003","journal-title":"J. Mol. Biol"},{"key":"2023020206250383400_btx219-B14","doi-asserted-by":"crossref","first-page":"1005","DOI":"10.1006\/jmbi.2000.3903","article-title":"Predicting subcellular localization of proteins based on their n-terminal amino acid sequence","volume":"30","author":"Emanuelsson","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023020206250383400_btx219-B15","doi-asserted-by":"crossref","first-page":"3766","DOI":"10.1021\/pr200379a","article-title":"Mapping the subcellular protein distribution in three human cell lines","volume":"10","author":"Fagerberg","year":"2011","journal-title":"J. Proteome Res"},{"key":"2023020206250383400_btx219-B16","doi-asserted-by":"crossref","first-page":"550","DOI":"10.2174\/138920209789503941","article-title":"Mechanisms and signals for the nuclear import of proteins","volume":"10","author":"Freitas","year":"2009","journal-title":"Curr. Genomics"},{"key":"2023020206250383400_btx219-B17","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1093\/bioinformatics\/bts390","article-title":"Loctree2 predicts localization for all domains of life","volume":"28","author":"Goldberg","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020206250383400_btx219-B18","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1093\/nar\/gkm259","article-title":"Wolfpsort: protein localization predictor","volume":"35","author":"Horton","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020206250383400_btx219-B19","doi-asserted-by":"crossref","first-page":"3970","DOI":"10.1002\/pmic.201000274","article-title":"Prediction of subcellular locations of proteins: Where to proceed?","volume":"10","author":"Imai","year":"2010","journal-title":"Proteomics"},{"key":"2023020206250383400_btx219-B20","doi-asserted-by":"crossref","first-page":"1236","DOI":"10.1093\/bioinformatics\/btu031","article-title":"Interproscan 5: genome-scale protein function classification","volume":"30","author":"Jones","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020206250383400_btx219-B21","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1038\/nbt0908-1011","article-title":"What are decision trees?","volume":"26","author":"Kingsford","year":"2008","journal-title":"Nat. Biotechnol"},{"key":"2023020206250383400_btx219-B22","first-page":"5101","article-title":"Classical nuclear localization signals: Definition, function, and interaction with importin \u03b1","volume":"8","author":"Lande","year":"2007","journal-title":"J. Biol. Chem"},{"key":"2023020206250383400_btx219-B23","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","article-title":"Comparison of the predicted and observed secondary structure of T4 phage lysozyme","volume":"405","author":"Matthews","year":"1975","journal-title":"Biochim. Biophys. Acta"},{"key":"2023020206250383400_btx219-B24","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1016\/S0065-3233(00)54009-1","article-title":"Protein sorting signals and prediction of subcellular localization","volume":"54","author":"Nakai","year":"2000","journal-title":"Adv. Protein Chem"},{"key":"2023020206250383400_btx219-B25","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1002\/prot.340110203","article-title":"Expert system for predicting protein localization sites in gram-negative bacteria","volume":"11","author":"Nakai","year":"1991","journal-title":"Proteins"},{"volume-title":"Predicting Subcellular Localization of Proteins by Bioinformatic Algorithms, Volume 10 of Current Topics in Microbiology and Immunology","year":"2015","author":"Nielsen","key":"2023020206250383400_btx219-B26"},{"key":"2023020206250383400_btx219-B27","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res"},{"key":"2023020206250383400_btx219-B28","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1038\/nmeth.1701","article-title":"Signalp 4.0: discriminating signal peptides from transmembrane regions","volume":"8","author":"Petersen","year":"2011","journal-title":"Nat. Methods"},{"key":"2023020206250383400_btx219-B29","doi-asserted-by":"crossref","first-page":"2973","DOI":"10.1093\/bioinformatics\/btu411","article-title":"Tppred2: improving the prediction of mitochondrial targeting peptide cleavage sites by exploiting sequence motifs","volume":"30","author":"Savojardo","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020206250383400_btx219-B30","doi-asserted-by":"crossref","first-page":"727","DOI":"10.1016\/j.ygeno.2003.10.006","article-title":"TAFA: a novel secreted family with conserved cysteine residues and restricted expression in the brain","volume":"83","author":"Tom Tang","year":"2004","journal-title":"Genomics"},{"key":"2023020206250383400_btx219-B31","doi-asserted-by":"crossref","first-page":"1248","DOI":"10.1038\/nbt1210-1248","article-title":"Towards a knowledge-based human protein atlas","volume":"28","author":"Uhlen","year":"2010","journal-title":"Nat. Biotechnol"},{"key":"2023020206250383400_btx219-B32","doi-asserted-by":"crossref","first-page":"D204","DOI":"10.1093\/nar\/gku989","article-title":"Uniprot: a hub for protein information","volume":"43","author":"UniProt-Consortium","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023020206250383400_btx219-B33","volume-title":"Information Retrieval","author":"Van Rijsbergen","year":"1979","edition":"2nd ed,"},{"key":"2023020206250383400_btx219-B34","doi-asserted-by":"crossref","first-page":"4683","DOI":"10.1093\/nar\/14.11.4683","article-title":"A new method for predicting signal sequence cleavage sites","volume":"14","author":"von Heijne","year":"1986","journal-title":"Nucleic Acids Res"},{"key":"2023020206250383400_btx219-B35","doi-asserted-by":"crossref","first-page":"735","DOI":"10.1053\/jhep.2003.50340","article-title":"A novel liver-specific zona pellucida domain containing protein that is expressed rarely in hepatocellular carcinoma","volume":"38","author":"Xu","year":"2003","journal-title":"Hepatology"},{"key":"2023020206250383400_btx219-B36","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1002\/prot.21018","article-title":"Prediction of protein subcellular localization","volume":"64","author":"Yu","year":"2006","journal-title":"Proteins Struct. Funct. Bioinf"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/16\/2464\/49041061\/bioinformatics_33_16_2464.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/16\/2464\/49041061\/bioinformatics_33_16_2464.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T06:26:53Z","timestamp":1675319213000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/16\/2464\/3603546"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,4,12]]},"references-count":36,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2017,8,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx219","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2017,8,15]]},"published":{"date-parts":[[2017,4,12]]}}}