{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T19:49:21Z","timestamp":1776109761522,"version":"3.50.1"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"W1","license":[{"start":{"date-parts":[[2022,4,30]],"date-time":"2022-04-30T00:00:00Z","timestamp":1651276800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100009708","name":"Novo Nordisk Fonden","doi-asserted-by":"publisher","award":["NNF20OC0062606"],"award-info":[{"award-number":["NNF20OC0062606"]}],"id":[{"id":"10.13039\/501100009708","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001732","name":"Danish National Research Foundation","doi-asserted-by":"publisher","award":["P1"],"award-info":[{"award-number":["P1"]}],"id":[{"id":"10.13039\/501100001732","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,7,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The prediction of protein subcellular localization is of great relevance for proteomics research. Here, we propose an update to the popular tool DeepLoc with multi-localization prediction and improvements in both performance and interpretability. For training and validation, we curate eukaryotic and human multi-location protein datasets with stringent homology partitioning and enriched with sorting signal information compiled from the literature. We achieve state-of-the-art performance in DeepLoc 2.0 by using a pre-trained protein language model. It has the further advantage that it uses sequence input rather than relying on slower protein profiles. We provide two means of better interpretability: an attention output along the sequence and highly accurate prediction of nine different types of protein sorting signals. We find that the attention output correlates well with the position of sorting signals. The webserver is available at services.healthtech.dtu.dk\/service.php?DeepLoc-2.0.<\/jats:p>","DOI":"10.1093\/nar\/gkac278","type":"journal-article","created":{"date-parts":[[2022,4,19]],"date-time":"2022-04-19T19:22:57Z","timestamp":1650396177000},"page":"W228-W234","source":"Crossref","is-referenced-by-count":637,"title":["DeepLoc 2.0: multi-label subcellular localization prediction using protein language models"],"prefix":"10.1093","volume":"50","author":[{"given":"Vineet","family":"Thumuluri","sequence":"first","affiliation":[{"name":"Indian Institute of Technology Madras , Chennai 600036, India"}]},{"given":"Jos\u00e9 Juan","family":"Almagro\u00a0Armenteros","sequence":"additional","affiliation":[{"name":"Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen , Copenhagen 2200, Denmark"},{"name":"Department of Genetics, Stanford University School of Medicine , Stanford 94305, CA, USA"}]},{"given":"Alexander\u00a0Rosenberg","family":"Johansen","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Stanford University , Stanford 94305, CA, USA"},{"name":"Department of Genetics, Stanford University School of Medicine , Stanford 94305, CA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9412-9643","authenticated-orcid":false,"given":"Henrik","family":"Nielsen","sequence":"additional","affiliation":[{"name":"Section\u00a0for Bioinformatics, Department of Health Technology, Technical University of Denmark , Kongens Lyngby 2800, Denmark"}]},{"given":"Ole","family":"Winther","sequence":"additional","affiliation":[{"name":"Center for Genomic Medicine, Rigshospitalet (Copenhagen University Hospital) , Copenhagen 2100, Denmark"},{"name":"Department of Biology, Bioinformatics Centre, University of Copenhagen , Copenhagen 2200, Denmark"},{"name":"Section\u00a0for Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark , Kongens Lyngby 2800, Denmark"}]}],"member":"286","published-online":{"date-parts":[[2022,4,30]]},"reference":[{"key":"2022070423590083400_B1","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1038\/nrd2897","article-title":"Subcellular targeting strategies for drug design and delivery","volume":"9","author":"Rajendran","year":"2010","journal-title":"Nat. Rev. Drug Discov."},{"key":"2022070423590083400_B2","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1016\/j.atherosclerosis.2015.11.027","article-title":"Protein sorting gone wrong \u2013 VPS10P domain receptors in cardiovascular and metabolic diseases","volume":"245","author":"Schmidt","year":"2016","journal-title":"Atherosclerosis"},{"key":"2022070423590083400_B3","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1146\/annurev-cellbio-100913-013012","article-title":"Protein sorting at the trans-Golgi network","volume":"30","author":"Guo","year":"2014","journal-title":"Ann. Rev. Cell Dev. Biol."},{"key":"2022070423590083400_B4","doi-asserted-by":"crossref","first-page":"26947","DOI":"10.1074\/jbc.M101870200","article-title":"Multiple mechanisms regulate subcellular localization of human CDC6","volume":"276","author":"Delmolino","year":"2001","journal-title":"J. Biol. Chem."},{"key":"2022070423590083400_B5","doi-asserted-by":"crossref","first-page":"1625","DOI":"10.1105\/tpc.109.066019","article-title":"Exploring the function-location nexus: using multiple lines of evidence in defining the subcellular location of plant proteins","volume":"21","author":"Millar","year":"2009","journal-title":"Plant Cell"},{"key":"2022070423590083400_B6","doi-asserted-by":"crossref","first-page":"13","DOI":"10.3389\/fcell.2018.00013","article-title":"Subcellular localization and dynamics of the Bcl-2 family of proteins","volume":"6","author":"Popgeorgiev","year":"2018","journal-title":"Front. Cell Dev. Biol."},{"key":"2022070423590083400_B7","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1083\/jcb1703fta1","article-title":"Lost in translation","volume":"170","author":"Leslie","year":"2005","journal-title":"J. Cell Biol."},{"key":"2022070423590083400_B8","doi-asserted-by":"crossref","first-page":"7920","DOI":"10.1074\/jbc.M207462200","article-title":"Co-translational targeting and translocation of the amino terminus of Opsin across the endoplasmic membrane requires GTP but Not ATP","volume":"278","author":"Kanner","year":"2003","journal-title":"J. Biol. Chem."},{"key":"2022070423590083400_B9","doi-asserted-by":"crossref","first-page":"e71112","DOI":"10.1371\/journal.pone.0071112","article-title":"The first transmembrane domain of lipid phosphatase SAC1 promotes Golgi localization","volume":"8","author":"Wang","year":"2013","journal-title":"PLoS ONE"},{"key":"2022070423590083400_B10","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1007\/s10930-019-09838-3","article-title":"A brief history of protein sorting prediction","volume":"38","author":"Nielsen","year":"2019","journal-title":"Protein J."},{"key":"2022070423590083400_B11","doi-asserted-by":"crossref","first-page":"1232","DOI":"10.1093\/bioinformatics\/btq115","article-title":"Going from where to why\u2014interpretable prediction of protein subcellular localization","volume":"26","author":"Briesemeister","year":"2010","journal-title":"Bioinformatics"},{"key":"2022070423590083400_B12","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1093\/bioinformatics\/btw717","article-title":"FUEL-mLoc: feature-unified prediction and explanation of multi-localization of cellular proteins in multiple organisms","volume":"33","author":"Wan","year":"2016","journal-title":"Bioinformatics"},{"key":"2022070423590083400_B13","doi-asserted-by":"crossref","first-page":"3387","DOI":"10.1093\/bioinformatics\/btx431","article-title":"DeepLoc: prediction of protein subcellular localization using deep learning","volume":"33","author":"Almagro\u00a0Armenteros","year":"2017","journal-title":"Bioinformatics"},{"key":"2022070423590083400_B14","doi-asserted-by":"crossref","first-page":"vbab035","DOI":"10.1093\/bioadv\/vbab035","article-title":"Light attention predicts protein location from the language of life","volume":"1","author":"St\u00e4rk","year":"2021","journal-title":"Bioinform. Adv."},{"key":"2022070423590083400_B15","first-page":"D158","article-title":"UniProt: the universal protein knowledgebase","volume":"45","author":"The\u00a0UniProt","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"2022070423590083400_B16","doi-asserted-by":"crossref","first-page":"eaal3321","DOI":"10.1126\/science.aal3321","article-title":"A subcellular map of the human proteome","volume":"356","author":"Thul","year":"2017","journal-title":"Science"},{"key":"2022070423590083400_B17","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1186\/s12859-016-0940-x","article-title":"Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins","volume":"17","author":"Wan","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2022070423590083400_B18","first-page":"5998","article-title":"Attention Is All You Need","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani","year":"2017"},{"key":"2022070423590083400_B19","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"Suzek","year":"2014","journal-title":"Bioinformatics"},{"key":"2022070423590083400_B20","first-page":"4171","article-title":"BERT: pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Devlin","year":"2019"},{"key":"2022070423590083400_B21","doi-asserted-by":"crossref","first-page":"e2016239118","DOI":"10.1073\/pnas.2016239118","article-title":"Biological Structure and Function Emerge from Scaling Unsupervised Learning to 250 Million Protein Sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc. Nati. Acad. Sci."},{"key":"2022070423590083400_B22","first-page":"8844","article-title":"MSA Transformer","volume-title":"Proceedings of the 38th International Conference on Machine Learning, PMLR","author":"Rao","year":"2021"},{"key":"2022070423590083400_B23","doi-asserted-by":"crossref","DOI":"10.1101\/2020.12.15.422761","article-title":"Transformer protein language models are unsupervised structure learners","author":"Rao","year":"2020"},{"key":"2022070423590083400_B24","doi-asserted-by":"crossref","DOI":"10.1109\/TPAMI.2021.3095381","article-title":"ProtTrans: towards cracking the language of lifes code through self-supervised deep learning and high performance computing","volume-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence","author":"Elnaggar","year":"2021"},{"key":"2022070423590083400_B25","article-title":"BERTology meets biology: interpreting attention in protein language models","author":"Vig","year":"2021"},{"key":"2022070423590083400_B26","doi-asserted-by":"crossref","first-page":"2102","DOI":"10.1093\/bioinformatics\/btac020","article-title":"ProteinBERT: a universal deep-learning model of protein sequence and function","volume":"38","author":"Brandes","year":"2022","journal-title":"Bioinformatics"},{"key":"2022070423590083400_B27","doi-asserted-by":"crossref","first-page":"107596","DOI":"10.1016\/j.compbiolchem.2021.107596","article-title":"Deep protein representations enable recombinant protein expression prediction","volume":"95","author":"Martiny","year":"2021","journal-title":"Comput. Biol. Chem."},{"key":"2022070423590083400_B28","article-title":"Neural machine translation by jointly learning to align and translate","volume-title":"3rd International Conference on Learning Representations","author":"Bahdanau","year":"2015"},{"key":"2022070423590083400_B29","doi-asserted-by":"crossref","first-page":"2999","DOI":"10.1109\/ICCV.2017.324","article-title":"Focal loss for dense object detection","volume-title":"2017 IEEE International Conference on Computer Vision (ICCV)","author":"Lin","year":"2017"},{"key":"2022070423590083400_B30","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1186\/s12864-019-6413-7","article-title":"The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation","volume":"21","author":"Chicco","year":"2020","journal-title":"BMC Genomics"},{"key":"2022070423590083400_B31","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1186\/1471-2105-13-290","article-title":"mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines","volume":"13","author":"Wan","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2022070423590083400_B32","doi-asserted-by":"crossref","DOI":"10.1038\/s41587-021-01156-3","article-title":"SignalP 6.0 predicts all five types of signal peptides using protein language models","author":"Teufel","year":"2022","journal-title":"Nat. Biotechnol."},{"key":"2022070423590083400_B33","doi-asserted-by":"crossref","first-page":"e201900429","DOI":"10.26508\/lsa.201900429","article-title":"Detecting sequence signals in targeting peptides using deep learning","volume":"2","author":"Almagro\u00a0Armenteros","year":"2019","journal-title":"Life Sci. Allian."},{"key":"2022070423590083400_B34","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1016\/j.crbiot.2021.01.001","article-title":"Prediction of GPI-anchored proteins with pointer neural networks","volume":"3","author":"G\u00edslason","year":"2021","journal-title":"Curr. Res. Biotechnol."}],"container-title":["Nucleic Acids Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/nar\/article-pdf\/50\/W1\/W228\/44378499\/gkac278.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/nar\/article-pdf\/50\/W1\/W228\/44378499\/gkac278.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T17:07:08Z","timestamp":1675357628000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/nar\/article\/50\/W1\/W228\/6576357"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,30]]},"references-count":34,"journal-issue":{"issue":"W1","published-online":{"date-parts":[[2022,4,30]]},"published-print":{"date-parts":[[2022,7,5]]}},"URL":"https:\/\/doi.org\/10.1093\/nar\/gkac278","relation":{},"ISSN":["0305-1048","1362-4962"],"issn-type":[{"value":"0305-1048","type":"print"},{"value":"1362-4962","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,7,5]]},"published":{"date-parts":[[2022,4,30]]}}}