{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,23]],"date-time":"2026-02-23T19:59:08Z","timestamp":1771876748438,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1013925","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2026,2,23]],"date-time":"2026-02-23T00:00:00Z","timestamp":1771804800000}}],"reference-count":39,"publisher":"Public Library of Science (PLoS)","issue":"2","license":[{"start":{"date-parts":[[2026,2,9]],"date-time":"2026-02-09T00:00:00Z","timestamp":1770595200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Protein language models (PLMs) pretrained via a masked language modeling objective have proven effective across a range of structure-related tasks, including high-resolution structure prediction. However, it remains unclear to what extent these models factorize protein structural categories among their learned parameters. In this work, we introduce trainable subnetworks, which mask out the PLM weights responsible for language modeling performance on a structural category of proteins. We systematically trained 39 PLM subnetworks targeting both sequence- and residue-level features at varying degrees of resolution using annotations defined by the CATH taxonomy and secondary structure elements. Using these PLM subnetworks, we assessed how structural factorization in PLMs influences downstream structure prediction. Our results show that PLMs are highly sensitive to sequence-level features and can predominantly disentangle extremely coarse or fine-grained information. Furthermore, we observe that structure prediction is highly responsive to factorized PLM representations and that small changes in language modeling performance can significantly impair PLM-based structure prediction capabilities. Our work presents a framework for studying feature entanglement within pretrained PLMs and can be leveraged to improve the alignment of learned PLM representations with known biological concepts.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1013925","type":"journal-article","created":{"date-parts":[[2026,2,9]],"date-time":"2026-02-09T18:57:14Z","timestamp":1770663434000},"page":"e1013925","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":0,"title":["Trainable subnetworks reveal insights into structure knowledge organization in protein language models"],"prefix":"10.1371","volume":"22","author":[{"given":"Ria","family":"Vinod","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ava P.","family":"Amini","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lorin","family":"Crawford","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9045-6826","authenticated-orcid":true,"given":"Kevin K.","family":"Yang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"340","published-online":{"date-parts":[[2026,2,9]]},"reference":[{"issue":"15","key":"pcbi.1013925.ref001","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"A Rives","year":"2021","journal-title":"Proc Natl Acad Sci U S A."},{"key":"pcbi.1013925.ref002","doi-asserted-by":"crossref","unstructured":"Elnaggar A, Heinzinger M, Dallago C, Rehawi G, Wang Y, Jones L, et al. ProtTrans: towards cracking the language of life\u2019s code through self-supervised learning. openRxiv. 2020.https:\/\/doi.org\/10.1101\/2020.07.12.199554","DOI":"10.1101\/2020.07.12.199554"},{"key":"pcbi.1013925.ref003","doi-asserted-by":"crossref","unstructured":"Rao R, Liu J, Verkuil R, Meier J, Canny JF, Abbeel P, et al. MSA transformer. openRxiv. 2021. https:\/\/doi.org\/10.1101\/2021.02.12.430858","DOI":"10.1101\/2021.02.12.430858"},{"issue":"1","key":"pcbi.1013925.ref004","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1186\/s12859-019-3220-8","article-title":"Modeling aspects of the language of life through transfer-learning protein sequences","volume":"20","author":"M Heinzinger","year":"2019","journal-title":"BMC Bioinformatics."},{"key":"pcbi.1013925.ref005","doi-asserted-by":"crossref","unstructured":"Wu R, Ding F, Wang R, Shen R, Zhang X, Luo S, et al. High-resolution de novo structure prediction from primary sequence. openRxiv. 2022. https:\/\/doi.org\/10.1101\/2022.07.21.500999","DOI":"10.1101\/2022.07.21.500999"},{"key":"pcbi.1013925.ref006","doi-asserted-by":"crossref","unstructured":"Chowdhury R, Bouatta N, Biswas S, Rochereau C, Church GM, Sorger PK, et al. Single-sequence protein structure prediction using language models from deep learning. openRxiv. 2021. https:\/\/doi.org\/10.1101\/2021.08.02.454840","DOI":"10.1101\/2021.08.02.454840"},{"issue":"6637","key":"pcbi.1013925.ref007","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Z Lin","year":"2023","journal-title":"Science."},{"key":"pcbi.1013925.ref008","doi-asserted-by":"crossref","unstructured":"Rao R, Bhattacharya N, Thomas N, Duan Y, Chen X, Canny J. Evaluating protein transfer learning with TAPE. arXiv preprint. 2019. http:\/\/arxiv.org\/abs\/1906.08230","DOI":"10.1101\/676825"},{"issue":"5","key":"pcbi.1013925.ref009","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btaf170","article-title":"ProtNote: a multimodal method for protein-function annotation","volume":"41","author":"S Char","year":"2025","journal-title":"Bioinformatics."},{"key":"pcbi.1013925.ref010","doi-asserted-by":"crossref","unstructured":"Meier J, Rao R, Verkuil R, Liu J, Sercu T, Rives A. Language models enable zero-shot prediction of the effects of mutations on protein function. openRxiv. 2021. http:\/\/dx.doi.org\/10.1101\/2021.07.09.450648","DOI":"10.1101\/2021.07.09.450648"},{"key":"pcbi.1013925.ref011","doi-asserted-by":"crossref","unstructured":"Nijkamp E, Ruffolo J, Weinstein EN, Naik N, Madani A. ProGen2: exploring the boundaries of protein language models. arXiv preprint 2022. https:\/\/doi.org\/10.48550\/arXiv.2206.13517","DOI":"10.1016\/j.cels.2023.10.002"},{"key":"pcbi.1013925.ref012","doi-asserted-by":"crossref","unstructured":"Alamdari S, Thakkar N, van den Berg R, Tenenholtz N, Strome R, Moses AM, et al. Protein generation with evolutionary diffusion: sequence is all you need. openRxiv. 2023. https:\/\/doi.org\/10.1101\/2023.09.11.556673","DOI":"10.1101\/2023.09.11.556673"},{"key":"pcbi.1013925.ref013","doi-asserted-by":"crossref","unstructured":"Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, Lin Z, et al. Simulating 500 million years of evolution with a language model. openRxiv. 2024. https:\/\/doi.org\/10.1101\/2024.07.01.600583","DOI":"10.1101\/2024.07.01.600583"},{"issue":"3","key":"pcbi.1013925.ref014","article-title":"Convolutions are competitive with transformers for protein sequence pretraining","volume":"15","author":"KK Yang","year":"2024","journal-title":"Cell Syst."},{"issue":"12","key":"pcbi.1013925.ref015","doi-asserted-by":"crossref","first-page":"1315","DOI":"10.1038\/s41592-019-0598-1","article-title":"Unified rational protein engineering with sequence-based deep representation learning","volume":"16","author":"EC Alley","year":"2019","journal-title":"Nat Methods."},{"key":"pcbi.1013925.ref016","doi-asserted-by":"crossref","unstructured":"Chen B, Cheng X, Li P, Geng Y, Gong J, Li S, et al. xTrimoPGLM: unified 100B-scale pre-trained transformer for deciphering the language of protein. arXiv preprint 2024. http:\/\/arxiv.org\/abs\/2401.06199","DOI":"10.1101\/2023.07.05.547496"},{"key":"pcbi.1013925.ref017","doi-asserted-by":"crossref","unstructured":"Li F-Z, Amini AP, Yue Y, Yang KK, Lu AX. Feature reuse and scaling: understanding transfer learning with protein language models. openRxiv. 2024.https:\/\/doi.org\/10.1101\/2024.02.05.578959","DOI":"10.1101\/2024.02.05.578959"},{"key":"pcbi.1013925.ref018","doi-asserted-by":"crossref","unstructured":"Vig J, Madani A, Varshney LR, Xiong C, Socher R, Rajani NF. BERTology meets biology: interpreting attention in protein language models. arXiv preprint 2021. http:\/\/arxiv.org\/abs\/2006.15222","DOI":"10.1101\/2020.06.26.174417"},{"issue":"45","key":"pcbi.1013925.ref019","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2406285121","article-title":"Protein language models learn evolutionary statistics of interacting sequence motifs","volume":"121","author":"Z Zhang","year":"2024","journal-title":"Proc Natl Acad Sci U S A."},{"key":"pcbi.1013925.ref020","doi-asserted-by":"crossref","unstructured":"Simon E, Zou J. InterPLM: discovering interpretable features in protein language models via sparse autoencoders. openRxiv. 2024. https:\/\/doi.org\/10.1101\/2024.11.14.623630","DOI":"10.1101\/2024.11.14.623630"},{"key":"pcbi.1013925.ref021","article-title":"From mechanistic interpretability to mechanistic biology: training, evaluating, and interpreting sparse autoencoders on protein language models","author":"E Adams","year":"2025","journal-title":"bioRxiv."},{"key":"pcbi.1013925.ref022","doi-asserted-by":"crossref","unstructured":"Bayazit D, Foroutan N, Chen Z, Weiss G, Bosselut A. Discovering knowledge-critical subnetworks in pretrained language models. In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. p. 6549\u201383. https:\/\/doi.org\/10.18653\/v1\/2024.emnlp-main.376","DOI":"10.18653\/v1\/2024.emnlp-main.376"},{"key":"pcbi.1013925.ref023","doi-asserted-by":"crossref","unstructured":"Cao B, Lin H, Han X, Sun L, Yan L, Liao M, et al. Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. https:\/\/doi.org\/10.18653\/v1\/2021.acl-long.146","DOI":"10.18653\/v1\/2021.acl-long.146"},{"key":"pcbi.1013925.ref024","unstructured":"Sanh V, Wolf T, Rush AM. Movement pruning: adaptive sparsity by fine-tuning. arXiv preprint 2005. http:\/\/arxiv.org\/abs\/2005.07683"},{"key":"pcbi.1013925.ref025","doi-asserted-by":"crossref","unstructured":"Mallya A, Davis D, Lazebnik S. Piggyback: adapting a single network to multiple tasks by learning to mask weights. arXiv preprint 2018. http:\/\/arxiv.org\/abs\/1801.06519","DOI":"10.1007\/978-3-030-01225-0_5"},{"key":"pcbi.1013925.ref026","unstructured":"Csord\u00e1s R, Steenkiste S v, Schmidhuber J. Are neural nets modular? Inspecting functional modularity through differentiable weight masks. arXiv preprint 2021. http:\/\/arxiv.org\/abs\/2010.02066"},{"key":"pcbi.1013925.ref027","doi-asserted-by":"crossref","unstructured":"Zhang X, van de Meent J-W, Wallace B. Disentangling representations of text by masking transformers. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. p. 778\u201391. https:\/\/doi.org\/10.18653\/v1\/2021.emnlp-main.60","DOI":"10.18653\/v1\/2021.emnlp-main.60"},{"issue":"12","key":"pcbi.1013925.ref028","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"W Kabsch","year":"1983","journal-title":"Biopolymers."},{"key":"pcbi.1013925.ref029","doi-asserted-by":"crossref","unstructured":"Cao S, Sanh V, Rush AM. Low-complexity probing via finding subnetworks. arXiv preprint 2021. http:\/\/arxiv.org\/abs\/2104.03514","DOI":"10.18653\/v1\/2021.naacl-main.74"},{"key":"pcbi.1013925.ref030","unstructured":"Bengio Y, L\u00e9onard N, Courville A. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint 2013. http:\/\/arxiv.org\/abs\/1308.3432"},{"key":"pcbi.1013925.ref031","doi-asserted-by":"crossref","unstructured":"Yang KK, Alamdari S, Lee AJ, Kaymak-Loveless K, Char S, Brixi G, et al. The Dayhoff Atlas: scaling sequence diversity for improved protein generation. openRxiv. 2025. http:\/\/dx.doi.org\/10.1101\/2025.07.21.665991","DOI":"10.1101\/2025.07.21.665991"},{"issue":"8","key":"pcbi.1013925.ref032","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","article-title":"CATH\u2013a hierarchic classification of protein domain structures","volume":"5","author":"CA Orengo","year":"1997","journal-title":"Structure."},{"key":"pcbi.1013925.ref033","doi-asserted-by":"crossref","unstructured":"Ahdritz G, Bouatta N, Floristean C, Kadyan S, Xia Q, Gerecke W, et al. OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. openRxiv. 2022. https:\/\/doi.org\/10.1101\/2022.11.20.517210","DOI":"10.1101\/2022.11.20.517210"},{"key":"pcbi.1013925.ref034","unstructured":"Bepler T, Berger B. Learning protein sequence embeddings using information from structure. CoRR. 2019. https:\/\/doi.org\/abs\/1902.08661"},{"issue":"7873","key":"pcbi.1013925.ref035","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"J Jumper","year":"2021","journal-title":"Nature."},{"key":"pcbi.1013925.ref036","unstructured":"Ismail AA, Oikarinen T, Wang A, Adebayo J, Stanton S, Joren T. Concept bottleneck language models for protein design. arXiv preprint 2024. http:\/\/arxiv.org\/abs\/2411.06090"},{"key":"pcbi.1013925.ref037","doi-asserted-by":"crossref","unstructured":"Tenney I, Das D, Pavlick E. BERT rediscovers the classical NLP pipeline. CoRR. 2019. https:\/\/doi.org\/abs\/1905.05950","DOI":"10.18653\/v1\/P19-1452"},{"key":"pcbi.1013925.ref038","doi-asserted-by":"crossref","unstructured":"Liu NF, Gardner M, Belinkov Y, Peters ME, Smith NA. Linguistic knowledge and transferability of contextual representations. CoRR. 2019. https:\/\/doi.org\/abs\/1903.08855","DOI":"10.18653\/v1\/N19-1112"},{"issue":"6721","key":"pcbi.1013925.ref039","article-title":"Exploring structural diversity across the protein universe with The Encyclopedia of Domains","volume":"386","author":"AM Lau","year":"2024","journal-title":"Science."}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1013925","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2026,2,23]],"date-time":"2026-02-23T00:00:00Z","timestamp":1771804800000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013925","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,23]],"date-time":"2026-02-23T19:00:49Z","timestamp":1771873249000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013925"}},"subtitle":[],"editor":[{"given":"Rachel","family":"Kolodny","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2026,2,9]]},"references-count":39,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,2,9]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1013925","relation":{},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,9]]}}}