{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T12:13:14Z","timestamp":1771330394604,"version":"3.50.1"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T00:00:00Z","timestamp":1770249600000},"content-version":"vor","delay-in-days":4,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100012795","name":"Pritzker Institute of Biomedical Science and Engineering","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100012795","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,2,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Protein sequence variation analysis is a topic of broad interest in drug discovery and protein engineering to support modulation of protein function for diverse biotechnological and therapeutic applications. To assist in the analysis of multiple sequence alignments (MSAs) and identify residues that account for protein function specificity, computational tools have been developed. Yet, existing programs often omit consideration of amino acid properties, flexibility beyond fixed webserver interfaces, accessible source code, or compatibility with small MSAs.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>To address these limitations, we present PyTEA-O, a Python implementation of Two-Entropies Analysis that has been developed to be easy to use for the analysis of protein sequence variation. To help users analyze the MSA and screen for residues of interest, we generate modifiable and intuitive visualizations. These visualizations, together with a scoring approach for identifying alignment positions with (dis-)similar physicochemical properties, presents a powerful tool for sequence variability analysis. To demonstrate its capabilities, we present a case study based on the deubiquitinase OTUD7B (Cezanne) where we identify a crucial position that modulates its affinity for its substrate.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>PyTEA-O is available at https:\/\/github.com\/CDDLeiden\/PyTEA-O\/ and archived via Zenodo (https:\/\/doi.org\/10.5281\/zenodo.15914598).<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btag043","type":"journal-article","created":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T12:21:31Z","timestamp":1770034891000},"source":"Crossref","is-referenced-by-count":0,"title":["PyTEA-O: a Python implementation of Two-Entropies Analysis for protein sequence variation analysis"],"prefix":"10.1093","volume":"42","author":[{"given":"Rosan C M","family":"Kuin","sequence":"first","affiliation":[{"name":"Division of Medicinal Chemistry, Leiden Academic Centre of Drug Research, Leiden University Computational Drug Discovery, , 2333 CC Leiden,","place":["The Netherlands"]}]},{"given":"Alexander T","family":"Julian","sequence":"additional","affiliation":[{"name":"Department of Biology, Illinois Institute of Technology , Chicago, IL 60616,","place":["United States"]}]},{"given":"Jagriti","family":"Chander","sequence":"additional","affiliation":[{"name":"Department of Biology, Illinois Institute of Technology , Chicago, IL 60616,","place":["United States"]}]},{"given":"Sunah","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Biology, Illinois Institute of Technology , Chicago, IL 60616,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0717-1817","authenticated-orcid":false,"given":"Gerard J P","family":"van Westen","sequence":"additional","affiliation":[{"name":"Division of Medicinal Chemistry, Leiden Academic Centre of Drug Research, Leiden University Computational Drug Discovery, , 2333 CC Leiden,","place":["The Netherlands"]}]}],"member":"286","published-online":{"date-parts":[[2026,2,4]]},"reference":[{"key":"2026021706212996400_btag043-B1","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1186\/1475-2859-5-2","article-title":"Enzymes: an integrated view of structure, dynamics and function","volume":"5","author":"Agarwal","year":"2006","journal-title":"Microb Cell Fact"},{"key":"2026021706212996400_btag043-B2","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J Mol Biol"},{"key":"2026021706212996400_btag043-B3","doi-asserted-by":"crossref","first-page":"21534","DOI":"10.1038\/s41598-022-25323-x","article-title":"Pan-cancer functional analysis of somatic mutations in G protein-coupled receptors","volume":"12","author":"Bongers","year":"2022","journal-title":"Sci Rep"},{"key":"2026021706212996400_btag043-B4","doi-asserted-by":"crossref","first-page":"e0119306","DOI":"10.1371\/journal.pone.0119306","article-title":"Prediction of protein structural features from sequence data based on Shannon entropy and Kolmogorov complexity","volume":"10","author":"Bywater","year":"2015","journal-title":"PLoS One"},{"key":"2026021706212996400_btag043-B5","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1093\/bib\/bbv045","article-title":"Practical analysis of specificity-determining residues in protein families","volume":"17","author":"Chagoyen","year":"2016","journal-title":"Brief Bioinform"},{"key":"2026021706212996400_btag043-B16","doi-asserted-by":"crossref","first-page":"3075","DOI":"10.1093\/bioinformatics\/btq595","article-title":"Identification of subfamily-specific sites based on active sites modeling and clustering","volume":"26","author":"de Melo-Minardi","year":"2010","journal-title":"Bioinformatics"},{"key":"2026021706212996400_btag043-B6","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1002\/pro.4222","article-title":"Substitutions at a rheostat position in human aldolase a cause a shift in the conformational population","volume":"31","author":"Fenton","year":"2022","journal-title":"Protein Sci"},{"key":"2026021706212996400_btag043-B7","doi-asserted-by":"crossref","first-page":"5141","DOI":"10.1038\/s41467-024-49119-x","article-title":"Simultaneous enhancement of multiple functional properties using evolution-informed protein design","volume":"15","author":"Fram","year":"2024","journal-title":"Nat Commun"},{"key":"2026021706212996400_btag043-B8","doi-asserted-by":"crossref","first-page":"14497","DOI":"10.1021\/jacs.6b09545","article-title":"Monomer\/oligomer quasi-racemic protein crystallography","volume":"138","author":"Gao","year":"2016","journal-title":"J Am Chem Soc"},{"key":"2026021706212996400_btag043-B9","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1002\/prot.21949","article-title":"An iterative knowledge-based scoring function for protein-protein recognition","volume":"72","author":"Huang","year":"2008","journal-title":"Proteins"},{"key":"2026021706212996400_btag043-B10","doi-asserted-by":"crossref","first-page":"e55","DOI":"10.1093\/nar\/gku077","article-title":"A knowledge-based scoring function for protein-RNA interactions derived from a statistical mechanics-based iterative method","volume":"42","author":"Huang","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2026021706212996400_btag043-B11","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1016\/j.jmb.2010.04.039","article-title":"Consensus protein design without phylogenetic bias","volume":"399","author":"J\u00e4ckel","year":"2010","journal-title":"J Mol Biol"},{"key":"2026021706212996400_btag043-B12","doi-asserted-by":"crossref","first-page":"vbab030","DOI":"10.1093\/bioadv\/vbab030","article-title":"3DFI: a pipeline to infer protein function using structural homology","volume":"1","author":"Julian","year":"2021","journal-title":"Bioinform Adv"},{"key":"2026021706212996400_btag043-B13","doi-asserted-by":"crossref","first-page":"3059","DOI":"10.1093\/nar\/gkf436","article-title":"MAFFT: a novel method for rapid multiple sequence alignment based on fast fourier transform","volume":"30","author":"Katoh","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2026021706212996400_btag043-B14","volume-title":"Nucleic Acids Res","author":"Kuin","year":"1274"},{"key":"2026021706212996400_btag043-B15","author":"Lodish","year":"2021"},{"key":"2026021706212996400_btag043-B17","doi-asserted-by":"crossref","first-page":"402","DOI":"10.1038\/nature19836","article-title":"Molecular basis of Lys11-polyubiquitin specificity in the deubiquitinase Cezanne","volume":"538","author":"Mevissen","year":"2016","journal-title":"Nature"},{"key":"2026021706212996400_btag043-B18","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1042\/EBC20200108","article-title":"Uncovering protein function: from classification to complexes","volume":"66","author":"Morris","year":"2022","journal-title":"Essays Biochem"},{"key":"2026021706212996400_btag043-B19","doi-asserted-by":"crossref","first-page":"544","DOI":"10.1002\/prot.10490","article-title":"Identification of functionally conserved residues with the use of entropy-variability plots","volume":"52","author":"Oliveira","year":"2003","journal-title":"Proteins"},{"key":"2026021706212996400_btag043-B20","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1021\/jm9700575","article-title":"New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids","volume":"41","author":"Sandberg","year":"1998","journal-title":"J Med Chem"},{"key":"2026021706212996400_btag043-B21","doi-asserted-by":"crossref","first-page":"D20","DOI":"10.1093\/nar\/gkae979","article-title":"Database resources of the national center for biotechnology information in 2025","volume":"53","author":"Sayers","year":"2025","journal-title":"Nucleic Acids Res"},{"key":"2026021706212996400_btag043-B22","doi-asserted-by":"crossref","first-page":"D609","DOI":"10.1093\/nar\/gkae1010","article-title":"UniProt: the Universal Protein Knowledgebase in 2025","volume":"53","author":"UniProt Consortium","year":"2025","journal-title":"Nucleic Acids Res"},{"key":"2026021706212996400_btag043-B25","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1186\/1758-2946-5-41","article-title":"Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets","volume":"5","author":"van Westen","year":"2013","journal-title":"J Cheminform"},{"key":"2026021706212996400_btag043-B23","doi-asserted-by":"crossref","first-page":"e07454","DOI":"10.7554\/eLife.07454","article-title":"Contacts-based prediction of binding affinity in protein\u2013protein complexes","volume":"4","author":"Vangone","year":"2015","journal-title":"eLife"},{"key":"2026021706212996400_btag043-B24","doi-asserted-by":"crossref","first-page":"e0080221","DOI":"10.1128\/JVI.00802-21","article-title":"Targeting conserved sequences circumvents the evolution of resistance in a viral gene drive against human cytomegalovirus","volume":"95","author":"Walter","year":"2021","journal-title":"J Virol"},{"key":"2026021706212996400_btag043-B26","doi-asserted-by":"crossref","first-page":"1752","DOI":"10.1074\/mcp.R113.027771","article-title":"Identification of protein interactions involved in cellular signaling","volume":"12","author":"Westermarck","year":"2013","journal-title":"Mol Cell Proteomics"},{"key":"2026021706212996400_btag043-B27","doi-asserted-by":"crossref","first-page":"3676","DOI":"10.1093\/bioinformatics\/btw514","article-title":"PRODIGY: a web server for predicting the binding affinity of protein\u2013protein complexes","volume":"32","author":"Xue","year":"2016","journal-title":"Bioinformatics"},{"key":"2026021706212996400_btag043-B28","doi-asserted-by":"crossref","first-page":"1829","DOI":"10.1038\/s41596-020-0312-x","article-title":"The HDOCK server for integrated protein\u2013protein docking","volume":"15","author":"Yan","year":"2020","journal-title":"Nat Protoc"},{"key":"2026021706212996400_btag043-B29","doi-asserted-by":"crossref","first-page":"497","DOI":"10.1002\/prot.25234","article-title":"Addressing recent docking challenges: a hybrid strategy to integrate template-based and free protein-protein docking","volume":"85","author":"Yan","year":"2017","journal-title":"Proteins"},{"key":"2026021706212996400_btag043-B30","doi-asserted-by":"crossref","first-page":"1018","DOI":"10.1002\/prot.20899","article-title":"A two-entropies analysis to identify functional positions in the transmembrane region of class a G protein-coupled receptors","volume":"63","author":"Ye","year":"2006","journal-title":"Proteins Struct Funct Bioinform"},{"key":"2026021706212996400_btag043-B31","doi-asserted-by":"crossref","first-page":"908","DOI":"10.1093\/bioinformatics\/btn057","article-title":"Tracing evolutionary pressure","volume":"24","author":"Ye","year":"2008","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btag043\/66754491\/btag043.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btag043\/66754491\/btag043.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btag043\/66754491\/btag043.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T11:21:40Z","timestamp":1771327300000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btag043\/8460757"}},"subtitle":[],"editor":[{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2026,2]]},"references-count":31,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,2,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btag043","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,2]]},"published":{"date-parts":[[2026,2]]},"article-number":"btag043"}}