{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T10:13:03Z","timestamp":1760609583631,"version":"3.41.2"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2024,3,26]],"date-time":"2024-03-26T00:00:00Z","timestamp":1711411200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Department of Computer Science, Brunel University London"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,3,29]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Computational methods to detect correlated amino acid positions in proteins have become a valuable tool to predict intra- and inter-residue protein contacts, protein structures, and effects of mutation on protein stability and function. While there are many tools and webservers to compute coevolution scoring matrices, there is no central repository of alignments and coevolution matrices for large-scale studies and pattern detection leveraging on biological and structural annotations already available in UniProt.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We present a Python library, PyCoM, which enables users to query and analyze coevolution matrices and sequence alignments of 457\u00a0622 proteins, selected from UniProtKB\/Swiss-Prot database (length \u2264 500 residues), from a precompiled coevolution matrix database (PyCoMdb). PyCoM facilitates the development of statistical analyses of residue coevolution patterns using filters on biological and structural annotations from UniProtKB\/Swiss-Prot, with simple access to PyCoMdb for both novice and advanced users, supporting Jupyter Notebooks, Python scripts, and a web API access. The resource is open source and will help in generating data-driven computational models and methods to study and understand protein structures, stability, function, and design.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>PyCoM code is freely available from https:\/\/github.com\/scdantu\/pycom and PyCoMdb and the Jupyter Notebook tutorials are freely available from https:\/\/pycom.brunel.ac.uk.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae166","type":"journal-article","created":{"date-parts":[[2024,3,25]],"date-time":"2024-03-25T15:28:41Z","timestamp":1711380521000},"source":"Crossref","is-referenced-by-count":3,"title":["PyCoM: a python library for large-scale analysis of residue\u2013residue coevolution data"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-4807-8716","authenticated-orcid":false,"given":"Philipp E","family":"Glass","sequence":"first","affiliation":[{"name":"Department of Computer Science, Brunel University London , Uxbridge UB8 3PH,","place":["United Kingdom"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-4866-4147","authenticated-orcid":false,"given":"Sabriyeh","family":"Alibai","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Brunel University London , Uxbridge UB8 3PH,","place":["United Kingdom"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4158-233X","authenticated-orcid":false,"given":"Alessandro","family":"Pandini","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Brunel University London , Uxbridge UB8 3PH,","place":["United Kingdom"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2019-5311","authenticated-orcid":false,"given":"Sarath Chandra","family":"Dantu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Brunel University London , Uxbridge UB8 3PH,","place":["United Kingdom"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,3,26]]},"reference":[{"key":"2025052103470724200_btae166-B1","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1146\/annurev-chembioeng-011720-103410","article-title":"How do cells adapt? stories told in landscapes","volume":"11","author":"Agozzino","year":"2020","journal-title":"Annu Rev Chem Biomol Eng"},{"key":"2025052103470724200_btae166-B2","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1126\/science.abj8754","article-title":"Accurate prediction of protein structures and interactions using a three-track neural network","volume":"373","author":"Baek","year":"2021","journal-title":"Science"},{"key":"2025052103470724200_btae166-B3","first-page":"23","volume-title":"Methods Mol Biol","author":"Boutet","year":"2016"},{"key":"2025052103470724200_btae166-B4","doi-asserted-by":"crossref","first-page":"4175","DOI":"10.1038\/s41467-023-39909-0","article-title":"Discovering functionally important sites in proteins","volume":"14","author":"Cagiada","year":"2023","journal-title":"Nat Commun"},{"key":"2025052103470724200_btae166-B5","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1038\/nrg3414","article-title":"Emerging methods in protein co-evolution","volume":"14","author":"de Juan","year":"2013","journal-title":"Nat Rev Genet"},{"key":"2025052103470724200_btae166-B6","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.jcp.2014.07.024","article-title":"Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences","volume":"276","author":"Ekeberg","year":"2014","journal-title":"J Comput Phys"},{"key":"2025052103470724200_btae166-B7","doi-asserted-by":"crossref","first-page":"1018","DOI":"10.1093\/molbev\/msy007","article-title":"How pairwise coevolutionary models capture the collective residue variability in proteins?","volume":"35","author":"Figliuzzi","year":"2018","journal-title":"Mol Biol Evol"},{"key":"2025052103470724200_btae166-B8","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1016\/j.cell.2009.07.038","article-title":"Protein sectors: evolutionary units of three-dimensional structure","volume":"138","author":"Halabi","year":"2009","journal-title":"Cell"},{"key":"2025052103470724200_btae166-B9","doi-asserted-by":"crossref","first-page":"1582","DOI":"10.1093\/bioinformatics\/bty862","article-title":"The EVcouplings python framework for coevolutionary sequence analysis","volume":"35","author":"Hopf","year":"2019","journal-title":"Bioinformatics"},{"key":"2025052103470724200_btae166-B10","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1038\/nbt.3769","article-title":"Mutation effects predicted from sequence co-variation","volume":"35","author":"Hopf","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2025052103470724200_btae166-B11","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.03430","article-title":"Sequence co-evolution gives 3D contacts and structures of protein complexes","volume":"3","author":"Hopf","year":"2014","journal-title":"Elife"},{"key":"2025052103470724200_btae166-B12","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2025052103470724200_btae166-B13","doi-asserted-by":"crossref","first-page":"15674","DOI":"10.1073\/pnas.1314045110","article-title":"Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era","volume":"110","author":"Kamisetty","year":"2013","journal-title":"Proc Natl Acad Sci USA"},{"key":"2025052103470724200_btae166-B14","doi-asserted-by":"crossref","first-page":"E94","DOI":"10.1093\/nar\/gkz536","article-title":"Evolutionary coupling analysis identifies the impact of disease-associated variants at less-conserved sites","volume":"47","author":"Kim","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2025052103470724200_btae166-B15","first-page":"1129","volume-title":"Science","author":"Lin","year":"2023"},{"key":"2025052103470724200_btae166-B16","doi-asserted-by":"crossref","first-page":"5743","DOI":"10.1038\/s41467-021-25976-8","article-title":"ECNet is an evolutionary context-integrated deep learning framework for protein engineering","volume":"12","author":"Luo","year":"2021","journal-title":"Nat Commun"},{"key":"2025052103470724200_btae166-B17","first-page":"1072","volume-title":"Nat Biotechnol","author":"Marks","year":"2012"},{"key":"2025052103470724200_btae166-B18","doi-asserted-by":"crossref","first-page":"E1293","DOI":"10.1073\/pnas.1111471108","article-title":"Direct-coupling analysis of residue coevolution captures native contacts across many protein families","volume":"108","author":"Morcos","year":"2011","journal-title":"Proc Natl Acad Sci USA"},{"key":"2025052103470724200_btae166-B19","doi-asserted-by":"crossref","first-page":"14","DOI":"10.3389\/fcell.2014.00014","article-title":"Practical aspects of protein co-evolution","volume":"2","author":"Ochoa","year":"2014","journal-title":"Front Cell Dev Biol"},{"key":"2025052103470724200_btae166-B20","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1126\/science.aba3304","article-title":"An evolution-based model for designing chorismate mutase enzymes","volume":"369","author":"Russ","year":"2020","journal-title":"Science"},{"volume-title":"eLife","year":"2018","author":"Salinas","key":"2025052103470724200_btae166-B21"},{"key":"2025052103470724200_btae166-B22","doi-asserted-by":"crossref","first-page":"3128","DOI":"10.1093\/bioinformatics\/btu500","article-title":"CCMpred\u2014fast and precise prediction of protein residue-residue contacts from correlated mutations","volume":"30","author":"Seemayer","year":"2014","journal-title":"Bioinformatics"},{"key":"2025052103470724200_btae166-B23","doi-asserted-by":"crossref","first-page":"2403","DOI":"10.1038\/s41467-021-22732-w","article-title":"Protein design and variant prediction using autoregressive generative models","volume":"12","author":"Shin","year":"2021","journal-title":"Nat Commun"},{"key":"2025052103470724200_btae166-B24","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1186\/s12859-019-3019-7","article-title":"HH-suite3 for fast remote homology detection and deep protein annotation","volume":"20","author":"Steinegger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2025052103470724200_btae166-B25","doi-asserted-by":"crossref","first-page":"847","DOI":"10.1002\/1873-3468.14067","article-title":"Sharing biological data: why, when, and how","volume":"595","author":"Wilson","year":"2021","journal-title":"FEBS Lett"},{"key":"2025052103470724200_btae166-B26","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1186\/s13059-019-1689-0","article-title":"Machine learning and complex biological data","volume":"20","author":"Xu","year":"2019","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae166\/57097752\/btae166.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/4\/btae166\/57218617\/btae166.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/4\/btae166\/57218617\/btae166.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,21]],"date-time":"2025-05-21T07:47:18Z","timestamp":1747813638000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae166\/7635577"}},"subtitle":[],"editor":[{"given":"Peter","family":"Robinson","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2024,3,26]]},"references-count":26,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,3,29]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae166","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2024,4,1]]},"published":{"date-parts":[[2024,3,26]]},"article-number":"btae166"}}