{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T00:54:52Z","timestamp":1775264092668,"version":"3.50.1"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,1,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Motivation: The accurate prediction of residue\u2013residue contacts, critical for maintaining the native fold of a protein, remains an open problem in the field of structural bioinformatics. Interest in this long-standing problem has increased recently with algorithmic improvements and the rapid growth in the sizes of sequence families. Progress could have major impacts in both structure and function prediction to name but two benefits. Sequence-based contact predictions are usually made by identifying correlated mutations within multiple sequence alignments (MSAs), most commonly through the information-theoretic approach of calculating mutual information between pairs of sites in proteins. These predictions are often inaccurate because the true covariation signal in the MSA is often masked by biases from many ancillary indirect-coupling or phylogenetic effects. Here we present a novel method, PSICOV, which introduces the use of sparse inverse covariance estimation to the problem of protein contact prediction. Our method builds on work which had previously demonstrated corrections for phylogenetic and entropic correlation noise and allows accurate discrimination of direct from indirectly coupled mutation correlations in the MSA.<\/jats:p>\n                  <jats:p>Results: PSICOV displays a mean precision substantially better than the best performing normalized mutual information approach and Bayesian networks. For 118 out of 150 targets, the L\/5 (i.e. top-L\/5 predictions for a protein of length L) precision for long-range contacts (sequence separation &amp;gt;23) was \u22650.5, which represents an improvement sufficient to be of significant benefit in protein structure prediction or model quality assessment.<\/jats:p>\n                  <jats:p>Availability: The PSICOV source code can be downloaded from http:\/\/bioinf.cs.ucl.ac.uk\/downloads\/PSICOV<\/jats:p>\n                  <jats:p>Contact: \u00a0d.jones@cs.ucl.ac.uk<\/jats:p>\n                  <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr638","type":"journal-article","created":{"date-parts":[[2011,11,18]],"date-time":"2011-11-18T21:00:47Z","timestamp":1321650047000},"page":"184-190","source":"Crossref","is-referenced-by-count":700,"title":["PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments"],"prefix":"10.1093","volume":"28","author":[{"given":"David T.","family":"Jones","sequence":"first","affiliation":[{"name":"1 Department of Computer Science, Bioinformatics Group and 2Department of Computer Science, Centre for Computational Statistics and Machine Learning, University College London, Malet Place, London WC1E 6BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daniel W. A.","family":"Buchan","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Bioinformatics Group and 2Department of Computer Science, Centre for Computational Statistics and Machine Learning, University College London, Malet Place, London WC1E 6BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Domenico","family":"Cozzetto","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Bioinformatics Group and 2Department of Computer Science, Centre for Computational Statistics and Machine Learning, University College London, Malet Place, London WC1E 6BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Massimiliano","family":"Pontil","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Bioinformatics Group and 2Department of Computer Science, Centre for Computational Statistics and Machine Learning, University College London, Malet Place, London WC1E 6BT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2011,11,17]]},"reference":[{"key":"2023012511345400800_B1","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1093\/protein\/gzp078","article-title":"Reducing phylogenetic bias in correlated mutation analysis","volume":"23","author":"Ashkenazy","year":"2010","journal-title":"Protein Eng. Des. Sel."},{"key":"2023012511345400800_B2","first-page":"485","article-title":"Model selection through sparse maximum likelihood estimation","volume":"9","author":"Banerjee","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"2023012511345400800_B3","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein data bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012511345400800_B4","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-20192-9","volume-title":"Statistics for High-Dimensional Data: Methods, Theory and Applications.","author":"B\u00fchlmann","year":"2011"},{"key":"2023012511345400800_B5","doi-asserted-by":"crossref","first-page":"e1000633","DOI":"10.1371\/journal.pcbi.1000633","article-title":"Disentangling direct from indirect co-evolution of residues in protein alignments","volume":"6","author":"Burger","year":"2010","journal-title":"PLoS Comput. Biol."},{"key":"2023012511345400800_B6","doi-asserted-by":"crossref","first-page":"1125","DOI":"10.1093\/bioinformatics\/btp135","article-title":"Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information","volume":"25","author":"Buslje","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012511345400800_B7","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1093\/bioinformatics\/btm604","article-title":"Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction","volume":"24","author":"Dunn","year":"2008","journal-title":"Bioinformatics"},{"issue":"Suppl. 9","key":"2023012511345400800_B8","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1002\/prot.22554","article-title":"Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8","volume":"77","author":"Ezkurdia","year":"2009","journal-title":"Proteins"},{"key":"2023012511345400800_B9","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1093\/protein\/14.11.835","article-title":"Prediction of contact maps with neural networks and correlated mutations","volume":"14","author":"Fariselli","year":"2001","journal-title":"Protein Eng."},{"key":"2023012511345400800_B10","doi-asserted-by":"crossref","first-page":"D211","DOI":"10.1093\/nar\/gkp985","article-title":"The pfam protein families database","volume":"38","author":"Finn","year":"2010","journal-title":"Nucleic Acids Res."},{"issue":"Suppl. 5","key":"2023012511345400800_B11","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1002\/prot.10036","article-title":"CAFASP2: the second critical assessment of fully automated structure prediction methods","volume":"45","author":"Fischer","year":"2001","journal-title":"Proteins"},{"key":"2023012511345400800_B12","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1093\/biostatistics\/kxm045","article-title":"Sparse inverse covariance estimation with the graphical Lasso","volume":"9","author":"Friedman","year":"2008","journal-title":"Biostatistics"},{"key":"2023012511345400800_B13","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1002\/prot.340180402","article-title":"Correlated mutations and residue contacts in proteins","volume":"18","author":"Gobel","year":"1994","journal-title":"Proteins"},{"issue":"Suppl. 7","key":"2023012511345400800_B14","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1002\/prot.20739","article-title":"CASP6 assessment of contact prediction","volume":"61","author":"Gra\u00f1a","year":"2005","journal-title":"Proteins"},{"key":"2023012511345400800_B15","doi-asserted-by":"crossref","first-page":"W347","DOI":"10.1093\/nar\/gki411","article-title":"EVAcon: a protein contact prediction evaluation service","volume":"33","author":"Gra\u00f1a","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012511345400800_B16","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/j.pbiomolbio.2003.09.003","article-title":"Inter-residue interactions in protein folding and stability","volume":"86","author":"Gromiha","year":"2004","journal-title":"Prog. Biophys. Mol. Biol."},{"key":"2023012511345400800_B17","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1016\/j.cell.2009.07.038","article-title":"Protein sectors: evolutionary units of three-dimensional structure","volume":"138","author":"Halabi","year":"2009","journal-title":"Cell"},{"key":"2023012511345400800_B18","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1007\/978-1-60327-429-6_3","article-title":"An introduction to protein contact prediction","volume":"453","author":"Hamilton","year":"2008","journal-title":"Methods Mol. Biol."},{"key":"2023012511345400800_B19","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1002\/prot.20160","article-title":"Protein contact prediction using patterns of correlation","volume":"56","author":"Hamilton","year":"2004","journal-title":"Proteins"},{"key":"2023012511345400800_B20","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511345400800_B21","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1016\/0022-2836(94)90032-9","article-title":"Position-based sequence weights","volume":"243","author":"Henikoff","year":"1994","journal-title":"J. Mol. Biol."},{"key":"2023012511345400800_B22","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1093\/bib\/bbm052","article-title":"Correlated substitution analysis and the prediction of amino acid structural contacts","volume":"9","author":"Horner","year":"2008","journal-title":"Brief. Bioinform."},{"key":"2023012511345400800_B23","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1214\/lnms\/1215455556","article-title":"Correlated mutations in protein sequences: Phylogenetic and structural effects","volume-title":"Proceedings of the AMS\/SIAM Conference on Statistics in Molecular Biology and Genetics","author":"Lapedes","year":"1999"},{"key":"2023012511345400800_B24","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1016\/S0927-5398(03)00007-0","article-title":"Improved estimation of the covariance matrix of stock returns with an application to portfolio selection","volume":"10","author":"Ledoit","year":"2003","journal-title":"J. Empir. Finance"},{"key":"2023012511345400800_B25","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1109\/TCBB.2010.91","article-title":"Is there an optimal substitution matrix for contact prediction with correlated mutations?","volume":"8","author":"Lena","year":"2011","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"2023012511345400800_B26","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1186\/1471-2105-8-60","article-title":"Supervised group Lasso with applications to microarray data analysis","volume":"8","author":"Ma","year":"2007","journal-title":"BMC Bioinformatics"},{"issue":"Suppl. 1","key":"2023012511345400800_B27","doi-asserted-by":"crossref","first-page":"i224","DOI":"10.1093\/bioinformatics\/bth913","article-title":"Striped sheets and protein contact prediction","volume":"20","author":"MacCallum","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012511345400800_B28","doi-asserted-by":"crossref","first-page":"bar009","DOI":"10.1093\/database\/bar009","article-title":"UniProt knowledgebase: a hub of integrated protein data","volume":"2011","author":"Magrane","year":"2011","journal-title":"Database"},{"key":"2023012511345400800_B29","doi-asserted-by":"crossref","first-page":"4116","DOI":"10.1093\/bioinformatics\/bti671","article-title":"Using information theory to search for co-evolving residues in proteins","volume":"21","author":"Martin","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012511345400800_B30","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1016\/0022-2836(71)90390-1","article-title":"Tests for comparing related amino-acid sequences. cytochrome c and cytochrome c 551","volume":"61","author":"McLachlan","year":"1971","journal-title":"J. Mol. Biol."},{"key":"2023012511345400800_B31","doi-asserted-by":"crossref","first-page":"1436","DOI":"10.1214\/009053606000000281","article-title":"High dimensional graphs and variable selection with the Lasso","volume":"34","author":"Meinshausen","year":"2006","journal-title":"Ann. Stat."},{"key":"2023012511345400800_B32","doi-asserted-by":"crossref","first-page":"1575","DOI":"10.1093\/bioinformatics\/btn248","article-title":"Using inferred residue contacts to distinguish between correct and incorrect protein models","volume":"24","author":"Miller","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012511345400800_B33","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1073\/pnas.91.1.98","article-title":"How frequent are correlated changes in families of protein sequences?","volume":"91","author":"Neher","year":"1994","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511345400800_B34","doi-asserted-by":"crossref","first-page":"S25","DOI":"10.1016\/S1359-0278(97)00060-6","article-title":"Improving contact predictions by the combination of correlated mutations and other sources of sequence information","volume":"2","author":"Olmea","year":"1997","journal-title":"Fold Des."},{"issue":"Suppl. 1","key":"2023012511345400800_B35","doi-asserted-by":"crossref","first-page":"S62","DOI":"10.1093\/bioinformatics\/18.suppl_1.S62","article-title":"Prediction of contact maps by giohmms and recurrent neural networks using lateral propagation from all four cardinal corners","volume":"18","author":"Pollastri","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012511345400800_B36","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1093\/protein\/10.6.647","article-title":"Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution","volume":"10","author":"Pollock","year":"1997","journal-title":"Protein Eng."},{"key":"2023012511345400800_B37","doi-asserted-by":"crossref","first-page":"2960","DOI":"10.1093\/bioinformatics\/bti454","article-title":"PROFcon: novel prediction of long-range contacts","volume":"21","author":"Punta","year":"2005","journal-title":"Bioinformatics"},{"issue":"Suppl. 6","key":"2023012511345400800_B38","doi-asserted-by":"crossref","first-page":"497","DOI":"10.1002\/prot.10539","article-title":"Predicting interresidue contacts using templates and pathways","volume":"53","author":"Shao","year":"2003","journal-title":"Proteins"},{"key":"2023012511345400800_B39","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1073\/pnas.0805923106","article-title":"Identification of direct residue contacts in protein-protein interaction by message passing","volume":"106","author":"Weigt","year":"2009","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012511345400800_B40","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1002\/prot.22329","article-title":"Predicting residue-residue contact maps by a two-layer, integrated neural-network method","volume":"76","author":"Xue","year":"2009","journal-title":"Proteins"},{"key":"2023012511345400800_B41","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1186\/1471-2105-6-248","article-title":"Better prediction of protein contact number using a support vector regression analysis of amino acid sequence","volume":"6","author":"Yuan","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023012511345400800_B42","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1093\/biomet\/asm018","article-title":"Model selection and estimation in the gaussian graphical model","volume":"91","author":"Yuan","year":"2007","journal-title":"Biometrika"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/2\/184\/48870285\/bioinformatics_28_2_184.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/2\/184\/48870285\/bioinformatics_28_2_184.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T06:39:19Z","timestamp":1674628759000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/2\/184\/198108"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,11,17]]},"references-count":42,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2012,1,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr638","relation":{"has-review":[{"id-type":"doi","id":"10.3410\/f.13945969.793513420","asserted-by":"object"},{"id-type":"doi","id":"10.3410\/f.13945969.15405106","asserted-by":"object"}]},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,1,15]]},"published":{"date-parts":[[2011,11,17]]}}}