{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T10:52:47Z","timestamp":1740135167265,"version":"3.37.3"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2016,12,15]],"date-time":"2016-12-15T00:00:00Z","timestamp":1481760000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2016,12,15]],"date-time":"2016-12-15T00:00:00Z","timestamp":1481760000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>The post-genomic era with its wealth of sequences gave rise to a broad range of protein residue-residue contact detecting methods. Although various coevolution methods such as PSICOV, DCA and plmDCA provide correct contact predictions, they do not completely overlap. Hence, new approaches and improvements of existing methods are needed to motivate further development and progress in the field. We present a new contact detecting method, COUSCOus, by combining the best shrinkage approach, the empirical Bayes covariance estimator and GLasso.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>Using the original PSICOV benchmark dataset, COUSCOus achieves mean accuracies of 0.74, 0.62 and 0.55 for the top <jats:italic>L<\/jats:italic>\/10 predicted long, medium and short range contacts, respectively. In addition, COUSCOus attains mean areas under the precision-recall curves of 0.25, 0.29 and 0.30 for long, medium and short contacts and outperforms PSICOV. We also observed that COUSCOus outperforms PSICOV w.r.t. Matthew\u2019s correlation coefficient criterion on full list of residue contacts. Furthermore, COUSCOus achieves on average 10% more gain in prediction accuracy compared to PSICOV on an independent test set composed of CASP11 protein targets. Finally, we showed that when using a simple random forest meta-classifier, by combining contact detecting techniques and sequence derived features, PSICOV predictions should be replaced by the more accurate COUSCOus predictions.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>We conclude that the consideration of superior covariance shrinkage approaches will boost several research fields that apply the GLasso procedure, amongst the presented one of residue-residue contact prediction as well as fields such as gene network reconstruction.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-016-1400-3","type":"journal-article","created":{"date-parts":[[2016,12,15]],"date-time":"2016-12-15T13:26:40Z","timestamp":1481808400000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["COUSCOus: improved protein contact prediction using an empirical Bayes covariance estimator"],"prefix":"10.1186","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0445-2325","authenticated-orcid":false,"given":"Reda","family":"Rawi","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Raghvendra","family":"Mall","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Khalid","family":"Kunji","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mohammed","family":"El Anbari","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Aupetit","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ehsan","family":"Ullah","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Halima","family":"Bensmail","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2016,12,15]]},"reference":[{"key":"1400_CR1","doi-asserted-by":"publisher","first-page":"1593","DOI":"10.1126\/science.146.3651.1593","volume":"146","author":"C Yanofsky","year":"1964","unstructured":"Yanofsky C, Horn V, Thorpe D. Protein structure relationships revealed by mutual analysis. Science (New YorkNY). 1964; 146:1593\u20134.","journal-title":"Science (New YorkNY)"},{"issue":"5","key":"1400_CR2","doi-asserted-by":"publisher","first-page":"579","DOI":"10.1007\/BF00486096","volume":"4","author":"WM Fitch","year":"1970","unstructured":"Fitch WM, Markowitz E. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem Genet. 1970; 4(5):579\u201393.","journal-title":"Biochem Genet"},{"issue":"4","key":"1400_CR3","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1038\/nrg3414","volume":"14","author":"D de Juan","year":"2013","unstructured":"de Juan D, Pazos F, Valencia A. Emerging methods in protein co-evolution. Nat Rev Genet. 2013; 14(4):249\u201361.","journal-title":"Nat Rev Genet"},{"issue":"12","key":"1400_CR4","doi-asserted-by":"publisher","first-page":"e28766","DOI":"10.1371\/journal.pone.0028766","volume":"6","author":"DS Marks","year":"2011","unstructured":"Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, et al.Protein 3D Structure Computed from Evolutionary Sequence Variation. PLoS ONE. 2011; 6(12):e28766.","journal-title":"PLoS ONE"},{"issue":"11","key":"1400_CR5","doi-asserted-by":"publisher","first-page":"1072","DOI":"10.1038\/nbt.2419","volume":"30","author":"DS Marks","year":"2012","unstructured":"Marks DS, Hopf TA, Sander C. Protein structure prediction from sequence variation. Nat Biotechnol. 2012; 30(11):1072\u201380.","journal-title":"Nat Biotechnol"},{"issue":"7","key":"1400_CR6","doi-asserted-by":"publisher","first-page":"1607","DOI":"10.1016\/j.cell.2012.04.012","volume":"149","author":"TA Hopf","year":"2012","unstructured":"Hopf TA, Colwell LJ, Sheridan R, Rost B, Sander C, Marks DS. Three-dimensional structures of membrane proteins from genomic sequencing. Cell. 2012; 149(7):1607\u201321.","journal-title":"Cell"},{"issue":"3","key":"1400_CR7","doi-asserted-by":"publisher","first-page":"e92197","DOI":"10.1371\/journal.pone.0092197","volume":"9","author":"T Kosciolek","year":"2014","unstructured":"Kosciolek T, Jones DT. De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts. PLoS ONE. 2014; 9(3):e92197.","journal-title":"PLoS ONE"},{"key":"1400_CR8","doi-asserted-by":"publisher","first-page":"e03430","DOI":"10.7554\/eLife.03430","volume":"3","author":"TA Hopf","year":"2014","unstructured":"Hopf TA, Sch\u00e4rfe CPI, Rodrigues JPGLM, Green AG, Kohlbacher O, Sander C, et al.Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife. 2014; 3:e03430.","journal-title":"eLife"},{"key":"1400_CR9","doi-asserted-by":"publisher","first-page":"e02030","DOI":"10.7554\/eLife.02030","volume":"3","author":"S Ovchinnikov","year":"2014","unstructured":"Ovchinnikov S, Kamisetty H, Baker D. Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information. eLife. 2014; 3:e02030.","journal-title":"eLife"},{"issue":"19","key":"1400_CR10","doi-asserted-by":"publisher","first-page":"7156","DOI":"10.1021\/bi050293e","volume":"44","author":"GB Gloor","year":"2005","unstructured":"Gloor GB, Martin LC, Wahl LM, Dunn SD. Mutual information in protein multiple sequence alignments reveals two classes of coevolving positions. Biochemistry. 2005; 44(19):7156\u201365.","journal-title":"Biochemistry"},{"issue":"22","key":"1400_CR11","doi-asserted-by":"publisher","first-page":"4116","DOI":"10.1093\/bioinformatics\/bti671","volume":"21","author":"LC Martin","year":"2005","unstructured":"Martin LC, Gloor GB, Dunn SD, Wahl LM. Using information theory to search for co-evolving residues in proteins. Bioinformatics (Oxford England). 2005; 21(22):4116\u201324.","journal-title":"Bioinformatics (Oxford England)"},{"issue":"3","key":"1400_CR12","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1093\/bioinformatics\/btm604","volume":"24","author":"SD Dunn","year":"2008","unstructured":"Dunn SD, Wahl LM, Gloor GB. Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics (Oxford England). 2008; 24(3):333\u201340.","journal-title":"Bioinformatics (Oxford England)"},{"issue":"1","key":"1400_CR13","doi-asserted-by":"publisher","first-page":"e1000633","DOI":"10.1371\/journal.pcbi.1000633","volume":"6","author":"L Burger","year":"2010","unstructured":"Burger L, van Nimwegen E. Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Comput Biol. 2010; 6(1):e1000633.","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"1400_CR14","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1073\/pnas.0805923106","volume":"106","author":"M Weigt","year":"2009","unstructured":"Weigt M, White RA, Szurmant H, Hoch JA, Hwa T. Identification of direct residue contacts in protein-protein interaction by message passing. Proc Natl Acad Sci USA. 2009; 106(1):67\u201372.","journal-title":"Proc Natl Acad Sci USA"},{"issue":"49","key":"1400_CR15","doi-asserted-by":"publisher","first-page":"E1293\u2014301","DOI":"10.1073\/pnas.1111471108","volume":"108","author":"F Morcos","year":"2011","unstructured":"Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, et al.Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci USA. 2011; 108(49):E1293\u2014301.","journal-title":"Proc Natl Acad Sci USA"},{"issue":"2","key":"1400_CR16","doi-asserted-by":"publisher","first-page":"184","DOI":"10.1093\/bioinformatics\/btr638","volume":"28","author":"DT Jones","year":"2012","unstructured":"Jones DT, Buchan DWA, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics (Oxford England). 2012; 28(2):184\u201390.","journal-title":"Bioinformatics (Oxford England)"},{"issue":"1","key":"1400_CR17","doi-asserted-by":"publisher","first-page":"012707","DOI":"10.1103\/PhysRevE.87.012707","volume":"87","author":"M Ekeberg","year":"2013","unstructured":"Ekeberg M, Lo\u0307vkvist C, Lan Y, Weigt M, Aurell E. Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys Rev E Stat Nonlinear Soft Matter Phys. 2013; 87(1):012707.","journal-title":"Phys Rev E Stat Nonlinear Soft Matter Phys"},{"issue":"39","key":"1400_CR18","doi-asserted-by":"publisher","first-page":"15674","DOI":"10.1073\/pnas.1314045110","volume":"110","author":"H Kamisetty","year":"2013","unstructured":"Kamisetty H, Ovchinnikov S, Baker D. Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci USA. 2013; 110(39):15674\u20139.","journal-title":"Proc Natl Acad Sci USA"},{"issue":"23","key":"1400_CR19","doi-asserted-by":"publisher","first-page":"3066","DOI":"10.1093\/bioinformatics\/bts598","volume":"28","author":"J Eickholt","year":"2012","unstructured":"Eickholt J, Cheng J. Predicting protein residue-residue contacts using deep networks and boosting. Bioinformatics. 2012; 28(23):3066\u201372.","journal-title":"Bioinformatics"},{"issue":"14","key":"1400_CR20","doi-asserted-by":"publisher","first-page":"1815","DOI":"10.1093\/bioinformatics\/btt259","volume":"29","author":"MJ Skwark","year":"2013","unstructured":"Skwark MJ, Abdel-Rehim A, Elofsson A. PconsC: combination of direct information methods and alignments improves contact prediction. Bioinformatics. 2013; 29(14):1815\u20136.","journal-title":"Bioinformatics"},{"issue":"21","key":"1400_CR21","doi-asserted-by":"publisher","first-page":"3506","DOI":"10.1093\/bioinformatics\/btv472","volume":"31","author":"J Ma","year":"2015","unstructured":"Ma J, Wang S, Wang Z, Xu J. Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning. Bioinformatics. 2015; 31(21):3506\u201313.","journal-title":"Bioinformatics"},{"issue":"7","key":"1400_CR22","doi-asserted-by":"publisher","first-page":"999","DOI":"10.1093\/bioinformatics\/btu791","volume":"31","author":"DT Jones","year":"2015","unstructured":"Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. Bioinformatics. 2015; 31(7):999\u20131006.","journal-title":"Bioinformatics"},{"issue":"3","key":"1400_CR23","doi-asserted-by":"publisher","first-page":"1436","DOI":"10.1214\/009053606000000281","volume":"34","author":"N Meinshausen","year":"2006","unstructured":"Meinshausen N, B\u00fchlmann P. High-dimensional graphs and variable selection with the Lasso. Ann Stat. 2006; 34(3):1436\u201362.","journal-title":"Ann Stat"},{"issue":"3","key":"1400_CR24","doi-asserted-by":"publisher","first-page":"586","DOI":"10.1214\/aos\/1176345010","volume":"8","author":"LR Haff","year":"1980","unstructured":"Haff LR. Empirical Bayes Estimation of the Multivariate Normal Covariance Matrix. Ann Stat. 1980; 8(3):586\u201397.","journal-title":"Ann Stat"},{"issue":"4","key":"1400_CR25","doi-asserted-by":"publisher","first-page":"611","DOI":"10.1002\/prot.10180","volume":"48","author":"I Kass","year":"2002","unstructured":"Kass I, Horovitz A. Mapping pathways of allosteric communication in GroEL by analysis of correlated mutations. Proteins Struct Funct Genet. 2002; 48(4):611\u20137.","journal-title":"Proteins Struct Funct Genet"},{"issue":"18","key":"1400_CR26","doi-asserted-by":"publisher","first-page":"2681","DOI":"10.1093\/bioinformatics\/btu336","volume":"30","author":"A Bakan","year":"2014","unstructured":"Bakan A, Dutta A, Mao W, Liu Y, Chennubhotla C, Lezon TR, et al.Evol and ProDy for bridging protein sequence evolution and structural dynamics. Bioinformatics. 2014; 30(18):2681\u20133.","journal-title":"Bioinformatics"},{"issue":"1","key":"1400_CR27","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1186\/1471-2105-15-85","volume":"15","author":"L Kaj\u00e1n","year":"2014","unstructured":"Kaj\u00e1n L, Hopf TA, Kalas\u0306 M, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinforma. 2014; 15(1):85.","journal-title":"BMC Bioinforma"},{"issue":"3","key":"1400_CR28","doi-asserted-by":"publisher","first-page":"432","DOI":"10.1093\/biostatistics\/kxm045","volume":"9","author":"J Friedman","year":"2008","unstructured":"Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008; 9(3):432\u201341.","journal-title":"Biostatistics"},{"key":"1400_CR29","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198522195.001.0001","volume-title":"Graphical Models","author":"SL Lauritzen","year":"1996","unstructured":"Lauritzen SL. Graphical Models, 1st ed. Oxford: Oxford University Press; 1996."},{"issue":"2","key":"1400_CR30","doi-asserted-by":"publisher","first-page":"295","DOI":"10.1214\/aos\/1009210544","volume":"29","author":"IM Johnstone","year":"2001","unstructured":"Johnstone IM. On the Distribution of the Largest Eigenvalue in Principal Components Analysis. Ann Stat. 2001; 29(2):295\u2013327.","journal-title":"Ann Stat"},{"key":"1400_CR31","volume-title":"Proc. Fourth Berkeley Symp. Math. Statist. Prob","author":"W James","year":"1961","unstructured":"James W, Stein C. Estimation with quadratic loss. In: Proc. Fourth Berkeley Symp. Math. Statist. Prob. Berkeley: University of California Press: 1961. p. 361\u2013379."},{"issue":"5","key":"1400_CR32","doi-asserted-by":"publisher","first-page":"603","DOI":"10.1016\/S0927-5398(03)00007-0","volume":"10","author":"O Ledoit","year":"2003","unstructured":"Ledoit O, Wolf M. Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J Empir Financ. 2003; 10(5):603\u201321.","journal-title":"J Empir Financ"},{"key":"1400_CR33","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","volume":"292","author":"DT Jones","year":"1999","unstructured":"Jones DT. Protein secondary structure prediction based on position-specific matrices. J Mol Biol. 1999; 292:195\u2013202.","journal-title":"J Mol Biol"},{"key":"1400_CR34","unstructured":"R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna; 2014. http:\/\/www.R-project.org\/."},{"issue":"21","key":"1400_CR35","doi-asserted-by":"publisher","first-page":"2695","DOI":"10.1093\/bioinformatics\/btl461","volume":"22","author":"BJ Grant","year":"2006","unstructured":"Grant BJ, Rodrigues APC, ElSawy KM, McCammon JA, Caves LSD. Bio3d: an R package for the comparative analysis of protein structures. Bioinformatics. 2006; 22(21):2695\u20136.","journal-title":"Bioinformatics"},{"issue":"1","key":"1400_CR36","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1186\/1756-0381-4-16","volume":"4","author":"D Heider","year":"2011","unstructured":"Heider D, Hoffmann D. Interpol: An R package for preprocessing of protein sequences. BioData Min. 2011; 4(1):16.","journal-title":"BioData Min"},{"key":"1400_CR37","doi-asserted-by":"publisher","first-page":"1097","DOI":"10.1002\/prot.24862","volume":"84","author":"H Park","year":"2016","unstructured":"Park H, DiMaio F, Baker D. CASP11 refinement experiments with ROSETTA. Proteins. 2016; 84:1097\u20130134.","journal-title":"Proteins"},{"issue":"S8","key":"1400_CR38","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1002\/prot.21637","volume":"69","author":"JMG Izarzugaza","year":"2007","unstructured":"Izarzugaza JMG, Gran\u0307a O, Tress ML, Valencia A, Clarke ND. Assessment of intramolecular contact predictions for CASP7. Proteins Struct Funct Bioinforma. 2007; 69(S8):152\u20138.","journal-title":"Proteins Struct Funct Bioinforma"},{"issue":"2","key":"1400_CR39","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","volume":"405","author":"BW Matthews","year":"1975","unstructured":"Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta Protein Struct. 1975; 405(2):442\u201351.","journal-title":"Biochim Biophys Acta Protein Struct"},{"issue":"D1","key":"1400_CR40","first-page":"D222\u2014D230","volume":"42","author":"RD Finn","year":"2014","unstructured":"Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, et al.Pfam: the protein families database. Nucleic Acids Res. 2014; 42(D1):D222\u2014D230.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"1400_CR41","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/nar\/28.1.235","volume":"28","author":"HM Berman","year":"2000","unstructured":"Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al.The Protein Data Bank. Nucleic Acids Res. 2000; 28(1):235\u201342.","journal-title":"Nucleic Acids Res"},{"issue":"17","key":"1400_CR42","doi-asserted-by":"publisher","first-page":"i482\u2014i488","DOI":"10.1093\/bioinformatics\/btu458","volume":"30","author":"M Michel","year":"2014","unstructured":"Michel M, Hayat S, Skwark MJ, Sander C, Marks DS, Elofsson A. PconsFold: improved contact predictions improve protein models. Bioinformatics. 2014; 30(17):i482\u2014i488.","journal-title":"Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-016-1400-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-016-1400-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-016-1400-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T18:11:41Z","timestamp":1706811101000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-016-1400-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,12,15]]},"references-count":42,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2016,12]]}},"alternative-id":["1400"],"URL":"https:\/\/doi.org\/10.1186\/s12859-016-1400-3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2016,12,15]]},"assertion":[{"value":"17 June 2016","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 December 2016","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 December 2016","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"533"}}