{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T08:45:08Z","timestamp":1773391508788,"version":"3.50.1"},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":3196,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The prediction of ligand-binding residues or catalytically active residues of a protein may give important hints that can guide further genetic or biochemical studies. Existing sequence-based prediction methods mostly rank residue positions by evolutionary conservation calculated from a multiple sequence alignment of homologs. A problem hampering more wide-spread application of these methods is the low per-residue precision, which at 20% sensitivity is around 35% for ligand-binding residues and 20% for catalytic residues.<\/jats:p><jats:p>Results: We combine information from the conservation at each site, its amino acid distribution, as well as its predicted secondary structure (ss) and relative solvent accessibility (rsa). First, we measure conservation by how much the amino acid distribution at each site differs from the distribution expected for the predicted ss and rsa states. Second, we include the conservation of neighboring residues in a weighted linear score by analytically optimizing the signal-to-noise ratio of the total score. Third, we use conditional probability density estimation to calculate the probability of each site to be functional given its conservation, the observed amino acid distribution, and the predicted ss and rsa states.<\/jats:p><jats:p>We have constructed two large data sets, one based on the Catalytic Site Atlas and the other on PDB SITE records, to benchmark methods for predicting functional residues. The new method FRcons predicts ligand-binding and catalytic residues with higher precision than alternative methods over the entire sensitivity range, reaching 50% and 40% precision at 20% sensitivity, respectively.<\/jats:p><jats:p>Availability: Server: http:\/\/frpred.tuebingen.mpg.de. Data sets: ftp:\/\/ftp.tuebingen.mpg.de\/pub\/protevo\/FRpred\/<\/jats:p><jats:p>Contact: \u00a0soeding@lmb.uni-muenchen.de<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics Online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm626","type":"journal-article","created":{"date-parts":[[2008,1,4]],"date-time":"2008-01-04T01:13:32Z","timestamp":1199409212000},"page":"613-620","source":"Crossref","is-referenced-by-count":110,"title":["Prediction of protein functional residues from sequence by probability density estimation"],"prefix":"10.1093","volume":"24","author":[{"given":"J. D.","family":"Fischer","sequence":"first","affiliation":[{"name":"Department for Protein Evolution, Max Planck Institute for Developmental Biology, Spemannstr. 35, 72076 T\u00fcbingen, Germany"}]},{"given":"C. E.","family":"Mayer","sequence":"additional","affiliation":[{"name":"Department for Protein Evolution, Max Planck Institute for Developmental Biology, Spemannstr. 35, 72076 T\u00fcbingen, Germany"}]},{"given":"J.","family":"S\u00f6ding","sequence":"additional","affiliation":[{"name":"Department for Protein Evolution, Max Planck Institute for Developmental Biology, Spemannstr. 35, 72076 T\u00fcbingen, Germany"}]}],"member":"286","published-online":{"date-parts":[[2008,1,2]]},"reference":[{"key":"2023020210110098800_B1","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1002\/prot.20176","article-title":"Accurate prediction of solvent accessibility using neural networks-based regression","volume":"56","author":"Adamczak","year":"2004","journal-title":"Proteins"},{"key":"2023020210110098800_B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023020210110098800_B3","doi-asserted-by":"crossref","first-page":"1875","DOI":"10.1093\/bioinformatics\/btm270","article-title":"Predicting functionally important residues from sequence conservation","volume":"23","author":"Capra","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020210110098800_B4","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1038\/nsb0295-171","article-title":"A method to predict functional residues in proteins","volume":"2","author":"Casari","year":"1995","journal-title":"Nat. Struct. Biol"},{"key":"2023020210110098800_B5","doi-asserted-by":"crossref","first-page":"1487","DOI":"10.1016\/j.jmb.2004.08.022","article-title":"Distinguishing structural and functional restraints in evolution in order to identify interaction sites","volume":"342","author":"Chelliah","year":"2004","journal-title":"J. Mol. Biol"},{"key":"2023020210110098800_B6","first-page":"233","article-title":"The relationship between Precision-Recall and ROC curves","author":"Davis","year":"2006"},{"key":"2023020210110098800_B7","doi-asserted-by":"crossref","first-page":"1289","DOI":"10.1016\/S0022-2836(02)01451-1","article-title":"Automatic methods for predicting functionally important residues","volume":"326","author":"del Sol Mesa","year":"2003","journal-title":"J. Mol. Biol"},{"key":"2023020210110098800_B8","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological sequence analysis: Probabilistic models of proteins and nucleic acids.","author":"Durbin","year":"1998"},{"key":"2023020210110098800_B9","doi-asserted-by":"crossref","first-page":"041905","DOI":"10.1103\/PhysRevE.65.041905","article-title":"Analysis of symbolic sequences using the Jensen-Shannon divergence","volume":"65","author":"Grosse","year":"2002","journal-title":"Phys. Rev. E"},{"key":"2023020210110098800_B10","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1016\/S0022-2836(03)00515-1","article-title":"Using a neural network and spatial clustering to predict the location of active sites in enzymes","volume":"330","author":"Gutteridge","year":"2003","journal-title":"J. Mol. Biol"},{"key":"2023020210110098800_B11","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1006\/jmbi.2000.4036","article-title":"Analysis and prediction of functional sub-types from protein sequence alignments","volume":"303","author":"Hannenhalli","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023020210110098800_B12","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1093\/nar\/gkj063","article-title":"The PROSITE database","volume":"34","author":"Hulo","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023020210110098800_B13","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023020210110098800_B14","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.cbpa.2003.11.001","article-title":"Searching for functional sites in protein structures","volume":"8","author":"Jones","year":"2004","journal-title":"Curr. Opin. Chem. Biol"},{"key":"2023020210110098800_B15","doi-asserted-by":"crossref","first-page":"424","DOI":"10.1093\/nar\/gkh391","article-title":"SDPpred: a tool for prediction of amino acid residues that determine differences in functional specificity of homologous proteins","volume":"32","author":"Kalinina","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023020210110098800_B16","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1093\/nar\/gkm297","article-title":"Firestar\u2013prediction of functionally important residues using structural templates and alignment reliability","volume":"35","author":"L\u00f3pez","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023020210110098800_B17","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1006\/jmbi.2001.5327","article-title":"Structural clusters of evolutionary trace residues are statistically significant and common in proteins","volume":"316","author":"Madabushi","year":"2002","journal-title":"J. Mol. Biol"},{"key":"2023020210110098800_B18","doi-asserted-by":"crossref","first-page":"2466","DOI":"10.1093\/bioinformatics\/btl411","article-title":"Bayesian search of functionally divergent protein subgroups and their function specific residues","volume":"22","author":"Marttinen","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020210110098800_B19","doi-asserted-by":"crossref","first-page":"1781","DOI":"10.1093\/molbev\/msh194","article-title":"Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior","volume":"21","author":"Mayrose","year":"2004","journal-title":"Mol. Biol. Evol"},{"key":"2023020210110098800_B20","doi-asserted-by":"crossref","first-page":"1265","DOI":"10.1016\/j.jmb.2003.12.078","article-title":"A family of evolution-entropy hybrid methods for ranking protein residues by importance","volume":"336","author":"Mihalek","year":"2004","journal-title":"J. Mol. Biol"},{"key":"2023020210110098800_B21","doi-asserted-by":"crossref","first-page":"e139","DOI":"10.1371\/journal.pcbi.0030129","article-title":"A primer on learning in Bayesian networks for computational biology","volume":"3","author":"Needham","year":"2007","journal-title":"PLoS Comput. Biol"},{"key":"2023020210110098800_B22","doi-asserted-by":"crossref","first-page":"164","DOI":"10.1093\/bioinformatics\/bti766","article-title":"Prediction of functional specificity determinants from protein sequences using log-likelihood ratios","volume":"22","author":"Pei","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020210110098800_B23","doi-asserted-by":"crossref","first-page":"700","DOI":"10.1093\/bioinformatics\/17.8.700","article-title":"AL2CO: calculation of positional conservation in a protein sequence alignment","volume":"17","author":"Pei","year":"2001","journal-title":"Bioinformatics"},{"key":"2023020210110098800_B24","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1186\/1471-2105-7-312","article-title":"Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties","volume":"7","author":"Petrova","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020210110098800_B25","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1093\/nar\/gkh028","article-title":"The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data","volume":"32","author":"Porter","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023020210110098800_B26","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1093\/bioinformatics\/18.suppl_1.S71","article-title":"Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues","volume":"18","author":"Pupko","year":"2002","journal-title":"Bioinformatics"},{"key":"2023020210110098800_B27","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198568315.001.0001","volume-title":"Data Analysis. A Bayesian tutorial.","author":"Sivia","year":"2006"},{"key":"2023020210110098800_B28","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1002\/prot.10146","article-title":"Scoring residue conservation","volume":"48","author":"Valdar","year":"2002","journal-title":"Proteins"},{"key":"2023020210110098800_B29","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1002\/1097-0134(20010101)42:1<108::AID-PROT110>3.0.CO;2-O","article-title":"Protein\u2013protein interfaces: analysis of amino acid conservation in homodimers","volume":"42","author":"Valdar","year":"2001","journal-title":"Proteins"},{"key":"2023020210110098800_B30","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1186\/1471-2105-7-385","article-title":"Incorporating background frequency improves entropy-based residue conservation measures","volume":"7","author":"Wang","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020210110098800_B31","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1110\/ps.062523907","article-title":"Evaluation of features for catalytic residue prediction in novel folds","volume":"16","author":"Youn","year":"2007","journal-title":"Protein Sci"},{"key":"2023020210110098800_B32","article-title":"Estimating residue evolutionary conservation by introducing von Neumann entropy and a novel gap-treating approach","author":"Zhang","year":"2007","journal-title":"Amino Acids"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/5\/613\/49051655\/bioinformatics_24_5_613.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/5\/613\/49051655\/bioinformatics_24_5_613.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,25]],"date-time":"2025-01-25T09:02:20Z","timestamp":1737795740000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/5\/613\/200952"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,1,2]]},"references-count":32,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2008,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm626","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,3,1]]},"published":{"date-parts":[[2008,1,2]]}}}