{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T04:12:36Z","timestamp":1768882356896,"version":"3.49.0"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The success of genome sequencing has resulted in many protein sequences without functional annotation. We present ConFunc, an automated Gene Ontology (GO)-based protein function prediction approach, which uses conserved residues to generate sequence profiles to infer function. ConFunc split sets of sequences identified by PSI-BLAST into sub-alignments according to their GO annotations. Conserved residues are identified for each GO term sub-alignment for which a position specific scoring matrix is generated. This combination of steps produces a set of feature (GO annotation) derived profiles from which protein function is predicted.<\/jats:p>\n               <jats:p>Results: We assess the ability of ConFunc, BLAST and PSI-BLAST to predict protein function in the twilight zone of sequence similarity. ConFunc significantly outperforms BLAST &amp; PSI-BLAST obtaining levels of recall and precision that are not obtained by either method and maximum precision 24% greater than BLAST. Further for a large test set of sequences with homologues of low sequence identity, at high levels of presicision, ConFunc obtains recall six times greater than BLAST. These results demonstrate the potential for ConFunc to form part of an automated genomics annotation pipeline.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/www.sbg.bio.ic.ac.uk\/confunc<\/jats:p>\n               <jats:p>Contact: \u00a0m.sternberg@imperial.ac.uk<\/jats:p>\n               <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn037","type":"journal-article","created":{"date-parts":[[2008,2,9]],"date-time":"2008-02-09T01:53:52Z","timestamp":1202522032000},"page":"798-806","source":"Crossref","is-referenced-by-count":96,"title":["ConFunc\u2014functional annotation in the twilight zone"],"prefix":"10.1093","volume":"24","author":[{"given":"Mark N.","family":"Wass","sequence":"first","affiliation":[{"name":"Structural Bioinformatics Group, Biochemistry Building, Division of Molecular Biosciences, Imperial College London, London SW7 2AZ, UK"}]},{"given":"Michael J. E.","family":"Sternberg","sequence":"additional","affiliation":[{"name":"Structural Bioinformatics Group, Biochemistry Building, Division of Molecular Biosciences, Imperial College London, London SW7 2AZ, UK"}]}],"member":"286","published-online":{"date-parts":[[2008,2,8]]},"reference":[{"key":"2023020209512544100_B1","doi-asserted-by":"crossref","first-page":"D197","DOI":"10.1093\/nar\/gki067","article-title":"FunShift: a database of function shift analysis on protein subfamilies","volume":"33","author":"Abhiman","year":"2005","journal-title":"Nucl. Acids Res"},{"key":"2023020209512544100_B2","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1006\/jmbi.2001.4870","article-title":"Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking","volume":"311","author":"Aloy","year":"2001","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B3","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B4","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucl. Acids Res"},{"key":"2023020209512544100_B5","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The gene ontology consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet"},{"key":"2023020209512544100_B6","doi-asserted-by":"crossref","first-page":"1322","DOI":"10.1093\/bioinformatics\/bth070","article-title":"ConSeq: the identification of functionally and structurally important residues in protein sequences","volume":"20","author":"Berezin","year":"2004","journal-title":"Bioinformatics"},{"key":"2023020209512544100_B7","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1016\/S0168-9525(99)01706-0","article-title":"Errors in genome annotation","volume":"15","author":"Brenner","year":"1999","journal-title":"Trends Genet"},{"key":"2023020209512544100_B8","first-page":"5","article-title":"The gene ontology annotation (GOA) database\u2014an integrated resource of GO annotations to the UniProt Knowledgebase","volume":"4","author":"Camon","year":"2004","journal-title":"Int. Silico Biol"},{"key":"2023020209512544100_B9","doi-asserted-by":"crossref","DOI":"10.1145\/1143844.1143874","article-title":"The relationship between Precision\u2013Recall and ROC Curves","author":"Davis","year":"2006"},{"key":"2023020209512544100_B10","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1002\/1097-0134(20001001)41:1<98::AID-PROT120>3.0.CO;2-S","article-title":"Practical limits of function prediction","volume":"41","author":"Devos","year":"2000","journal-title":"Proteins"},{"key":"2023020209512544100_B11","doi-asserted-by":"crossref","first-page":"429","DOI":"10.1016\/S0168-9525(01)02348-4","article-title":"Intrinsic errors in genome annotation","volume":"17","author":"Devos","year":"2001","journal-title":"Trends Genet"},{"key":"2023020209512544100_B12","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"MUSCLE: multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucl. Acids Res"},{"key":"2023020209512544100_B13","doi-asserted-by":"crossref","first-page":"e45","DOI":"10.1371\/journal.pcbi.0010045","article-title":"Protein molecular function prediction by bayesian phylogenomics","volume":"1","author":"Engelhardt","year":"2005","journal-title":"PLoS Comput. Biol"},{"key":"2023020209512544100_B14","doi-asserted-by":"crossref","first-page":"D247","DOI":"10.1093\/nar\/gkj149","article-title":"Pfam: clans, web tools and services","volume":"34","author":"Finn","year":"2006","journal-title":"Nucl. Acids Res"},{"key":"2023020209512544100_B15","doi-asserted-by":"crossref","first-page":"12299","DOI":"10.1073\/pnas.0504833102","article-title":"Effective function annotation through catalytic residue conservation","volume":"102","author":"George","year":"2005","journal-title":"Proc Natl Acad. Sci. USA"},{"key":"2023020209512544100_B16","doi-asserted-by":"crossref","first-page":"W313","DOI":"10.1093\/nar\/gkh406","article-title":"GOblet: a platform for gene ontology annotation of anonymous sequence data","volume":"32","author":"Groth","year":"2004","journal-title":"Nucl. Acids Res"},{"key":"2023020209512544100_B17","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1006\/jmbi.2000.4036","article-title":"Analysis and prediction of functional sub-types from protein sequence alignments","volume":"303","author":"Hannenhalli","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B18","doi-asserted-by":"crossref","first-page":"1550","DOI":"10.1110\/ps.062153506","article-title":"Enhanced automated function prediction using distantly related sequences and contextual association by PFP","volume":"15","author":"Hawkins","year":"2006","journal-title":"Protein Sci"},{"key":"2023020209512544100_B19","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1006\/jmbi.1999.2661","article-title":"The relationship between protein structure and function: a comprehensive survey with application to the yeast genome","volume":"288","author":"Hegyi","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B20","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020209512544100_B21","doi-asserted-by":"crossref","first-page":"272","DOI":"10.1186\/1471-2105-6-272","article-title":"Automated methods of predicting the function of biological sequences using GO and BLAST","volume":"6","author":"Jones","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023020209512544100_B22","doi-asserted-by":"crossref","first-page":"2484","DOI":"10.1093\/bioinformatics\/btg338","article-title":"GoFigure: automated Gene Ontology annotation","volume":"19","author":"Khan","year":"2003","journal-title":"Bioinformatics"},{"key":"2023020209512544100_B23","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1093\/bioinformatics\/18.1.77","article-title":"Tolerating some redundancy significantly speeds up clustering of large protein databases","volume":"18","author":"Li","year":"2002","journal-title":"Bioinformatics"},{"key":"2023020209512544100_B24","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1006\/jmbi.1996.0167","article-title":"An evolutionary trace method defines binding surfaces common to protein families","volume":"257","author":"Lichtarge","year":"1996","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B25","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.1093\/bioinformatics\/btg153","article-title":"Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation","volume":"19","author":"Lord","year":"2003","journal-title":"Bioinformatics"},{"key":"2023020209512544100_B26","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1186\/1471-2105-5-178","article-title":"GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes","volume":"5","author":"Martin","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023020209512544100_B27","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1007\/BF02295996","article-title":"Note on the sampling error of the difference between correlated proportions of percentages","volume":"12","author":"McNemar","year":"1947","journal-title":"Psychometrica"},{"key":"2023020209512544100_B28","doi-asserted-by":"crossref","first-page":"D201","DOI":"10.1093\/nar\/gki106","article-title":"InterPro, progress and status in 2005","volume":"33","author":"Mulder","year":"2005","journal-title":"Nucl. Acids Res"},{"key":"2023020209512544100_B29","doi-asserted-by":"crossref","first-page":"14754","DOI":"10.1073\/pnas.0404569101","article-title":"Automated prediction of protein function and detection of functional sites from structure","volume":"101","author":"Pazos","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020209512544100_B30","doi-asserted-by":"crossref","first-page":"D129","DOI":"10.1093\/nar\/gkh028","article-title":"The catalytic site atlas: a resource of catalytic sites and residues identified in enzymes using structural data","volume":"32","author":"Porter","year":"2004","journal-title":"Nucl. Acids Res"},{"key":"2023020209512544100_B31","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1093\/protein\/12.2.85","article-title":"Twilight zone of protein sequence alignments","volume":"12","author":"Rost","year":"1999","journal-title":"Protein Eng"},{"key":"2023020209512544100_B32","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1016\/S0022-2836(02)00016-5","article-title":"Enzyme function less conserved than anticipated","volume":"318","author":"Rost","year":"2002","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B33","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1186\/1471-2105-7-302","article-title":"A new measure for functional similarity of gene products based on Gene Ontology","volume":"7","author":"Schlicker","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020209512544100_B34","doi-asserted-by":"crossref","first-page":"863","DOI":"10.1016\/j.jmb.2003.08.057","article-title":"How well is enzyme function conserved as a function of pairwise sequence identity?","volume":"333","author":"Tian","year":"2003","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B35","doi-asserted-by":"crossref","first-page":"1113","DOI":"10.1006\/jmbi.2001.4513","article-title":"Evolution of function in protein superfamilies, from a structural perspective","volume":"307","author":"Todd","year":"2001","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B36","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1002\/prot.10146","article-title":"Scoring residue conservation","volume":"48","author":"Valdar","year":"2002","journal-title":"Proteins"},{"key":"2023020209512544100_B37","doi-asserted-by":"crossref","first-page":"1544","DOI":"10.1110\/ps.062184006","article-title":"A categorization approach to automated ontological function annotation","volume":"15","author":"Verspoor","year":"2006","journal-title":"Protein Sci"},{"key":"2023020209512544100_B38","first-page":"115","article-title":"A fast and sensitive multiple sequence alignment algorithm","volume":"5","author":"Vingron","year":"1989","journal-title":"Comput. Appl. Biosci"},{"key":"2023020209512544100_B39","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1006\/jmbi.2000.3550","article-title":"Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores","volume":"297","author":"Wilson","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023020209512544100_B40","doi-asserted-by":"crossref","first-page":"D187","DOI":"10.1093\/nar\/gkj161","article-title":"The universal protein resource (UniProt): an expanding universe of protein information","volume":"34","author":"Wu","year":"2006","journal-title":"Nucl. Acids Res"},{"key":"2023020209512544100_B41","doi-asserted-by":"crossref","first-page":"3799","DOI":"10.1093\/nar\/gkg555","article-title":"OntoBlast function: from sequence similarities directly to potential functional annotations by ontology terms","volume":"31","author":"Zehetner","year":"2003","journal-title":"Nucl. Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/6\/798\/49046397\/bioinformatics_24_6_798.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/6\/798\/49046397\/bioinformatics_24_6_798.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T10:46:38Z","timestamp":1675334798000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/6\/798\/194173"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,2,8]]},"references-count":41,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2008,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn037","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,3,15]]},"published":{"date-parts":[[2008,2,8]]}}}