{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,4,1]],"date-time":"2024-04-01T18:11:54Z","timestamp":1711995114242},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2010,12,1]],"date-time":"2010-12-01T00:00:00Z","timestamp":1291161600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/2.0"},{"start":{"date-parts":[[2010,12,1]],"date-time":"2010-12-01T00:00:00Z","timestamp":1291161600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/2.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2010,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Models for the simulation of metabolic networks require the accurate prediction of enzyme function. Based on a genomic sequence, enzymatic functions of gene products are today mainly predicted by sequence database searching and operon analysis. Other methods can support these techniques: We have developed an automatic method \"BrEPS\" that creates highly specific sequence patterns for the functional annotation of enzymes.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>The enzymes in the UniprotKB are identified and their sequences compared against each other with BLAST. The enzymes are then clustered into a number of trees, where each tree node is associated with a set of EC-numbers. The enzyme sequences in the tree nodes are aligned with ClustalW. The conserved columns of the resulting multiple alignments are used to construct sequence patterns. In the last step, we verify the quality of the patterns by computing their specificity. Patterns with low specificity are omitted and recomputed further down in the tree. The final high-quality patterns can be used for functional annotation. We ran our protocol on a recent Swiss-Prot release and show statistics, as well as a comparison to PRIAM, a probabilistic method that is also specialized on the functional annotation of enzymes. We determine the amount of true positive annotations for five common microorganisms with data from BRENDA and AMENDA serving as standard of truth. BrEPS is almost on par with PRIAM, a fact which we discuss in the context of five manually investigated cases.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Our protocol computes highly specific sequence patterns that can be used to support the functional annotation of enzymes. The main advantages of our method are that it is automatic and unsupervised, and quite fast once the patterns are evaluated. The results show that BrEPS can be a valuable addition to the reconstruction of metabolic networks.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-11-589","type":"journal-article","created":{"date-parts":[[2010,12,1]],"date-time":"2010-12-01T19:16:42Z","timestamp":1291231002000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["BrEPS: a flexible and automatic protocol to compute enzyme-specific sequence profiles for functional annotation"],"prefix":"10.1186","volume":"11","author":[{"given":"C","family":"Bannert","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"A","family":"Welfle","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"C","family":"aus dem Spring","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"D","family":"Schomburg","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2010,12,1]]},"reference":[{"key":"4172_CR1","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","volume":"215","author":"SF Altschul","year":"1990","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215: 403\u2013410.","journal-title":"J Mol Biol"},{"key":"4172_CR2","doi-asserted-by":"publisher","first-page":"595","DOI":"10.1016\/S0022-2836(02)00016-5","volume":"318","author":"B Rost","year":"2002","unstructured":"Rost B: Enzyme function less conserved than anticipated. J Mol Biol 2002, 318: 595\u2013608. 10.1016\/S0022-2836(02)00016-5","journal-title":"J Mol Biol"},{"key":"4172_CR3","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1002\/0471721204.ch2","volume-title":"Structural Bioinformatics","author":"ED Scheeff","year":"2003","unstructured":"Scheeff ED, Fink JL: Fundamentals of protein structure. In Structural Bioinformatics. Edited by: Bourne PE, Weissig H. Hoboken, Wiley-Liss Inc; 2003:15\u201339."},{"key":"4172_CR4","doi-asserted-by":"publisher","first-page":"435","DOI":"10.1007\/s00018-004-4416-1","volume":"62","author":"E Bornberg-Bauer","year":"2005","unstructured":"Bornberg-Bauer E, Beaussart F, Kummerfeld SK, Teichmann SA, Weiner J: The evolution of domain arangements in proteins and interaction networks. Cell Mol Life Sci 2005, 62: 435\u2013445. 10.1007\/s00018-004-4416-1","journal-title":"Cell Mol Life Sci"},{"issue":"1","key":"4172_CR5","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1093\/bib\/1.1.45","volume":"1","author":"TK Attwood","year":"2000","unstructured":"Attwood TK: The role of pattern databases in sequence analysis. Brief Bioinform 2000, 1(1):45\u201359. 10.1093\/bib\/1.1.45","journal-title":"Brief Bioinform"},{"key":"4172_CR6","doi-asserted-by":"publisher","first-page":"334","DOI":"10.1016\/S0959-440X(00)00211-6","volume":"11","author":"EV Kriventseva","year":"2001","unstructured":"Kriventseva EV, Biswas M, Apweiler R: Clustering and analysis of protein families. Curr Opi Struct Biol 2001, 11: 334\u2013339. 10.1016\/S0959-440X(00)00211-6","journal-title":"Curr Opi Struct Biol"},{"issue":"1","key":"4172_CR7","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1016\/S1367-5931(02)00003-0","volume":"7","author":"J Liu","year":"2003","unstructured":"Liu J, Rost B: Domains, motifs and clusters in the protein universe. Curr Opin Chem Biol 2003, 7(1):5\u201311. 10.1016\/S1367-5931(02)00003-0","journal-title":"Curr Opin Chem Biol"},{"issue":"I","key":"4172_CR8","first-page":"1","volume":"3","author":"NJ Mulder","year":"2001","unstructured":"Mulder NJ, Apweiler R: Tools and resources for identifying protein families, domains and motifs. Genome Biol 2001, 3(I):1\u20138.","journal-title":"Genome Biol"},{"key":"4172_CR9","first-page":"279","volume":"5","author":"A Brazma","year":"1998","unstructured":"Brazma A, Jonassen I, Eidhammer I, Gilbert D: Approaches to the automatic discovery of patterns in biosequences. JCB 1998, 5: 279\u2013305.","journal-title":"JCB"},{"issue":"8","key":"4172_CR10","doi-asserted-by":"publisher","first-page":"1587","DOI":"10.1002\/pro.5560040817","volume":"4","author":"I Jonassen","year":"1995","unstructured":"Jonassen I, Collins JF, Higgins DG: Finding flexible patterns in unaligned protein sequences. Protein Sci 1995, 4(8):1587\u201395. 10.1002\/pro.5560040817","journal-title":"Protein Sci"},{"issue":"1","key":"4172_CR11","first-page":"21","volume":"10","author":"AH Liu","year":"2003","unstructured":"Liu AH, Califano A: CASTOR: Clustering algorithm for sequence taxonomical organization and relationships. JCB 2003, 10(1):21\u201345.","journal-title":"JCB"},{"key":"4172_CR12","doi-asserted-by":"crossref","unstructured":"Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJ: The 20 years of PROSITE. Nucl Acids Res 2007, (36 Database):D245-D249. 10.1093\/nar\/gkm977","DOI":"10.1093\/nar\/gkm977"},{"issue":"22","key":"4172_CR13","doi-asserted-by":"publisher","first-page":"6633","DOI":"10.1093\/nar\/gkg847","volume":"31","author":"C Claudel-Renard","year":"2003","unstructured":"Claudel-Renard C, Chevalet C, Faraut T, Kahn D: Enzyme-specific profiles for genome annotation: PRIAM. Nucl Ac Res 2003, 31(22):6633\u20136639. 10.1093\/nar\/gkg847","journal-title":"Nucl Ac Res"},{"issue":"1","key":"4172_CR14","doi-asserted-by":"publisher","first-page":"304","DOI":"10.1093\/nar\/28.1.304","volume":"28","author":"A Bairoch","year":"2000","unstructured":"Bairoch A: The ENZYME database in 2000. Nucl Ac Res 2000, 28(1):304\u2013305. 10.1093\/nar\/28.1.304","journal-title":"Nucl Ac Res"},{"issue":"17","key":"4172_CR15","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"SF Altschul","year":"1997","unstructured":"Altschul SF, Madden TL, Sch\u00e4ffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Ac Res 1997, 25(17):3389\u20133402. 10.1093\/nar\/25.17.3389","journal-title":"Nucl Ac Res"},{"issue":"22","key":"4172_CR16","doi-asserted-by":"publisher","first-page":"4673","DOI":"10.1093\/nar\/22.22.4673","volume":"22","author":"JD Thompson","year":"1994","unstructured":"Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl Ac Res 1994, 22(22):4673\u20134680. 10.1093\/nar\/22.22.4673","journal-title":"Nucl Ac Res"},{"key":"4172_CR17","doi-asserted-by":"crossref","unstructured":"Chang A, Scheer M, Grote A, Schomburg I, Schomburg D: BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009. Nucl Ac Res 2009, (37 Database):D588\u201392. 10.1093\/nar\/gkn820","DOI":"10.1093\/nar\/gkn820"},{"key":"4172_CR18","doi-asserted-by":"publisher","first-page":"149","DOI":"10.1016\/0097-8485(93)85006-X","volume":"17","author":"JC Wootton","year":"1993","unstructured":"Wootton JC, Federhen S: Statistics of local complexity in amino acid sequences and sequence databases. Comput Chem 1993, 17: 149\u2013163. 10.1016\/0097-8485(93)85006-X","journal-title":"Comput Chem"},{"issue":"S2","key":"4172_CR19","doi-asserted-by":"publisher","first-page":"S182","DOI":"10.1093\/bioinformatics\/18.suppl_2.S182","volume":"18","author":"P Pipenbacher","year":"2002","unstructured":"Pipenbacher P, Schliep A, Schneckener S, Sch\u00f6nhuth A, Schomburg D, Schrader R: ProClust: improved clustering of protein sequences with an extended graph-based approach. Bioinformatics 2002, 18(S2):S182-S191.","journal-title":"Bioinformatics"},{"key":"4172_CR20","doi-asserted-by":"crossref","unstructured":"UniProt Consortium: The Universal Protein Resource (UniProt) 2009. Nucl Ac Res 2009, (37 Database):D169\u201374.","DOI":"10.1093\/nar\/gkn664"},{"key":"4172_CR21","unstructured":"Part of the BLAST [1] package, some online description at[http:\/\/www.ncbi.nlm.nih.gov\/staff\/tao\/URLAPI\/wwwblast\/node20.html]"},{"key":"4172_CR22","unstructured":"Bernard T: [PRIAM Team]: Personal communication."},{"key":"4172_CR23","volume-title":"PHD thesis","author":"C aus dem Spring","year":"2006","unstructured":"aus dem Spring C: Identifizierung \u00e4hnlicher Reaktionsmechanismen in homologen Enzymen unterschiedlicher Funktion unter Verwendung konservierter Sequenzdom\u00e4nen (in German). In PHD thesis. K\u00f6ln, Germany; 2006."},{"key":"4172_CR24","volume-title":"PHD thesis","author":"A Welfle","year":"2008","unstructured":"Welfle A: Erweiterte Identifizierung, automatische Generierung und Analyse von konservierten Sequenzmustern und vergleichende Analyse enzymatischer Reaktionen unter Verwendung von homologen Enzymdom\u00e4nen (in German). In PHD thesis. K\u00f6ln, Germany; 2008."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-11-589.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/1471-2105-11-589\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-11-589.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,1]],"date-time":"2024-04-01T17:42:48Z","timestamp":1711993368000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-11-589"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,12]]},"references-count":24,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,12]]}},"alternative-id":["4172"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-11-589","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,12]]},"assertion":[{"value":"25 March 2010","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 December 2010","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 December 2010","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"589"}}