{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,5,14]],"date-time":"2023-05-14T04:40:39Z","timestamp":1684039239010},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"22","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,11,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Availability of large volumes of genomic and enzymatic data for taxonomically and phenotypically diverse organisms allows for exploration of the adaptive mechanisms that led to diversification of enzymatic functions. We present Chisel, a computational framework and a pipeline for an automated, high-resolution analysis of evolutionary variations of enzymes. Chisel allows automatic as well as interactive identification, and characterization of enzymatic sequences. Such knowledge can be utilized for comparative genomics, microbial diagnostics, metabolic engineering, drug design and analysis of metagenomes.<\/jats:p><jats:p>Results: Chisel is a comprehensive resource that contains 8575 clusters and subsequent computational models specific for 939 distinct enzymatic functions and, when data is sufficient, their taxonomic variations. Application of Chisel to identification of enzymatic sequences in newly sequenced genomes, analysis of organism-specific metabolic networks, \u2018binning\u2019 of metagenomes and other biological problems are presented. We also provide a thorough analysis of Chisel performance with other similar resources and manual annotations on Shewanella oneidensis MR1 genome.<\/jats:p><jats:p>Availability: Chisel is available for interactive use at http:\/\/compbio.mcs.anl.gov\/CHISEL. The website also provides a user manual, clusters and function-specific computational models.<\/jats:p><jats:p>Contact: \u00a0arodri7@mcs.anl.gov or maltsev@mcs.anl.gov<\/jats:p><jats:p>Supplementary information: Additional data can be found at http:\/\/compbio.mcs.anl.gov\/CHISEL\/htmls\/refs.html<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm421","type":"journal-article","created":{"date-parts":[[2007,9,14]],"date-time":"2007-09-14T00:23:31Z","timestamp":1189729411000},"page":"2961-2968","source":"Crossref","is-referenced-by-count":1,"title":["Evolutionary analysis of enzymes using Chisel"],"prefix":"10.1093","volume":"23","author":[{"given":"Alexis A.","family":"Rodriguez","sequence":"first","affiliation":[{"name":"1 Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439, 2Computation Institute, The University of Chicago, 5640 S. Ellis Avenue, RI 405, Chicago, IL 60637 and 3Department of Human Genetics, The University of Chicago, 920 E. 58th Street, Chicago, IL 60637, USA"},{"name":"1 Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439, 2Computation Institute, The University of Chicago, 5640 S. Ellis Avenue, RI 405, Chicago, IL 60637 and 3Department of Human Genetics, The University of Chicago, 920 E. 58th Street, Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tanuja","family":"Bompada","sequence":"additional","affiliation":[{"name":"1 Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439, 2Computation Institute, The University of Chicago, 5640 S. Ellis Avenue, RI 405, Chicago, IL 60637 and 3Department of Human Genetics, The University of Chicago, 920 E. 58th Street, Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mustafa","family":"Syed","sequence":"additional","affiliation":[{"name":"1 Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439, 2Computation Institute, The University of Chicago, 5640 S. Ellis Avenue, RI 405, Chicago, IL 60637 and 3Department of Human Genetics, The University of Chicago, 920 E. 58th Street, Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Parantu K.","family":"Shah","sequence":"additional","affiliation":[{"name":"1 Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439, 2Computation Institute, The University of Chicago, 5640 S. Ellis Avenue, RI 405, Chicago, IL 60637 and 3Department of Human Genetics, The University of Chicago, 920 E. 58th Street, Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Natalia","family":"Maltsev","sequence":"additional","affiliation":[{"name":"1 Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439, 2Computation Institute, The University of Chicago, 5640 S. Ellis Avenue, RI 405, Chicago, IL 60637 and 3Department of Human Genetics, The University of Chicago, 920 E. 58th Street, Chicago, IL 60637, USA"},{"name":"1 Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439, 2Computation Institute, The University of Chicago, 5640 S. Ellis Avenue, RI 405, Chicago, IL 60637 and 3Department of Human Genetics, The University of Chicago, 920 E. 58th Street, Chicago, IL 60637, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2007,9,13]]},"reference":[{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D115","DOI":"10.1093\/nar\/gkh131","article-title":"UniProt: The Universal Protein knowledgebase","volume":"32","author":"Apweiler","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1093\/nar\/gkg030","article-title":"PRINTS and its automatic supplement, prePRINTS","volume":"31","author":"Attwood","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1093\/nar\/28.1.304","article-title":"The ENZYME database in 2000","volume":"28","author":"Bairoch","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1038\/ng0498-313","article-title":"Predicting functions from protein sequences \u2013 where are the bottlenecks?","volume":"18","author":"Bork","year":"1998","journal-title":"Nat. Genet."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1186\/1471-2148-4-33","article-title":"Reconstruction of ancestral protein sequences and its applications","volume":"4","author":"Cai","year":"2004","journal-title":"BMC Evol. Biol."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"6633","DOI":"10.1093\/nar\/gkg847","article-title":"Enzyme-specific profiles for genome annotation: PRIAM","volume":"31","author":"Claudel-Renard","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1016\/S0959-440X(96)80056-X","article-title":"Hidden Markov models","volume":"6","author":"Eddy","year":"1996","journal-title":"Curr. Opin. Struct. Biol."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","article-title":"Profile Hidden Markov Models","volume":"14","author":"Eddy","year":"1998","journal-title":"Bioinformatics"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"MUSCLE: multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1086\/284325","article-title":"Phylogenies and the comparative method","volume":"125","author":"Felsenstein","year":"1985","journal-title":"Am. Nat."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1023\/A:1003705601428","article-title":"Functional genomics and enzyme evolution. Homologous and analogous enzymes encoded in microbial genomes","volume":"106","author":"Galperin","year":"1999","journal-title":"Genetica"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"1446","DOI":"10.1093\/bioinformatics\/btg175","article-title":"POAVIZ: a partial order multiple sequence alignment visualizer","volume":"19","author":"Grasso","year":"2003","journal-title":"Bioinformatics"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"4355","DOI":"10.1073\/pnas.84.13.4355","article-title":"Profile analysis: detection of distantly related proteins","volume":"84","author":"Gribskov","year":"1987","journal-title":"Proc. Natl Acad. Sci.USA"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1016\/S0076-6879(96)66008-X","article-title":"Blocks database and its applications","volume":"26","author":"Henikoff","year":"1996","journal-title":"Meth. Enzymol."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D227","DOI":"10.1093\/nar\/gkj063","article-title":"The PROSITE database","volume":"34","author":"Hulo","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D354","DOI":"10.1093\/nar\/gkj102","article-title":"From genomics to chemical genomics: new developments in KEGG","volume":"34","author":"Kanehisa","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"567","DOI":"10.1006\/jmbi.2000.4315","article-title":"Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes","volume":"305","author":"Krogh","year":"2001","journal-title":"J. Mol. Biol."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1093\/bioinformatics\/18.3.452","article-title":"Multiple sequence alignment using partial order graphs","volume":"18","author":"Lee","year":"2002","journal-title":"Bioinformatics"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D369","DOI":"10.1093\/nar\/gkj095","article-title":"PUMA2 \u2013 grid-based high-throughput analysis of genomes and metabolic pathways","volume":"34","author":"Maltsev","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Midori","year":"2000","journal-title":"Nat. Genet."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D201","DOI":"10.1093\/nar\/gki106","article-title":"InterPro, progress and status in 2005","volume":"33","author":"Mulder","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1016\/S0968-0004(98)01336-X","article-title":"PSORT: a program for detecting the sorting signals of proteins and predicting their subcellular localization","volume":"24","author":"Nakai","year":"1999","journal-title":"Trends Biochem. Sci."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"e337","DOI":"10.1371\/journal.pone.0000337","article-title":"Probabilistic protein function prediction from heterogeneous genome-wide data","volume":"2","author":"Nariai","year":"2007","journal-title":"PLoS ONE"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1110\/ps.16802","article-title":"The CATH extended protein-family database: providing structural annotations for genome sequences","volume":"11","author":"Pearl","year":"2002","journal-title":"Protein Sci."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1101\/gr.10.3.379","article-title":"HOBACGEN: database system for comparative genomics in bacteria","volume":"10","author":"Perriere","year":"2000","journal-title":"Genome Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D501","DOI":"10.1093\/nar\/gki025","article-title":"NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins","volume":"33","author":"Pruitt","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","first-page":"406","article-title":"The neighbor-joining method: a new method for reconstructing phylogenetic trees","volume":"4","author":"Saitou","year":"1987","journal-title":"Mol. Biol. Evol."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1002\/1438-826X(200010)1:3\/4<109::AID-GNFD109>3.0.CO;2-O","article-title":"Enzyme data and metabolic information: BRENDA, a resource for research in biology, biochemistry, and medicine","volume":"3\u20134","author":"Schomburg","year":"2000","journal-title":"Gene Funct. Dis."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1093\/nar\/24.1.26","article-title":"The metabolic pathway collection from EMP: the enzymes and metabolic pathways database","volume":"24","author":"Selkov","year":"1996","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1093\/nar\/25.1.37","article-title":"The metabolic pathway collection: an update","volume":"25","author":"Selkov","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1007\/s10877-005-3463-y","article-title":"Gnare: automated system for high-throughput genome analysis with grid computation backend","volume":"19","author":"Sulakhe","year":"2005","journal-title":"J. Clin. Monit. Comput."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1126\/science.278.5338.631","article-title":"A genomic perspective on protein families","volume":"278","author":"Tatusov","year":"1997","journal-title":"Science"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"4673","DOI":"10.1093\/nar\/22.22.4673","article-title":"CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice","volume":"22","author":"Thompson","year":"1994","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"6226","DOI":"10.1093\/nar\/gkh956","article-title":"EFICAz: a comprehensive approach for accurate genome-scale enzyme function inference","volume":"32","author":"Tian","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1038\/nature02340","article-title":"Community structure and metabolism through reconstruction of microbial genomes from the environment","volume":"428","author":"Tyson","year":"2004","journal-title":"Nature"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1126\/science.1093857","article-title":"Environmental genome shotgun sequencing of the Sargasso Sea","volume":"304","author":"Venter","year":"2004","journal-title":"Science"},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D173","DOI":"10.1093\/nar\/gkj158","article-title":"Database resources of the National Center for Biotechnology Information","volume":"34","author":"Wheeler","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D112","DOI":"10.1093\/nar\/gkh097","article-title":"PIRSF: family classification system at the Protein Information Resource","volume":"32","author":"Wu","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"D187","DOI":"10.1093\/nar\/gkj161","article-title":"The Universal Protein Resource (UniProt): an expanding universe of protein information","volume":"34","author":"Wu","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023041208260216600_","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1007\/BF01886884","article-title":"An analysis of protein folding type prediction by seed-propagated sampling and jackknife test","volume":"14","author":"Zhang","year":"1995","journal-title":"J. Protein Chem."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/22\/2961\/49857549\/bioinformatics_23_22_2961.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/22\/2961\/49857549\/bioinformatics_23_22_2961.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,14]],"date-time":"2023-05-14T04:11:56Z","timestamp":1684037516000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/22\/2961\/207468"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,9,13]]},"references-count":43,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2007,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm421","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,11,15]]},"published":{"date-parts":[[2007,9,13]]}}}