{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T07:19:16Z","timestamp":1761895156081},"reference-count":62,"publisher":"Oxford University Press (OUP)","issue":"21","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Computational techniques for microbial genomic sequence analysis are becoming increasingly important. With next-generation sequencing technology and the human microbiome project underway, current sequencing capacity is significantly greater than the speed at which organisms of interest can be studied experimentally. Most related computational work has been focused on sequence assembly, gene annotation and metabolic network reconstruction. We have developed a method that will primarily use available sequence data in order to determine prokaryotic transcription factor (TF) binding specificities.<\/jats:p>\n               <jats:p>Results: Specificity determining residues (critical residues) were identified from crystal structures of DNA\u2013protein complexes and TFs with the same critical residues were grouped into specificity classes. The putative binding regions for each class were defined as the set of promoters for each TF itself (autoregulatory) and the immediately upstream and downstream operons. MEME was used to find putative motifs within each separate class. Tests on the LacI and TetR TF families, using RegulonDB annotated sites, showed the sensitivity of prediction 86% and 80%, respectively.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/ural.wustl.edu\/\u223cgsahota\/HTHmotif\/<\/jats:p>\n               <jats:p>Contact: \u00a0stormo@wustl.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq501","type":"journal-article","created":{"date-parts":[[2010,9,1]],"date-time":"2010-09-01T01:10:24Z","timestamp":1283303424000},"page":"2672-2677","source":"Crossref","is-referenced-by-count":22,"title":["Novel sequence-based method for identifying transcription factor binding sites in prokaryotic genomes"],"prefix":"10.1093","volume":"26","author":[{"given":"Gurmukh","family":"Sahota","sequence":"first","affiliation":[{"name":"Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63108, USA"}]},{"given":"Gary D.","family":"Stormo","sequence":"additional","affiliation":[{"name":"Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63108, USA"}]}],"member":"286","published-online":{"date-parts":[[2010,8,31]]},"reference":[{"key":"2023012507535083300_B1","doi-asserted-by":"crossref","first-page":"1362","DOI":"10.1101\/gr.2242604","article-title":"Regulog analysis: detection of conserved regulatory networks across bacteria: application to Staphylococcus aureus","volume":"14","author":"Alkema","year":"2004","journal-title":"Genome Res."},{"key":"2023012507535083300_B2","first-page":"28","article-title":"Fitting a mixture model by expectation maximization to discover motifs in biopolymers","volume":"2","author":"Bailey","year":"1994","journal-title":"Proc. Int. Conf. Intell. Syst. Mol. Biol."},{"key":"2023012507535083300_B3","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1101\/gr.1642804","article-title":"CONREAL: conserved regulatory elements anchored alignment algorithm for identification of transcription factor binding sites by phylogenetic footprinting","volume":"14","author":"Berezikov","year":"2004","journal-title":"Genome Res."},{"key":"2023012507535083300_B4","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1101\/gr.6902","article-title":"Discovery of regulatory elements by a computational method for phylogenetic footprinting","volume":"12","author":"Blanchette","year":"2002","journal-title":"Genome Res."},{"key":"2023012507535083300_B5","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1089\/10665270252935430","article-title":"Finding motifs using random projections","volume":"9","author":"Buhler","year":"2002","journal-title":"J. Comput. Biol."},{"key":"2023012507535083300_B6","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1016\/0022-2836(92)90723-W","article-title":"Expectation maximization algorithm for identifying protein-binding sites with variable lengths from unaligned DNA fragments","volume":"223","author":"Cardon","year":"1992","journal-title":"J. Mol. Biol."},{"key":"2023012507535083300_B7","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1126\/science.1084337","article-title":"Finding functional features in Saccharomyces genomes by phylogenetic footprinting","volume":"301","author":"Cliften","year":"2003","journal-title":"Science"},{"key":"2023012507535083300_B8","doi-asserted-by":"crossref","first-page":"e74","DOI":"10.1093\/bioinformatics\/btl215","article-title":"Comparative footprinting of DNA-binding proteins","volume":"22","author":"Contreras-Moreira","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012507535083300_B9","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1002\/prot.22525","article-title":"Comparison of DNA binding across protein superfamilies","volume":"78","author":"Contreras-Moreira","year":"2010","journal-title":"Proteins"},{"key":"2023012507535083300_B10","doi-asserted-by":"crossref","first-page":"W522","DOI":"10.1093\/nar\/gkm276","article-title":"PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations","volume":"35","author":"Dolinsky","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B11","doi-asserted-by":"crossref","first-page":"1445","DOI":"10.1093\/nar\/gki282","article-title":"NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence","volume":"33","author":"Down","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s12033-008-9127-7","article-title":"Data deposition and annotation at the worldwide protein data bank","volume":"42","author":"Dutta","year":"2009","journal-title":"Mol. Biotechnol."},{"key":"2023012507535083300_B13","first-page":"205","article-title":"A new generation of homology search tools based on probabilistic inference","volume":"23","author":"Eddy","year":"2009","journal-title":"Genome Inform."},{"key":"2023012507535083300_B14","doi-asserted-by":"crossref","first-page":"D211","DOI":"10.1093\/nar\/gkp985","article-title":"The Pfam protein families database","volume":"38","author":"Finn","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B15","doi-asserted-by":"crossref","first-page":"D120","DOI":"10.1093\/nar\/gkm994","article-title":"RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation","volume":"36","author":"Gama-Castro","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B16","doi-asserted-by":"crossref","first-page":"695","DOI":"10.1093\/nar\/28.3.695","article-title":"Prediction of transcription regulatory sites in Archaea by a comparative genomic approach","volume":"28","author":"Gelfand","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B17","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1093\/bib\/1.4.357","article-title":"Comparative analysis of regulatory patterns in bacterial genomes","volume":"1","author":"Gelfand","year":"2000","journal-title":"Brief. Bioinformatics"},{"key":"2023012507535083300_B18","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1038\/ismej.2009.97","article-title":"Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data","volume":"4","author":"Hamady","year":"2009","journal-title":"ISME J."},{"key":"2023012507535083300_B19","doi-asserted-by":"crossref","first-page":"715","DOI":"10.1038\/353715a0","article-title":"A structural taxonomy of DNA-binding domains","volume":"353","author":"Harrison","year":"1991","journal-title":"Nature"},{"key":"2023012507535083300_B20","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1093\/bioinformatics\/15.7.563","article-title":"Identifying DNA and protein patterns with statistically significant alignments of multiple sequences","volume":"15","author":"Hertz","year":"1999","journal-title":"Bioinformatics"},{"key":"2023012507535083300_B21","first-page":"81","article-title":"Identification of consensus patterns in unaligned DNA sequences known to be functionally related","volume":"6","author":"Hertz","year":"1990","journal-title":"Comput. Appl. Biosci."},{"key":"2023012507535083300_B22","doi-asserted-by":"crossref","first-page":"3832","DOI":"10.1093\/bioinformatics\/bti628","article-title":"Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes","volume":"21","author":"Jensen","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012507535083300_B23","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1038\/nature01644","article-title":"Sequencing and comparison of yeast species to identify genes and regulatory elements","volume":"423","author":"Kellis","year":"2003","journal-title":"Nature"},{"key":"2023012507535083300_B24","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1002\/nav.20053","article-title":"The Hungarian method for the assignment problem","volume":"52","author":"Kuhn","year":"2005","journal-title":"Nav. Res. Logist."},{"key":"2023012507535083300_B25","doi-asserted-by":"crossref","first-page":"6778","DOI":"10.1093\/nar\/gkg891","article-title":"Solution structure and DNA binding of the effector domain from the global regulator PrrA (RegA) from Rhodobacter sphaeroides: insights into DNA binding specificity","volume":"31","author":"Laguri","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B26","doi-asserted-by":"crossref","first-page":"5376","DOI":"10.1093\/nar\/gkn515","article-title":"The cis-regulatory map of Shewanella genomes","volume":"36","author":"Liu","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B27","first-page":"127","article-title":"BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes","volume":"6","author":"Liu","year":"2001","journal-title":"Pac. Symp. Biocomput."},{"key":"2023012507535083300_B28","doi-asserted-by":"crossref","first-page":"3434","DOI":"10.1093\/nar\/gkl423","article-title":"Bacterial regulatory networks are extremely flexible in evolution","volume":"34","author":"Lozada-Ch\u00e1vez","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B29","doi-asserted-by":"crossref","first-page":"i297","DOI":"10.1093\/bioinformatics\/btm215","article-title":"Inferring protein DNA dependencies using motif alignments and mutual information","volume":"23","author":"Mahony","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012507535083300_B30","doi-asserted-by":"crossref","first-page":"3197","DOI":"10.1099\/mic.0.28167-0","article-title":"Combining microarray and genomic data to predict DNA binding motifs","volume":"151","author":"Mao","year":"2005","journal-title":"Microbiology"},{"key":"2023012507535083300_B31","doi-asserted-by":"crossref","first-page":"482","DOI":"10.1016\/j.mib.2003.09.002","article-title":"Identifying global regulators in transcriptional regulatory networks in bacteria","volume":"6","author":"Mart\u00ednez-Antonio","year":"2003","journal-title":"Curr. Opin. Microbiol."},{"key":"2023012507535083300_B32","doi-asserted-by":"crossref","first-page":"1523","DOI":"10.1101\/gr.323602","article-title":"Factors influencing the identification of transcription factor binding sites by cross-species comparison","volume":"12","author":"McCue","year":"2002","journal-title":"Genome Res."},{"key":"2023012507535083300_B33","doi-asserted-by":"crossref","first-page":"7068","DOI":"10.1073\/pnas.0701356104","article-title":"Connecting protein structure with predictions of regulatory sites","volume":"104","author":"Morozov","year":"2007","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507535083300_B34","first-page":"324","article-title":"Phylogenetic motif detection by expectation-maximization on evolutionary mixtures","volume":"9","author":"Moses","year":"2004","journal-title":"Pac. Symp. Biocomput."},{"key":"2023012507535083300_B35","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol."},{"key":"2023012507535083300_B36","doi-asserted-by":"crossref","first-page":"1277","DOI":"10.1016\/j.cell.2008.05.023","article-title":"Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites","volume":"133","author":"Noyes","year":"2008","journal-title":"Cell"},{"issue":"Suppl. 1","key":"2023012507535083300_B37","doi-asserted-by":"crossref","first-page":"S207","DOI":"10.1093\/bioinformatics\/17.suppl_1.S207","article-title":"An algorithm for finding signals of unknown length in DNA sequences","volume":"17","author":"Pavesi","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012507535083300_B38","doi-asserted-by":"crossref","first-page":"e5437","DOI":"10.1371\/journal.pone.0005437","article-title":"Diversity of 23S rRNA genes within individual prokaryotic genomes","volume":"4","author":"Pei","year":"2009","journal-title":"PLoS ONE"},{"key":"2023012507535083300_B39","doi-asserted-by":"crossref","first-page":"1838","DOI":"10.1093\/nar\/28.8.1838","article-title":"The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12","volume":"28","author":"Perez-Rueda","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B40","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1093\/bib\/bbp026","article-title":"Genome assembly reborn: recent computational challenges","volume":"10","author":"Pop","year":"2009","journal-title":"Brief. Bioinform."},{"key":"2023012507535083300_B41","first-page":"348","article-title":"Motif discovery in heterogeneous sequence data","volume":"9","author":"Prakash","year":"2004","journal-title":"Pac. Symp. Biocomput."},{"key":"2023012507535083300_B42","doi-asserted-by":"crossref","first-page":"880","DOI":"10.1093\/nar\/gki232","article-title":"A novel method for accurate operon predictions in all sequenced prokaryotes","volume":"33","author":"Price","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B43","doi-asserted-by":"crossref","first-page":"1739","DOI":"10.1371\/journal.pcbi.0030175","article-title":"Orthologous transcription factors in bacteria have different functions and regulate different genes","volume":"3","author":"Price","year":"2007","journal-title":"PLoS Comput. Biol."},{"key":"2023012507535083300_B44","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nature08821","article-title":"A human gut microbial gene catalogue established by metagenomic sequencing","volume":"464","author":"Qin","year":"2010","journal-title":"Nature"},{"key":"2023012507535083300_B45","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1038\/nbt802","article-title":"Identification of co-regulated genes through Bayesian clustering of predicted regulatory binding sites","volume":"21","author":"Qin","year":"2003","journal-title":"Nat. Biotechnol."},{"key":"2023012507535083300_B46","doi-asserted-by":"crossref","first-page":"909","DOI":"10.1126\/science.8346441","article-title":"Determinants of binding-site specificity among yeast C6 zinc cluster proteins","volume":"261","author":"Reece","year":"1993","journal-title":"Science"},{"key":"2023012507535083300_B47","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1146\/annurev.genet.38.072902.091216","article-title":"METAGENOMICS: genomic analysis of microbial communities","volume":"38","author":"Riesenfeld","year":"2004","journal-title":"Annu. Rev. Genet."},{"key":"2023012507535083300_B48","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1111\/j.1574-6976.2008.00154.x","article-title":"A phylogenomic analysis of bacterial helix-turn-helix transcription factors","volume":"33","author":"Santos","year":"2009","journal-title":"FEMS Microbiol. Rev."},{"key":"2023012507535083300_B49","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1186\/1471-2105-11-52","article-title":"Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function","volume":"11","author":"Selengut","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012507535083300_B50","doi-asserted-by":"crossref","first-page":"e67","DOI":"10.1371\/journal.pcbi.0010067","article-title":"PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny","volume":"1","author":"Siddharthan","year":"2005","journal-title":"PLoS Comput. Biol."},{"key":"2023012507535083300_B51","doi-asserted-by":"crossref","first-page":"1027","DOI":"10.1016\/j.jmb.2004.11.010","article-title":"Structural alignment of protein\u2013DNA interfaces: insights into the determinants of binding specificity","volume":"345","author":"Siggers","year":"2005","journal-title":"J. Mol. Biol."},{"key":"2023012507535083300_B52","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1007\/978-1-59745-514-5_19","article-title":"PhyME: a software tool for finding motifs in sets of orthologous sequences","volume":"395","author":"Sinha","year":"2007","journal-title":"Methods Mol. Biol."},{"key":"2023012507535083300_B53","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1093\/nar\/gkn931","article-title":"Systematic prediction of control proteins and their DNA binding sites","volume":"37","author":"Sorokin","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B54","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1101\/gr.3069205","article-title":"Making connections between novel transcription factors and their DNA motifs","volume":"15","author":"Tan","year":"2005","journal-title":"Genome Res."},{"key":"2023012507535083300_B55","doi-asserted-by":"crossref","first-page":"3580","DOI":"10.1093\/nar\/gkg608","article-title":"Gibbs Recursive Sampler: finding transcription factor binding sites","volume":"31","author":"Thompson","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023012507535083300_B56","doi-asserted-by":"crossref","first-page":"6656","DOI":"10.1128\/JB.186.19.6656-6660.2004","article-title":"DNA binding activity of the Escherichia coli nitric oxide sensor NorR suggests a conserved target sequence in diverse proteobacteria","volume":"186","author":"Tucker","year":"2004","journal-title":"J. Bacteriol."},{"key":"2023012507535083300_B57","doi-asserted-by":"crossref","first-page":"804","DOI":"10.1038\/nature06244","article-title":"The human microbiome project","volume":"449","author":"Turnbaugh","year":"2007","journal-title":"Nature"},{"key":"2023012507535083300_B58","doi-asserted-by":"crossref","first-page":"2369","DOI":"10.1093\/bioinformatics\/btg329","article-title":"Combining phylogenetic data with co-regulated genes to identify regulatory motifs","volume":"19","author":"Wang","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012507535083300_B59","doi-asserted-by":"crossref","first-page":"17400","DOI":"10.1073\/pnas.0505147102","article-title":"Identifying the conserved network of cis-regulatory sites of a eukaryotic genome","volume":"102","author":"Wang","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507535083300_B60","doi-asserted-by":"crossref","first-page":"e1000465","DOI":"10.1371\/journal.pcbi.1000465","article-title":"A parsimony approach to biological pathway reconstruction\/inference for genomes and metagenomes","volume":"5","author":"Ye","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012507535083300_B61","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1142\/S0219720009004151","article-title":"An ORFome assembly approach to metagenomics sequences analysis","volume":"7","author":"Ye","year":"2009","journal-title":"J. Bioinform. Comput. Biol."},{"key":"2023012507535083300_B62","doi-asserted-by":"crossref","first-page":"1107","DOI":"10.1101\/gr.1774904","article-title":"Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs","volume":"14","author":"Yu","year":"2004","journal-title":"Genome Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/21\/2672\/48851255\/bioinformatics_26_21_2672.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/21\/2672\/48851255\/bioinformatics_26_21_2672.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T07:54:17Z","timestamp":1674633257000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/21\/2672\/213133"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,8,31]]},"references-count":62,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2010,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq501","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,11,1]]},"published":{"date-parts":[[2010,8,31]]}}}