{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T18:56:50Z","timestamp":1706813810982},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Identifying regulatory elements in genomic sequences is a key component in understanding the control of gene expression. Computationally, this problem is often addressed by motif discovery, where the goal is to find a set of mutually similar subsequences within a collection of input sequences. Though motif discovery is widely studied and many approaches to it have been suggested, it remains a challenging and as yet unresolved problem.<\/jats:p>\n               <jats:p>Results: We introduce SAMF (Solution-Aggregating Motif Finder), a novel approach for motif discovery. SAMF is based on a Markov Random Field formulation, and its key idea is to uncover and aggregate multiple statistically significant solutions to the given motif finding problem. In contrast to many earlier methods, SAMF does not require prior estimates on the number of motif instances present in the data, is not limited by motif length, and allows motifs to overlap. Though SAMF is broadly applicable, these features make it particularly well suited for addressing the challenges of prokaryotic regulatory element detection. We test SAMF's ability to find transcription factor binding sites in an Escherichia coli dataset and show that it outperforms previous methods. Additionally, we uncover a number of previously unidentified binding sites in this data, and provide evidence that they correspond to actual regulatory elements.<\/jats:p>\n               <jats:p>Contact: \u00a0cyanover@fhcrc.org, msingh@cs.princeton.edu,elenaz@cs.princeton.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp090","type":"journal-article","created":{"date-parts":[[2009,2,18]],"date-time":"2009-02-18T03:25:13Z","timestamp":1234927513000},"page":"868-874","source":"Crossref","is-referenced-by-count":15,"title":["<i>M<\/i> are better than one: an ensemble-based motif finder and its application to regulatory element prediction"],"prefix":"10.1093","volume":"25","author":[{"given":"Chen","family":"Yanover","sequence":"first","affiliation":[{"name":"1 Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA and 2Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mona","family":"Singh","sequence":"additional","affiliation":[{"name":"1 Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA and 2Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Elena","family":"Zaslavsky","sequence":"additional","affiliation":[{"name":"1 Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA and 2Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2009,2,17]]},"reference":[{"key":"2023013110155256400_B1","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1007\/BF00993379","article-title":"Unsupervised learning of multiple motifs in biopolymers using expectation maximization","volume":"21","author":"Bailey","year":"1995","journal-title":"Mach. Learn."},{"key":"2023013110155256400_B2","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1016\/S0969-2126(02)00761-X","article-title":"Tandem DNA recognition by PhoB, a two-component signal transduction transcriptional activator","volume":"10","author":"Blanco","year":"2002","journal-title":"Structure"},{"key":"2023013110155256400_B3","doi-asserted-by":"crossref","first-page":"2207","DOI":"10.1099\/mic.0.28912-0","article-title":"Transcriptional regulation of the fad regulon genes of Escherichia coli by ArcA","volume":"152","author":"Cho","year":"2006","journal-title":"Microbiology"},{"key":"2023013110155256400_B4","doi-asserted-by":"crossref","first-page":"S21","DOI":"10.1186\/1471-2105-8-S7-S21","article-title":"A survey of DNA motif finding algorithms","volume":"8","author":"Das","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023013110155256400_B5","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.molcel.2007.09.027","article-title":"A universal framework for regulatory element discovery across all genomes and data-types","volume":"28","author":"Elemento","year":"2007","journal-title":"Mol. Cell"},{"key":"2023013110155256400_B6","doi-asserted-by":"crossref","DOI":"10.1002\/prot.22280","article-title":"Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space","author":"Fromer","year":"2009","journal-title":"Proteins Struct."},{"key":"2023013110155256400_B7","doi-asserted-by":"crossref","first-page":"e164","DOI":"10.1371\/journal.pcbi.0020164","article-title":"Transcriptional regulation by competing transcription factor modules","volume":"2","author":"Hermsen","year":"2006","journal-title":"PLoS Comput. Biol."},{"key":"2023013110155256400_B8","doi-asserted-by":"crossref","first-page":"1047","DOI":"10.1093\/bioinformatics\/btl037","article-title":"A deterministic motif finding algorithm with application to the human genome","volume":"22","author":"Hon","year":"2006","journal-title":"Bioinformatics"},{"key":"2023013110155256400_B9","doi-asserted-by":"crossref","first-page":"4899","DOI":"10.1093\/nar\/gki791","article-title":"Limitations and potentials of current motif discovery algorithms","volume":"33","author":"Hu","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023013110155256400_B10","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1186\/1471-2105-7-342","article-title":"EMD: an ensemble algorithm for discovering regulatory motifs in DNA sequences","volume":"7","author":"Hu","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013110155256400_B11","doi-asserted-by":"crossref","first-page":"7577","DOI":"10.1093\/nar\/gkm740","article-title":"Multidimensional annotation of the Escherichia coli K-12 genome","volume":"35","author":"Karp","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023013110155256400_B12","doi-asserted-by":"crossref","first-page":"1159","DOI":"10.1016\/j.jmb.2004.09.010","article-title":"Oligomeric assemblies of the E-scherichia coli MalT transcriptional activator revealed by cryo-electron microscopy and image processing","volume":"343","author":"Larquet","year":"2004","journal-title":"J. Mol. Biol."},{"key":"2023013110155256400_B13","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1126\/science.1075090","article-title":"Transcriptional regulatory networks in Saccharomyces cerevisiae","volume":"298","author":"Lee","year":"2002","journal-title":"Science"},{"key":"2023013110155256400_B14","doi-asserted-by":"crossref","first-page":"e36","DOI":"10.1371\/journal.pcbi.0020036","article-title":"Practical strategies for discovering regulatory DNA sequence motifs","volume":"2","author":"MacIsaac","year":"2006","journal-title":"PLoS Comput. Biol."},{"key":"2023013110155256400_B15","doi-asserted-by":"crossref","first-page":"W253","DOI":"10.1093\/nar\/gkm272","article-title":"STAMP: a web tool for exploring DNA-binding motif similarities","volume":"35","author":"Mahony","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023013110155256400_B16","doi-asserted-by":"crossref","first-page":"744","DOI":"10.1101\/gr.10.6.744","article-title":"Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes","volume":"10","author":"McGuire","year":"2000","journal-title":"Genome Res."},{"key":"2023013110155256400_B17","doi-asserted-by":"crossref","first-page":"1331","DOI":"10.1038\/ng1473","article-title":"Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays","volume":"36","author":"Mukherjee","year":"2004","journal-title":"Nat. Genet."},{"key":"2023013110155256400_B18","doi-asserted-by":"crossref","first-page":"3516","DOI":"10.1093\/bioinformatics\/bth438","article-title":"Comparative analysis of methods for representing and searching for transcription factor binding sites","volume":"20","author":"Osada","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013110155256400_B19","doi-asserted-by":"crossref","first-page":"W199","DOI":"10.1093\/nar\/gkh465","article-title":"Weeder web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes","volume":"32","author":"Pavesi","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023013110155256400_B20","article-title":"Probabilistic Reasoning in Intelligent Systems","author":"Pearl","year":"1988","journal-title":"Networks of Plausible Inference"},{"key":"2023013110155256400_B21","doi-asserted-by":"crossref","first-page":"e90","DOI":"10.1371\/journal.pcbi.0030090","article-title":"Binding site graphs: a new graph theoretical framework for prediction of transcription factor binding sites","volume":"3","author":"Reddy","year":"2007","journal-title":"PLoS Comput. Biol."},{"key":"2023013110155256400_B22","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1038\/nmeth1068","article-title":"Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing","volume":"4","author":"Robertson","year":"2007","journal-title":"Nat. Methods"},{"key":"2023013110155256400_B23","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1006\/jmbi.1998.2160","article-title":"A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome","volume":"284","author":"Robison","year":"1998","journal-title":"J. Mol. Biol."},{"key":"2023013110155256400_B24","doi-asserted-by":"crossref","first-page":"939","DOI":"10.1038\/nbt1098-939","article-title":"Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation","volume":"16","author":"Roth","year":"1998","journal-title":"Nat. Biotechnol."},{"key":"2023013110155256400_B25","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1038\/nature06340","article-title":"Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures","volume":"450","author":"Stark","year":"2007","journal-title":"Nature"},{"key":"2023013110155256400_B26","doi-asserted-by":"crossref","first-page":"12091","DOI":"10.1073\/pnas.91.25.12091","article-title":"Detection of conserved segments in proteins: Iterative scanning of sequence databases with alignment blocks","volume":"91","author":"Tatusov","year":"1994","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110155256400_B27","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1038\/10343","article-title":"Systematic determination of genetic network architecture","volume":"22","author":"Tavazoie","year":"1999","journal-title":"Nat. Genet."},{"key":"2023013110155256400_B28","doi-asserted-by":"crossref","first-page":"1113","DOI":"10.1093\/bioinformatics\/17.12.1113","article-title":"A higher order background model improves the detection of regulatory elements by Gibbs Sampling","volume":"17","author":"Thijs","year":"2001","journal-title":"Bioinformatics"},{"key":"2023013110155256400_B29","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1038\/nbt1053","article-title":"Assessing computational tools for the discovery of transcription factor binding sites","volume":"23","author":"Tompa","year":"2005","journal-title":"Nat. Biotechnol."},{"key":"2023013110155256400_B30","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/j.jmb.2005.03.059","article-title":"Structural analysis and solution studies of the activated regulatory domain of the response regulator ArcA: a symmetric dimer mediated by the \u03b14-\u03b25\u2212\u03b15 face","volume":"349","author":"Toro-Roman","year":"2005","journal-title":"J. Mol. Biol."},{"key":"2023013110155256400_B31","doi-asserted-by":"crossref","first-page":"e1000077","DOI":"10.1371\/journal.pcbi.1000077","article-title":"Measuring global credibility with application to local sequence alignment","volume":"4","author":"Webb-Robertson","year":"2008","journal-title":"PLoS Comput. Biol."},{"key":"2023013110155256400_B32","doi-asserted-by":"crossref","first-page":"2288","DOI":"10.1093\/bioinformatics\/btn420","article-title":"MotifVoter: a novel ensemble method for fine-grained integration of generic motif finders","volume":"24","author":"Wijaya","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013110155256400_B33","first-page":"1457","article-title":"Approximate inference and protein-folding","volume-title":"In NIPS 15","author":"Yanover","year":"2003"},{"key":"2023013110155256400_B34","article-title":"Finding the M most probable configurations using loopy belief propagation","volume-title":"In NIPS 16.","author":"Yanover","year":"2004"},{"key":"2023013110155256400_B35","article-title":"Understanding belief propagation and its generalizations","author":"Yedidia","year":"2001","journal-title":"In IJCAI (distinguished lecture track)"},{"key":"2023013110155256400_B36","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1186\/1748-7188-1-13","article-title":"A combinatorial optimization approach for diverse motif finding applications","volume":"1","author":"Zaslavsky","year":"2006","journal-title":"Algorithms Mol. Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/7\/868\/48984446\/bioinformatics_25_7_868.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/7\/868\/48984446\/bioinformatics_25_7_868.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T20:18:47Z","timestamp":1675196327000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/7\/868\/211358"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,2,17]]},"references-count":36,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2009,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp090","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,4,1]]},"published":{"date-parts":[[2009,2,17]]}}}