{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T13:56:48Z","timestamp":1761919008420},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The sequence specificity of DNA-binding proteins is typically represented as a position weight matrix in which each base position contributes independently to relative affinity. Assessment of the accuracy and broad applicability of this representation has been limited by the lack of extensive DNA-binding data. However, new microarray techniques, in which preferences for all possible K-mers are measured, enable a broad comparison of both motif representation and methods for motif discovery. Here, we consider the problem of accounting for all of the binding data in such experiments, rather than the highest affinity binding data. We introduce the RankMotif++, an algorithm designed for finding motifs whenever sequences are associated with a semi-quantitative measure of protein-DNA-binding affinity. RankMotif++ learns motif models by maximizing the likelihood of a set of binding preferences under a probabilistic model of how sequence binding affinity translates into binding preference observations. Because RankMotif++ makes few assumptions about the relationship between binding affinity and the semi-quantitative readout, it is applicable to a wide variety of experimental assays of DNA-binding preference.<\/jats:p>\n               <jats:p>Results: By several criteria, RankMotif++ predicts binding affinity better than two widely used motif finding algorithms (MDScan, MatrixREDUCE) or more recently developed algorithms (PREGO, Seed and Wobble), and its performance is comparable to a motif model that separately assigns affinities to 8-mers. Our results validate the PWM model and provide an approximation of the precision and recall that can be expected in a genomic scan.<\/jats:p>\n               <jats:p>Availability: RankMotif++ is available upon request.<\/jats:p>\n               <jats:p>Contact: quaid.morris@utoronto.ca<\/jats:p>\n               <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm224","type":"journal-article","created":{"date-parts":[[2007,7,23]],"date-time":"2007-07-23T16:13:46Z","timestamp":1185207226000},"page":"i72-i79","source":"Crossref","is-referenced-by-count":52,"title":["RankMotif++: a motif-search algorithm that accounts for relative ranks of K-mers in binding transcription factors"],"prefix":"10.1093","volume":"23","author":[{"given":"Xiaoyu","family":"Chen","sequence":"first","affiliation":[{"name":"1 Banting and Best Department of Medical Research, 2Department of Medical Genetics and Microbiology, 3Department of Computer Science, University of Toronto, Toronto, ON, Canada and 4Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA"},{"name":"1 Banting and Best Department of Medical Research, 2Department of Medical Genetics and Microbiology, 3Department of Computer Science, University of Toronto, Toronto, ON, Canada and 4Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA"},{"name":"1 Banting and Best Department of Medical Research, 2Department of Medical Genetics and Microbiology, 3Department of Computer Science, University of Toronto, Toronto, ON, Canada and 4Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA"}]},{"given":"Timothy R.","family":"Hughes","sequence":"additional","affiliation":[{"name":"1 Banting and Best Department of Medical Research, 2Department of Medical Genetics and Microbiology, 3Department of Computer Science, University of Toronto, Toronto, ON, Canada and 4Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA"},{"name":"1 Banting and Best Department of Medical Research, 2Department of Medical Genetics and Microbiology, 3Department of Computer Science, University of Toronto, Toronto, ON, Canada and 4Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA"}]},{"given":"Quaid","family":"Morris","sequence":"additional","affiliation":[{"name":"1 Banting and Best Department of Medical Research, 2Department of Medical Genetics and Microbiology, 3Department of Computer Science, University of Toronto, Toronto, ON, Canada and 4Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA"},{"name":"1 Banting and Best Department of Medical Research, 2Department of Medical Genetics and Microbiology, 3Department of Computer Science, University of Toronto, Toronto, ON, Canada and 4Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA"},{"name":"1 Banting and Best Department of Medical Research, 2Department of Medical Genetics and Microbiology, 3Department of Computer Science, University of Toronto, Toronto, ON, Canada and 4Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,7,1]]},"reference":[{"key":"2023062708513459900_B1","first-page":"28","article-title":"Fitting a mixture model by expectation maximization to discover motifs in biopolymers","volume":"2","author":"Bailey","year":"1994","journal-title":"Proc. Int. Conf. Intell. Syst. Mol. Biol."},{"key":"2023062708513459900_B2","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1016\/0022-2836(87)90354-8","article-title":"Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters","volume":"193","author":"Berg","year":"1987","journal-title":"J. Mol. Biol"},{"key":"2023062708513459900_B3","doi-asserted-by":"crossref","first-page":"1429","DOI":"10.1038\/nbt1246","article-title":"Compact, universal DNA microarrays to comprehensively determine transcriptionfactor binding site specificities","volume":"24","author":"Berger","year":"2006","journal-title":"Nat. Biotechnol"},{"key":"2023062708513459900_B4","doi-asserted-by":"crossref","first-page":"12045","DOI":"10.1073\/pnas.0605140103","article-title":"Identifying transcription factor functions and targets by phenotypic activation","volume":"103","author":"Chua","year":"2006","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062708513459900_B5","doi-asserted-by":"crossref","first-page":"17675","DOI":"10.1073\/pnas.0503803102","article-title":"Profiling conditionspecific, genome-wide regulation of mRNA stability in yeast","volume":"102","author":"Foat","year":"2005","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062708513459900_B6","doi-asserted-by":"crossref","first-page":"e141","DOI":"10.1093\/bioinformatics\/btl223","article-title":"Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE","volume":"22","author":"Foat","year":"2006","journal-title":"Bioinformatics"},{"key":"2023062708513459900_B7","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1186\/1471-2105-7-113","article-title":"An improved map of conserved regulatory sites for Saccharomyces cerevisiae","volume":"7","author":"MacIsaac","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023062708513459900_B8","doi-asserted-by":"crossref","first-page":"R87","DOI":"10.1186\/gb-2005-6-10-r87","article-title":"Explicit equilibrium modeling of transcription-factor binding and gene regulation","volume":"6","author":"Granek","year":"2005","journal-title":"Genome. Biol"},{"key":"2023062708513459900_B9","first-page":"127","article-title":"BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes","volume-title":"Pac. Symp. Biocomput","author":"Liu","year":"2001"},{"key":"2023062708513459900_B10","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1038\/nbt717","article-title":"An algorithm for finding protein\u2013DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments","volume":"20","author":"Liu","year":"2002","journal-title":"Nat. Biotechnol"},{"key":"2023062708513459900_B11","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1101\/gr.3256505","article-title":"DIP-chip: rapid and accurate determination of DNA-binding specificity","volume":"15","author":"Liu","year":"2005","journal-title":"Genome Res"},{"key":"2023062708513459900_B12","doi-asserted-by":"crossref","first-page":"1517","DOI":"10.1101\/gr.5655606","article-title":"Whole-genome comparison of Leu3 binding in vitro and in vivo reveals the importance of nucleosome occupancy in target site selection","volume":"16","author":"Liu","year":"2006","journal-title":"Genome Res"},{"key":"2023062708513459900_B13","doi-asserted-by":"crossref","first-page":"2471","DOI":"10.1093\/nar\/29.12.2471","article-title":"Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay","volume":"29","author":"Man","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2023062708513459900_B14","doi-asserted-by":"crossref","first-page":"2041","DOI":"10.1101\/gr.2584104","article-title":"An ORFeome-based analysis of human transcription factor genes and the construction of a microarray to interrogate their expression","volume":"14","author":"Messina","year":"2004","journal-title":"Genome Res"},{"key":"2023062708513459900_B15","doi-asserted-by":"crossref","first-page":"1331","DOI":"10.1038\/ng1473","article-title":"Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays","volume":"36","author":"Mukherjee","year":"2004","journal-title":"Nat. Genet"},{"key":"2023062708513459900_B16","article-title":"Numerical Recipes in C++","author":"Press","year":"2002","edition":"2nd edn"},{"key":"2023062708513459900_B17","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt718","article-title":"Highthroughput SELEX SAGE method for quantitative modeling of transcription-factor binding sites","volume":"8","author":"Roulet","year":"2002","journal-title":"Nat. Biotechnol"},{"key":"2023062708513459900_B18","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1038\/nature05295","article-title":"In vivo enhancer analysis of human conserved non-coding sequences","volume":"444","author":"Pennacchio","year":"2006","journal-title":"Nature"},{"key":"2023062708513459900_B19","doi-asserted-by":"crossref","first-page":"1034","DOI":"10.1101\/gr.3715005","article-title":"Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes","volume":"8","author":"Siepel","year":"2005","journal-title":"Genome. Res"},{"key":"2023062708513459900_B20","doi-asserted-by":"crossref","first-page":"962","DOI":"10.1101\/gr.5113606","article-title":"Extensive low-affinity transcriptional interactions in the yeast genome","volume":"16","author":"Tanay","year":"2006","journal-title":"Genome. Res"},{"key":"2023062708513459900_B21","doi-asserted-by":"crossref","first-page":"D95","DOI":"10.1093\/nar\/gkj115","article-title":"A new generation of JASPAR, the open-access repository for transcription factor binding site profiles","volume":"34","author":"Vlieghe","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023062708513459900_B22","doi-asserted-by":"crossref","first-page":"867","DOI":"10.1073\/pnas.0509843102","article-title":"Defining the sequence-recognition profile of DNA-binding molecules","volume":"103","author":"Warren","year":"2006","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062708513459900_B23","doi-asserted-by":"crossref","first-page":"W389","DOI":"10.1093\/nar\/gki439","article-title":"enoLOGOS: a versatile web tool for energy normalized sequence logos","volume":"33","author":"Workman","year":"2005","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/13\/i72\/50718094\/bioinformatics_23_13_i72.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/13\/i72\/50718094\/bioinformatics_23_13_i72.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T08:54:04Z","timestamp":1687856044000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/13\/i72\/237219"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,7,1]]},"references-count":23,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2007,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm224","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,7]]},"published":{"date-parts":[[2007,7,1]]}}}