{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T08:33:08Z","timestamp":1758875588826,"version":"3.35.0"},"reference-count":47,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":3015,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The identification of transcription factor (TF) binding sites and the regulatory circuitry that they define is currently an area of intense research. Data from whole-genome chromatin immunoprecipitation (ChIP\u2013chip), whole-genome expression microarrays, and sequencing of multiple closely related genomes have all proven useful. By and large, existing methods treat the interpretation of functional data as a classification problem (between bound and unbound DNA), and the analysis of comparative data as a problem of local alignment (to recover phylogenetic footprints of presumably functional elements). Both of these approaches suffer from the inability to model and detect low-affinity binding sites, which have recently been shown to be abundant and functional.<\/jats:p><jats:p>Results: We have developed a method that discovers functional regulatory targets of TFs by predicting the total affinity of each promoter for those factors and then comparing that affinity across orthologous promoters in closely related species. At each promoter, we consider the minimum affinity among orthologs to be the fraction of the affinity that is functional. Because we calculate the affinity of the entire promoter, our method is independent of local alignment. By comparing with functional annotation information and gene expression data in Saccharomyces cerevisiae, we have validated that this biophysically motivated use of evolutionary conservation gives rise to dramatic improvement in prediction of regulatory connectivity and factor\u2013factor interactions compared to the use of a single genome. We propose novel biological functions for several yeast TFs, including the factors Snt2 and Stb4, for which no function has been reported. Our affinity-based approach towards comparative genomics may allow a more quantitative analysis of the principles governing the evolution of non-coding DNA.<\/jats:p><jats:p>Availability: The MatrixREDUCE software package is available from http:\/\/www.bussemakerlab.org\/software\/MatrixREDUCE<\/jats:p><jats:p>Contact: \u00a0Harmen.Bussemaker@columbia.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn154","type":"journal-article","created":{"date-parts":[[2008,6,27]],"date-time":"2008-06-27T07:43:13Z","timestamp":1214552593000},"page":"i165-i171","source":"Crossref","is-referenced-by-count":52,"title":["Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences"],"prefix":"10.1093","volume":"24","author":[{"given":"Lucas D.","family":"Ward","sequence":"first","affiliation":[{"name":"1 Department of Biological Sciences, Columbia University, New York, NY 10027 and 2Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA"}]},{"given":"Harmen J.","family":"Bussemaker","sequence":"additional","affiliation":[{"name":"1 Department of Biological Sciences, Columbia University, New York, NY 10027 and 2Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA"},{"name":"1 Department of Biological Sciences, Columbia University, New York, NY 10027 and 2Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA"}]}],"member":"286","published-online":{"date-parts":[[2008,7,1]]},"reference":[{"key":"2023020210374010700_B1","doi-asserted-by":"crossref","first-page":"2181","DOI":"10.1093\/nar\/29.10.2181","article-title":"Phenotypic analysis of genes encoding yeast zinc cluster proteins","volume":"29","author":"Akache","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"2023020210374010700_B2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat Genet"},{"key":"2023020210374010700_B3","doi-asserted-by":"crossref","first-page":"7024","DOI":"10.1093\/nar\/gkg894","article-title":"Identifying cooperativity among transcription factors controlling the cell cycle in yeast","volume":"31","author":"Banerjee","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023020210374010700_B4","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Statist. Soc. B"},{"key":"2023020210374010700_B5","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1016\/0022-2836(87)90354-8","article-title":"Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters","volume":"193","author":"Berg","year":"1987","journal-title":"J. Mol. Biol."},{"key":"2023020210374010700_B6","doi-asserted-by":"crossref","first-page":"R62","DOI":"10.1186\/gb-2004-5-9-r62","article-title":"Global nucleosome occupancy in yeast","volume":"5","author":"Bernstein","year":"2004","journal-title":"Genome Biol"},{"key":"2023020210374010700_B7","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1016\/j.gde.2005.02.007","article-title":"Transcriptional regulation by the numbers: models","volume":"15","author":"Bintu","year":"2005","journal-title":"Curr. Opin. Genet. Dev."},{"key":"2023020210374010700_B8","doi-asserted-by":"crossref","first-page":"W592","DOI":"10.1093\/nar\/gki484","article-title":"T-profiler: scoring the activity of predefined groups of genes using gene expression data","volume":"33","author":"Boorsma","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023020210374010700_B9","doi-asserted-by":"crossref","first-page":"S6","DOI":"10.1186\/1471-2105-8-S6-S6","article-title":"Dissecting complex transcriptional responses using pathway-level scores based on prior information","volume":"8","author":"Bussemaker","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023020210374010700_B10","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1146\/annurev.biophys.36.040306.132725","article-title":"Predictive modeling of genome-wide mRNA expression: from modules to molecules","volume":"36","author":"Bussemaker","year":"2007","journal-title":"Annu. Rev. Biophys. Biomol. Struct."},{"key":"2023020210374010700_B11","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1093\/nar\/26.1.73","article-title":"SGD: Saccharomyces Genome Database","volume":"26","author":"Cherry","year":"1998","journal-title":"Nucleic Acids Res."},{"key":"2023020210374010700_B12","doi-asserted-by":"crossref","first-page":"R43","DOI":"10.1186\/gb-2003-4-7-r43","article-title":"Phylogenetically and spatially conserved word pairs associated with gene-expression changes in yeasts","volume":"4","author":"Chiang","year":"2003","journal-title":"Genome Biol"},{"key":"2023020210374010700_B13","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1126\/science.1084337","article-title":"Finding functional features in Saccharomyces genomes by phylogenetic footprinting","volume":"301","author":"Cliften","year":"2003","journal-title":"Science"},{"key":"2023020210374010700_B14","doi-asserted-by":"crossref","first-page":"1114","DOI":"10.1093\/oxfordjournals.molbev.a004169","article-title":"Evolution of transcription factor binding sites in mammalian gene regulatory regions: conservation and turnover","volume":"19","author":"Dermitzakis","year":"2002","journal-title":"Mol. Biol. Evol."},{"key":"2023020210374010700_B15","doi-asserted-by":"crossref","first-page":"2381","DOI":"10.1101\/gr.1271603","article-title":"A biophysical approach to transcription factor binding site discovery","volume":"13","author":"Djordjevic","year":"2003","journal-title":"Genome Res."},{"key":"2023020210374010700_B16","doi-asserted-by":"crossref","first-page":"6982","DOI":"10.1128\/MCB.17.12.6982","article-title":"Yap, a novel family of eight bZIP proteins in Saccharomyces cerevisiae with distinct biological functions","volume":"17","author":"Fernandes","year":"1997","journal-title":"Mol. Cell Biol."},{"key":"2023020210374010700_B17","doi-asserted-by":"crossref","first-page":"e141","DOI":"10.1093\/bioinformatics\/btl223","article-title":"Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE","volume":"22","author":"Foat","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020210374010700_B18","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkm828","article-title":"TransfactomeDB: a resource for exploring the nucleotide sequence specificity and condition-specific regulatory activity of trans-acting factors","author":"Foat","year":"2007","journal-title":"Nucleic Acids Res"},{"issue":"31","key":"2023020210374010700_B19","article-title":"Defining transcriptional networks through integrative modeling of mRNA expression and transcription factor binding data","volume":"f 5","author":"Gao","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023020210374010700_B20","doi-asserted-by":"crossref","first-page":"R80","DOI":"10.1186\/gb-2004-5-10-r80","article-title":"Bioconductor: open software development for computational biology and bioinformatics","volume":"5","author":"Gentleman","year":"2004","journal-title":"Genome Biol"},{"key":"2023020210374010700_B21","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1038\/nature02800","article-title":"Transcriptional regulatory code of a eukaryotic genome","volume":"431","author":"Harbison","year":"2004","journal-title":"Nature"},{"key":"2023020210374010700_B22","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1016\/S0092-8674(00)00015-5","article-title":"Functional discovery via a compendium of expression profiles","volume":"102","author":"Hughes","year":"2000","journal-title":"Cell"},{"key":"2023020210374010700_B23","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1038\/nature01644","article-title":"Sequencing and comparison of yeast species to identify genes and regulatory elements","volume":"423","author":"Kellis","year":"2003","journal-title":"Nature"},{"key":"2023020210374010700_B24","doi-asserted-by":"crossref","first-page":"9481","DOI":"10.1073\/pnas.0501620102","article-title":"Sampling motifs on phylogenetic trees","volume":"102","author":"Li","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020210374010700_B25","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1038\/ng569","article-title":"Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association","volume":"28","author":"Lieb","year":"2001","journal-title":"Nat. Genet"},{"key":"2023020210374010700_B26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0022-2836(02)00894-X","article-title":"Rationalization of gene regulation by a eukaryotic transcription factor: calculation of regulatory region occupancy from predicted binding affinities","volume":"323","author":"Liu","year":"2002","journal-title":"J. Mol. Biol"},{"key":"2023020210374010700_B27","doi-asserted-by":"crossref","first-page":"634","DOI":"10.1016\/S0959-437X(02)00355-6","article-title":"Functional evolution of noncoding DNA","volume":"12","author":"Ludwig","year":"2002","journal-title":"Curr. Opin. Genet. Dev"},{"key":"2023020210374010700_B28","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1186\/1471-2105-7-113","article-title":"An improved map of conserved regulatory sites for Saccharomyces cerevisiae","volume":"7","author":"MacIsaac","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020210374010700_B29","doi-asserted-by":"crossref","first-page":"14315","DOI":"10.1073\/pnas.0405353101","article-title":"Sfp1 is a stress- and nutrient-sensitive regulator of ribosomal protein gene expression","volume":"101","author":"Marion","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020210374010700_B30","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1016\/S0014-5793(98)00249-X","article-title":"Yeast putative transcription factors involved in salt tolerance","volume":"425","author":"Mendizabal","year":"1998","journal-title":"FEBS Lett"},{"key":"2023020210374010700_B31","doi-asserted-by":"crossref","first-page":"R98","DOI":"10.1186\/gb-2004-5-12-r98","article-title":"MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model","volume":"5","author":"Moses","year":"2004","journal-title":"Genome Biol"},{"key":"2023020210374010700_B32","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1038\/ng724","article-title":"Identifying regulatory networks by combinatorial analysis of promoter elements","volume":"29","author":"Pilpel","year":"2001","journal-title":"Nat. Genet"},{"key":"2023020210374010700_B33","doi-asserted-by":"crossref","first-page":"3034","DOI":"10.1101\/gad.1034302","article-title":"Conserved homeodomain proteins interact with MADS box protein Mcm1 to restrict ECB-dependent transcription to the M\/G1 phase of the cell cycle","volume":"16","author":"Pramila","year":"2002","journal-title":"Genes Dev"},{"key":"2023020210374010700_B34","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1101\/gr.1739204","article-title":"Whole-genome discovery of transcription factor binding sites by network-level conservation","volume":"14","author":"Pritsker","year":"2004","journal-title":"Genome Res"},{"key":"2023020210374010700_B35","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1093\/bioinformatics\/btl565","article-title":"Predicting transcription factor affinities to DNA from a biophysical model","volume":"23","author":"Roider","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020210374010700_B36","doi-asserted-by":"crossref","first-page":"10555","DOI":"10.1073\/pnas.152046799","article-title":"Assigning numbers to the arrows: parameterizing a gene regulation network by using accurate expression kinetics","volume":"99","author":"Ronen","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020210374010700_B37","doi-asserted-by":"crossref","first-page":"W510","DOI":"10.1093\/nar\/gkl329","article-title":"JProGO: a novel tool for the functional interpretation of prokaryotic microarray data using Gene Ontology information","volume":"34","author":"Scheer","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023020210374010700_B38","doi-asserted-by":"crossref","first-page":"e67","DOI":"10.1371\/journal.pcbi.0010067","article-title":"PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny","volume":"1","author":"Siddharthan","year":"2005","journal-title":"PLoS Comput. Biol"},{"key":"2023020210374010700_B39","doi-asserted-by":"crossref","first-page":"3940","DOI":"10.1093\/bioinformatics\/bti623","article-title":"ROCR: visualizing classifier performance in R","volume":"21","author":"Sing","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020210374010700_B40","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1186\/1471-2105-5-129","article-title":"Cross-species comparison significantly improves genome-wide prediction of cis-regulatory modules in Drosophila","volume":"5","author":"Sinha","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023020210374010700_B41","doi-asserted-by":"crossref","first-page":"6661","DOI":"10.1093\/nar\/14.16.6661","article-title":"Quantitative analysis of the relationship between nucleotide sequence and functional activity","volume":"14","author":"Stormo","year":"1986","journal-title":"Nucleic Acids Res"},{"key":"2023020210374010700_B42","doi-asserted-by":"crossref","first-page":"1723","DOI":"10.1101\/gr.301202","article-title":"Genome-wide co-occurrence of promoter elements reveals a cis-regulatory cassette of rRNA transcription motifs in Saccharomyces cerevisiae","volume":"12","author":"Sudarsanam","year":"2002","journal-title":"Genome Res"},{"key":"2023020210374010700_B43","doi-asserted-by":"crossref","first-page":"962","DOI":"10.1101\/gr.5113606","article-title":"Extensive low-affinity transcriptional interactions in the yeast genome","volume":"16","author":"Tanay","year":"2006","journal-title":"Genome Res"},{"key":"2023020210374010700_B44","doi-asserted-by":"crossref","first-page":"575","DOI":"10.1016\/S0959-437X(00)00130-1","article-title":"Evolution of transcriptional regulation","volume":"10","author":"Tautz","year":"2000","journal-title":"Curr. Opin. Genet. Dev"},{"key":"2023020210374010700_B45","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1016\/S0968-0004(99)01460-7","article-title":"The economics of ribosome biosynthesis in yeast","volume":"24","author":"Warner","year":"1999","journal-title":"Trends Biochem. Sci"},{"key":"2023020210374010700_B46","first-page":"675","article-title":"Transcriptional regulation and the evolution of development","volume":"47","author":"Wray","year":"2003","journal-title":"Int. J. Dev. Biol"},{"key":"2023020210374010700_B47","doi-asserted-by":"crossref","first-page":"5279","DOI":"10.1128\/MCB.19.8.5279","article-title":"Chromatin opening and transactivator potentiation by Rap1 in Saccharomyces cerevisiae","volume":"19","author":"Yu","year":"1999","journal-title":"Mol. Cell Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/13\/i165\/49053301\/bioinformatics_24_13_i165.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/13\/i165\/49053301\/bioinformatics_24_13_i165.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,30]],"date-time":"2025-01-30T21:40:22Z","timestamp":1738273222000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/13\/i165\/230200"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,7,1]]},"references-count":47,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2008,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn154","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2008,7,1]]},"published":{"date-parts":[[2008,7,1]]}}}