{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,9]],"date-time":"2025-11-09T07:35:45Z","timestamp":1762673745446},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"24","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,12,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Much of a cell's regulatory response to changing environments occurs at the transcriptional level. Particularly in higher organisms, transcription factors (TFs), microRNAs and epigenetic modifications can combine to form a complex regulatory network. Part of this system can be modeled as a collection of regulatory modules: co-regulated genes, the conditions under which they are co-regulated and sequence-level regulatory motifs.<\/jats:p>\n               <jats:p>Results: We present the Combinatorial Algorithm for Expression and Sequence-based Cluster Extraction (COALESCE) system for regulatory module prediction. The algorithm is efficient enough to discover expression biclusters and putative regulatory motifs in metazoan genomes (&amp;gt;20 000 genes) and very large microarray compendia (&amp;gt;10 000 conditions). Using Bayesian data integration, it can also include diverse supporting data types such as evolutionary conservation or nucleosome placement. We validate its performance using a functional evaluation of co-clustered genes, known yeast and Escherichea coli TF targets, synthetic data and various metazoan data compendia. In all cases, COALESCE performs as well or better than current biclustering and motif prediction tools, with high accuracy in functional and TF\/target assignments and zero false positives on synthetic data. COALESCE provides an efficient and flexible platform within which large, diverse data collections can be integrated to predict metazoan regulatory networks.<\/jats:p>\n               <jats:p>Availability: Source code (C++) is available at http:\/\/function.princeton.edu\/sleipnir, and supporting data and a web interface are provided at http:\/\/function.princeton.edu\/coalesce.<\/jats:p>\n               <jats:p>Contact: \u00a0ogt@cs.princeton.edu; hcoller@princeton.edu.<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp588","type":"journal-article","created":{"date-parts":[[2009,10,14]],"date-time":"2009-10-14T03:13:28Z","timestamp":1255490008000},"page":"3267-3274","source":"Crossref","is-referenced-by-count":72,"title":["Detailing regulatory networks through large scale data integration"],"prefix":"10.1093","volume":"25","author":[{"given":"Curtis","family":"Huttenhower","sequence":"first","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"},{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"}]},{"given":"K. Tsheko","family":"Mutungu","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"}]},{"given":"Natasha","family":"Indik","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"}]},{"given":"Woongcheol","family":"Yang","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"}]},{"given":"Mark","family":"Schroeder","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"}]},{"given":"Joshua J.","family":"Forman","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"}]},{"given":"Olga G.","family":"Troyanskaya","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"},{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"}]},{"given":"Hilary A.","family":"Coller","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, Princeton University, 35 Olden Street, Princeton, NJ 08540, 2Lewis-Sigler Institute for Integrative Genomics, Princeton University, Carl Icahn Laboratory, Princeton, NJ 08544 and 3Department of Molecular Biology, Princeton University, Lewis Thomas Laboratory, Princeton, NJ 08544, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,10,13]]},"reference":[{"key":"2023013112143668800_B1","doi-asserted-by":"crossref","first-page":"D885","DOI":"10.1093\/nar\/gkn764","article-title":"NCBI GEO: archive for high-throughput functional genomic data","volume":"37","author":"Barrett","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023013112143668800_B2","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/S0092-8674(04)00304-6","article-title":"Predicting gene expression from sequence","volume":"117","author":"Beer","year":"2004","journal-title":"Cell"},{"key":"2023013112143668800_B3","doi-asserted-by":"crossref","first-page":"658","DOI":"10.1038\/nchembio.122","article-title":"Learning biological networks: from modules to dynamics","volume":"4","author":"Bonneau","year":"2008","journal-title":"Nat. Chem. Biol."},{"key":"2023013112143668800_B4","doi-asserted-by":"crossref","first-page":"352","DOI":"10.1091\/mbc.e07-08-0779","article-title":"Coordination of growth rate, cell cycle, stress response, and metabolic activity in yeast","volume":"19","author":"Brauer","year":"2008","journal-title":"Mol. Biol. Cell"},{"key":"2023013112143668800_B5","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1146\/annurev.biophys.36.040306.132725","article-title":"Predictive modeling of genome-wide mRNA expression: from modules to molecules","volume":"36","author":"Bussemaker","year":"2007","journal-title":"Ann. Rev. Biophys. Biomol. Struct."},{"key":"2023013112143668800_B6","doi-asserted-by":"crossref","first-page":"2245","DOI":"10.1016\/j.cub.2004.12.030","article-title":"Identification of thermosensory and olfactory neuron-specific genes via expression profiling of single neuron types","volume":"14","author":"Colosimo","year":"2004","journal-title":"Curr. Biol."},{"key":"2023013112143668800_B7","doi-asserted-by":"crossref","first-page":"3439","DOI":"10.1093\/bioinformatics\/bti525","article-title":"BioMart and bioconductor: a powerful link between biological databases and microarray data analysis","volume":"21","author":"Durinck","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013112143668800_B8","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1016\/j.molcel.2007.09.027","article-title":"A universal framework for regulatory element discovery across all genomes and data types","volume":"28","author":"Elemento","year":"2007","journal-title":"Mol. Cell"},{"key":"2023013112143668800_B9","doi-asserted-by":"crossref","first-page":"D120","DOI":"10.1093\/nar\/gkm994","article-title":"RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation","volume":"36","author":"Gama-Castro","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023013112143668800_B10","doi-asserted-by":"crossref","first-page":"1566","DOI":"10.1093\/nar\/gkn1064","article-title":"Allegro: analyzing expression and sequence in concert to discover regulatory programs","volume":"37","author":"Halperin","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023013112143668800_B11","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1093\/bioinformatics\/btn198","article-title":"Eukaryotic transcription factor binding sites\u2013modeling and integrative search methods","volume":"24","author":"Hannenhalli","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013112143668800_B12","doi-asserted-by":"crossref","first-page":"i330","DOI":"10.1093\/bioinformatics\/btn160","article-title":"Assessing the functional structure of genomic data","volume":"24","author":"Huttenhower","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013112143668800_B13","doi-asserted-by":"crossref","first-page":"1559","DOI":"10.1093\/bioinformatics\/btn237","article-title":"The Sleipnir library for computational functional genomics","volume":"24","author":"Huttenhower","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013112143668800_B14","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1038\/nature01644","article-title":"Sequencing and comparison of yeast species to identify genes and regulatory elements","volume":"423","author":"Kellis","year":"2003","journal-title":"Nature"},{"key":"2023013112143668800_B15","doi-asserted-by":"crossref","first-page":"1172","DOI":"10.1093\/bioinformatics\/bti096","article-title":"Finding regulatory modules through large-scale gene-expression data analysis","volume":"21","author":"Kloster","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013112143668800_B16","doi-asserted-by":"crossref","first-page":"e1000224","DOI":"10.1371\/journal.pcbi.1000224","article-title":"A predictive model of the oxygen and heme regulatory network in yeast","volume":"4","author":"Kundaje","year":"2008","journal-title":"PLoS Comput. Biol."},{"key":"2023013112143668800_B17","doi-asserted-by":"crossref","first-page":"R27","DOI":"10.1186\/gb-2009-10-3-r27","article-title":"DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli","volume":"10","author":"Lemmens","year":"2009","journal-title":"Genome Biol."},{"key":"2023013112143668800_B18","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1093\/bioinformatics\/btl606","article-title":"Functional genomics via multiscale analysis: application to gene expression and ChIP-on-chip data","volume":"23","author":"Lerman","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013112143668800_B19","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1146\/annurev.cellbio.24.110707.175408","article-title":"Systems approaches to identifying gene regulatory networks in plants","volume":"24","author":"Long","year":"2008","journal-title":"Ann. Rev. Cell Dev. Biol."},{"issue":"Suppl. 1","key":"2023013112143668800_B20","doi-asserted-by":"crossref","first-page":"S7","DOI":"10.1186\/1471-2105-7-S1-S7","article-title":"ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context","volume":"7","author":"Margolin","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013112143668800_B21","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1146\/annurev.genom.7.080505.115623","article-title":"Transcriptional regulatory elements in the human genome","volume":"7","author":"Maston","year":"2006","journal-title":"Ann. Rev. Genomics Hum. Genet."},{"key":"2023013112143668800_B22","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1186\/1471-2164-7-187","article-title":"Finding function: evaluation methods for functional genomic data","volume":"7","author":"Myers","year":"2006","journal-title":"BMC Genomics"},{"key":"2023013112143668800_B23","doi-asserted-by":"crossref","first-page":"W199","DOI":"10.1093\/nar\/gkh465","article-title":"Weeder web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes","volume":"32","author":"Pavesi","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023013112143668800_B24","doi-asserted-by":"crossref","first-page":"1089","DOI":"10.1126\/science.1063443","article-title":"Epigenetic reprogramming in mammalian development","volume":"293","author":"Reik","year":"2001","journal-title":"Science"},{"key":"2023013112143668800_B25","doi-asserted-by":"crossref","first-page":"280","DOI":"10.1186\/1471-2105-7-280","article-title":"Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks","volume":"7","author":"Reiss","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013112143668800_B26","doi-asserted-by":"crossref","first-page":"939","DOI":"10.1038\/nbt1098-939","article-title":"Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation","volume":"16","author":"Roth","year":"1998","journal-title":"Nat. Biotechnol."},{"key":"2023013112143668800_B27","doi-asserted-by":"crossref","first-page":"1850","DOI":"10.1101\/gr.6597907","article-title":"Evolution, biogenesis, expression, and target predictions of a substantially expanded set of Drosophila microRNAs","volume":"17","author":"Ruby","year":"2007","journal-title":"Genome Res."},{"key":"2023013112143668800_B28","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1038\/ng1165","article-title":"Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data","volume":"34","author":"Segal","year":"2003","journal-title":"Nat. Genet."},{"key":"2023013112143668800_B29","doi-asserted-by":"crossref","first-page":"2503","DOI":"10.1101\/gad.937701","article-title":"Core promoters: active contributors to combinatorial gene regulation","volume":"15","author":"Smale","year":"2001","journal-title":"Genes Dev."},{"key":"2023013112143668800_B30","doi-asserted-by":"crossref","first-page":"848","DOI":"10.1126\/science.1136678","article-title":"Relative impact of nucleotide and copy number variation on gene expression phenotypes","volume":"315","author":"Stranger","year":"2007","journal-title":"Science"},{"key":"2023013112143668800_B31","doi-asserted-by":"crossref","first-page":"2981","DOI":"10.1073\/pnas.0308661100","article-title":"Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data","volume":"101","author":"Tanay","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013112143668800_B32","doi-asserted-by":"crossref","first-page":"D446","DOI":"10.1093\/nar\/gkj013","article-title":"The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae","volume":"34","author":"Teixeira","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023013112143668800_B33","doi-asserted-by":"crossref","first-page":"W119","DOI":"10.1093\/nar\/gkn304","article-title":"RSAT: regulatory sequence analysis tools","volume":"36","author":"Thomas-Chollier","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023013112143668800_B34","doi-asserted-by":"crossref","first-page":"e1000227","DOI":"10.1371\/journal.pcbi.1000227","article-title":"Analyzing ChIP-chip data using bioconductor","volume":"4","author":"Toedling","year":"2008","journal-title":"PLoS Comput. Biol."},{"key":"2023013112143668800_B35","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1139\/o00-077","article-title":"Genetic and biochemical diversity in the Pax gene family","volume":"78","author":"Underhill","year":"2000","journal-title":"Biochem. Cell Biol."},{"key":"2023013112143668800_B36","doi-asserted-by":"crossref","first-page":"R135","DOI":"10.1186\/gb-2007-8-7-r135","article-title":"Cell-specific microarray profiling experiments reveal a comprehensive picture of gene expression in the C. elegans nervous system","volume":"8","author":"Von Stetina","year":"2007","journal-title":"Genome Biol."},{"key":"2023013112143668800_B37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.molcel.2007.12.010","article-title":"Let me count the ways: mechanisms of gene regulation by miRNAs and siRNAs","volume":"29","author":"Wu","year":"2008","journal-title":"Mol. Cell"},{"key":"2023013112143668800_B38","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1111\/j.1440-169X.2005.00797.x","article-title":"Sp1-like transcription factors are regulators of embryonic development in vertebrates","volume":"47","author":"Zhao","year":"2005","journal-title":"Dev. Growth Differ."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/24\/3267\/48997108\/bioinformatics_25_24_3267.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/24\/3267\/48997108\/bioinformatics_25_24_3267.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T21:48:35Z","timestamp":1675201715000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/24\/3267\/235923"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,10,13]]},"references-count":38,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2009,12,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp588","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,12,15]]},"published":{"date-parts":[[2009,10,13]]}}}