{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,25]],"date-time":"2026-02-25T00:22:20Z","timestamp":1771978940615,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Spatial clusters of genes conserved across multiple genomes provide important clues to gene functions and evolution of genome organization. Existing methods of identifying these clusters often made restrictive assumptions, such as exact conservation of gene order, and relied on heuristic algorithms.<\/jats:p>\n               <jats:p>Results: We developed a very efficient algorithm based on a \u2018gene teams\u2019 model that allows genes in the clusters to appear in different orders. This allows us to detect conserved gene clusters under flexible evolutionary constraints in a large number of genomes. Our statistical evaluation incorporates the evolutionary relationship among genomes, a key aspect that has been missing in most previous studies. We conducted a large-scale analysis of 133 bacterial genomes. Our results confirm that our approach is an effective way of uncovering functionally related genes. The comparison with known operons and the analysis of the structural properties of our predicted clusters suggest that operons are an important source of constraint, but there are also other forces that determine evolution of gene order and arrangement. Using our method, we predicted functions of many poorly characterized genes in bacterial. The combined algorithmic and statistical methods we present here provide a rigorous framework for systematically studying evolutionary constraints of genomic contexts.<\/jats:p>\n               <jats:p>Availability: The software, data and the full results of this article are available online at http:\/\/www.ews.uiuc.edu\/~xuling\/mcmusec.<\/jats:p>\n               <jats:p>Contact: \u00a0xuling@uiuc.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp027","type":"journal-article","created":{"date-parts":[[2009,1,22]],"date-time":"2009-01-22T01:44:17Z","timestamp":1232588657000},"page":"571-577","source":"Crossref","is-referenced-by-count":43,"title":["Detecting gene clusters under evolutionary constraint in a large number of genomes"],"prefix":"10.1093","volume":"25","author":[{"given":"Xu","family":"Ling","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA"}]},{"given":"Xin","family":"He","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA"}]},{"given":"Dong","family":"Xin","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA"}]}],"member":"286","published-online":{"date-parts":[[2009,1,21]]},"reference":[{"key":"2023013110113914100_B1","first-page":"487","article-title":"Fast algorithms for mining association rules in large databases","volume-title":"VLDB'94, Proceedings of 20th International Conference on Very Large Data Bases, September 12-15, 1994, Santiago de Chile, Chile.","author":"Agrawal","year":"1994"},{"key":"2023013110113914100_B2","doi-asserted-by":"crossref","first-page":"966","DOI":"10.1073\/pnas.012602299","article-title":"A genome-scale analysis for identification of genes required for growth or survival of Haemophilus influenzae","volume":"99","author":"Akerley","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110113914100_B3","doi-asserted-by":"crossref","first-page":"945","DOI":"10.1038\/ng2071","article-title":"Evolution of chromosome organization driven by selection for reduced gene expression noise","volume":"39","author":"Batada","year":"2007","journal-title":"Nat. Genet."},{"key":"2023013110113914100_B4","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1073\/pnas.0609683104","article-title":"Eukaryotic operon-like transcription of functionally related genes in Drosophila","volume":"104","author":"Ben-Shahar","year":"2007","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110113914100_B5","first-page":"464","article-title":"The algorithmic of gene teams","volume-title":"WABI '02: Proceedings of the Second International Workshop on Algorithms in Bioinformatics.","author":"Bergeron","year":"2002"},{"key":"2023013110113914100_B6","doi-asserted-by":"crossref","first-page":"1283","DOI":"10.1126\/science.1123061","article-title":"Toward automatic reconstruction of a highly resolved tree of life","volume":"311","author":"Ciccarelli","year":"2006","journal-title":"Science"},{"key":"2023013110113914100_B7","doi-asserted-by":"crossref","first-page":"288","DOI":"10.1093\/nar\/gkl1018","article-title":"Operon prediction using both genome-specific and general genomic information","volume":"35","author":"Dam","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023013110113914100_B8","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1023\/B:DAMI.0000005258.31418.83","article-title":"Mining frequent patterns without candidate generation: a frequent-pattern tree approach","volume":"8","author":"Han","year":"2004","journal-title":"Data Min. Knowl. Discov."},{"key":"2023013110113914100_B9","doi-asserted-by":"crossref","first-page":"638","DOI":"10.1089\/cmb.2005.12.638","article-title":"Identifying conserved gene clusters in the presence of homology families","volume":"12","author":"He","year":"2005","journal-title":"J. Comput. Biol."},{"key":"2023013110113914100_B10","doi-asserted-by":"crossref","first-page":"1083","DOI":"10.1089\/cmb.2005.12.1083","article-title":"The statistical analysis of spatially clustered genes under the maximum gap criterion","volume":"12","author":"Hoberman","year":"2005","journal-title":"J. Comput. Biol."},{"key":"2023013110113914100_B11","doi-asserted-by":"crossref","first-page":"5849","DOI":"10.1073\/pnas.95.11.5849","article-title":"Measuring genome evolution","volume":"95","author":"Huynen","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110113914100_B12","doi-asserted-by":"crossref","first-page":"1204","DOI":"10.1101\/gr.10.8.1204","article-title":"Predicting protein function by genomic context: quantitative evaluation and qualitative inferences","volume":"10","author":"Huynen","year":"2000","journal-title":"Genome Res."},{"key":"2023013110113914100_B13","doi-asserted-by":"crossref","first-page":"332","DOI":"10.1093\/oxfordjournals.molbev.a026114","article-title":"Evolutionary instability of operon structures disclosed by sequence comparisons of complete microbial genomes","volume":"16","author":"Itoh","year":"1999","journal-title":"Mol. Biol. Evol."},{"key":"2023013110113914100_B14","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1016\/S0022-2836(61)80072-7","article-title":"Genetic regulatory mechanisms in the synthesis of proteins","volume":"3","author":"Jacob","year":"1961","journal-title":"J. Mol. Biol."},{"key":"2023013110113914100_B15","doi-asserted-by":"crossref","first-page":"2083","DOI":"10.1101\/gad.1561207","article-title":"The interaction of DiaA and DnaA regulates the replication cycle in E. coli by directly promoting ATP DnaA-specific initiation complexes","volume":"21","author":"Keyamura","year":"2007","journal-title":"Genes Dev."},{"key":"2023013110113914100_B16","doi-asserted-by":"crossref","first-page":"1919","DOI":"10.1101\/gr.7090407","article-title":"Reliable prediction of regulator targets using 12 Drosophila genomes","volume":"17","author":"Kheradpour","year":"2007","journal-title":"Genome Res."},{"key":"2023013110113914100_B17","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1109\/CSB.2005.33","article-title":"Gene teams with relaxed proximity constraint","volume-title":"CSB '05: Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).","author":"Kim","year":"2005"},{"key":"2023013110113914100_B18","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1006\/jmbi.2001.4701","article-title":"Snapping up functionally related genes based on context information: a colinearity-free approach","volume":"311","author":"Kolesov","year":"2001","journal-title":"J. Mol. Biol."},{"key":"2023013110113914100_B19","doi-asserted-by":"crossref","first-page":"474","DOI":"10.1016\/S0968-0004(00)01663-7","article-title":"Gene context conservation of a higher order than operons","volume":"25","author":"Lathe","year":"2000","journal-title":"Trends Biochem. Sci."},{"key":"2023013110113914100_B20","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1016\/S0092-8674(02)00900-5","article-title":"Shared strategies in gene organization among prokaryotes and eukaryotes","volume":"110","author":"Lawrence","year":"2002","journal-title":"Cell"},{"key":"2023013110113914100_B21","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1089\/cmb.2008.0010","article-title":"Efficiently identifying max-gap clusters in pairwise genome comparison","volume":"15","author":"Ling","year":"2008","journal-title":"J. Comput. Biol."},{"key":"2023013110113914100_B22","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/S1476-9271(02)00097-X","article-title":"Gene teams: a new formalization of gene clusters for comparative genomics","volume":"27","author":"Luc","year":"2003","journal-title":"Comput. Biol. Chem"},{"key":"2023013110113914100_B23","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1016\/0168-9525(96)20006-X","article-title":"Gene order is not conserved in bacterial evolution","volume":"12","author":"Mushegian","year":"1996","journal-title":"Trends Genet."},{"key":"2023013110113914100_B24","doi-asserted-by":"crossref","first-page":"4552","DOI":"10.1128\/JB.187.13.4552-4561.2005","article-title":"Characterization of six lipoproteins in the sigmaE regulon","volume":"187","author":"Onufryk","year":"2005","journal-title":"J. Bacteriol."},{"key":"2023013110113914100_B25","doi-asserted-by":"crossref","first-page":"2896","DOI":"10.1073\/pnas.96.6.2896","article-title":"The use of gene clusters to infer functional coupling","volume":"96","author":"Overbeek","year":"1999","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110113914100_B26","doi-asserted-by":"crossref","first-page":"867","DOI":"10.1101\/gr.3638405","article-title":"Identification of genomic features using microsyntenies of domains: domain teams","volume":"15","author":"Pasek","year":"2005","journal-title":"Genome Res."},{"key":"2023013110113914100_B27","doi-asserted-by":"crossref","first-page":"809","DOI":"10.1101\/gr.3368805","article-title":"Operon formation is driven by co-regulation and not by horizontal gene transfer","volume":"15","author":"Price","year":"2005","journal-title":"Genome Res."},{"key":"2023013110113914100_B28","doi-asserted-by":"crossref","first-page":"e96","DOI":"10.1371\/journal.pgen.0020096","article-title":"The life-cycle of operons","volume":"2","author":"Price","year":"2006","journal-title":"PLoS Genet."},{"key":"2023013110113914100_B29","doi-asserted-by":"crossref","first-page":"2212","DOI":"10.1093\/nar\/30.10.2212","article-title":"Connected gene neighborhoods in prokaryotic genomes","volume":"30","author":"Rogozin","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023013110113914100_B30","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1093\/bib\/5.2.131","article-title":"Computational approaches for the analysis of gene neighbourhoods in prokaryotic genomes","volume":"5","author":"Rogozin","year":"2004","journal-title":"Brief. Bioinform."},{"key":"2023013110113914100_B31","doi-asserted-by":"crossref","first-page":"D394","DOI":"10.1093\/nar\/gkj156","article-title":"Regulondb (version 5.0): Escherichia coli k-12 transcriptional regulatory network, operon organization, and growth conditions","volume":"1","author":"Salgado","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023013110113914100_B32","doi-asserted-by":"crossref","first-page":"5890","DOI":"10.1073\/pnas.092632599","article-title":"The identification of functional modules from the genomic association of genes","volume":"99","author":"Snel","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110113914100_B33","doi-asserted-by":"crossref","first-page":"775","DOI":"10.1038\/nrg1688","article-title":"The role of chromatin structure in regulating the expression of clustered genes","volume":"6","author":"Sproul","year":"2005","journal-title":"Nat. Rev. Genet."},{"key":"2023013110113914100_B34","doi-asserted-by":"crossref","DOI":"10.1186\/gb-2001-2-6-research0020","article-title":"Evolution of gene order conservation in prokaryotes","volume":"2","author":"Tamames","year":"2001","journal-title":"Genome Biol."},{"key":"2023013110113914100_B35","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1186\/1471-2105-4-41","article-title":"The cog database: an updated version includes eukaryotes","volume":"4","author":"Tatusov","year":"2003","journal-title":"BMC Bioinformatics"},{"issue":"Suppl. 1","key":"2023013110113914100_B36","doi-asserted-by":"crossref","first-page":"S57","DOI":"10.1007\/PL00000052","article-title":"Genome plasticity as a paradigm of eubacteria evolution","volume":"44","author":"Watanabe","year":"1997","journal-title":"J. Mol. Evol."},{"key":"2023013110113914100_B37","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1101\/gr.161901","article-title":"Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context","volume":"11","author":"Wolf","year":"2001","journal-title":"Genome Res."},{"key":"2023013110113914100_B38","first-page":"247","article-title":"Prediction of functional modules based on gene distributions in microbial genomes","volume":"16","author":"Wu","year":"2005","journal-title":"Genome Inform."},{"key":"2023013110113914100_B39","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1101\/gr.072322.107","article-title":"Large-scale analysis of gene clustering in bacteria","volume":"18","author":"Yang","year":"2008","journal-title":"Genome Res."},{"key":"2023013110113914100_B40","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1186\/1471-2105-6-243","article-title":"Phylogenetic detection of conserved gene clusters in microbial genomes","volume":"6","author":"Zheng","year":"2005","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/5\/571\/48984216\/bioinformatics_25_5_571.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/5\/571\/48984216\/bioinformatics_25_5_571.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T19:49:15Z","timestamp":1675194555000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/5\/571\/183132"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,1,21]]},"references-count":40,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2009,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp027","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,3,1]]},"published":{"date-parts":[[2009,1,21]]}}}