{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,19]],"date-time":"2026-01-19T13:54:35Z","timestamp":1768830875463,"version":"3.49.0"},"reference-count":27,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (<jats:italic>S. cerevisiae<\/jats:italic>, <jats:italic>H. sapiens<\/jats:italic>, <jats:italic>D. melanogaster<\/jats:italic>, <jats:italic>E. coli<\/jats:italic> and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>We conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the <jats:italic>S. cerevisiae<\/jats:italic> data sets.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-9-89","type":"journal-article","created":{"date-parts":[[2008,2,7]],"date-time":"2008-02-07T19:14:57Z","timestamp":1202411697000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance"],"prefix":"10.1186","volume":"9","author":[{"given":"Ana C","family":"Casimiro","sequence":"first","affiliation":[]},{"given":"Susana","family":"Vinga","sequence":"additional","affiliation":[]},{"given":"Ana T","family":"Freitas","sequence":"additional","affiliation":[]},{"given":"Arlindo L","family":"Oliveira","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2008,2,7]]},"reference":[{"key":"2074_CR1","volume-title":"Algorithms on Strings, Trees, and Sequences","author":"D Gusfield","year":"1999","unstructured":"Gusfield D: Algorithms on Strings, Trees, and Sequences. Cambridge University Press; 1999."},{"key":"2074_CR2","first-page":"28","volume-title":"Proceedings of the seventh Annual International Conference on computational molecular biology","author":"Y Barash","year":"2003","unstructured":"Barash Y, Elidan G, Friedman N, Kaplan T: Modeling dependencies in protein-DNA binding sites. Proceedings of the seventh Annual International Conference on computational molecular biology 2003, 28\u201337."},{"key":"2074_CR3","first-page":"268","volume":"5","author":"J Schug","year":"1997","unstructured":"Schug J, Overton GC: Modeling transcription factor binding sites with Gibbs Sampling and Minimum Description Length encoding. Proceedings of the International Conference on Intelligent Systems for Molecular Biology 1997, 5: 268\u2013271.","journal-title":"Proceedings of the International Conference on Intelligent Systems for Molecular Biology"},{"key":"2074_CR4","first-page":"28","volume-title":"Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology","author":"T Bailey","year":"1994","unstructured":"Bailey T, Elkan C: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology 1994, 28\u201336."},{"issue":"2","key":"2074_CR5","doi-asserted-by":"publisher","first-page":"126","DOI":"10.1109\/TCBB.2006.16","volume":"3","author":"AM Carvalho","year":"2006","unstructured":"Carvalho AM, Freitas AT, Oliveira AL, Sagot MF: An efficient algorithm for the identification of structured motifs in DNA promoter sequences. IEEE Transactions on Computational Biology and Bioinformatics 2006, 3(2):126\u2013140.","journal-title":"IEEE Transactions on Computational Biology and Bioinformatics"},{"issue":"3\/4","key":"2074_CR6","doi-asserted-by":"publisher","first-page":"345","DOI":"10.1089\/106652700750050826","volume":"7","author":"L Marsan","year":"2000","unstructured":"Marsan L, Sagot MF: Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification. Journal of Computational Biology 2000, 7(3\/4):345\u2013360.","journal-title":"Journal of Computational Biology"},{"key":"2074_CR7","volume-title":"DNA words and models","author":"S Robin","year":"2005","unstructured":"Robin S, Rodolphe F, Schbath S: DNA words and models. Cambridge University Press, NY; 2005."},{"key":"2074_CR8","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1186\/1748-7188-1-8","volume":"1","author":"N Li","year":"2006","unstructured":"Li N, Tompa M: Analysis of computational approaches for motif discovery. Algorithms for Molecular Biology 2006, 1: 8.","journal-title":"Algorithms for Molecular Biology"},{"issue":"5","key":"2074_CR9","doi-asserted-by":"publisher","first-page":"1205","DOI":"10.1006\/jmbi.2000.3519","volume":"296","author":"JD Hudges","year":"2000","unstructured":"Hudges JD, Estep PW, Tavazoie S, Church GM: Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae . Journal of Molecular Biology 2000, 296(5):1205\u20131214.","journal-title":"Journal of Molecular Biology"},{"issue":"6","key":"2074_CR10","first-page":"735","volume":"24","author":"G Yagil","year":"2007","unstructured":"Yagil G: Binary DNA tracts as transcription control elements. Journal of Biomolecular Structure and Dynamics 2007, 24(6):735\u2013736.","journal-title":"Journal of Biomolecular Structure and Dynamics"},{"issue":"7104","key":"2074_CR11","doi-asserted-by":"publisher","first-page":"772","DOI":"10.1038\/nature04979","volume":"442","author":"E Segal","year":"2006","unstructured":"Segal E, Fondufe-Mittendorf Y, Chen LY, Thastrom A, Field Y, Moore IK, Wang JPZ, Widom J: A genomic code for nucleosome positioning. Nature 2006, 442(7104):772\u2013778.","journal-title":"Nature"},{"issue":"14","key":"2074_CR12","doi-asserted-by":"publisher","first-page":"e384","DOI":"10.1093\/bioinformatics\/btl251","volume":"22","author":"L Narlikar","year":"2006","unstructured":"Narlikar L, Gordan R, Ohler U, Hartemink AJ: Informative priors based on transcription factor structural class improve de novo motif discovery. Bioinformatics 2006, 22(14):e384\u201392.","journal-title":"Bioinformatics"},{"issue":"11","key":"2074_CR13","doi-asserted-by":"publisher","first-page":"3851","DOI":"10.1073\/pnas.0400611101","volume":"101","author":"A Erives","year":"2004","unstructured":"Erives A, Levine M: Coordinate enhancers share common organizational features in the Drosophila genome. Proc Natl Acad Sci U S A. 2004, 101(11):3851\u20133856.","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"20","key":"2074_CR14","doi-asserted-by":"publisher","first-page":"6016","DOI":"10.1093\/nar\/gkg799","volume":"31","author":"VJ Makeev","year":"2003","unstructured":"Makeev VJ, Lifanov AP, Nazina AG, Papatsenko DA: Distance preferences in the arrangement of binding motifs and hierarchical levels in organization of transcription regulatory information. Nucleic Acids Research 2003, 31(20):6016\u20136026.","journal-title":"Nucleic Acids Research"},{"key":"2074_CR15","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511802843","volume-title":"Bootstrap methods and their application","author":"AC Davison","year":"1997","unstructured":"Davison AC, Hinkley DV: Bootstrap methods and their application. Cambridge University Press; 1997."},{"issue":"5","key":"2074_CR16","doi-asserted-by":"publisher","first-page":"699","DOI":"10.1016\/S0092-8674(04)00205-3","volume":"116","author":"AD Basehoar","year":"2004","unstructured":"Basehoar AD, Zanton SJ, Pugh BF: Identification and distinct regulation of yeast TATA box-containing genes. Cell 2004, 116(5):699\u2013709.","journal-title":"Cell"},{"issue":"4","key":"2074_CR17","doi-asserted-by":"publisher","first-page":"563","DOI":"10.1016\/0022-2836(90)90223-9","volume":"212","author":"P Bucher","year":"1990","unstructured":"Bucher P: Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol 1990, 212(4):563\u201378.","journal-title":"J Mol Biol"},{"issue":"20","key":"2074_CR18","doi-asserted-by":"publisher","first-page":"2583","DOI":"10.1101\/gad.1026202","volume":"16","author":"JEF Butler","year":"2002","unstructured":"Butler JEF, Kadonaga JT: The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev 2002, 16(20):2583\u20132592.","journal-title":"Genes Dev"},{"issue":"4","key":"2074_CR19","doi-asserted-by":"publisher","first-page":"636","DOI":"10.1101\/gad.4.4.636","volume":"4","author":"VL Singer","year":"1990","unstructured":"Singer VL, Wobbe CR, Struhl K: A wide variety of DNA sequences can functionally replace a yeast TATA element for transcriptional activation. Genes & Development 1990, 4(4):636\u2013645.","journal-title":"Genes & Development"},{"issue":"14","key":"2074_CR20","doi-asserted-by":"crossref","first-page":"4190","DOI":"10.1128\/jb.177.14.4190-4193.1995","volume":"177","author":"D Blinder","year":"1995","unstructured":"Blinder D, Magasanik B: Recognition of nitrogen-responsive upstream activation sequences of Saccharomyces cerevisiae by the product of the GLN3 gene . J Bacteriol 1995, 177(14):4190\u20134193.","journal-title":"J Bacteriol"},{"issue":"11","key":"2074_CR21","doi-asserted-by":"crossref","first-page":"3416","DOI":"10.1128\/jb.179.11.3416-3429.1997","volume":"179","author":"JA Coffman","year":"1997","unstructured":"Coffman JA, Rai R, Loprete DM, Cunninham T, Svetlov V, Cooper TG: Cross regulation of four GATA factors that control nitrogen catabolic gene expression in Saccharomyces cerevisiae . J Bacteriol 1997, 179(11):3416\u20133429.","journal-title":"J Bacteriol"},{"issue":"6","key":"2074_CR22","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1089\/10665270260518254","volume":"9","author":"S Robin","year":"2002","unstructured":"Robin S, Daudin JJ, Richard H, Sagot MF, Schbath S: Occurrence Probability of Structured Motifs in Random Sequences. Journal of Computational Biology 2002, 9(6):761\u2013774.","journal-title":"Journal of Computational Biology"},{"key":"2074_CR23","first-page":"D446","volume-title":"Nucleic Acids Research","author":"MC Teixeira","year":"2006","unstructured":"Teixeira MC, Monteiro P, Jain P, Tenreiro S, Fernandes AR, Mira NP, Alenquer M, Freitas AT, Oliveira AL, S\u00e1-Correia I: The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Research 2006, (34 Database):D446-D451."},{"key":"2074_CR24","first-page":"D82","volume-title":"Nucleic Acids Research","author":"C Schmid","year":"2006","unstructured":"Schmid C, Rouaida DP, Praz V, Bucher P: EPD in its twentieth year: towards complete promoter coverage of selected model organisms. Nucleic Acids Research 2006, (34 Database):D82-D85."},{"issue":"13","key":"2074_CR25","doi-asserted-by":"publisher","first-page":"4754","DOI":"10.1128\/MCB.20.13.4754-4764.2000","volume":"20","author":"AK Kutach","year":"2000","unstructured":"Kutach AK, Kadonaga JT: The Downstream Promoter Element DPE Appears To Be as Widely Used as the TATA Box in Drosophila Core Promoters. Mol Cell Biol 2000, 20(13):4754\u20134764.","journal-title":"Mol Cell Biol"},{"key":"2074_CR26","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1093\/nar\/26.1.55","volume":"26","author":"AM Huerta","year":"1998","unstructured":"Huerta AM, Salgado H, Thieffry D, Collado-Vides J: RegulonDB: a database on transcriptional regulation in Escherichia coli. Nucleic Acids Research 1998, 26: 55\u201359.","journal-title":"Nucleic Acids Research"},{"key":"2074_CR27","doi-asserted-by":"publisher","first-page":"114","DOI":"10.1093\/nar\/gkg041","volume":"31","author":"IA Shahmuradov","year":"2003","unstructured":"Shahmuradov IA, Gammerman AJ, Hancock JM, Bramley PM, Solovyev VV: PlantProm: a database of plant promoter sequences. Nucleic Acids Research 2003, 31: 114\u2013117.","journal-title":"Nucleic Acids Research"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-89.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T03:14:52Z","timestamp":1630466092000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-89"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,2,7]]},"references-count":27,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["2074"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-89","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,2,7]]},"assertion":[{"value":"17 August 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 February 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 February 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"89"}}