{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,21]],"date-time":"2025-09-21T18:32:11Z","timestamp":1758479531287},"reference-count":33,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict) known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple mathematical calculations. By implementing our method on several biological data sets, it can be induced that this method performs better than methods that do not consider dependencies.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-10-93","type":"journal-article","created":{"date-parts":[[2009,3,21]],"date-time":"2009-03-21T07:13:06Z","timestamp":1237619586000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["New scoring schema for finding motifs in DNA Sequences"],"prefix":"10.1186","volume":"10","author":[{"given":"Fatemeh","family":"Zare-Mirakabad","sequence":"first","affiliation":[]},{"given":"Hayedeh","family":"Ahrabian","sequence":"additional","affiliation":[]},{"given":"Mehdei","family":"Sadeghi","sequence":"additional","affiliation":[]},{"given":"Abbas","family":"Nowzari-Dalini","sequence":"additional","affiliation":[]},{"given":"Bahram","family":"Goliaei","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2009,3,20]]},"reference":[{"key":"2823_CR1","doi-asserted-by":"publisher","first-page":"909","DOI":"10.1093\/bioinformatics\/bth006","volume":"20","author":"Q Zhou","year":"2004","unstructured":"Zhou Q, Liu J: Modeling within-motif dependence for transcription factor binding site predictions. Bioinformatics. 2004, 20: 909-916.","journal-title":"Bioinformatics"},{"key":"2823_CR2","doi-asserted-by":"publisher","first-page":"314","DOI":"10.1089\/cmb.2005.12.314","volume":"12","author":"L Hertzberg","year":"2005","unstructured":"Hertzberg L, Zuk O, Getz G, Domany E: Finding Motifs in Promoter Regions. J Compu Biology. 2005, 12: 314-330.","journal-title":"J Compu Biology"},{"key":"2823_CR3","doi-asserted-by":"publisher","first-page":"W249","DOI":"10.1093\/nar\/gkh372","volume":"32","author":"A Sandelin","year":"2004","unstructured":"Sandelin A, Wasserman W, Lenhard B: ConSite: web-based prediction of regulatory elements using cross-species comparision. Nucleic Acids Res. 2004, 32: W249-W252.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR4","doi-asserted-by":"publisher","first-page":"3576","DOI":"10.1093\/nar\/gkg585","volume":"31","author":"A Kel","year":"2003","unstructured":"Kel A, G\u00f6\u00dfling E, Reuter I, Cheremushkin E, Kel-Margoulis O, Wingender E: MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res. 2003, 31: 3576-3579.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR5","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1186\/1471-2105-6-79","volume":"6","author":"V Marinescu","year":"2005","unstructured":"Marinescu V, Kohane I, Riva A: MAPPER: A search engine for the computational identification of putative transcription factor binding sites in multiple genomes. BMC Bioinformatics. 2005, 6: 79-","journal-title":"BMC Bioinformatics"},{"issue":"2","key":"2823_CR6","first-page":"81","volume":"6","author":"G Hertz","year":"1990","unstructured":"Hertz G, Hartzell G, Stormo G: Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Comput Appl Biosci. 1990, 6 (2): 81-92.","journal-title":"Comput Appl Biosci"},{"key":"2823_CR7","doi-asserted-by":"publisher","first-page":"W217","DOI":"10.1093\/nar\/gkh383","volume":"32","author":"G Loots","year":"2004","unstructured":"Loots G, Ovcharenkol I: rVISTA 2.0: Evolutionary analysis of transcription factor binding sites. Nucleic Acids Res. 2004, 32: W217-W221.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR8","doi-asserted-by":"publisher","first-page":"208","DOI":"10.1126\/science.8211139","volume":"262","author":"C Lawrence","year":"1993","unstructured":"Lawrence C, Altschul S, Bogusky M, Liu J, Neuwald A, Wootton J: Detecting subtle sequence signals: Gibbs sampling strategy for multiple alignment. Science. 1993, 262: 208-214.","journal-title":"Science"},{"key":"2823_CR9","doi-asserted-by":"publisher","first-page":"1205","DOI":"10.1006\/jmbi.2000.3519","volume":"296","author":"J Hughes","year":"2000","unstructured":"Hughes J, Estep P, Tavazoie S, Church G: Computational identification of cis-regulatory elements associated with functionally coherent groups of genes in Saccharomyces Cerevisiae. J Mol Biology. 2000, 296: 1205-1214.","journal-title":"J Mol Biology"},{"key":"2823_CR10","first-page":"21","volume-title":"Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology","author":"T Bailey","year":"1995","unstructured":"Bailey T, Elkan C: The value of priori knowledge in discovering motifs with MEME. Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology. 1995, AAAI Press, Menlo Park, CA, 21-29."},{"key":"2823_CR11","doi-asserted-by":"publisher","first-page":"3586","DOI":"10.1093\/nar\/gkg618","volume":"31","author":"S Sinha","year":"2003","unstructured":"Sinha S, Tompa M: YMF: A program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res. 2003, 31: 3586-3588.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR12","doi-asserted-by":"publisher","first-page":"1093","DOI":"10.1093\/nar\/20.5.1093","volume":"20","author":"W Day","year":"1992","unstructured":"Day W, McMorris F: Critical comparision of consensus methods for molecular sequences. Nucleic Acids Res. 1992, 20: 1093-1099.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR13","doi-asserted-by":"publisher","first-page":"2971","DOI":"10.1093\/nar\/10.9.2971","volume":"10","author":"G Stormo","year":"1982","unstructured":"Stormo G, Schneider T, Gold L: Characterization of translational initiation sites in E. Coli. Nucleic Acids Res. 1982, 10: 2971-2996.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR14","doi-asserted-by":"publisher","first-page":"6097","DOI":"10.1093\/nar\/18.20.6097","volume":"18","author":"T Schneider","year":"1990","unstructured":"Schneider T, Stephens R: Sequence logos: A new way to display consensus sequences. Nucleic Acids Res. 1990, 18: 6097-6100.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR15","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1101\/gr.6902","volume":"12","author":"M Blanchette","year":"2002","unstructured":"Blanchette M, Tompa M: Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res. 2002, 12: 739-748.","journal-title":"Genome Res"},{"key":"2823_CR16","doi-asserted-by":"publisher","first-page":"345","DOI":"10.1089\/106652700750050826","volume":"7","author":"L Marsan","year":"2000","unstructured":"Marsan L, Sagot M: Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification. J Comput Biol. 2000, 7: 345-360.","journal-title":"J Comput Biol"},{"key":"2823_CR17","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1186\/1471-2105-6-121","volume":"6","author":"S Bortoluzzi","year":"2005","unstructured":"Bortoluzzi S, Coppe A, Bisognin A, Pizzi C, Danieli G: A multistep bioinformatic approach detects putative regulatory elements in gene promoters. BMC Bioinformatics. 2005, 6: 121-136.","journal-title":"BMC Bioinformatics"},{"key":"2823_CR18","doi-asserted-by":"publisher","first-page":"4442","DOI":"10.1093\/nar\/gkf578","volume":"30","author":"P Benos","year":"2002","unstructured":"Benos P, Bulyk M, Stormo G: Additivity in Protein-DNA interactions: how good an approximation is it?. Nucleic Acids Res. 2002, 30: 4442-4451.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR19","doi-asserted-by":"publisher","first-page":"1255","DOI":"10.1093\/nar\/30.5.1255","volume":"30","author":"M Bulyk","year":"2002","unstructured":"Bulyk M, Johnson P, Church G: Nucleotides of transcription factor binding site exert independent effects on the binding affinities of transcription factors. Nucleic Acids Res. 2002, 30: 1255-1261.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR20","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1145\/640075.640079","volume-title":"Proceedings of the seventh annual international conference on Research in computational molecular biology","author":"Y Barash","year":"2003","unstructured":"Barash Y, Elidan G, Friedman N, Kaplan T: Modeling dependencies in protein-DNA binding sites. Proceedings of the seventh annual international conference on Research in computational molecular biology. 2003, Berlin, Germany: ACM, New York, NY, 28-37."},{"key":"2823_CR21","doi-asserted-by":"publisher","first-page":"894","DOI":"10.1089\/cmb.2005.12.894","volume":"12","author":"X Zhao","year":"2005","unstructured":"Zhao X, Huang H, Speed T: Finding short DNA motifs using permuted Markov models. J Comput Biol. 2005, 12: 894-906.","journal-title":"J Comput Biol"},{"key":"2823_CR22","doi-asserted-by":"publisher","first-page":"S100","DOI":"10.1093\/bioinformatics\/18.suppl_2.S100","volume":"18 Suppl 2","author":"K Ellrott","year":"2002","unstructured":"Ellrott K, Yang C, Sladek F, Jiang T: Identifiying transcription factor binding sites through Markov chain optimization. Bioinformatics. 2002, 18 Suppl 2: S100-S109.","journal-title":"Bioinformatics"},{"key":"2823_CR23","doi-asserted-by":"publisher","first-page":"e116","DOI":"10.1093\/nar\/gng117","volume":"31","author":"O King","year":"2003","unstructured":"King O, Roth F: A non-parametric model for transcription factor binding sites. Nucleic Acids Res. 2003, 31: e116-","journal-title":"Nucleic Acids Res"},{"key":"2823_CR24","doi-asserted-by":"publisher","first-page":"933","DOI":"10.1093\/bioinformatics\/btm055","volume":"23","author":"A Tomovic","year":"2007","unstructured":"Tomovic A, Oakeley E: Position dependencies in transcription factor binding sites. Bioinformatics. 2007, 23: 933-941.","journal-title":"Bioinformatics"},{"key":"2823_CR25","first-page":"269","volume-title":"Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology","author":"P Pevzner","year":"2000","unstructured":"Pevzner P, Sze S: Combinatorial approaches to finding subtle signals in DNA sequences. Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology. 2000, AAAI Press, Menlo Park, CA, 269-278."},{"key":"2823_CR26","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1006\/jtbi.1998.0785","volume":"195","author":"G Stormo","year":"1998","unstructured":"Stormo G: Information content and free energy in DNA-Protein interaction. J Theor Biol. 1998, 195: 135-137.","journal-title":"J Theor Biol"},{"key":"2823_CR27","doi-asserted-by":"publisher","first-page":"701","DOI":"10.1016\/S0022-2836(02)00917-8","volume":"323","author":"P Benos","year":"2002","unstructured":"Benos P, Lapedes A, Stormo G: Probabilistic code for DNA recognition by proteins of EGR family. J Mol Biol. 2002, 323: 701-727.","journal-title":"J Mol Biol"},{"key":"2823_CR28","doi-asserted-by":"publisher","first-page":"1135","DOI":"10.1093\/bioinformatics\/18.8.1135","volume":"18","author":"B Lenhard","year":"2002","unstructured":"Lenhard B, Wasserman W: TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics. 2002, 18: 1135-1136.","journal-title":"Bioinformatics"},{"key":"2823_CR29","doi-asserted-by":"publisher","first-page":"238","DOI":"10.1093\/nar\/24.1.238","volume":"24","author":"E Wingender","year":"1996","unstructured":"Wingender E, Dietze P, Karas H, Knuppel R: TRANSFAC: A database on transcription factors and their DNA binding sites. Nucleic Acids Res. 1996, 24: 238-241.","journal-title":"Nucleic Acids Res"},{"key":"2823_CR30","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1186\/1471-2105-8-193","volume":"8","author":"G Sandve","year":"2007","unstructured":"Sandve G, Abul O, Walseng V, Drabl\u00f8s F: Improved benchmarks for computational motif discovery. BMC Bioinformatics. 2007, 8: 193-","journal-title":"BMC Bioinformatics"},{"key":"2823_CR31","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1038\/nbt1053","volume":"23","author":"M Tompa","year":"2005","unstructured":"Tompa M, Li N, Bailey T, Church G, De Moor B, Eskin E, Favorov A, Frith M, Fu Y, Kent W, Makeev V, Mironov A, Noble W, Pavesi G, Pesole G, Regnier M, Simonis N, Sinha S, Thijs G, van Helden J, Vandenbogaert M, Weng Z, Workman C, Ye C, Zhu Z: Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol. 2005, 23: 137-144.","journal-title":"Nat Biotechnol"},{"key":"2823_CR32","doi-asserted-by":"publisher","first-page":"353","DOI":"10.1006\/geno.1996.0298","volume":"34","author":"M Burset","year":"1996","unstructured":"Burset M, Guigo R: Evaluation of gene structure prediction programs. Genomics. 1996, 34: 353-367.","journal-title":"Genomics"},{"key":"2823_CR33","doi-asserted-by":"publisher","first-page":"563","DOI":"10.1093\/bioinformatics\/15.7.607","volume":"15","author":"J Zhu","year":"1999","unstructured":"Zhu J, Zhang M: SCPD: A promoter database of yeast Saccharomyces Cerevisiae. Bioinformatics. 1999, 15: 563-577.","journal-title":"Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-10-93.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,31]],"date-time":"2021-08-31T21:32:58Z","timestamp":1630445578000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-10-93"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,3,20]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,12]]}},"alternative-id":["2823"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-10-93","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2009,3,20]]},"assertion":[{"value":"14 September 2008","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 March 2009","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 March 2009","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"93"}}