{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T19:12:27Z","timestamp":1761765147535,"version":"3.37.0"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"16","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2719,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,8,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Understanding transcriptional regulation is one of the main challenges in computational biology. An important problem is the identification of transcription factor (TF) binding sites in promoter regions of potential TF target genes. It is typically approached by position weight matrix-based motif identification algorithms using Gibbs sampling, or heuristics to extend seed oligos. Such algorithms succeed in identifying single, relatively well-conserved binding sites, but tend to fail when it comes to the identification of combinations of several degenerate binding sites, as those often found in cis-regulatory modules.<\/jats:p><jats:p>Results: We propose a new algorithm that combines the benefits of existing motif finding with the ones of support vector machines (SVMs) to find degenerate motifs in order to improve the modeling of regulatory modules. In experiments on microarray data from Arabidopsis thaliana, we were able to show that the newly developed strategy significantly improves the recognition of TF targets.<\/jats:p><jats:p>Availability: The python source code (open source-licensed under GPL), the data for the experiments and a Galaxy-based web service are available at http:\/\/www.fml.mpg.de\/raetsch\/suppl\/kirmes\/<\/jats:p><jats:p>Contact: \u00a0sebi@tuebingen.mpg.de<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp278","type":"journal-article","created":{"date-parts":[[2009,4,24]],"date-time":"2009-04-24T00:25:34Z","timestamp":1240532734000},"page":"2126-2133","source":"Crossref","is-referenced-by-count":14,"title":["KIRMES: kernel-based identification of regulatory modules in euchromatic sequences"],"prefix":"10.1093","volume":"25","author":[{"given":"Sebastian J.","family":"Schultheiss","sequence":"first","affiliation":[{"name":"1 Friedrich Miescher Laboratory of the Max Planck Society, and 2Max Planck Institute for Developmental Biology, T\u00fcbingen, 3Department of Stem Cell Biology, University of Heidelberg and 4Wilhelm Schickard Institute for Computer Science, University of T\u00fcbingen, Germany"},{"name":"1 Friedrich Miescher Laboratory of the Max Planck Society, and 2Max Planck Institute for Developmental Biology, T\u00fcbingen, 3Department of Stem Cell Biology, University of Heidelberg and 4Wilhelm Schickard Institute for Computer Science, University of T\u00fcbingen, Germany"}]},{"given":"Wolfgang","family":"Busch","sequence":"additional","affiliation":[{"name":"1 Friedrich Miescher Laboratory of the Max Planck Society, and 2Max Planck Institute for Developmental Biology, T\u00fcbingen, 3Department of Stem Cell Biology, University of Heidelberg and 4Wilhelm Schickard Institute for Computer Science, University of T\u00fcbingen, Germany"}]},{"given":"Jan U.","family":"Lohmann","sequence":"additional","affiliation":[{"name":"1 Friedrich Miescher Laboratory of the Max Planck Society, and 2Max Planck Institute for Developmental Biology, T\u00fcbingen, 3Department of Stem Cell Biology, University of Heidelberg and 4Wilhelm Schickard Institute for Computer Science, University of T\u00fcbingen, Germany"},{"name":"1 Friedrich Miescher Laboratory of the Max Planck Society, and 2Max Planck Institute for Developmental Biology, T\u00fcbingen, 3Department of Stem Cell Biology, University of Heidelberg and 4Wilhelm Schickard Institute for Computer Science, University of T\u00fcbingen, Germany"}]},{"given":"Oliver","family":"Kohlbacher","sequence":"additional","affiliation":[{"name":"1 Friedrich Miescher Laboratory of the Max Planck Society, and 2Max Planck Institute for Developmental Biology, T\u00fcbingen, 3Department of Stem Cell Biology, University of Heidelberg and 4Wilhelm Schickard Institute for Computer Science, University of T\u00fcbingen, Germany"}]},{"given":"Gunnar","family":"R\u00e4tsch","sequence":"additional","affiliation":[{"name":"1 Friedrich Miescher Laboratory of the Max Planck Society, and 2Max Planck Institute for Developmental Biology, T\u00fcbingen, 3Department of Stem Cell Biology, University of Heidelberg and 4Wilhelm Schickard Institute for Computer Science, University of T\u00fcbingen, Germany"}]}],"member":"286","published-online":{"date-parts":[[2009,4,23]]},"reference":[{"key":"2023013112090906300_B1","first-page":"28","article-title":"Fitting a mixture model by expectation maximization to discover motifs in biopolymers","volume-title":"Proceedings of ISMB'94","author":"Bailey","year":"1994"},{"key":"2023013112090906300_B2","doi-asserted-by":"crossref","first-page":"e1000173","DOI":"10.1371\/journal.pcbi.1000173","article-title":"Support vector machines and kernels for computational biology","volume":"4","author":"Ben-Hur","year":"2008","journal-title":"PLoS Comput. Biol."},{"key":"2023013112090906300_B3","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1145\/130385.130401","article-title":"A training algorithm for optimal margin classifiers","volume-title":"Proceedings COLT '92.","author":"Boser","year":"1992"},{"key":"2023013112090906300_B4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.1365-313X.2004.02272.x","article-title":"Identification of novel heat shock factor-dependent genes and biochemical pathways in A. thaliana","volume":"41","author":"Busch","year":"2005","journal-title":"Plant J."},{"key":"2023013112090906300_B5","doi-asserted-by":"crossref","first-page":"e1000071","DOI":"10.1371\/journal.pcbi.1000071","article-title":"Discovering sequence motifs with arbitrary insertions and deletions","volume":"4","author":"Frith","year":"2008","journal-title":"PLoS Comput. Biol."},{"key":"2023013112090906300_B6","doi-asserted-by":"crossref","first-page":"1451","DOI":"10.1101\/gr.4086505","article-title":"Galaxy: a platform for interactive large-scale genome analysis","volume":"15","author":"Giardine","year":"2005","journal-title":"Genome Res."},{"key":"2023013112090906300_B7","first-page":"98","article-title":"A fast, alignment-free, conservation-based method for transcription factor binding site discovery","volume-title":"Lecture Notes in Computer Science: RECOMB 2008","author":"Gord\u00e2n","year":"2008"},{"key":"2023013112090906300_B8","doi-asserted-by":"crossref","first-page":"7079","DOI":"10.1073\/pnas.0408743102","article-title":"De novo cis-regulatory module elicitation for eukaryotic genomes","volume":"102","author":"Gupta","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013112090906300_B9","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1038\/nature02800","article-title":"Transcriptional regulatory code of a eukaryotic genome","volume":"431","author":"Harbison","year":"2004","journal-title":"Nature"},{"key":"2023013112090906300_B10","article-title":"Making large-scale SVM learning practical","volume-title":"Advances in Kernel Methods - Support Vector Learning.","author":"Joachims","year":"1999"},{"key":"2023013112090906300_B11","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1126\/science.8211139","article-title":"Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment","volume":"262","author":"Lawrence","year":"1993","journal-title":"Science"},{"key":"2023013112090906300_B12","doi-asserted-by":"crossref","first-page":"1172","DOI":"10.1038\/nature04270","article-title":"Wuschel controls meristem function by direct regulation of cytokinin-inducible response regulators","volume":"438","author":"Leibfried","year":"2005","journal-title":"Nature"},{"key":"2023013112090906300_B13","first-page":"564","article-title":"The spectrum kernel: a string kernel for SVM protein classification","volume-title":"Proceedings of the Pacific Symposium on Biocomputing","author":"Leslie","year":"2002"},{"key":"2023013112090906300_B14","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1093\/bioinformatics\/btg431","article-title":"Mismatch string kernels for discriminative protein classification","volume":"20","author":"Leslie","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013112090906300_B15","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1093\/nar\/gkg108","article-title":"Transfac: transcriptional regulation, from patterns to profiles","volume":"31","author":"Matys","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023013112090906300_B16","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1186\/1471-2105-5-169","article-title":"Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites","volume":"5","author":"Meinicke","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023013112090906300_B17","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/s11263-005-3848-x","article-title":"A comparison of affine region detectors","volume":"65","author":"Mikolajczyk","year":"2005","journal-title":"Int. J. Comput. Vis."},{"key":"2023013112090906300_B18","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1109\/72.914517","article-title":"An introduction to kernel-based learning algorithms","volume":"12","author":"M\u00fcller","year":"2001","journal-title":"IEEE Trans. Neural Netw."},{"key":"2023013112090906300_B19","doi-asserted-by":"crossref","first-page":"1565","DOI":"10.1038\/nbt1206-1565","article-title":"What is a support vector machine?","volume":"12","author":"Noble","year":"2006","journal-title":"Nat. Biotechnol."},{"key":"2023013112090906300_B20","doi-asserted-by":"crossref","DOI":"10.1007\/11744085_38","article-title":"Sampling strategies for bag-of-features image classification","volume-title":"European Conference on Computer Vision","author":"Nowak","year":"2006"},{"key":"2023013112090906300_B21","doi-asserted-by":"crossref","first-page":"277","DOI":"10.7551\/mitpress\/4057.003.0018","article-title":"Accurate splice site detection for Caenorhabditis elegans","volume-title":"Kernel Methods in Computational Biology","author":"R\u00e4tsch","year":"2004"},{"issue":"Suppl. 1","key":"2023013112090906300_B22","doi-asserted-by":"crossref","first-page":"i369","DOI":"10.1093\/bioinformatics\/bti1053","article-title":"RASE: recognition of alternatively spliced exons in C. elegans","volume":"21","author":"R\u00e4tsch","year":"2005","journal-title":"Bioinformatics"},{"key":"2023013112090906300_B23","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1111\/j.1365-313X.2004.02061.x","article-title":"Development and evaluation of an Arabidopsis whole genome affymetrix probe array","volume":"38","author":"Redman","year":"2004","journal-title":"Plant J."},{"key":"2023013112090906300_B24","doi-asserted-by":"crossref","first-page":"D91","DOI":"10.1093\/nar\/gkh012","article-title":"Jaspar: an open-access database for eukaryotic transcription factor binding profiles","volume":"32","author":"Sandelin","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"2023013112090906300_B25","doi-asserted-by":"crossref","first-page":"6097","DOI":"10.1093\/nar\/18.20.6097","article-title":"Sequence logos: a new way to display consensus sequences","volume":"18","author":"Schneider","year":"1990","journal-title":"Nucleic Acids Res."},{"key":"2023013112090906300_B26","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1016\/0022-2836(86)90165-8","article-title":"Information content of binding sites on nucleotide sequences","volume":"188","author":"Schneider","year":"1986","journal-title":"J. Mol. Biol."},{"key":"2023013112090906300_B27","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/4175.001.0001","volume-title":"Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond","author":"Sch\u00f6lkopf","year":"2001"},{"volume-title":"Learning with Kernels","year":"2002","author":"Sch\u00f6lkopf","key":"2023013112090906300_B28"},{"key":"2023013112090906300_B29","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/4057.001.0001","volume-title":"Kernel Methods In Computational Biology","author":"Sch\u00f6lkopf","year":"2004"},{"key":"2023013112090906300_B30","doi-asserted-by":"crossref","first-page":"822","DOI":"10.1089\/cmb.2005.12.822","article-title":"A discriminative model for identifying spatial cis-regulatory modules","volume":"12","author":"Segal","year":"2005","journal-title":"J. Comput. Biol."},{"key":"2023013112090906300_B31","doi-asserted-by":"crossref","first-page":"5549","DOI":"10.1093\/nar\/gkf669","article-title":"Discovery of novel transcription factor binding sites by statistical overrepresentation","volume":"30","author":"Sinha","year":"2002","journal-title":"Nucleic Acids Res."},{"issue":"Suppl. 1","key":"2023013112090906300_B32","doi-asserted-by":"crossref","first-page":"S15","DOI":"10.1186\/1471-2148-7-S1-S15","article-title":"Evolution of motif variants and positional bias of the cyclic-amp response element","volume":"7","author":"Smith","year":"2007","journal-title":"BMC Evol. Biol."},{"key":"2023013112090906300_B33","first-page":"1531","article-title":"Large scale multiple kernel learning","volume":"7","author":"Sonnenburg","year":"2006","journal-title":"J. Mach. Learn. Res."},{"issue":"Suppl. 10","key":"2023013112090906300_B34","doi-asserted-by":"crossref","first-page":"S7","DOI":"10.1186\/1471-2105-8-S10-S7","article-title":"Accurate splice site prediction using support vector machines","volume":"8","author":"Sonnenburg","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023013112090906300_B35","doi-asserted-by":"crossref","first-page":"73","DOI":"10.7551\/mitpress\/7496.003.0006","article-title":"Large scale learning with string kernels","volume-title":"Large Scale Kernel Machines","author":"Sonnenburg","year":"2007"},{"key":"2023013112090906300_B36","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1093\/bioinformatics\/btn170","article-title":"POIMs: positional oligomer importance matrices\u2013understanding support vector machine-based signal detectors","volume":"24","author":"Sonnenburg","year":"2008","journal-title":"Bioinformatics"},{"key":"2023013112090906300_B37","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1093\/bioinformatics\/16.1.16","article-title":"DNA binding sites: representation and discovery","volume":"16","author":"Stormo","year":"2000","journal-title":"Bioinformatics"},{"issue":"Database issue","key":"2023013112090906300_B38","doi-asserted-by":"crossref","first-page":"D1009","DOI":"10.1093\/nar\/gkm965","article-title":"The Arabidopsis Information Resource (TAIR): gene structure and function annotation","volume":"36","author":"Swarbreck","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023013112090906300_B39","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1093\/bioinformatics\/18.2.331","article-title":"Inclusive: integrated clustering, upstream sequence retrieval and motif sampling","volume":"18","author":"Thijs","year":"2002","journal-title":"Bioinformatics"},{"key":"2023013112090906300_B40","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1093\/bioinformatics\/14.4.317","article-title":"Automatic extraction of motifs represented in the hidden Markov model from a number of DNA sequences","volume":"14","author":"Yada","year":"1998","journal-title":"Bioinformatics"},{"key":"2023013112090906300_B41","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1093\/bioinformatics\/16.9.799","article-title":"Engineering support vector machine kernels that recognize translation initiation sites","volume":"16","author":"Zien","year":"2000","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/16\/2126\/48992513\/bioinformatics_25_16_2126.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/16\/2126\/48992513\/bioinformatics_25_16_2126.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,9]],"date-time":"2025-02-09T04:37:54Z","timestamp":1739075874000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/16\/2126\/204231"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,4,23]]},"references-count":41,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2009,8,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp278","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2009,8,15]]},"published":{"date-parts":[[2009,4,23]]}}}