{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T06:07:51Z","timestamp":1761718071817},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2008,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Exonic splicing enhancers (ESEs) activate nearby splice sites and promote the inclusion (vs. exclusion) of exons in which they reside, while being a binding site for SR proteins. To study the impact of ESEs on alternative splicing it would be useful to have a possibility to detect them in exons. Identifying SR protein-binding sites in human DNA sequences by machine learning techniques is a formidable task, since the exon sequences are also constrained by their functional role in coding for proteins.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>The choice of training examples needed for machine learning approaches is difficult since there are only few exact locations of human ESEs described in the literature which could be considered as positive examples. Additionally, it is unclear which sequences are suitable as negative examples. Therefore, we developed a motif-oriented data-extraction method that extracts exon sequences around experimentally or theoretically determined ESE patterns. Positive examples are restricted by heuristics based on known properties of ESEs, e.g. location in the vicinity of a splice site, whereas negative examples are taken in the same way from the middle of long exons. We show that a suitably chosen SVM using optimized sequence kernels (e.g., combined oligo kernel) can extract meaningful properties from these training examples. Once the classifier is trained, every potential ESE sequence can be passed to the SVM for verification. Using SVMs with the combined oligo kernel yields a high accuracy of about 90 percent and well interpretable parameters.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>The motif-oriented data-extraction method seems to produce consistent training and test data leading to good classification rates and thus allows verification of potential ESE motifs. The best results were obtained using an SVM with the combined oligo kernel, while oligo kernels with oligomers of a certain length could be used to extract relevant features.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-9-369","type":"journal-article","created":{"date-parts":[[2008,9,10]],"date-time":"2008-09-10T18:13:54Z","timestamp":1221070434000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Automatic detection of exonic splicing enhancers (ESEs) using SVMs"],"prefix":"10.1186","volume":"9","author":[{"given":"Britta","family":"Mersch","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander","family":"Gepperth","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"S\u00e1ndor","family":"Suhai","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Agnes","family":"Hotz-Wagenblatt","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2008,9,10]]},"reference":[{"issue":"3","key":"2354_CR1","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1016\/S0968-0004(00)01549-8","volume":"25","author":"BJ Blencowe","year":"2000","unstructured":"Blencowe BJ: Exonic splicing enhancers: mechanism of action, diversity and role in human genetic diseases. Trends Biochem Sci 2000, 25(3):106\u2013110.","journal-title":"Trends Biochem Sci"},{"issue":"4","key":"2354_CR2","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1038\/nrg775","volume":"3","author":"L Cartegni","year":"2002","unstructured":"Cartegni L, Chew SL, Krainer AR: Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nat Rev Genet 2002, 3(4):285\u2013298.","journal-title":"Nat Rev Genet"},{"issue":"9","key":"2354_CR3","doi-asserted-by":"publisher","first-page":"1197","DOI":"10.1017\/S1355838200000960","volume":"6","author":"BR Graveley","year":"2000","unstructured":"Graveley BR: Sorting out the complexity of SR protein functions. RNA 2000, 6(9):1197\u20131211.","journal-title":"RNA"},{"issue":"6","key":"2354_CR4","doi-asserted-by":"publisher","first-page":"793","DOI":"10.1017\/S1355838201010524","volume":"7","author":"LA Boukis","year":"2001","unstructured":"Boukis LA, Bruzik JP: Functional selection of splicing enhancers that stimulate trans-splicing in vitro. RNA 2001, 7(6):793\u2013805.","journal-title":"RNA"},{"issue":"4","key":"2354_CR5","doi-asserted-by":"publisher","first-page":"2143","DOI":"10.1128\/MCB.17.4.2143","volume":"17","author":"LR Coulter","year":"1997","unstructured":"Coulter LR, Landree MA, Cooper TA: Identification of a new class of exonic splicing enhancers by in vivo selection. Mol Cell Biol 1997, 17(4):2143\u20132150.","journal-title":"Mol Cell Biol"},{"issue":"13","key":"2354_CR6","doi-asserted-by":"publisher","first-page":"1998","DOI":"10.1101\/gad.12.13.1998","volume":"12","author":"HX Liu","year":"1998","unstructured":"Liu HX, Zhang M, Krainer AR: Identification of functional exonic splicing enhancer motifs recognized by individual SR proteins. Genes Dev 1998, 12(13):1998\u20132012.","journal-title":"Genes Dev"},{"issue":"3","key":"2354_CR7","doi-asserted-by":"publisher","first-page":"1063","DOI":"10.1128\/MCB.20.3.1063-1071.2000","volume":"20","author":"HX Liu","year":"2000","unstructured":"Liu HX, Chew SL, Cartegni L, Zhang MQ, Krainer AR: Exonic splicing enhancer motif recognized by human SC35 under splicing conditions. Mol Cell Biol 2000, 20(3):1063\u20131071.","journal-title":"Mol Cell Biol"},{"issue":"3","key":"2354_CR8","doi-asserted-by":"publisher","first-page":"1705","DOI":"10.1128\/MCB.19.3.1705","volume":"19","author":"TD Schaal","year":"1999","unstructured":"Schaal TD, Maniatis T: Selection and characterization of pre-mRNA splicing enhancers: identification of novel SR protein-specific enhancer sequences. Mol Cell Biol 1999, 19(3):1705\u20131719.","journal-title":"Mol Cell Biol"},{"issue":"11","key":"2354_CR9","doi-asserted-by":"publisher","first-page":"6291","DOI":"10.1128\/MCB.15.11.6291","volume":"15","author":"H Tian","year":"1995","unstructured":"Tian H, Kole R: Selection of novel exon recognition elements from a pool of random sequences. Mol Cell Biol 1995, 15(11):6291\u20136298.","journal-title":"Mol Cell Biol"},{"issue":"5583","key":"2354_CR10","doi-asserted-by":"publisher","first-page":"1007","DOI":"10.1126\/science.1073774","volume":"297","author":"WG Fairbrother","year":"2002","unstructured":"Fairbrother WG, Yeh RF, Sharp PA, Burge CB: Predictive identification of exonic splicing enhancers in human genes. Science 2002, 297(5583):1007\u20131013.","journal-title":"Science"},{"issue":"11","key":"2354_CR11","doi-asserted-by":"publisher","first-page":"1241","DOI":"10.1101\/gad.1195304","volume":"18","author":"XHF Zhang","year":"2004","unstructured":"Zhang XHF, Chasin LA: Computational definition of sequence motifs governing constitutive exon splicing. Genes Dev 2004, 18(11):1241\u20131250.","journal-title":"Genes Dev"},{"issue":"13","key":"2354_CR12","doi-asserted-by":"publisher","first-page":"3568","DOI":"10.1093\/nar\/gkg616","volume":"31","author":"L Cartegni","year":"2003","unstructured":"Cartegni L, Wang J, Zhu Z, Zhang MQ, Krainer AR: ESEfinder: A web resource to identify exonic splicing enhancers. Nucleic Acids Res 2003, 31(13):3568\u20133571.","journal-title":"Nucleic Acids Res"},{"key":"2354_CR13","unstructured":"SEE ESE[http:\/\/www.cbcb.umd.edu\/software\/SeeEse\/]"},{"key":"2354_CR14","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511801389","volume-title":"An Introduction to Support Vector Machines and Other Kernel-based Learning Methods","author":"N Cristianini","year":"2000","unstructured":"Cristianini N, Shawe-Taylor J: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press; 2000."},{"key":"2354_CR15","volume-title":"Kernel Methods in Computational Biology. Computational Molecular Biology","year":"2004","unstructured":"Sch\u00f6lkopf B, Tsuda K, Vert JP, (Eds): Kernel Methods in Computational Biology. Computational Molecular Biology. MIT Press; 2004."},{"key":"2354_CR16","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Learning Theory","author":"V Vapnik","year":"1995","unstructured":"Vapnik V: The Nature of Statistical Learning Theory. New York, USA: Springer-Verlag; 1995."},{"key":"2354_CR17","doi-asserted-by":"publisher","first-page":"419","DOI":"10.1186\/1471-2105-7-419","volume":"7","author":"T Down","year":"2006","unstructured":"Down T, Leong B, Hubbard TJP: A machine learning strategy to identify candidate binding sites in human protein-coding sequence. BMC Bioinformatics 2006, 7: 419.","journal-title":"BMC Bioinformatics"},{"key":"2354_CR18","doi-asserted-by":"crossref","unstructured":"Ashurst JL, Chen CK, Gilbert JGR, Jekosch K, Keenan S, Meidl P, Searle SM, Stalker J, Storey R, Trevanion S, Wilming L, Hubbard T: The Vertebrate Genome Annotation (Vega) database. Nucleic Acids Res 2005, (33 Database):D459-D465.","DOI":"10.1093\/nar\/gki135"},{"issue":"6","key":"2354_CR19","doi-asserted-by":"publisher","first-page":"2411","DOI":"10.1074\/jbc.270.6.2411","volume":"270","author":"SM Berget","year":"1995","unstructured":"Berget SM: Exon recognition in vertebrate splicing. J Biol Chem 1995, 270(6):2411\u20132414.","journal-title":"J Biol Chem"},{"issue":"11","key":"2354_CR20","doi-asserted-by":"publisher","first-page":"7347","DOI":"10.1128\/MCB.19.11.7347","volume":"19","author":"CF Bourgeois","year":"1999","unstructured":"Bourgeois CF, Popielarz M, Hildwein G, Stevenin J: Identification of a bidirectional splicing enhancer: differential involvement of SR proteins in 5' or 3' splice site activation. Mol Cell Biol 1999, 19(11):7347\u20137356.","journal-title":"Mol Cell Biol"},{"issue":"17","key":"2354_CR21","doi-asserted-by":"publisher","first-page":"e117","DOI":"10.1093\/nar\/gkl544","volume":"34","author":"M Hiller","year":"2006","unstructured":"Hiller M, Pudimat R, Busch A, Backofen R: Using RNA secondary structures to guide sequence motif finding towards single-stranded regions. Nucleic Acids Res 2006, 34(17):e117.","journal-title":"Nucleic Acids Res"},{"key":"2354_CR22","doi-asserted-by":"publisher","first-page":"169","DOI":"10.1186\/1471-2105-5-169","volume":"5","author":"P Meinicke","year":"2004","unstructured":"Meinicke P, Tech M, Morgenstern B, Merkl R: Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites. BMC Bioinformatics 2004, 5: 169.","journal-title":"BMC Bioinformatics"},{"issue":"9","key":"2354_CR23","doi-asserted-by":"publisher","first-page":"799","DOI":"10.1093\/bioinformatics\/16.9.799","volume":"16","author":"A Zien","year":"2000","unstructured":"Zien A, R\u00e4tsch G, Mika S, Sch\u00f6lkopf B, Lengauer T, M\u00fcller KR: Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics 2000, 16(9):799\u2013807.","journal-title":"Bioinformatics"},{"issue":"2","key":"2354_CR24","doi-asserted-by":"publisher","first-page":"216","DOI":"10.1109\/TCBB.2007.070208","volume":"4","author":"C Igel","year":"2007","unstructured":"Igel C, Glasmachers T, Mersch B, Pfeifer N, Meinicke P: Gradient-based optimization of kernel-target alignment for sequence kernels applied to bacterial gene start detection. IEEE\/ACM Trans Comput Biol Bioinform 2007, 4(2):216\u2013226.","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"issue":"2","key":"2354_CR25","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","volume":"405","author":"BW Matthews","year":"1975","unstructured":"Matthews BW: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 1975, 405(2):442\u2013451.","journal-title":"Biochim Biophys Acta"},{"key":"2354_CR26","doi-asserted-by":"publisher","first-page":"199","DOI":"10.1007\/3-540-33019-4_9","volume-title":"Multi-Objective Machine Learning","author":"T Suttorp","year":"2006","unstructured":"Suttorp T, Igel C: Multi-objective optimization of support vector machines. In Multi-Objective Machine Learning. Volume 16. Edited by: Jin Y. Springer-Verlag; 2006:199\u2013220."},{"key":"2354_CR27","first-page":"584","volume-title":"PPSN IV: Proceedings of the 4th International Conference on Parallel Problem Solving from Nature","author":"CM Fonseca","year":"1996","unstructured":"Fonseca CM, Fleming PJ: On the Performance Assessment and Comparison of Stochastic Multiobjective Optimizers. In PPSN IV: Proceedings of the 4th International Conference on Parallel Problem Solving from Nature. London, UK: Springer-Verlag; 1996:584\u2013593."},{"key":"2354_CR28","volume-title":"Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond","author":"B Sch\u00f6lkopf","year":"2002","unstructured":"Sch\u00f6lkopf B, Smola A: Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. The MIT Press; 2002."},{"key":"2354_CR29","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1186\/1471-2105-8-159","volume":"8","author":"M Pertea","year":"2007","unstructured":"Pertea M, Mount SM, Salzberg SL: A computational survey of candidate exonic splicing enhancer motifs in the model plant Arabidopsis thaliana. BMC Bioinformatics 2007, 8: 159.","journal-title":"BMC Bioinformatics"},{"issue":"5","key":"2354_CR30","doi-asserted-by":"publisher","first-page":"369","DOI":"10.1142\/S0129065707001214","volume":"17","author":"B Mersch","year":"2007","unstructured":"Mersch B, Glasmachers T, Meinicke P, Igel C: Evolutionary Optimization of Sequence Kernels for Detection of Bacterial Gene Starts. International Journal of Neural Systems 2007, 17(5):369\u2013381.","journal-title":"International Journal of Neural Systems"},{"issue":"Suppl 2","key":"2354_CR31","doi-asserted-by":"publisher","first-page":"S75","DOI":"10.1093\/bioinformatics\/18.suppl_2.S75","volume":"18","author":"S Degroeve","year":"2002","unstructured":"Degroeve S, Baets BD, de Peer YV, Rouz\u00e9 P: Feature subset selection for splice site prediction. Bioinformatics 2002, 18(Suppl 2):S75-S83.","journal-title":"Bioinformatics"},{"key":"2354_CR32","first-page":"564","volume-title":"Proceedings of the Pacific Symposium on Biocomputing","author":"C Leslie","year":"2002","unstructured":"Leslie C, Eskin E, Noble WS: The Spectrum Kernel: A string kernel for SVM protein classification. In Proceedings of the Pacific Symposium on Biocomputing. Edited by: Altman RB, Dunker AK, Hunter L, Lauerdale H, Klein TE. World Scientific; 2002:564\u2013575."},{"key":"2354_CR33","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1016\/S0167-7306(08)60461-5","volume-title":"Computational Methods in Molecular Biology","author":"A Krogh","year":"1998","unstructured":"Krogh A: An introduction to Hidden Markov Models for biological sequences. In Computational Methods in Molecular Biology. Edited by: Salzberg SL, Searls DB, Kasif S. Elsevier; 1998:45\u201363."},{"issue":"2","key":"2354_CR34","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1109\/TCBB.2005.27","volume":"2","author":"JC Rajapakse","year":"2005","unstructured":"Rajapakse JC, Ho LS: Markov encoding for detecting signals in genomic sequences. IEEE\/ACM Trans Comput Biol Bioinform 2005, 2(2):131\u2013142.","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"issue":"10","key":"2354_CR35","doi-asserted-by":"publisher","first-page":"1579","DOI":"10.1261\/rna.2990205","volume":"11","author":"AK Dubey","year":"2005","unstructured":"Dubey AK, Baker CS, Romeo T, Babitzke P: RNA sequence and secondary structure participate in high-affinity CsrA-RNA interaction. RNA 2005, 11(10):1579\u20131587.","journal-title":"RNA"},{"key":"2354_CR36","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1093\/nar\/gki153","volume":"33","author":"T Hori","year":"2005","unstructured":"Hori T, Taguchi Y, Uesugi S, Kurihara Y: The RNA ligands for mouse proline-rich RNA-binding protein (mouse Prrp) contain two consensus sequences in separate loop structure. Nucleic Acids Res 2005, 33: 190\u2013200.","journal-title":"Nucleic Acids Res"},{"issue":"20","key":"2354_CR37","doi-asserted-by":"publisher","first-page":"17484","DOI":"10.1074\/jbc.M010594200","volume":"276","author":"T Thisted","year":"2001","unstructured":"Thisted T, Lyakhov DL, Liebhaber SA: Optimized RNA targets of two closely related triple KH domain proteins, heterogeneous nuclear ribonucleoprotein K and alphaCP-2KL, suggest Distinct modes of RNA recognition. J Biol Chem 2001, 276(20):17484\u201317496.","journal-title":"J Biol Chem"},{"key":"2354_CR38","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1007\/BF00818163","volume":"125","author":"IL Hofacker","year":"1994","unstructured":"Hofacker IL, Dontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P: Fast Folding and Comparison of RNA Secondary Structures. Monatshefte Chemie 1994, 125: 167\u2013188.","journal-title":"Monatshefte Chemie"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-9-369.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T11:04:12Z","timestamp":1630494252000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-9-369"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,9,10]]},"references-count":38,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2008,12]]}},"alternative-id":["2354"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-9-369","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2008,9,10]]},"assertion":[{"value":"20 December 2007","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 September 2008","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 September 2008","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"369"}}